Convergent Usability Evaluation, p. 5
Evaluation Methods Used
The UI team exploited a variety of opportunities to evaluate and improve EIRS’ usability. These included: expert review, trial deployment, remote training observation, participation in Q/A testing, election-day field observations, and post-election interviews.
During development, the primary method of evaluating and improving the usability of EIRS was expert review EIRS management recognized the need for usability reviews even before any usability experts joined the project. Because the voting incident report form was so important, project leaders solicited and received feedback on a very early version of it from outside usability experts in July.
Once usability experts had joined the project, developers posted messages soliciting reviews of a specific aspect or part of EIRS they had implemented. Usually the design to be reviewed was in the current “test” version that team-members could access, but sometimes developers posted preliminary implementations or prototypes and asked for feedback before putting them into EIRS.
Usability team members generally provided reviews whenever asked, and sometimes offered feedback even when not asked. Other project members also often pitched in to provide review feedback as time permitted.
Early “Trial” Deployment in Primary Election
The state of Florida conducted its primary election on August 31. Although EIRS was far from complete by then, it was functional. Enough was implemented to allow it to be used in that state election to collect voting incidents.
Users and Election Protection Coalition leaders provided feedback through a variety of channels. An important complaint received from the Florida primary was:
- Volunteers cannot edit incident reports after submitting them. This is a problem because people sometimes realize that they entered incorrect data, or they later learn something new about the case.
This problem was anticipated by the project, and extensively
discussed. Client leadership insisted that forms not be editable
because they wanted to preserve the integrity of records for legal
purposes. EIRS project staff was split on the issue. On the one hand,
the clients knew what their legal requirements were, and furthermore
threatened not to use the system if records were editable. On the other
hand, usability problems were anticipated.
Several compromises were considered. The chosen implementation
allowed records to be appended to with additional commentary, but never
altered once entered. Due to resource constraints, the "append" feature
was primitive. (Schedule and political pressures led to technology
platform choices with severely limited options for such a feature.)
Further complicating the issue, EIRS’s security model was very complex,
underspecified, and underdocumented. This led to a security flaw in the
append feature that in theory would allow unauthorized persons to
append to records. However, these bogus records were detectable, so the
ultimate risk to data integrity was minimal. But the flaw could have
permitted a denial-of-service attack.
In retrospect, more time should have been dedicated to the security
model. A sub-team with security experience should have been given
responsibility for this task.
The correct way to handle data integrity in this situation remains a
tricky problem. Here are two approaches, which are not mutually
Allow operators to have more than one “open” record at a time, and allow free editing until the open record is committed, but the open record is still retained in the database (probably in a different table), so that the record is not lost in the event of a crash. Thus an operator can complete coding for one call while taking the next call. It may be that two open incidents is exactly the right number to permit.
Allow authorized personnel to make corrections to coding, and append new text to the narrative. These changes could be noted in a change log field for the record, if necessary.
Remote Training Session Observation
Throughout development, the EIRS team discussed the desirability of conducting formal usability tests: observing representative users performing prescribed tasks in a fairly controlled environment. This was spurred by a concern that putting such a complex, important, and visible application into use without first testing its usability was unwise – even foolhardy. However, as the election neared, it became clear that formal usability tests simply were not going to happen: the team did not have resources to design a test, recruit participants, find a site, create a realistic test-environment, conduct test sessions, and analyze the results. The election was too near and things were moving too quickly.
In mid-October, as this realization sank in, an alternative presented itself: groups of volunteers slated to work as Election Protection call-center volunteers and first-level managers were being trained on how to use EIRS. It occurred to us that observing training sessions might provide opportunities to see users’ reactions to EIRS. We arranged to sit in on some training sessions.
We soon learned that the training sessions were not normal instruction in physical classrooms because the trainees were scattered around the U.S. Instead, the training was conducted by telephone conferencing. Presentation slides describing EIRS and illustrating its pages were emailed to trainees, who joined a conference call at a prescribed time. An instructor led the group through the slides, explaining how to use EIRS and answering questions.
We initially wondered whether listening in on a conference call slide presentation would be an effective way to discover usability problems in EIRS. In fact, it turned out to be quite valuable, for three reasons:
The presentation was a fairly systematic and thorough walk-through of the functionality of EIRS. This allowed us to notice design flaws and potential usability problems that we might have missed in our self-guided reviews.
As instructors explained how to use EIRS, it was clear that they felt there were “rough spots” in the UI that had to be explained carefully to avoid errors on election day.
Questions and comments from trainees provided insight into their understanding – or misunderstanding – of how EIRS worked, and the context in which it would be used.
Between us, we “attended” a total of six training sessions. Examples of observations reported back to the team are:
At the bottom of the incident summary page, there is an “Edit Task” button that isn’t for editing the report. A trainee asked what it does. The instructor didn’t know.
Users are urged to “be sure to print” incident forms before submitting them. A “Print” link is provided at the bottom of the form for this purpose. Some trainees argued that if forms must be printed, EIRS should just do it, rather than relying on users to remember. Trainees asked what happens if the printer is busy or out of paper when a user clicks “Print”. The instructor didn’t know. Trainees wanted to be able to “print” completed forms to files.
If a volunteer tries to edit a report s/he didn’t create, EIRS doesn’t explicitly deny permission; instead it asks the user to login (even though s/he is already logged in), because it wants the user to login as the user who entered the report. This relies on volunteers to understand why they are being asked to login again. In fact, even if the user logged on as the creator, incident editing didn’t work as users expected.
The main problem is that many of the web browsers operators were using were expected to not permit automatic printing. Reconfiguring the browsers was considered and rejected as impractical. This usability problem shows an important limitation of implementing this type of system as a web application.
Problems discovered by observing training sessions were filed as bug reports so they could be corrected or further emphasized in training before election day.
Participation in Q/A Testing
As election day loomed, another evaluation opportunity arose: Q/A testing. We monitored the Q/A email list to learn what aspects of EIRS needed testing. We exercised EIRS, often following test-scripts devised by the Q/A team. We reported usability problems as well as bugs. One of us also helped create test scripts and noted issues that confused other Q/A testers.
An important usability problem this exposed was that the relationship
between user accounts and access privileges didn’t match users’
expectations: EIRS often wouldn’t let users – including testers –
do what they needed to do. This was mostly caused by bugs in the
security model or in processes that used the security model, but the
situation was made considerably more confusing and frustrating for
users because in a few case EIRS by design forbade any user from
performing certain actions that many users expected to be able to
perform. Some of these problems uncovered by QA testing were resolved
prior to deployment, but fixing some would have required substantial
rearchitecture, and so could not be fixed prior to deployment.
“In Vitro” Field Observations and Interviews
Because EIRS will be used in elections beyond 2004, the usability team and EIRS management thought it would be valuable for the usability team to document how well (or poorly) EIRS worked for its intended users and tasks. We therefore spent election day at two call-centers, observing, interviewing, and videotaping volunteers using EIRS.
Two call-centers in San Francisco allowed us to observe their operation. These were large centers, each with over 50 operator workstations. Each handled voting-problem calls from a subset of western states. Both centers expected to be filmed on election day by news organizations (a correct assumption), and so had volunteers sign media-releases. This made it possible for us to videotape without having to obtain permission from each volunteer.
Although the call-center volunteers were very busy on election day handling calls from voters and poll observers, we were able to collect a lot of feedback on the design of specific EIRS features, especially the incident-reporting form. Our election-day observations included:
Because of the requirements of one of the underlying components of EIRS, the EIRS’ designers assumed each user would be registered with their own login account. However, management at both SF call centers considered that too much bother. They created generic user-accounts and had everyone use them. EIRS’ developers feared that the system wouldn’t work properly with multiple users simultaneously using the same account. VerifiedVoting staff at one SF call-center quickly assigned each station a unique login, but that wasn’t practical nationwide. Fortunately, careful monitoring soon revealed that EIRS works properly with multiple users logged into the same account. The toolkit chosen as a basis for EIRS had a very strong concept of a person at its core. Since the EIRS development approach was to choose technology first, models of use were systematically distorted by the underlying technology platform. Given the schedule, this was inevitable, but it's clear that a luckier choice for underlying technology would have led to a more robust system.
With hundreds of volunteers entering election incidents all across the U.S. at more-or-less the same time, EIRS bogged down considerably. At times, it was intolerably slow, causing call-center volunteers to switch to taking incident reports on paper forms for later entry into EIRS.
As some EIRS designers and expert reviewers predicted, volunteers found it difficult to fill out the online incident form as callers talked, because they couldn’t force callers to follow the order of the form when describing voting problems. Some users scrolled up and down the form as the caller talked. Others first wrote out a summary of the situation, either on paper or in the incident Description field, and later went back and filled out the form fields. One design that the EIRS team had considered was putting the free-form incident description first, so that operators could enter later field values off of that information. EIRS clients weren’t in favor of this design, but in retrospect the EIRS team should have made a stronger case to the clients for this design.
In addition to learning much about EIRS’ usability, observing the call centers allowed us to see and understand the context in which EIRS was used. EIRS was only part of the process and only one of several tools used by call-center volunteers. For example, many used the website MyPollingPlace.com to help callers find where to vote.
The call centers also employed non-computer artifacts to support volunteers in helping callers. At both centers, late-breaking information was posted on walls. Printed incident reports were sorted into boxes according to their urgency.
An important insight that came out of the election-day field observations was that many calls were from voters trying to find their polling place. EIRS was designed under the assumption that most calls would be reports of voting irregularities. Although a significant number of voting irregularities were reported , it is useful for future planning to understand that the call-centers functioned primarily as voter information hot-lines.
After the election, the usability team suggested interviewing a sample of EIRS users to capture their comments, criticisms, and suggestions while their memories were still fresh. EIRS management agreed, and asked two organizations that had recruited and organized many of the call-center volunteers (People for the American Way Foundation and the Lawyers’ Committee for Civil Rights under Law) to nominate users for us to interview.
None of the nominated users were near San Francisco (where the usability team is located), so interviews were conducted by telephone and email. The interviewers were the same two UI team members who had conducted the call-center observations and interviews on election day.
The post-election telephone interviews provided data similar to that
collected during the election-day field observations. We
collected more feedback on the design of the incident reporting
form. We learned that the revolt against using individual user
logins was quite widespread. Overall, these interviews helped us
understand the variety of ways in which call-centers operate.
Last modified December 22, 2005 05:38 PM