This interesting research report comes via Jared Spool's
User Interface list. Here, Will Schroeder discusses the
work of Rolf Molich.
Rolf Molich is a designer and researcher at DialogDesign
in Denmark.
Their English web site is located at:
http://www.dialogdesign.dk/inenglish.html
Those who read skandinavisk will also find the
main Web site useful:
http://www.dialogdesign.dk/home.html
Their intro page offers a nice expression of an
important concept. Their goal is to help customers
design such good IT products that users are happy
to use them. This is what Anders Skoe often talks
about as conviviality, the step beyond user-friendliness
to user enjoyment.
The articles on this site are a nice collection on
research methods and other design issues.
-- Ken Friedman
--
Learning from the Work of Others
By Will Schroeder, Principal, User Interface Engineering
Every once in a while it's a good idea to step back from our own
day-to-day work and watch other professionals operate. One
instinctively feels that a lot can be learned.
There are a lot of usability handbooks and guidelines out there
with seemingly good advice, but should we adopt methods we've
never seen in action? How do we learn from usability tests if
the details of screening, test protocols, and analysis are not
presented along with the test results? Can anyone accurately
reproduce a usability test series from the limited descriptions
in a typical conference paper? Most importantly, how can we
learn from others without dogging their steps from start to
finish?
We were excited when we learned that Rolf Molich had completed
two studies that truly facilitate this kind of learning. In each
study, several usability teams independently tested the same
interface. He compares the work of the teams step by step from
user screening through task design to testing and reporting.
Both papers, called CUE-1 and CUE-2, are on Rolf's web site
at http://www.dialogdesign.dk/inenglish.html.
In the CUE-2 study, nine teams set out to usability test the same
web site, following what they believed to be the established
usability best practices. Rolf describes, reviews, and compares
the processes and reports from each of the nine teams in exhaustive
detail. His findings are so remarkable that they have changed the
way we think about our own work.
The way the study was set up, all of the teams were given the
same test scenario and objectives for the same interface. Each
team then conducted a study using their organization's standard
procedures and techniques. They then compiled reports, which they
sent back to Rolf.
Rolf looked at all problems found by each team, a combined total
of more than 300 problems. He rigorously evaluated each of the
identified problems, finding most of them to be "reasonable and
in accordance with generally accepted advice on usable design."
So, the good news is that conducting all of these usability tests
identified a wealth of usability problems with the interface.
The bad news comes when you compare the findings of each team.
Although the teams' definitions of what constituted a usability
problem were effectively identical, there wasn't a single problem
that every team reported. Even more surprising to us was that
eight of the nine teams missed 75% of the usability problems!!
When you look at the total number of unique problems identified,
only one team reported more than 25% of the these problems.
This is alarming. It's the parable of the blind men studying the
elephant all over again. Each team grabbed onto a different part
and came to different conclusions. Each usability report read like
a test of a completely different interface. This is what makes
the CUE-2 study so exciting! We can study the differences between
each team's methods and practices and then look at how they relate
to our own.
The study also raises some central questions for future research
of usability testing techniques. How can we construct tests that
find the important usability problems as quickly as possible?
And how can we improve our practices so different teams will
consistently find the same problems? We can find the beginnings
of answers to these questions in Rolf's studies. Let's take two
examples:
Example #1: Task design.
Nine teams created 51 different tasks for the same UI. Rolf
found each task to be well designed and valid, but there was
scant agreement on which tasks were critical. If each team used
the same best practices, then they should have all derived the
same tasks from the test scenario. But that isn't what happened.
Instead, there was virtually no overlap. It was surprisingly
rare when more than one team used similar tasks. It was as if
each team thought the interface was for a completely different
purpose. Comparing the tasks developed by the nine teams makes
a valuable lesson in effective task design.
Example #2: Reporting results.
Rolf found that the quality of the reports varied dramatically.
The size of the reports varied from 5 pages to 52 pages -
a 10 times difference! Some reports lacked positive findings,
executive summaries, and screen shots. Others were complete with
detailed descriptions of team's methods and definitions of
terminology. By looking through the different reports, we can
quickly pick out the attributes that would make our reports more
helpful to our clients.
The practices of all of the teams in this study needed review,
formalization, and a general tightening up. In all probability,
since the teams were professional or professionally led, everyone
can benefit from reviewing the practices. We can use this analysis
to hold a mirror up to our own work. This long overdue experiment
provides extremely valuable material for sharpening individual
usability practices. Rolf has done a great job of opening our
eyes to the possibilities for improvement.
_______________
|