User-Centered Processes and Evaluation in Product Development
Apple Computer, Inc.
This report describes the process by which an engineer and a social scientist conducted
a user study and how their complementary perspectives influenced the interface design
for Eager, an intelligent assistant for users of the HyperCard environment. Trace data was combined with qualitative analysis to arrive at concrete recommendations
for improving the interface.
Introduction There is a lot of talk about the benefits of interdisciplinary work throughout the design cycle. However, in reality there are organizational and personal barriers to effective interaction among people in different disciplines. In his paper, Interdisciplinary Collaboration " (1990) Scott Kim discusses how "it is important to keep noticing the assumptions in your own discipline that might limit your view of the world". We present our experiences as a case example of some of the problems and successes in a collaborative effort to improve the design of the user interface for Eager, a programming-by-example system.
What is Eager?
Eager is a program for automating repetitive tasks . It constantly monitors
user actions, and when it detects a pattern, it writes a generalized program to perform
that repetitive activity. The following is a screen illustration of Eager:
The design of the interface was based on the principle that the program should cause
minimal intrusion into the user's normal interaction with the computer. The design
included the representation of an "agent" icon on the screen and the use of a new
interface technique, called "anticipation", where the agent indicates what it expects the
user to do next by highlighting that item in green (see Figure 1). Since intelligent
agents are innovative software features and users are not accustomed to interacting
with them, it was necessary to test empiricallythe initial reaction of users to these
features. The two main issues were the ease of use and the functionality of the
Organizationally, we were separated. We both were in the Advanced Technology Group,
but we worked in two different departments. Although there were no formal mechanisms
for collaboration across departments, there was a general sentiment that cooperation
across disciplines is valuable. Apple managers value their employees' contributions
to other departments, and some managers explicitly consider these contributions when
reviewing employee performance. We knew about each other's work, and when we took
the initiative to work together, we did not need to go through formal approval procedures.
When Allen, the engineer, had a working version of his program that he felt was sufficiently
robust to handle most of the actions that a typical user would perform, he approached
Shifteh, a user studies specialist at that time in the Human Interface Group, with a list of questions. He had ideas about how a user study should be conducted.
For example, he expected to first explain and demonstrate the program and then have
the subjects use it themselves. But since the main goal of the study was to understand users' initial reactions, Shifteh suggested that they observe subjects' first
experiences with the system without providing any prior explanation.
To study this type of problem, a longitudinal study would be an ideal approach in
unfolding user responses over time in their natural work settings. However, given
the constraints of time and resources for conducting a longitudinal study, Shifteh
suggested a study design based on specific tasks that were relevant to the user's everyday
In addition to observation methods for collecting users' reactions, it was important
also to follow the actions of the computer's intelligent agent behind the scene.
Allen suggested modifying the program to produce trace data to give us an accurate
recording of the actions of the intelligent agent. Whenever the agent appeared on the user's
screen, trace data recorded 1) a list of the user's HyperCard commands before and
after the appearance, 2) whether the user interacted with the agent, and 3) whether
the agent's suggestions were accepted or rejected. This trace data helped Allen to understand
the specific situation that had caused the Eager program to detect a repetitive pattern.
This data was an additional tool that augmented the data collection methods of video recording, questionnaires, etc.
One of the questions Allen wanted answered by the study was when the agent should
appear on the screen to assist the user in completing a repetitive task. From his
technical perspective, he wanted the agent to appear as soon as possible so that
it could be of most help to the user. However, there was a tradeoff. The sooner the agent
appeared, the more likely it would be for it to incorrectly guess the pattern in
the users' actions. Allen wanted to test two versions of the program: one with
early and one with late appearance of the agent.
From her perspective as a psychologist, Shifteh knew that users would be confused
if confronted with an intelligent agent that made mistakes. The idea that a program
could be incorrect was in sharp contrast to people's current mental model of computers.
So, instead of testing two versions we decided to use just the late appearance version,
the one that had a greater likelihood of correctly detecting the pattern. Indeed,
the results of the study supported this decision to such a degree that Allen later
changed the program to delay the appearance of the agent even longer.
Feedback into the design.
Allen was present throughout the testing sessions as a silent observer. Shifteh felt
that it would be valuable for him to experience those detailed aspects of user reactions
that could not be captured on video or in a formal report.
At the end of each testing session we discussed our observations about users' responses
and reactions. Our different orientations often led us to different interpretations
of the data.
For example, Shifteh noticed that some users did not recognize that the green highlighting
and the agent icon were two aspects of the same system. Allen, on the other hand,
could not imagine that this misunderstanding could occur. He knew the system so
well that this finding could not fit his perception of the program. He kept trying
to explain the data in a different way to convince Shifteh that her observations
We tried to resolve our differences by reviewing the transcripts of the session.
We found comments like these:
- I was curious about him [the agent] but I didn't know exactly what it was and I was
seeing some green things.
- I thought it [the agent] was a button and the green [highlighting] seemed to have
something to do with what I had done. The green came on at the same time; it showed
what I was going to do next and the icon was gonna do the task for me.
To Allen, the fact that users had noticed the two features and talked about their
connection was proof that they understood the relation. Our discussions revealed
that users were verbalizing their reactions because they were having a difficult
time understanding the relationship between the two interface features. As a result of this
study, Allen changed the agent icon to appear in green, so that its connection with
the green highlighting is more obvious.
Conclusion --Our experiences demonstrate how engineers and psychologists can go beyond organizational constraints and personal biases of their disciplines and turn the design process around. By our willingness to work together in conducting a user study we learned about each other's world, and clarified misperceptions about how to interpret user data.
We recognized the value of combining trace data with observational methods. This interplay of objective and subjective methods was particularly useful in our situation where we had limited time and resources.
1. Kim, Scott. (1990). Interdisciplinary Collaboration. In Brenda Laurel (Ed.), The
Art of Human-Computer Interface Design. Reading, MA. Addison-Wesley Publishing Co.,
2. Cypher, Allen. (1993). Eager: Programming Repetitive Tasks by Demonstration. In Allen Cypher (Ed.), Watch What I Do: Programming by Demonstration. Cambridge, MA. MIT Press, pp. 467-484.