User-Centered Processes and Evaluation in Product Development

Shifteh Karimi
Allen Cypher

Apple Computer, Inc.

Abstract This report describes the process by which an engineer and a social scientist conducted a user study and how their complementary perspectives influenced the interface design for Eager, an intelligent assistant for users of the HyperCard environment. Trace data was combined with qualitative analysis to arrive at concrete recommendations for improving the interface.

Introduction There is a lot of talk about the benefits of interdisciplinary work throughout the design cycle. However, in reality there are organizational and personal barriers to effective interaction among people in different disciplines. In his paper, Interdisciplinary Collaboration " (1990) Scott Kim discusses how "it is important to keep noticing the assumptions in your own discipline that might limit your view of the world". We present our experiences as a case example of some of the problems and successes in a collaborative effort to improve the design of the user interface for Eager, a programming-by-example system.

What is Eager? Eager is a program for automating repetitive tasks [2]. It constantly monitors user actions, and when it detects a pattern, it writes a generalized program to perform that repetitive activity. The following is a screen illustration of Eager:

TR123.gif1.gif

The design of the interface was based on the principle that the program should cause minimal intrusion into the user's normal interaction with the computer. The design included the representation of an "agent" icon on the screen and the use of a new interface technique, called "anticipation", where the agent indicates what it expects the user to do next by highlighting that item in green (see Figure 1). Since intelligent agents are innovative software features and users are not accustomed to interacting with them, it was necessary to test empiricallythe initial reaction of users to these features. The two main issues were the ease of use and the functionality of the system.

Our Collaboration. Organizationally, we were separated. We both were in the Advanced Technology Group, but we worked in two different departments. Although there were no formal mechanisms for collaboration across departments, there was a general sentiment that cooperation across disciplines is valuable. Apple managers value their employees' contributions to other departments, and some managers explicitly consider these contributions when reviewing employee performance. We knew about each other's work, and when we took the initiative to work together, we did not need to go through formal approval procedures.

User-study Design. When Allen, the engineer, had a working version of his program that he felt was sufficiently robust to handle most of the actions that a typical user would perform, he approached Shifteh, a user studies specialist at that time in the Human Interface Group, with a list of questions. He had ideas about how a user study should be conducted. For example, he expected to first explain and demonstrate the program and then have the subjects use it themselves. But since the main goal of the study was to understand users' initial reactions, Shifteh suggested that they observe subjects' first experiences with the system without providing any prior explanation.

To study this type of problem, a longitudinal study would be an ideal approach in unfolding user responses over time in their natural work settings. However, given the constraints of time and resources for conducting a longitudinal study, Shifteh suggested a study design based on specific tasks that were relevant to the user's everyday worklife.

In addition to observation methods for collecting users' reactions, it was important also to follow the actions of the computer's intelligent agent behind the scene. Allen suggested modifying the program to produce trace data to give us an accurate recording of the actions of the intelligent agent. Whenever the agent appeared on the user's screen, trace data recorded 1) a list of the user's HyperCard commands before and after the appearance, 2) whether the user interacted with the agent, and 3) whether the agent's suggestions were accepted or rejected. This trace data helped Allen to understand the specific situation that had caused the Eager program to detect a repetitive pattern. This data was an additional tool that augmented the data collection methods of video recording, questionnaires, etc.

One of the questions Allen wanted answered by the study was when the agent should appear on the screen to assist the user in completing a repetitive task. From his technical perspective, he wanted the agent to appear as soon as possible so that it could be of most help to the user. However, there was a tradeoff. The sooner the agent appeared, the more likely it would be for it to incorrectly guess the pattern in the users' actions. Allen wanted to test two versions of the program: one with early and one with late appearance of the agent.

From her perspective as a psychologist, Shifteh knew that users would be confused if confronted with an intelligent agent that made mistakes. The idea that a program could be incorrect was in sharp contrast to people's current mental model of computers. So, instead of testing two versions we decided to use just the late appearance version, the one that had a greater likelihood of correctly detecting the pattern. Indeed, the results of the study supported this decision to such a degree that Allen later changed the program to delay the appearance of the agent even longer.


Feedback into the design. Allen was present throughout the testing sessions as a silent observer. Shifteh felt that it would be valuable for him to experience those detailed aspects of user reactions that could not be captured on video or in a formal report.

At the end of each testing session we discussed our observations about users' responses and reactions. Our different orientations often led us to different interpretations of the data.

For example, Shifteh noticed that some users did not recognize that the green highlighting and the agent icon were two aspects of the same system. Allen, on the other hand, could not imagine that this misunderstanding could occur. He knew the system so well that this finding could not fit his perception of the program. He kept trying to explain the data in a different way to convince Shifteh that her observations were incorrect.

We tried to resolve our differences by reviewing the transcripts of the session. We found comments like these:

    • I was curious about him [the agent] but I didn't know exactly what it was and I was seeing some green things.

    • I thought it [the agent] was a button and the green [highlighting] seemed to have something to do with what I had done. The green came on at the same time; it showed what I was going to do next and the icon was gonna do the task for me.

To Allen, the fact that users had noticed the two features and talked about their connection was proof that they understood the relation. Our discussions revealed that users were verbalizing their reactions because they were having a difficult time understanding the relationship between the two interface features. As a result of this study, Allen changed the agent icon to appear in green, so that its connection with the green highlighting is more obvious.

Conclusion --Our experiences demonstrate how engineers and psychologists can go beyond organizational constraints and personal biases of their disciplines and turn the design process around. By our willingness to work together in conducting a user study we learned about each other's world, and clarified misperceptions about how to interpret user data.

We recognized the value of combining trace data with observational methods. This interplay of objective and subjective methods was particularly useful in our situation where we had limited time and resources.

1. Kim, Scott. (1990). Interdisciplinary Collaboration. In Brenda Laurel (Ed.), The Art of Human-Computer Interface Design. Reading, MA. Addison-Wesley Publishing Co., pp. 31-44.

2. Cypher, Allen. (1993). Eager: Programming Repetitive Tasks by Demonstration. In Allen Cypher (Ed.), Watch What I Do: Programming by Demonstration. Cambridge, MA. MIT Press, pp. 467-484.


See Also