Instrumental Interaction: An Interaction Model for Designing Post-WlMP…
Instrumental Interaction: An Interaction Model for
Designing Post-WlMP User Interfaces
This article introduces a new interaction model called Instrumental Interaction that extends and generalizes the principles of direct manipulation.
It covers existing interaction styles, including traditional WIMP interfaces, as well as new interaction styles such as two-handed input and augmented reality.
It defines a design space for new interaction techniques and a set of properties for comparing them.
Instrumental Interaction describes graphical user interfaces in terms of domain objects and interaction instruments. Interaction between users and domain objects is mediated by interaction instruments, similar to the tools and instruments we use in the real world to interact with physical objects.
The article presents the model, applies it to describe and compare a number of interaction techniques, and shows how it was used to create a new interface for searching and replacing text.
a. A likely reason is that integrating
new interaction techniques into an interface is challenging
for both designers and developers. Designers find it faster
and easier to stick with a small set of well-understood
techniques. Similarly, developers find it more efficient to
take advantage of the extensive support for WlMP
interaction provided by current development tools.
b. The leap from WIMP to newer "post-WlMP" graphical
interfaces, which take advantage of novel interaction
techniques, requires both new interaction models and
corresponding tools to facilitate development. This paper
focuses on the first issue by introducing a new interaction
model, called Instrumental Interaction, that extends and
generalizes the principles of direct manipulation to also
encompass a wide range of graphical interaction techniques.
c. The Instrumental Interaction model has the following goals:
• cover the state-of-the-art in graphical interaction
• provide qualitative and quantitative ways to compare
interaction techniques, to give designers the basis for
an informed choice when selecting a given technique
to address a particular interface problem;
• define a design space in which unexplored areas can be
identified and lead to new interaction techniques; and
• open the way to a new generation of user interface
development tools that make it easy to integrate the
latest interaction techniques in interactive applications.
d. After a review of related work, this paper analyzes the
limits of current WIMP interfaces.
a. An interaction model is a set of principles, rules and
properties that guide the design of an interface. It
describes how to combine interaction techniques in a
meaningful and consistent way and defines the "look
and feel" of the interaction from the user's
perspective. Properties of the interaction model can
be used to evaluate specific interaction designs.
b. Direct Manipulation  is a generic interaction model,
while style guides, e.g., Apple's guidelines , describe
more precise and specific models. Took introduced a model
called Surface Interaction  and Holland & Oppenheim a
model called Direct Combination
c. An interaction model differs from the architectural model of
an interface, which describes the functional elements in the
implementation of the interface and their relationships
d. User interface development environments
have generated a variety of implementation models for
developing interfaces (see  for a review), e.g. the widget
model of the X/Motif toolkit  or Garnet's Interactors
e. Whereas architectural models are aimed at interface
development, an interaction model is aimed at interface
f. The model-based approach and its associated tools  help
bridge the gap between interaction and architectural models
by offering a higher-level approach to the design of
g. Device-level models such as logical input devices  or
Card et al.'s taxonomy  operate at a lower level of
abstraction than interaction models. Understanding the role
of the physical devices in interaction tasks is a critical
component of the definition of the Instrumental Interaction
h. At the theoretical level, Activity Theory  provides a
relevant framework for analyzing interaction as a mediation
process between users and objects of interest.
i. Finally, Instrumental Interaction is grounded in the large
(and growing) number of graphical interaction techniques
that have been developed in recent years, some of which are referenced in the rest of this article
FROM WlMP TO POST-WlMP INTERFACES
a. The WIMP interaction model can be outlined as follows:
• application objects are displayed in document windows;
• objects can be selected and sometimes dragged and
dropped between different windows; and
• commands are invoked through menus or toolbars,
often bringing up a dialog box that must be filled in
before the command's effect on the object is visible.
b. This section uses Shneiderman's principles of direct
manipulation  to analyze WIMP interfaces:
I. Continuous Representation of objects of interest
Objects of interest are central to direct manipulation.
Principle 1 asserts
that objects of interest should be present at all times. Since
objects of interest are often larger than the screen or window
in which they are displayed, WIMP interfaces makes them
accessible at all times through scrolling, panning or
Finally, there are more objects of interest than meet the
eye: in many applications users must manipulate secondary
objects to achieve their tasks
Physical actions on objects vs. complex syntax
Most computers have only a mouse and keyboard as input
devices limiting the set of user actions to: typing text or
This is conceptually no different from typing a command in
a command-line interface:
In both cases the syntax is complex and cannot be considered direct manipulation of the objects of interest.
In fact, WIMP interfaces directly violate principle 2 and often use
indirect manipulation of the objects of interest, through
(direct) manipulation of interface elements such as menus
and dialog boxes.
3, Fast, incremental and reversible operations with an
immediately-apparent effect on the objects of interest
Layered or spiral approach to learning
The small number of interaction techniques used by WIMP
interfaces makes it easy to learn the basics of any new
application. However, interaction shortcuts, such as
combining keyboard modifiers with mouse buttons to
activate the frequent commands, are concealed and
inconsistent across applications and make the transition
from novice to power user more difficult.
c. Towards a new interaction model
To guide interface designers, these models should be:
• descriptive, incorporating both existing and new
• comparative, providing metrics for comparing
alternative designs (as opposed to prescriptive,
deciding a priori what is good and what is bad); and
• generative, facilitating creation of new interaction
WIMP interfaces do not
follow the principles of direct manipulation. Instead, they
introduce interface elements such as menus, dialog boxes
and scrollbars that act as mediators between users and the
objects of interest. Users have a (limited) sense of
engagement, as advocated by direct manipulation, because
they manipulate these intermediate objects directly
Our interaction with the physical world is
governed by our use of tools.
The Instrumental Interaction model is based on how we
naturally use tools (or instruments) to manipulate objects
of interest in the physical world. Objects of interest are
called domain objects, and are manipulated with computer
artifacts called interaction instruments.
a. Domain objects
In computer systems, applications operate on data that
represent phenomena or objects.
Domain objects have
attributes that describe their characteristics. Attributes can
be simple values or more complex objects. For example, in
a 3D modeller, the position and size of a sphere are simple
values (integer or real numbers), while the material of the
sphere is a complex entity (color, texture, transparency,
Materials and styles are therefore also domain
objects in their respective interfaces.
In summary, domain objects form the basis of the
interaction as well as its purpose: Users operate on domain
objects by, editing, their attributes. They also manipulate
them as a whole, e.g. to create, move and delete them.
b. Interaction instruments
An interaction instrument is a mediator or two-way
transducer between the user and domain objects. The user
acts on the instrument, which transforms the user's actions
into commands affecting relevant target domain objects.
Instruments have reactions enabling users to control their
actions on the instrument, and provide feedback as the
command is carried out on target objects
An instrument decomposes interaction into two layers: the
interaction between the user and the instrument, defined as
the physical action of the user on the instrument and the
reaction of the instrument and the interaction between the
instrument and the domain object, defined as the command
sent to the object and the response of the object, which the
instrument may transform into feedback to the user. The
instrument is composed of a physical part, the input device,
and a logical part, the representation of the instrument in
software and on the screen.
c. Activating instruments
An instrument is said to be activated when it is under the
user's control, i.e. when the physical part has been
associated with the logical part.
These two types of activation are quite different. The
activation of the scrollbar is spatial because it is caused by
moving the mouse (and cursor) inside the area of the
scrollbar. The activation of the rectangle creation
instrument is temporal because it is caused by a former
action and remains in effect until the activation of another
instrument. (This is traditionally called a mode). Each type
of activation has an associated cost: Spatial activation
requires the instrument to be visible on the screen, taking
up screen real-estate and requiring the user to point at it and
potentially dividing the user's attention. Temporal
activation requires an explicit action to trigger the
activation, making it slower and less direct.
Interface designers often face a design trade-off between
temporal and spatial multiplexing of instruments because
the activation costs become significant when the user must
frequently change instruments
d. Reification and Meta-instruments
Reification is a process for turning concepts into objects.
For example, a style in a text editor is the reification of a collection of text
Instrumental Interaction introduces a second type of reification: an interaction instrument is the reification of one or more commands.
The result of this reification rule is that instruments are
themselves potential objects of interest.
For example a pencil is a writing instrument and the
domain object is the text being written. When the lead
breaks, the focus shifts to a new instrument, a pencil
sharpener, which operates on the shifted domain object, the
The focus may even shift to the pen sharpener,
if we need a screwdriver to fix it. Such "meta-instruments"
(instruments that operate on instruments) are not only
useful for "fxing" instruments, but can also be used to
organize instruments in the workspace, e.g. a toolbox, or to
tailor instruments to particular tasks, e.g. turning a powerdrill
into a power-saw. In graphical user interfaces, common
meta-instruments include menus and tool palettes used to
select commands and tools, i.e. to activate instruments.
e. Properties of Instruments
An important role of an interaction model is to provide
properties to evaluate and compare alternative designs.
The goal of defining properties of instruments
is not to decide which instruments are good and which are
bad, but to evaluate them so that designers can make an
informed choice and so that researchers can identify and
explore areas of the design space that are not mapped by
The rest of this section introduces three properties of interaction instruments.
i. Degree of indirection
The degree of indirection is a 2D measure of the spatial and
temporal offsets generated by an instrument. The spatial
offset is the distance on the screen between the logical part
of the instrument and the object it operates on. Some
instruments, such as the selection handles used in graphical
editors, have a very small spatial offset since they are next
to or on top of the object they control. Other instruments,
such as dialog boxes, can be arbitrarily far away from the
object they operate on and therefore have a large spatial
offset. A large spatial offset is not necessarily undesirable.
The temporal offset is the time difference between the
physical action on the instrument and the response of the
For example, the arguments specified in a dialog box are taken into
account only when the OK or Apply button is activated. In
general, short temporal offsets are desirable because they
exploit the human perception-action loop and give a sense
of causality .
ii. Degree of integration
The degree of integration measures the ratio between the
number of degrees of freedom (DOF) provided by the logical
part of the instrument and the number of DOFs captured by
the input device
This term comes from the notion of
integral tasks : some tasks are performed more
efficiently when the various DOFs are controlled
simultaneously with a single device
iii. Degree of compatibility
The degree of compatibility measures the similarity
between the physical actions of the users on the instrument
and the response of the object. Dragging an object has a
high degree of compatibility since the object follows the movements of the mouse. Scrolling with a scrollbar has a
low degree of compatibility because moving the thumb
downwards moves the document upwards
APPLYING THE MODEL
a. Analyzing WIMP Interfaces
Contextual menus have a small spatial offset
and are therefore more efficient than toolbars and menu bars.
Toolbars, which can be moved next to their context of use,
have a better spatial offset than menu bars.
Dialog boxes are us~ed for complex commands. They have a
high degree of spatial and temporal indirection
Inspectors and property boxes are an alternative to dialog
boxes that have a lower degree of temporal indirection.
Handles are used for graphical editing and provide a very
direct interaction: low degree of indirection, high degree of
compatibility and good degree of integration
b. Analyzing Post-WlMP Interaction Techniques
These systems use two categories of instruments:
• navigation instruments specify which part of the data to
visualize and how; and
• filtering instruments specify queries and display results.
A key aspect of these systems is a strong coupling between
user actions and system response. In other words, these
instruments must have a small temporal offset.
Both navigation and filtering are usually multi-dimensional
Graspable interfaces  use physical objects as input
devices to manipulate virtual objects. In effect, they transfer
most of the characteristics usually found in the logical part
of the instrument into the physical part. This approach was
pioneered by Augmented Reality , which explores ways
to reconcile the physical and computer world by embedding
computational facilities into physical objects. Here, the
domain objects, in addition to the instruments, have a
strong physical component. This increases the degrees of
compatibility and integration since interaction occurs in the
c. Designing a Text Search Instrument
The design of the scrollbar improves the degree of
integration since it uses the vertical position of the mouse
to control the speed and direction of scrolling
This design also improves the degree of
compatibility since the mouse now works as a joystick. In
effect, the scroilbar thumb and arrows are functionally
equivalent: the thumb provides positional control while the
arrows provide rate control of the visible part of the
CONCLUSION AND FUTURE WORK
This article has introduced the Instrumental Interaction
model, which generalizes and operationalizes Direct
Manipulation. The model has been used to analyze WIMP
interfaces as well as more recent interaction techniques and
to design a new interface for searching and replacing text.
This demonstrates the descriptive, comparative and
generative power of the Instrumental Interaction model.
The other important area for future work is to make
Instrumental Interaction useful not only to user interface
designers but also to user interface developers by developing
a user interface toolkit based on the model.