Next Generation User Interfaces: Emerging Technologies and DirectionsAuthor: Mike Bennett (smoog at techie dot com)
Last Update: 18/09/2000
The user interface on desktop computers has remained rather static over the years with small evolutionary advances that tend more towards eye candy than actual usefulness. Most advancements that do occur are merely implementations of old ideas and don't adequately take into consideration modern user requirements or the fact that people have a greater familiarity with technology so are more comfortable using it. Nor do the interfaces fully utilise the power of modern computers that have memory and CPU cycles to waste.
To fully grasp the possibilities for user interfaces we look at on going research. While examining how the research can possibly be deployed in real world desktops and what advantages and disadvantages this could possibly lead to. Some of the concepts require further research, such as 3 dimensional interfaces, while others are ready now, such as multi-scalar interfaces.
Once the research is out lined speculation is made on possible ways of fitting the most useful aspects together. Various walk throughs of potential interfaces are presented, which helps highlight some of the stronger and weaker interface methods.
This section outlines the various research fields that I believe have a lot of potential to make important contributions to user interfaces. Some of the projects are not directly usable but they provide a good framework upon which to build, others are already seeing some early stage deployment in commercial products.
Calm Technology (Weiser and Brown, 1996 )
Also known as "Ubiquitous Computing (Wesier )" is the concept of non-intrusive computing, "where the technology recedes into the background". It envisions a world where certain types of technology can sit on the "periphery" and do not require our full focus to be of use. This is achieved through utilising the human ability to notice things while we are paying attention to other tasks, i.e. when driving notice that something is wrong with the car if the sound of the engine changes, even though we're focused on the road ahead and surrounding traffic.
Another interesting example is the "Dangling String" , created by the artist Natalie Jeremijenko, where a piece of plastic spaghetti is hung from a motor on the ceiling. The speed of the motor is proportional to the amount of traffic passing through the local ethernet network. The more traffic on the network the faster the plastic spaghetti spins, transforming the spaghetti into a certain shape and creating a distinctive sound, if the motor spins slower then the shape of the spaghetti is different, as is the sound. This can be hung within view of many people but it wouldn't have to be intrusive, therefore becoming part of an individual's environment. If it stopped individuals in its presence would feel something is wrong with their environment, notice what has changed and seek to rectify the problem. The problem in this case would be something which causes the network to stop sending traffic.
Multi-scalar interfaces don't force applications to assume certain predefined scales nor points of reference. They free the user to completely define the scale and position of objects and data on the graphical user interface, in most cases the desktop. Therefore allowing users to better utilise screen space, application relationships and the visualisation of data.
In "Pad" (3, Perlin et al) this is build upon with the concept of portals which can bring differently scaled applications and data together in various views, e.g. have a graphing package and a spreadsheet sharing the one view, while the same spreadsheet is sharing another view with a mathematical package and data import package. A view can be quite intelligence and could be used not to merely group applications but also to act in a transformative manner, e.g. a graphing view could act on a pre-generated image, which is displayed in a web-browser, and convert the information into different formats for people with other preferred ways of visualising information, i.e. pie charts instead of bar charts, etc.
Xerox Parc researchers have also worked on the concept of views, without the multi-scalar aspects, which they have called "Magic Lens and Toolglasses" (4, Bier et al, 1993). Apple's research group has also worked on a form of "Magic Lens" but they've approached it from a different perspective and call them "Data Detectors" .
Realistically Apple's concept isn't as fully developed and is more of a solution to a particularly problem, i.e. how can you make the cutting and pasting of text more intelligent. With their solution when you highlight text it is automatically parsed to find URL's, email address's, etc. When matches are found context sensitive menus are displayed, e.g. if the highlighted text has URL then there's an option to display it in Netscape, etc.
3 Dimensional Interfaces
3 Dimensional interfaces have gotten a lot of attention from diverse groups, ranging from sci-fi authors to groups research and developing potential ones such as 3Dsia , Verse , 3Dwm  and Microsoft's Task Gallery .
The definition of a 3D interface varies from research group to research group. The types under development, but not limited to, include:
Other variations on the theme include augmented realities which allow the real time projection of virtual environments onto the physical world. An example of its possible use is a plumber, while wearing special glasses, could look at a room and see where all pipes (virtually overlaid) are running even when they're hidden behind walls, etc.
3D environments also require a rethinking of how to we interact with the desktop, i.e. what type of replacement is used for the mouse - since the mouse if very much a tool for a 2D environment. A discussion of this can be found in "Elements of a Three-dimensional Graphical User Interface" (10, Leach et al, 1997).
User interfaces are used to convey information so the nature of how the information is displayed is important. The key problem is how to visually represent a lot of, often vaguely, related data so that it can be navigated and understood reasonably easily.
Clearly with the advent of the Web hypertext has become in vogue as a possible answer, but that has serious limitations such as keeping track of where you were, allowing the inclusion of notes, etc. An interesting solution to both those problems is the concept of "Fluid Documents"  where the content of related links can be insert into the middle of the currently view web page. This wouldn't really work with the Web as a whole but it certainly has its applications within a well structured and designed web site.
Another important aspect to consider when designing good visual systems is the fact that people have different ways of modelling data relationships, so what could work very well for one person may be an utter mess for another.
Some interesting developments in this area include the "Visual Thesaurus"  which is a very graphical way of demonstrating the relationships between words while also allowing easy navigation, i.e. you enter a start word which then appears in a 3 dimensional space surrounded by related words that are linked by thin lines to the parent. The words can be made to spin around the parent which helps when there are a lot of relationships, the emphasis on word types (verb, noun, etc) can be represented by the strength of the lines, etc. Another implementation of the same concept can be found at "The Brain"  but this is a commercial 2 dimensional version that isn't restricted to representing word relationships.
Gesture Based Interfaces
Gesture based computing is still in its infancy, for a variety of reasons including the complexity of training a computer to recognise 2 and 3 dimensional gestures.
At the moment most PDAs (personal digital assistants) provide pen based gesture input, i.e. you draw a certain type of 'g' to input a 'g'. How the gesture is drawn, the directions taken, etc are all very relevant to enable character recognition. An evolutionary development is to association shapes/gestures with actions as part of the standard desktop, which is under development as part of wayV , e.g. draw an N and Netscape starts, draw a C and a calculator starts, etc.
An interesting area is the inputting of gestures via other means, especially cameras. This is becoming more possible, especially with the advent of cheap cameras that have strong market penetration, i.e. webcams. Matthew Turk of Microsoft's research group has some interesting work going on in this area . Of course one of the questions that arise is what exactly is the usefulness of such developments?
One of the fields that presented a lot of challenges over the years is the area of voice recognition. While its not completely perfect yet, and won't for a considerable time to come, it has more than reached the stage where its useful and used in real world situations, ranging from mobile phones to voice dictation in word processors.
Since the technology is reaching a mature stage a lot of consideration needs to be paid as to how it should be integrated, used and deployed. One group considering this is the Speech Interface Group at MIT .
Visual programming is a newly emerging field, so much so that there isn't even a consistent name for it. It also goes by the names of "Programming by Demonstration" , "Programming by Example"  and "Demonstrational Interfaces" .
The concept is simple but the implementation aspects are very complex. The idea breaks down into two main parts:
An example of the first part occurring is if you delete a .pdf file in a folder and follow that by the deletion of another .pdf in the same folder the computer should be intelligent enough to ask would you like to delete all the .pdf files in that folder.
The second part is where the idea is really revolutionary, basically you can develop computer programs by demonstrating how actions should occur and if need be build animated graphical representations that are computer programs themselves. "Pictorial Janus"  and "ToonTalk"  are implementations of these concepts and are worth looking at.
There are many other fields in user interface research at the moment but the ones above are the areas I feel have the most potential to change the way we understand computers and how we perceive and interact with data.
Other obvious areas include eye aware interfaces (Edwards, 1998 ), which is more of a sub-set of gesture based computing and would have limited applications. Hardware devices for other forms of input while simultaneous giving tactile feedback and 3 dimensional display devices .
Advantages and Disadvantages
In this section we deal with each of the above mentioned areas, we try and apply some critical evaluation and envision what impact they could have on the desktop experience.
This is more of a concept than a particular application, its a philosophy on how we should possibly design interfaces. Though saying that there have been some attempts to develop "Calm" applications, in particular I draw attention to "LavaPS"  written by John Heidemann.
"LavaPS" displays running processes on a Unix computer via a graphical interface. The processes are represented with blobs of colours, the size of the blobs, the speed at which they move and the colour all tell you information about the state of the process each blob is associated with. From the personal tests I've carried out I do find that "LavaPS" runs on your desktop in a non-intrusive manner, but when you pay attention to it understanding what is going on turns out to be hard at anything but the most basic level.
This is a technology which we should really see as part of the desktop now. People often run many different applications simultaneously but don't require all of them to be full sized the whole time, but they do require the ability to keep an eye on what's going on with them.
3 Dimensional Interfaces
At the moment, and for the next while, 3 dimensional user interfaces as part of the desktop are going to remain beyond a lot of users. There are a variety of reasons for this, including, but not limited to:
Data visualisation has clearly come of age, the rapid growth of the Web demonstrates this. Using the techniques mention above we should move onto more graphical ways of representing data, for navigating our file systems, our collection of documents, etc.
How we perceive data shapes how we think and if we all are forced to perceive data the same way we simply will miss some obvious solutions to problems.
Gesture Based Interfaces
There is a market for various forms of gesture interfaces, primarily PDA's but the desktop offers a lot of potential development in this area. It's much neglected, even from the point of view of just using the mouse as the gesture input device.
As a research area it offers a lot of interesting problems that offer challenges ranging from what are optimal gestures for the symbolic representation of actions to trying to recognise 3 dimensional gestures.
Humans are obviously used to speaking as one of the primary communication tools so there should be a wide spread adoption of voice interfaces in all types of technology and business situations. Growth should be especially obvious in devices where keyboard and pen input would be clumsy or problematic, this includes PDAs, ATM, video recorders, etc.
As it currently stands visual programming could be applied to the standard desktop but the capabilities and possibilities have still to be discovered and defined so it would be rather premature.
Eye aware applications are currently used as part of user interface usability testing. Whether it could have wide spread applications is questionable, certainly it would add an extra element to computer games.
Tactile feedback devices will have their time but at the moment further consideration is required to develop concepts of where they'd be useful for, apart from virtual reality.
So where can the desktop go from here? Well if some of the above concepts where deployed together things could get very interesting and useful.
Multi-scalar  graphical user interface with voice recognition
Visual relationships  and gesture recognition via a camera 
Calm computing  with demonstrational interface  form of visual programming
3 dimensional graphical user interface, a cross between 3Dwm  and Microsoft's Task Gallery , with an eye tracking abilities 
As I've shown there is a great deal of on going research into user interfaces, of which I've only covered a limited amount, but the research is not just re-inventing the wheel. Significant advances are been made, its anyone's guess what the future holds but its definitely going to be interesting.
A big question is why has the user interface remained static for so long? Why must we continue waiting for Microsoft and co to introduce the above research into the real world? Yes, some of it would involve the end user learning new things but the transition would be nothing when compared to learning the user interface differences between Windows 3.1 and Windows 95, which many non-technical did very successfully.
One very important point is worth making: the amount of serious user interface research in Ireland is disappointingly low. There is the "Human Factors Research Group"  in University College Cork dedicated to it but their focus is more on usability testing and metrics than new interface development.
Ultimately the user interface is about how we relate to data, how we perceive it, how we manipulate it and how we transform it.
 (Weiser and Brown, 1996) "The Coming Age of Calm Technology", Mark Weiser and John Seely Brown, Xerox PARC, 1996, http://nano.xerox.com/hypertext/weiser/acmfuture2endnote.htm
 (Weiser) Mark Weiser, http://www.ubiq.com/hypertext/weiser/UbiHome.html
 (Perlin et al) "Pad", Ken Perlin, Prof. Jack Schwartz and Jonathan Meyer, New York University Media Research Lab, http://www.cat.nyu.edu/projects/pad.html
 (Bier et al, 1993) "Toolglass and Magic Lenses: The See-Through Interface", Eric A. Bier, Maureen C. Stone, Ken Pier, William Buxton, Tony D. DeRose, Xerox PARC, 1993, SigGraph 93, http://www.parc.xerox.com/istl/projects/MagicLenses/93Siggraph.html and http://www.parc.xerox.com/istl/projects/MagicLenses/
 Data Detectors, Apple, http://www.apple.com/applescript/data_detectors/
 3Dsia, http://threedsia.sourceforge.net
 Verse, http://www.obsession.se/verse/
 3Dwm, Chalmers Medialab, http://www.3dwm.org
 Task Gallery, Microsoft's Research Group, http://research.microsoft.com/ui/TaskGallery/
 (Leach et al, 1997) "Elements of a Three-dimensional Graphical User Interface", Geoff Leach, Ghassan Al-Quimari, Mark Grieve, Noel Jinks, Cameron McKay, Interact 97, http://goanna.cs.rmit.edu.au/~gl/research/HCC/interact97.html
 Fluid Documents, Xerox Parc, http://www.parc.xerox.com/istl/projects/fluid/
 Visual Thesaurus, http://www.plumbdesign.com/thesaurus/
 The Brain, http://www.thebrain.com
 wayV, http://wayv.sourceforge.net
 Gesture Recognition, Microsoft's Research Group, http://www.research.microsoft.com/users/mturk/gesture_recognition.htm
 Speech Interface Group, MIT, http://www.media.mit.edu/speech/
 Programming by Example and Programming by Demonstration, MIT, http://lieber.www.media.mit.edu/people/lieber/PBE/
 Demonstrational Interfaces, Carnegie Mellon, http://www.cs.cmu.edu/~bydemo/
 Pictorial Janus, C-lab and Heinz Nixdorf Institut, http://jerry.c-lab.de/~wolfgang/PJ/
 ToonTalk, http://www.toontalk.com
 (Edwards, 1998) "A Tool for Creating Eye-aware Applications that Adapt to Changes in User Behavior", Gregory Edwards, Advanced Eye Interpretation Project, Standford University, ASSETS 98, http://eyetracking.stanford.edu/assets/assets.html
 Volumetric 3-D Display Technology, Actuality Systems, http://www.actuality-systems.com
 LavaPS, http://www.isi.edu/~johnh/SOFTWARE/LAVAPS/index.html
 Human Factors Research Group, University College Cork, http://www.ucc.ie/hfrg/