The problem with WIMP interfaces on wearables isn't in the interface itself. After all, in spite of some criticism WIMP is a clear step above command-line interfaces for desktop computers. The problem is that WIMP makes several assumptions about the user's situation that are generally untrue for a wearable computer user.
When I think about it too hard, it amazes me people can use mice, trackpoints, or other pointers at all. The skill entails fine motor control to adjust a spot in continuous space across a screen, sometimes to pixel accuracy. However, after our several years of experience most of us have gotten quite good at mousing, at least on a flat, stationary surface (even with a HMD, see the work done at the University of Oregon ). The task becomes quite a bit harder, however, when both the pointing device and display are jostling. Even the minor vibrations from walking are enough to cause serious problems, especially when the pointer target is small due to limited screen real-estate.
2. The user has screen real-estate to burn
WIMP assumes a certain amount of screen real-estate is available for the interface alone. At the very least, it expects room for window borders, top-level menus, and icons on the screen. Even on a desktop computer with a 21" monitor, information overload is becoming a serious problem. The issue becomes all the more difficult with current HMD screens, which even with improving technology leave us wanting more information-space.
3. Our digital information is our primary task
This is the most important wrong assumption when it comes to wearable computers. WIMP interfaces assume that interacting with your computer is your primary task, and anything else will take a back seat to that task. This is why WIMP can get away with requiring a user to keep his eyes on a cursor all the way from its current position to a target icon. This is why WIMP can expect a user to search for a menu item, button, or icon on the desktop.
Unfortunately, wearables users aren't often afforded this luxury. Sometimes the problem is simply one of distractions. Because a wearable's environment is more hostile than a warm office, the user will often have extra demands on their attention from the outside world. These distractions may be as simple as wind or traffic noise, or may be more demanding issues like oncoming traffic when crossing the street.
More important than distractions is that wearable computer applications tend not to be the user's primary task. Admittedly this is an overly broad statement considering the wide assortment of applications for wearable computers. However, I would argue that new applications for wearables, i.e. those that cannot be equally well implemented on a palmtop or laptop computer, will primarily be secondary task applications. Consider that one of the chief advantages of wearable computers is that they are available everywhere, all the time. Any time an application requires the user's primary attention, the full availability of the wearable is wasted. For example, a standard desktop application like a spreadsheet could be ported to a wearable computer or to a palm-top computer like the Apple Newton. It is not clear, however, why one would prefer the wearable version over the palm-top, since working on a spreadsheet doesn't get any advantage from being performed in a hands-free, eyes-free, or otherwise wearable environment. On the other hand, an Emergency Medical Technition cannot use a palmtop computer to log patient statistics because she is busy with carrying the patient, taking vital signs, and other tasks. The reason she needs a wearable over any other kind of computer is that her interactions with the data must be a secondary task. In especially hostile environments, being distracted from this primary task can actually be fatal either to the wearable user or those around her.
Output modalities fall along a similar continuum. At one end falls agents technology, where software automatically acts on your behalf without any user intervention. Software that automatically respondes to messages with an "I'm busy" message is an example of a very simple agent. Full text or multimedia presentation falls at the other end of the spectrum, requiring full user attention to take in information. In between is what has been called ambient interfaces. These output interfaces fall at the periphery of attention, like rain on the windowsill. An example would be an interface that produces a quiet click whenever a message goes to a chat room. Normally the interface could be ignored entirely, but a sudden storm of conversation would be noticed, as would the sudden absence of any conversation.
Input: passive sensors low-load input direct-manipulation <-X-----------------------------X--------------------------------X-> Output: agents ambient full multimedia <-X-----------------------------X--------------------------------X-> no cognitive load high cognitive load
When designing interfaces along this spectrum there is a natural trade-off between receiving or conveying necessary information and overwhelming the user with too much information. Optimally an interface should start with a low cognitive-load (and correspondingly low information content), and ramp up to higher load and content as the situation warrants. For example, a wearable navigation aid might passively watch my position, requiring no action from me on either input or output. Any time I come to an intersection leading from where I started to a way-point, my agent would notice this fact and paint an arrow over my visual field pointing in the direction of that way-point. This, in effect, would up my output from the agent level (handled entirely by the software) to the ambient level. If I absolutely need to get more information about Bill, I might hit a single button indicating I want more information about the current overlay (a low-load input), which would then bring up a full description of directions to Bill's house. At this point I would be fully concentrating on my wearable, and would probably want to stop walking for fear of running into a lamp post.