The existing interface technologies that came from the world of desktop and laptop computers are badly suited for many categories of smart devices.

The main problem with mobile devices is their miniature size, which leads to a demise of traditional input technologies based on a physical keyboard, mouse, pointing sticks, etc. Even the size of the screen limits touchscreen capabilities, because the "point and tap" approach doesn't work as well with a finger. What was once good and acceptable for desktop and laptop computers is now either inconvenient or almost useless for the wide range of modern mobile devices.

There is a completely new platform called "wearable devices". The term "wearables" is a very loose term, since there are many devices that fall under this category, and they cannot be physically worn. This term was coined to separate a new family of devices from existing phones and tablets. Since the first devices (i.e. watches, glasses) were wearable, the term stuck.

What are "wearable" devices? The main point is, for the user, interacting with the device is NOT the primary task, since the user could be doing something more important at the time, which doesn't relate to the device's function. That is what separates "wearables" from all other computing devices. Based on its role, a wearable device has a limited set of input operations.

The "limited set of input operations" is very important when it comes to understanding the core nature of "wearable" devices. For example, an astronaut may wear thick gloves to operate a touchscreen device attached to his sleeve. A pilot or driver, busy at the wheel, might need to perform a certain function on the device. All the limitations come from the PRIMARY operations a user performs that aren't necessarily directly related to the device itself.

The main challenge with "wearables" is that the existing interface technologies don't work for them, though they are quite feasible when it comes to smart-phones and tablets.

Based on this, an embedded car navigation system can be also considered as a full-fledged "wearable" device, though nobody "wears" their cars. It is very unfortunate that the term itself is confusing, because it blocks from seeing a broader picture of the phenomenon.

Less...

Consumer Interaction With Device

One of the main issues with the concept of "user interface" is its perceptional fragmentation; different people see and discuss different parts of this "iceberg". While consumers see the snowy tip, the developers see the dark underwater mass.

There is a plethora of terms, such as User Interface (UI), Human Interface (HI), Computer Interface (CI), User Experience (UX), Human-Computer Interaction (HCI), Man-Machine Interaction (MMI), and Computer-Human Interaction (CHI), to name a few. That's a clear sign that something's not right here, since none of them are good enough to reflect the WHOLE PICTURE. For example, the "UX" term can work for people who know or care a little about the underlying technologies, but it's not enough for application developers to hear that "your UX is better than their UX", or "your UX is different from their UX". Sometimes, it's important to understand WHAT the difference is, and WHY it's better. Does that mean "UX" is a bad term? No, it's just a narrow view of one side of a much larger picture. But before looking at the whole picture, it makes sense to introduce a term that represents a stack of layers that make an interface work. As Steve Jobs put it: "Design is not just what it looks like and feels like. Design is how it works."

Instead of a generic "User Interface" term we'll use another one, CIWID (Consumer Interaction With Device).

There are many reasons why this term is more relevant than its "User Interface" counterpart.

First, what is acceptable for a "user" (say, an engineer who operates hundreds of buttons on a dispatcher control board) is not always acceptable for a "consumer", who, by definition, isn't expected to be a computer-savvy professional.

Second, in the computer world, "interface" often relates to a set of predefined functionality. It can be a functional interface in the form of an abstract class, or it can be a graphical interface in the form of menus and dialogs with fixed layouts. But the main point here is the presence of predefined functionality (commands, methods, buttons, etc). Meanwhile, some modern systems, such as Apple's Siri, do not have a set of predefined commands, and they are capable of figuring "things out" by themselves, which applies more to "interacting" than to "complying with an interface". Therefore, an "interaction" is a broader term than an "interface". Plus, it's more relevant in reference to forthcoming technologies.

Third, the "device" part is also extremely important here. Very soon, we'll see a range of "smart" apparatuses that will no longer resemble computers as we know them. Interacting with such devices could bring completely new communication challenges.

In simple words, CIWID is not only about WHAT a consumer sees (GUI), or HOW he or she mentally perceives the functionality (UX), but it's also about WHAT LIES BENEATH (i.e. the nuts and bolts of the underlying functionality).

Less...

Six Layers of CIWID

By knowing the "nuts and bolts" of CIWID, one can not only improve the quality of his or her software products, but can also tell which CIWID is better, worse, or different. One can also tell WHY it's better, worse or different.

Let's consider the following stack of layers that forms CIWID, from the bottom up. Each layer will get a nickname to simplify references.

1. "BARE BONES" encompasses the underlying concepts (such as "WIMP") and metaphors (such as "desktop", "paper document" and "folders") that are based on available conduits of interaction (such as a keyboard, mouse or touchscreen).

2. "OS MEAT" represents the functionality an OS provides for programming the interaction: UI frameworks (such as .Net, Cocoa), technology (such as ActiveX/COM), events/messages firing/handling system and corresponding API, etc.

3. "OS SKIN" represents the "cosmetic" components of the OS functionality: dynamics of a window generation/minimizing/closing, menu/dialog/cursor appearance and animation, color/graphics themes, predefined controls layout (task-bars, minimize/maximize/close buttons), as well as the appearance and dynamics of visual components that can be changed/configured without changing the underlying "OS MEAT".

4. "APP MEAT" represents the functionality that the application adds or overrides on top of the existing "OS MEAT". This layer is not that mandatory, because an application can utilize the existing functionality that is provided by "OS MEAT". But there are many cases when the default "OS MEAT" functionality could need an overhaul: to provide a more efficient way of presenting interface elements (dialogs, menus, etc), to render interface components in an original non-standard way, or to provide a basis for application porting to another OS that has completely different "MEAT". A great example would be Safari and iTunes applications for Windows.

5. "APP SKIN" is an extension of "OS SKIN". This layer isn't mandatory, either. An application could utilize the existing functionality provided by the "OS SKIN". A great example would be Windows Notepad. When creating custom controls with original graphics and animation, which is a commonplace in a commercial software development, a developer is actually overriding the "OS SKIN" with the "APP SKIN".

6. "CONSUMPTION" is the way a consumer perceives the application's functionality. In a nutshell, it's about how the application's features (based on levels 1-5) map into the consumer's physical capabilities, skills, expectations and prior experiences.

How does the notion of CIWID help? First, as in the case with food, knowing the amount of protein, fiber and carbs in a dish can help you compare it to a different dish. Second, it also shows what direction the interface development needs to head in. Third, CIWID provides more holistic picture, because it encompasses other existing terms that are used in the context, such as GUI (which corresponds to OS and APP SKIN levels), UX (which corresponds to the "CONSUMPTION" level with little details to technicalities).

Why is it so important to separate the "MEAT" from the "SKIN" on both "OS" and "APP" levels? To demonstrate the importance, imagine someone developed a Windows application that completely simulates the look of an existing Apple application, down to the minor details, including the global menu and the round minimize/maximize/close buttons. To do so, one needs to override the default Windows-related "OS SKIN" with the custom "APP SKIN". Now, we have two applications running: the original Apple application, and a simulated Windows clone. Technically speaking, the GUIs of both applications are equivalent, right? BUT if you grab an application's window by its caption and quickly move it around, you'll instantly see the difference - the Windows-based application would flicker when the application's window is moved around because of the core nature of the Windows rendering mechanism, i.e. the Windows "MEAT". This visual "difference" would affect the "CONSUMPTION" or user experience level.

Less...

What is the Reason of an Obvious Stagnation?

The main problem with modern interface technologies is that they have WIMP concept as its "BARE BONES" basis, i.e. they are all built around the WIMP-based CIWID.

WIMP stands for "Windows, Icons, Menus, Pointer", and it uses these elements to denote a style of interaction. Merzouga Wilberts coined that abbreviation in 1980.

A very crucial component of the WIMP concept relates to the letter "P", which is about pointing devices (i.e. mouse, trackball, joystick, pointing stick, dial) and touchscreen related-functionality. Pointing devices utilize the idea of "hovering cursor/pointer/selector", as in the functionality that allows a user to move/position a cursor/pointer/selector prior to the selecting a command or character on a virtual keyboard. It is important to notice that touchscreen-based devices do not have such functionality, and a user has to "point and tap" either a command or a character on a virtual keyboard with either a stylus or a finger. It could look a little bit different, but it's conceptually the same thing -- selecting an interface item "by its POSITION".

Therefore, the letter "P" in WIMP is about "selecting an interface element by its position", either with a "hover and click" or a "point and tap" approach. Everything that can be done with a touchscreen can be done with a physical mouse and keyboard. The entire Metro interface was based on that notion.

Why hasn't such technology faced a major overhaul for more than 40 years? Because sometimes an OK technology can be an obstacle for a new one. Though the majority of computer scientists agree that the WIMP concept is in a state of stagnation, the "newer"  technologies in development (RUI, NUI, FUI, ZUI, to name a few) are mostly about beefing up the same antique WIMP-based "BARE BONES".


Q: What is the role of letter "M" in WIMP abbreviation?

A: It's important to understand that interface components, such as forms, dialogs, tabs, buttons, combo-boxes and radio-buttons do not add new functionality; they just provide extra convenience. All of them can be replaced with menu-based counterparts (even a virtual keyboard is not an exception). That way, the entire interface layout of an application can be implemented, using only the menu.

Q: Why are all the modern systems "a step aside" instead of "a step forward"?

A: The main reason is, they address the "OS MEAT" level (a little) and the "OS SKIN" level (a lot) of the CIWID stack. The "BARE BONES" level is left intact, which causes the stagnation.

Q: Does the addition of new touchscreen-related features, such as "swipe", "pinch zoom" and "multiple touch gesture", change anything with the WIMP concept?

A: Not at all. In a nutshell, the new features affect "MEAT" and "SKIN" levels of both OS and APP parts. Though the functionality is a cool addition, it's not deep enough to affect the "BARE BONES" level of WIMP-based CIWID.

Usually, when a new version of OS is introduced, it has a set of new "cool" features, such as transparent dialogs, flat-design based layouts, zoomable forms, voice control systems, etc. This approach is very popular, but it's important to understand that it cannot lead to a revolutionary change in interface technologies. There are many reasons why it's done this way, but the main one is very simple: there is no other way for big OS-producing corporations. The interface related functionality is deeply embedded into the core of an existent OS, which, itself, is usually a few decades old. Thus, providing a "complete overhaul of the user interface" that affects the "BARE BONES" level is technically impossible. The companies are stuck with the "legacy" API, so to "innovate", they have to tweak the "OS MEAT", gently enough not to break anything that's already working.

Less...

What Can be Done to Improve the Situation?

To provide a new revolutionary interface with new amazing features and possibilities, the "BARE BONES" level of CIWID stack needs to be revisited. Without this, no changes to "OS MEAT" and "OS SKIN" levels are going to work.

To prove the point, we are going (with our "SPINT" project) to tinker with the "P" letter in WIMP, by replacing "Pointer" related functionality with the "State" related counterpart, which would effectively transform WIMP into a WIMS concept. Even this small alteration of the "BARE BONES" level would open completely new possibilities, and provide solutions to interface problems that the old WIMP concept was incapable of solving.

Let's take a quick look at two popular trends in the field: Zooming UI (ZUI), usage of stylus and voice control.

Systems that are based on ZUI provide additional "conveniences", but in reality, they are all very similar to the GUI. Usually, the ZUI concept comes into consideration when developers try to overcome the restrictions related to a small screen size. Thus, it's not a new technology, per se, but an attempt to provide visual "crutches" to the existing one. If UX is about how well application functionality fits into the customer's physical capabilities, skills, expectations and prior experiences, then understanding the way we keep and operate information in our brain is important. We don't keep information in our brains in the form of screenshots, let alone zoom-in or zoom-out on anything when we're trying to recall a memory. The act of recalling something is more sophisticated, and has nothing to do with the pure mechanical approach, based on some hardware limitations of mobile devices.

There is an evolutionary chain of communication technologies. It all started with oral communication. Later, handwriting provided more possibilities than talking did. Typewriting replaced handwriting, which was then replaced by computers with physical keyboards. Now, those have been replaced by touch screens with virtual keyboards. This evolution has provided more convenience and increased productivity. Usually, an old device such as a stylus doesn't resurface, unless it can serve a completely new function.

Another example of an old link in the evolutionary chain would be "voice control" systems. The voice itself is an old link in the communication evolutionary chain. Pronouncing commands takes more effort than entering them into a device. There are also many instances in which it could be considered (1) inappropriate (when asked to provide the username and password in a crowded place), (2) useless (in a noisy environment), (3) fatally dangerous (when a soldier in the battlefield wearing smart glasses has to give commands loudly). Do we really expect people to want to talk to their glasses or watches? Just because a device "can" recognize voice commands to some extend, that doesn't mean it's a more convenient and efficient way of communicating with the device.

Less...

Online Articles