a device that is generating concepts

Human interface modeling for communication trends – Part Two (Model-driven applications: Audio and Visual)


This is a second part of the article I am working on that is providing actual applications of the model. You can find here examples of existing products and usages as well as some possible directions of communication evolution or get some inspiration.  To understand what the #$!% am I talking about, it is recommended to read a first part that describes the model. Enjoy!


The model is systemizing communication channels based on their properties, while exposing strengths and weaknesses of each channel. We could use high quality characteristics and technologies to reuse/improve low quality ones to improve overall communication efficiency of Knowledge, Emotion and Will transfer. Also we could think of possible future technologies that would bring missing (or evolve weak) capabilities, how our world would look like. In addition we can track currently developed technologies in the light of the model.

We can see that there are multiple companies that are working on a single channel attribute to improve the efficiency, while many others combine several attributes to bring new usages.  Lots of new business opportunities are hidden within new capabilities that improve human communication efficiency.

Need to remember that the model here takes only some of communication channel types and some of characteristics, while there are much more to work on.


ImageToday, Audio is one of the most developed communication channels. Radio, Phone and later mobile phones development leveraged an extremely high informativity of the channel to provide a substrate for comprehensive audio content. Still, there are lots of characteristics to improve.


Slowness of Audio channel we could compensate by other fast channels.

For output, we can add the fast Motoric channel (most people are trying naturally to do it by gesticulating especially when they try to convince and add more of “Will” nature to the content), though it is not good in knowledge content delivery. How could we do it? Possibly by taking a whole process and separating its Will and Knowledge content. All inputs that have a strong Will nature should be switched to Motoric channel. For example, a programming process contains lots of Will content along with Knowledge. Can we combine voice to generate complex structures and then use our hands to manipulate and correct them? Speech recognition technology maturity is critical for evolvement of the channel, but once it is there, we need to use it efficiently and not overload by Will content. For example more native interaction design of text editing tool would be text input by voice and touchpad/face/eye control of text blocks and formatting rather than doing it all by voice. Head movement would be better approval/disapproval control than voice-driven one.Image

How can we overcome a serial nature of the channel to increase its bandwidth? Can we prepare an audio content and make it available by some trigger? Example can be a pre-processing of our speech and generation of sentences based on demand or condition. Think that when your wife is calling to you, the beginning of conversation is pretty known. Of cause the applications might be much broader. The speech analysis engine can provide possible audio content and we can choose it by our hands.

Increasing Audio channel driven vocabulary or leveraging of tones and sounds can be another way to improve a bandwidth and generate more information in given time. For example detection of “Skeptic Hmmm” sound could bring natively a help menu on gaze-based selection.

For input content channel compensation, the Visual and Touch channels would be good. Both complement Bandwidth and the Easiness of Absorption. Though, generally the content absorption for Audio channel is not bad, but comparison to Visual and Touch shows that there is a lot of potential for improvement.Image

Today it is used in movies; children like books with pictures since this way they complement (initially) audio content; we use PowerPoint presentations to add visual content that becomes a dominant even when the knowledge component in it might be very limited. Transformation of written content into audio would be better, and improve a bit information absorption, but addition of visual content would make it much better. How can we make an automatic transformation of audio data into Visual? Can it be done through Literacy channel?

How can we leverage a Touch channel to improve characteristics of Audio? Can we translate Audio content into electric signals and transfer it to our neural system? Music in disco bars can be felt by the whole body and it increases spectrum of Emotions. Can we increase the spectrum for this and other applications? Can we detect and transfer Emotion component of our voice to the body of our spouse?

A localization of Audio communication is easy for input and complicated for output, but still has to be treated. For example capturing of vocal cords signals, transfer of information through Literacy channel and recovering an Audio content.

Low information sharing factor might be critical since a comprehensive and native Audio channel is the main knowledge channel (on top of Emotion and Will). Developing technologies that would improve it might significantly increase the efficiency of human communication. This would be well leveraged by Search factor of audio content. Today the Audio communication is mainly between 2 people or one-directional one-to-many through Media. There is missing many-many Audio content sharing due to its very low bandwidth (serial nature). How can we overcome it and make this kind of sharing efficient? Most of Audio information we produce is lost. Automatic and permanent capture, processing, analysis, structuring, saving and extraction might give a lot of potential – make this information accessible and distributive, on demand and when required, it would become for us a mirror, made of the main human communication channel.


ImageVisual is the type of channel that can provide any content, while today it is used mainly as a complementary to Audio or Literacy channels to increase the Emotion component or to provide a supporting content for them.

Despite today there is Visual information sharing, it is mainly done for a reality captured content (thanks to smartphones) and much less for creative one. As for creative content, it is more a distribution rather than sharing. There will be no real sharing of visual content up until we will be able to create easily and immediately anything we think about. Up until we will improve the Easiness of Visual information generation. How would it be possible? How can we make most of people to be able create (and not only capture) pictures and videos on their daily basis that would express their Emotions and be a native substrate for their Audio or Literacy content that they share? ImageCan we make it automatically, based on Audio and Literacy content processing with just a fast Motoric interface adjustment? Can we make an easy to access and manipulate database of visual objects and instruments that would help us to visualize our thoughts from those pieces?

Taking into account that Literacy channel is great for knowledge content communication, audio channel is left to “compete” on Emotional content with Video (while touch is not yet evolved). We can see that trends for those two are opposite – while more people “see” their content, the same amount less “listen” it.

Image* Based on Google Trends

Can we significantly increase the Knowledge component of Visual? Can we make it even more informative? Knowledge visualization has a great potential. Once we can visualize any written or spoken knowledge, it can be easily and rapidly absorbed. It might be a parsing, processing and visualization of big knowledge databases to adjust to human needs. How else can we do it? It might be a visualization of complex knowledge structures and easy access to the library based on demand or conditions. Examples of this today are hieroglyphs, corporate identity symbols and QR codes.

And finally the Visual information search factor… It can be done through metadata, trough component recognition, or by any other way – search ability through the Visual content is very important to leverage a content sharing and make the channel efficient. Big companies (e.g. Google, Facebook) are trying to bring solutions in this domain. Best search ability is for Literacy channel, so we need some way to create a metadata descriptive and sufficient enough to have a comprehensive search for every component of the Visual content.


Author: Andrey Gabdulin Product Development

2 thoughts on “Human interface modeling for communication trends – Part Two (Model-driven applications: Audio and Visual)

  1. Pingback: Human interface modeling for communication trends – Part Three (Model-driven applications: Motoric, Touch, Literacy, Content View and Sum) | concepton

  2. Pingback: Infinite Virtual Wall | Concepton

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s