It is predicted that in 2007, the industry will finally break throughthe barriers that have prevented the mass-market adoption of videophones. All of thefundamentals are now in place to make videophones – wired and Wi-Fi -available to consumers:
The widespread adoption of broadband (and Wi-Fi networks) in thehome, now in excess of 50 percent penetration in many parts of Asia,Europe and North America, enables true mass-market potential in the”connected home”;
Moore's Law continues to driveprocessing power forward, enabling support for the computationallycomplex media processing algorithms that are required to deliverreliable and high-quality full motion video;
Advances in battery technology and power management enablingWi-Fi-based devices with standby and talk times that can be measured indays and hours rather than minutes; and
Despite nearly 40 years of development, price and performance haveremained stumbling blocks to delivering a mass market solution. Even aswe moved from analog videophones to digital IP, constrained networksand a lack of processing power created a price/performance ratio thatwas unacceptable to the consumer market.
These issues are quickly becoming obsolete with the proliferationof wired and wireless broadband networks, coupled with highly-capablevoice/video processing technology.
So let's turn our attention to the fourth item – software. WhenIP-connected devices don't perform properly, the blame and focus fallsquarely on this piece of the solution. And in most cases, it should.
Consumer and enterprise devices – whether a phone or any personalcommunication and multimedia device – must provide a compelling andreliable user experience to successfully create a mainstream market. Asa result, the quality and reliability of wireless transmissions play animportant role in the success of Wi-Fi videophones.
The IEEE 802.11 WLAN standard hascontinued to evolve and improve in terms of data rates, range andsecurity. Stable and reliable VoIP offerings have been made availableto consumers only in the last 18 to 24 months. Designing, developingand manufacturing a Wi- Fi-enabled V2IP phone requires significantresources for software development, integration and validation.
Increasingly, we are seeing manufacturers using embedded Linux asthe basis for their VoIP phone products. The advantages are many(developer familiarity, rich software creation environment, etc.), butthe overriding one may be its ability to help manufacturers lower theoverall BOM. There are a number of suppliers that provide very stableand well-supported versions of Linux that are optimized for low-powerconsumer devices.
Looking at the architectures utilized by first-generationvideophones, we see that separate processors tended to be used forvoice, video and system-control functions. Due to processingrequirements, it was typical to use processors optimized for intensivemedia processing operations (DSPs). For example:
One DSP to handle voice processing functions including voiceencode/decode, tone generation and detection; echo cancellation andnoise reduction;
One DSP or dedicated coprocessor to handle the video encode anddecode; and
One applications processor managing the VoIP call controlprotocol and user interface (Figure 1,below ).
|Figure1: First-generation videophones require three separateprocessors.|
This approach requires multiple programming models and developmenttool chains, which, in turn, results in the need for larger developmentteams, increased training and additional costs.
Since the first generation of IP videophones was introduced,general-purpose applications processors have increased in processingpower to the point where it is possible to move all of the audioprocessing tasks usually performed by a DSP to the applicationsprocessor.
This is an important advancement for the Wi-Fi videophone market inview of the pressing need to minimize power consumption and maximizebattery life in wireless devices.
VoIP codecs (G.711, G.729AB, G.723.1 and iLBC), audio processing(DTMF and call progress tone detection/generation), voice-qualityenhancement (line/ acoustic echo cancellation and jitter buffers) andother similar functions can now be effectively executed on theapplications processor if carefully implemented with assembly-coded andhand optimized software.
And, as provided on an increasing number of applicationsprocessors, we can use hardware acceleration for the video encode anddecode (Figure 2, below ).
|Figure2: A new paradigm for designing videophones has emerged.|
The increased processing capabilities of today's applicationprocessors allow the use of advanced operating environments likeembedded Linux to effectively partition the control and mediaprocessing required in V2IP systems. This in turn leads to simplersoftware development using a single processor and tool chain, and lowercost through the elimination of one or more expensive DSPs.
Videophones will leverage one or more of the following three videocompression algorithms: H.263, H.264 or MPEG-4. Of these, H.264 (MPEG-4AVC) is the most advanced in its ability to deliver low-bit-rate,high-quality video in real-time. The downside is that H.264 requiressignificantly more processing power than H.263. Only in 2007 willcost-effective processors with enough capability to execute H.264 willbecome commonplace.
Embedded V2IP frameworkAt the heart of a V2IP design is embedded voice and video processing,and the software elements that control and manage the data flow throughthe system (the framework). OEMs and ODMs have three options fordeveloping the V2IP framework:
Build a complete V2IP software framework from the ground up;
License components and software stacks and provide theintegration, validation, silicon porting and interoperability testing;or
License a pre-integrated and proven framework from a third party.
Unless IP and networking software development is a core strength ofyour organization, the fastest, lowest-risk and most cost-effectiveoption is to license a third-party framework.
A highly optimized solution will come in a form that can be quicklyintegrated into the end-product design. Look for something thatprovides all of the media-processing algorithms and VoIP call controlcombined in a flexible framework, thus allowing you to focus ondesigning a capable, value-added device.
Given the real-time nature of IP traffic, a tightly integrated V2IPframework is critical for ensuring reliable and stable voice and videocommunications. From a comprehensive media-processing library to arange of QoS and networking clients, a V2IP software frameworkultimately determines the quality and performance of the voice/videocommunications.
OEMs should ensure that they are implementing a flexible VoIPframework. A framework needs to provide runtime selection andconfiguration of the appropriate VoIP codec, as well as dynamicconfiguration of the media-processing elements within a given mediachannel.
The framework and its associated scheduler component must ensurethat all algorithms required for a given channel definition areexecuted in the time period allowed. In a single-channel system, thetask of scheduling these algorithms is little more than a series ofconsecutive calls to the appropriate algorithms in order.
Multichannel systems, on the other hand, offer a more complexscenario in which different VoIP codecs may be required for eachchannel, with certain channels requiring echo cancellation. Videophonesare typically “single-channel” systems, although generally capable ofthree-way A/V calling.
Designing a VoIP phone today, let alone a Wi-Fi videophone,requires product differentiation and support for next-generationservices and functionality. Legacy VoIP phones have provided basic”toll quality” voice codecs such as G.711 and video-compressionfunctionality using the H.263 standard. Both of these codecs are 100percent capable of enabling a personal video conferencing session andhave been used successfully for a number of years.
However, in today's high-fidelity and high-definition world,next-generation videophones must support wideband audio and advancedvideo-compression technologies.
Technologies such as AMR-WB (G.722.2) audio and H.264 videocompression enhance the communications experience, providing morerealistic communication between two parties.
In addition to wideband audio and higher-definition video, thereare different technologies that benefit the end-user by enhancing thereliability, performance and quality of V2IP communications. Goingforward, the following features will be required for competitive VoIPand V2IP solutions.
Audio protocols/voice qualityenhancement:
G.711, G.723, G.726 G.711, G.729AB, G.723.1, iLBC;
Audio playback and record;
3-way calling with local audio mixing;
G.168 line echo cancellation;
Full duplex acoustic echo cancellation (hands-free speaking);
Country-specific call progress tone generation/detection;
Universal tone generator;
Gain control – automatic and manual modes;
Up/down sampling for 8, 16 and 44.1kHz.
MPEG-4 simple profile;
H.264 (MPEG-4 AVC);
Video playback and record support.
TURN (STUN relay) client;
Hi-fidelity VoIP & multimediasupport:
G.722.2 (AMR-WB) codec support;
Wideband AEC/AES support
RTSP streaming media client.
Once the system designer has decided on a capable framework to handlevoice processing, video processing, call setup and NAT traversal, thefocus shifts to differentiating the product from other V2IP devices bydesigning and building the user experience.
The user experience today is a reflection of many factors rangingfrom the quality of the key components used to create the device(quality of speaker, microphone, camera and display) to the intangibleand hard-to-measure ease of use of the user interface.
With better display technology on virtually all the devices beingused for real-time, personal communications, the GUI becomes anincreasingly important part of the user experience. Even the most basicWi-Fi VoIP phones today offer full-color display GUIs with featuressuch as animated menus, photo caller display and instant messaging.
The integration of a GUI with an embedded V2IP framework isnon-trivial. The largest hurdle facing most developers is that thetypes of processing in a GUI and a V2IP framework are inherentlydifferent:
V2IP framework – Highly responsive, media-oriented, real-timeprocessing;
GUI – Responsive, user-oriented, event-driven processing.
A well-designed V2IP framework will offer a capable API thatrequires minimum interaction with the GUI. Specifically, the API shouldgenerally only require invocation in response to events generated bythe user or the network.
This split avoids the uncomfortable union of event driven andreal-time media processing elements, enabling a simple integration thatallows the developer to focus on an intuitive, value-added GUI.
Understanding your own organization's strengths and weaknesses isthe key consideration in effectively managing this part of thedevelopment process and determining the most productive methodology fortaking a product to market.
The bandwidth, processing power and software systems are nowavailable to develop robust and reliable voice and video products forboth fixed and Wi- Fi-enabled networks.
Success on the part of OEMs will come down to their ability todeliver innovative products that are easy to use, reliable and at theright price points. The only way to do that is to ensure the tightintegration and interoperability of the critical media-processing,network-management and user interface/ application software.
David Brown is chief technologyofficer at TrinityConvergence Inc .