Making mobile video applications more energy efficient
Demand for mobile video has exploded. As the thirst for multimedia access—from anywhere at anytime—continues to grow, so has the availability of TV-anywhere services. This has been a boon for service providers, eager to add new consumer applications to their portfolios.
But it has also created a huge challenge for the supporting infrastructure equipment, not to mention leveling a potential conundrum: the performance required to process video from the media gateways must be tempered with low power that will not over-tax the infrastructure, but will provide cost containment and sustainability.
Advances in video transcoding implementations from new, alternative processors and the streaming architectures that support it—some of which can yield a real 15x power/density gain over processor technology typically used in today’s infrastructure equipment—can be part of the solution to ensure high-performance, low-power platforms for mobile video.
Figure 1: The client devices receiving the video demonstrate a wide range of decoding capabilities. (To view, larger image click here)
In 2H 09, some estimates suggest that 1 billion YouTube channels were streamed per day. That’s up from 600 million in 2008. Looking forward, industry reports purport that by 2015, as many as 12 billion devices (TVs, desktops, laptops, netbooks and smart phones) will be connected to 500 billion hours of available video, according to Intel.
By 2013, 400 million mobile video phones will in the hands of consumers, according to Infonetics. In 2012, Cisco predicts 90 percent of all Internet traffic will be video and 70 percent of all mobile traffic in 2013. Clearly, there is tremendous demand—growing by the day—for widely and easily accessible video.
To meet this demand and to maintain ease of accessibility, the servers, video streamers and media gateways that process and deliver all of this video content now have to grapple with an exponential increase in processing work load.
Video requires an order of magnitude more processing power, especially when being delivered to multiple types of viewing client devices, compared to voice signals and typical internet content. Every increase in processing power, of course, increases power consumption.
Video transcoding represents the bulk of this processing load, and combined with the growth described, represents a large latent increase in energy demand that could put pressure on existing infrastructures.
Figure 2: Within an approximate processing/power envelope of 150W, typical QCIF transcoding density is around 150 video streams for a pure x86 setup. (To view larger image, click here)
Need for video transcoding
The video being distributed originates from a wide range of capture devices that have very different resolutions and encoding capabilities. For example, the capture devices could range from a CIF resolution smart phone encoding MPEG4 or a 720p resolution camcorder encoding H.264.
Similarly, the client devices receiving the video demonstrate a wide range of decoding capabilities (in terms of dedicated video functionality or available processor cycles), such as a QCIF resolution H.263 capable cellphone or a high-spec laptop PC capable of decoding high-definition H.264.
As video is submitted to the network for live viewing and/or storage, it is done so with the expectation that anyone can view it, yet no rigid or common video exchange standards exist. To meet this expectation, video distribution and network service providers therefore have to provide transparent transcoding functionality.
Furthermore, the growing expectation is that the video being delivered is done so in real-time without the need to first wait to download some or all of the video into buffers on the client device, especially for Internet TV and surveillance systems.
This demand for real-time delivery further increases the processing load in the video delivery systems and of course compounds the growth in energy demand.
Historically, voice transcoding equipment has been located close to the core of the network and was commonly implemented using dedicated DSP-based hardware.
The same, however, is not true for video transcoding. Typically, the equipment used to transcode and stream live video is based on standard server platforms, especially when located close to the video storage locations.
This is primarily because a large percentage of video delivery services are Internet-based (even though a large majority of viewing clients are mobile). This means that a large percentage of video transcoding functionality is being performed in the server-based Internet infrastructure.
Therefore, x86-based server platforms from companies like Dell and HP are very common for video transcoding. Furthermore, once baseline streaming video capability is achieved using x86 based platforms, it is common for manufactures and video service providers to want to expand their products’ functionality to include live peer-to-peer video communications and video conferencing.
Other reasons also contribute to the broad use of x86s for video transcoding. For example, as video demand grows, more and more people are positioning to provide transcoding equipment, but not everyone has the available video codec IP or the experience and resources to produce it themselves.
There is a multitude of open source codecs available, which are of course x86-ready. Furthermore, IPS that want to provide video services are already in possession of x86-based equipment. The easy availability of video codecs and x86-based hardware makes x86 a very accessible but an extremely inefficient platform for transcoding. Because x86s are GPPs, they are attractive in many senses, but their benefits come at the cost of power efficiency.
To most, the issues related to power are obvious. Typically, the following three power-related issues come under consideration when building out a new video distribution infrastructure:
(1) Operating costs. For power consumption, this is simply the rate charged per KWH. The more power-hungry a solution is, the more it costs to operate not only in terms of processing, but also in cooling costs.
(2) Scalability. Rack space is a premium-cost resource and the ability to grow systems in terms of planned increases in channel density, without incurring the extra space and cooling penalties is key. Lower power equates to more densely packed systems and smaller scale cooling resources.
(3) Reliability. . Power consumption translates directly into heat dissipation which in turn requires active cooling solutions. Active cooling on the solid state components, the equipment chassis, and racking reduces the overall system reliability down to that of the weakest mechanical cooling components.
Moreover, the increased environmental/ social responsibility being adopted by the industry as a whole means that power is a top issue: full stop.
To address these issues, the most power-hungry resources in the systems, the x86 based processors providing the transcoding resources, simply need to become dramatically more power efficient. Alternatively, a different processor choice could be made.
There are new low-power, multicore DSPs for voice and video applications that are scalable in terms of channel processing density, allowing them to be deployed across the network from lower density access points to the higher density core network infrastructure.
These DSPs are built on a fundamentally power-efficient architecture using a highly power efficient asynchronous processor. The power efficiency comes from the asynchronous design of the core itself.
By removing the clocks and the synchronizing registers in the processor core and replacing them with simpler logic-based synchronization methods, here are the three things happen:
1) The silicon area of the DSP core is reduced.
2) The power and wiring from the clock and register infrastructure is removed.
3) The cumulative effects of reducing the area removing the clock yield even greater power reductions.
The net result is a high-performance device capable of transcoding, for example, up to 20 CIF or 70 QCIF video streams, or up to 480 voice channels. This level of channel density is achievable within a power footprint of 1.9W.
Compare this density to that of a standard server-class x86 processor. Within an approximate processing/power envelope of 150W, typical QCIF transcoding density is around 150 video streams for a pure x86 setup. Even when considering the implementation- specific details of adding DSP resources into a system, the possible power and efficiency gains are very large.
In response to the predicted and obviously growing demand for mobile and Internet-based video services, DSP designers and video server, streamer and gateway manufacturers have to engage the new paradigm of a highly power-efficient mobile video platform.
The new platform must be designed from inception to deal with the high processing demands of video transcoding and remain flexible enough to deal with ever-changing video resolution, frames rate and codec standards.
This new paradigm now provides manufacturers of video transcoding equipment an alternative to power-hungry x86-based processors, and even a power-efficient alternative to standard DSPs used by those who have already recognized the benefits of moving away from x86s.
Although the shift to a new processor can be daunting, the benefits are groundbreaking. For example, compared with a typical server-class dual quad core Xeon setup, the new generation DSP can very easily provide a ~15x gain in channel density without increasing power consumption.
Alternatively, the same power efficiency can be used for simple power reductions greater than 60 percent without sacrificing channel density.
Once these processors are deployed in video transcoding systems, the benefits of power efficiency immediately move outwards to the system and provide greatly reduced operating costs, higher levels of achievable channel density and scalability, and overall system reliability.
Imagine the equipment manufacturers using new low-power, high-performance DSPs: They can easily position these power benefits at the next level to their end customers.
The power and channel density gains simply mean that more transcoding capability can fit into a greatly reduced area. Even achieving a more modest ~10x power density gain could mean a simple 10x reduction in the total number of servers required in a real deployment; the power saving possibilities (based on complete server power consumption) are enormous.
Beyond the tangible and measurable benefits, simply being able to produce a more power-efficient product is a common goal at all levels of the industry—for sustainability, cost containment and customer satisfaction.
(John Ry is buisiness development manager at Octasic Inc.)