Mobile devices, multimedia NoCs, and Steve Martin's "LA Story" -

Mobile devices, multimedia NoCs, and Steve Martin’s “LA Story”


During the recent ConsumerElectronics Show, I talked to many chip and software vendors whosedesigns are being used in some of the feature-packed, multimedia-ladenmobile and embedded devices introduced there. Listening to them, I wasconstantly reminded of a scene from Steve Martin's LA Story.

Martin, playing a Los Angeles TV weatherman, gets in his car in themorning to go to work. But instead of getting on the freeway system -clogged to a standstill with traffic beyond its capacity – Stevetravels to work by way of an elaborate system of short-cuts: through aneighbor's yard, down an alley, across an empty lot, through a carwash, zigzagging through a parking lot, and so on.

It was that or face driving on a freeway system that was designedfifty years ago and unable to handle today's volume of traffic. Thescene was hilarious because it was an accurate reflection of the realworld, and Martin's solution wasn't that much of an exaggeration of theextremes to which L.A. drivers sometimes resort.

Embedded SoC traffic jams
In many embedded applications in mobile devices and portableelectronics systems, developers and builders of the silicon are in thesame situation. Builders of MP2 players, video recorders, mobile TVdevices andall-in-one mobile phones with video capability are driven by the needto deliver multimediacontent over high-bandwidth wired and wireless Internet connections athigher and higher data rates. But they are havingproblems with outdated shared bus architectures that simply can'thandle the increased traffic loads.

The use of multi-core CPUs in such designs only partially addressessuch problems, and in other ways exacerbates them, because to move thedata around the chip it has been necessary to depend on a shared-bus”freeway” system that is decades old and inadequate for present andfuture needs.

Of course, there are new freeway systems, such as networks-on-chipand on-chip point-to-point, packet-based, serial switched fabriclinkages, similar in concept toInfiniband, PCI Express and RapidIO at the board-to-boardand system to system level. Many of these chip-level alternatives andthe problems they raise aredescribed in an excellent recent book “Networks On Chips,” by GiovanniDe Micheli and Luca Benini.

There are at least two problems I see with most such topologies forthe vendors of the devices that use these multimedia-optimized SoCs.First, there are so many of them. How do you make a choice? How do youassess their compatibility with existing “freeway” designs? Second,there are the numerous software development issues. These are alsocovered extensively in the De Micheli/Benni book.

After reading in their book about all the complicated softwareproblems ahead, I have come to the conclusion that even if we agree ona common nextgen freeway system for on-chip traffic, the softwareproblems alone will prevent its widespread adoption for many years.Consider the amount of time it is taking for the industry to develop acommon set of standards for multicore software development. So far Ihear a lot of talk, with minimal action taken.

Making do with work-arounds
It should come as no surprise that, faced with such challenges, not afew current licensees of core processors – including those from ARM, MIPS,Power, and PowerPC – are taking apage from the script for Steve Martin's movie. They're making do withwhat they have, using current shared bus topologies where appropriate,replacing them where they can, or finding work-arounds and shortcutsthat get around the traffic jams when they can't.

For example, most recently, Atmel's ARM926EJ-S-basedmicrocontroller – designed for what it calls human interfaceapplications with loads of graphics, audio and video – takes thework-around approach to the extreme to eliminate the data trafficbottlenecks that often occur on the ARMarchitecture's traditional AMBA bus toachieve on-chip data transfer rates of up to 41.6 Gbps.

No less innovative in its work-around strategies is Digi with thebus workaround it uses in its NetsiliconNS9360 deployed in the several dedicated I/O devices it has builtfor cellular gateways, WiFi device servers, and Wireless Videoappliances. Similar to the approach taken by Atmel, they stick with theexisting AMBA AHBshared bus topology, but greatly modify the peripheral DMAstructure. It even incorporates mechanisms that enable the developer tomodify specific registers to allow direct control in software overhowever much bandwidth is allocated.

They are not alone. For example, Faraday Technologyhas opted for a QoS-aware non-blocking crossbar switch to get theintra-chip data flow bandwidth it needed, as well as a smart DMA engineof its own design. PortalPlayeralso uses a crossbar switch of its own design as an alternative toAMBA, and NXP uses a modified bus architecture,retaining AMBA for deterministic control and processing tasks andadding an additional data flow optimized bus of its own design thathandles media rich operations. Other companies have opted for theapproach that Cirrus Logic has taken. Direct andsimple, it just puts two AMBA buses on the chip and separates dataflows such that each processing element gets as much bandwidth aspossible.

Others, such as Texas Instrumentswith its OMAP, NXPwith the Nexperiaand Toshiba et. al. in the Cellarchitecture have opted for a shared memory approach, on top ofwhich they layer various message-passing mechanisms, based as much aspossible on existing standards, such as Open MPI.

There are still a lot of questions that occur to me as the industrymakes the shift to this new architectural paradigm. How long can suchworkarounds be effective? Are there any commonalities between thevarious new NoC bus topologies that a developer can look to, to atleast minimize the cost of converting from the existing shared busmethods?

Do any of the new NoC alternatives incorporate features that makethis translation easier? Can the software solutions being considered tosolve various programming and debug issues with current homogeneoussymmetric multi-cores be extended to operate effectively in this muchmore complex, heterogeneous and asymmetric multicore environment thatNoCs represent?

What do you think? What approaches are you pursuing now? And in thefuture? What is the best way to make the transition? The Steve Martinapproach will only work for so long.

Bernard Cole is site editorfor, site leader on iApplianceweb aswell as an independent editorial services consultant working with hightechnology companies. He welcomes your feedback. Call him at 602-288-7257 or send an email to .

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.