Building YouTube into embedded applications -

Building YouTube into embedded applications


YouTube's popularity keeps growing, and the addition of direct-access YouTube viewing, without going through a general purpose browser, can add a compelling, valuable feature to a set top box, media player, or other device.

A typical YouTube-for-embedded-application needs to perform the following operations:

  • Interface via network module with the YouTube site to search and download the contents
  • Run filter to select key text and graphical information
  • Download the graphics as well as the clips
  • Flash video decoding, flv file processing, demuxing, video decoding, audio decoding and A/V synchronization
  • Render the graphics and video to the screen according to each target device

View full size

Figure 1: YouTube-for-embedded-application block diagram

Digital media elements
The Digital Media Elements (DME) architecture is designed with a systematic approach to handle concurrent real-time image processing, audio and video compression and/or decompression, audio and video synchronization, network transmission and graphic user interface. The software is designed with a hierarchical approach to ensure complex applications can be developed, integrated and tested systematically. DME technology allows us to quickly and efficiently provide highly integrated solutions for our customers.

From a functionality point of view, DME is organized into four libraries.

  • Media Application Library (MapLib)
  • Media Signal Processing Library (MspLib)
  • Network Library (NetLib)
  • Graphic User Interface Library (GuiLib)

These are described in more detail next.

Next: Implementation on TI platform
Implementation on TI platform
For the current version, we implemented DME on Texas Instruments' TMS320 DM6446 (DaVinci) platform, which is a digital video/audio SOC with one 297Mhz ARM926EJ-S, one 594 Mhz C64x+ DSP, and rich video/audio peripherals.

With the TI DaVinci digital media technology, it is easy to port code to DaVinci family SOCs, like DaVinci HD DM6447, or OMAP serials, etc.

DME provides a media framework, allowing seamless integration of xDM codecs, and it reduces time-to-market by reusing existing software components.

All four of the DME libraries are implemented with C, and the decoding process is done on the DSP core with highly optimized algorithms.

View full size
Figure 2: Digital Media Elements architecture on TI's TMS320DM6446.

The current version of DME supports FLV and MP3 media formats, which are used in most YouTube content. The MspLib is TI xDM compliant, so it is easy to integrate with other media codecs, like H.264, MPEG4, MPEG2, VP6, VP7 and WMV, etc.

Next: How it works
How it works
NetLib is a middleware used to handle querying/searching video/audio content on the Internet as well as real-time streaming to download the content. The streaming protocol uses HTTP.

Below is a diagram that describes the NetLib process:

View full size
Figure 3: NetLib flow chart.

  • A NETLIB_Handler instant is created, using NETLIB_handlerInit().
  • Using NETLIB_search() or NETLIB_searchByIndex() to perform searching.
  • Result set is held by the handler; can also redo the previous step to perform a new search–handler will then hold the NEW result set.
  • Get returned information from the handler.
  • Destroy the handler using NETLIB_handlerDestroy();

NetLib includes the following APIs:

  • NETLIB_Handler NETLIB_handlerInit (void);
    Initiate the NetLib handler, which is in charge of handling further operations.
  • int NETLIB_handlerDestroy ( NETLIB_Handler);
    Destroy the handler, which will not be used again.
  • int NETLIB_search (NETLIB_Handler handler,
    char* keyword,
    NETLIB_QueryType queryType,
    unsigned int maxCount );
    Search video with the given keywords, and return the result with no more than the maxCount.We support several query types including, TOP10, VIDEO, and CATEGORY, etc.
  • int NETLIB_searchByIndex(NETLIB_Handler handler,
    char* keywords,
    NETLIB_QueryType queryType,
    Int startIndex,
    Int endIndex);
    Search video by index with keywords, query type, start index and end index provided. This function helps when pages are used in searching.
  • int NETLIB_readContent(NETLIB_Handler handler,
    int index,
    NETLIB_VideoDetail* videoDetail);
    Read specific entry from handler with index provided. Here, index means the order, which starts from 1, indicating that one entry stands in the result array.
  • int NETLIB_getCount(NETLIB_Handler handler);
    Obtain the size of result set from handler.

Next: MapLib, Codecs, GUILib
MapLib is a player responsible for processing the streaming media data, including:

  • Audio/Video demuxing
  • Audio/Video decoding
  • Audio/Video synchronizing and rendering

From the player's view, the audio and video format are transparent, it can easily switch between different formats (e.g., FLV, H.264, MPEG4 or MPEG2, etc.)

View full size
Figure 4: MapLib flow chart.

In the current implementation, MspLib is following the TI xDM standard so that all of TI's xDM codecs, or 3rd party codecs which are compliant to xDM, can be integrated into MspLib as well. The available codecs include:

Table 1: Video, image, audio codecs.

According to the xDM standard, all the codecs have similar interfaces, include the following four APIs, for example, for the FLV decoder codec:

  • VIDDEC_create(Engine_Handle handle, string name);
    Creates an instance of the video decoder algorithm.
  • VIDDEC_control(VIDDEC_Handle, VIDDEC_Cmd id,
    VIDDEC_DynamicParams* params,
    VIDDEC_Status* status);
    Executes the “control” method in this instance of the video decoder algorithm.
  • VIDDEC_process(VIDDEC_Handle handle, XDM_BufDesc* inBufs, XDM_BufDesc* outBufs,
    VIDDEC_InArgs* inArgs, VIDDEC_OutArgs* outArgs);
    Executes the “process” method in this instance of the video decoder algorithm.
  • VIDDEC_delete(VIDDEC_Handle handle);
    Deletes the instance of the video decoder algorithm.

GUI Lib is based on the implementation of DirectFB. DirectFB is a thin library that provides hardware graphics acceleration, input device handling and abstraction, and integrated windowing system with support for translucent windows and multiple display layers on top of the Linux Framebuffer Device.

In the current version on DaVinci DM6446, we applied all graphic operations on DaVinci's OSD (On-Screen Display) engine, while, the video layer is separated with the Graphic layer (OSD).

Next: Input device, Image / fonts rendering

Input device
We implement the MSP430 driver for infrared remote control as an input device, providing a simple way for users to control the GUI, and we provide a software keyboard for advanced users to do more interactive operations.

  • int msp430lib_init(void);
  • int msp430lib_get_rtc(int *year, int *month, int *day, int *hour, int *minute, int *second);
  • int msp430lib_set_rtc(int year, int month, int day, int hour, int minute, int second);
  • int msp430lib_get_ir_key(enum msp430lib_keycode *key);

Image / fonts rendering
Finally, here are the APIs for image and font rendering, and for windows and menu management.

  • int image_render(IDirectFB* dfb,IDirectFBSurface* primary,char* image,DFBRectangle* rect);
  • int string_render(IDirectFB *dfb,IDirectFBSurface* surface,IDirectFBFont* font,char* text,DFBRectangle *rect);

Windows / menu management

  • void set_background_image(char * imgName, IDirectFB *dfb,IDirectFBDisplayLayer *layer);
  • int display_string_to_window(IDirectFB *dfb,IDirectFBWindow *window, const char * text ,int x,int y, DFBSurfaceTextFlags flags);
  • void youtube_visiable_window(IDirectFBWindow * window);
  • void youtube_invisiable_window(IDirectFBWindow * window);
  • IDirectFBWindow* create_window_backgroud(IDirectFB* dfb,IDirectFBDisplayLayer* layer,char* path,DFBRectangle* rect);
  • IDirectFBWindow* create_window(IDirectFB* dfb,IDirectFBDisplayLayer* layer,DFBRectangle* rect);
  • void Level1_update(IDirectFB *dfb,IDirectFBDisplayLayer *layer, int type);
  • void Level2_update(IDirectFB *dfb,IDirectFBDisplayLayer *layer, int type);
  • void exit_topmenu();
  • void set_image(IDirectFB *dfb,IDirectFBDisplayLayer *layer);
  • void adShow(int lvl_two_index,int lvl_three_index,IDirectFB *dfb);
  • void adNotShow();
  • void set_ad(IDirectFB *dfb,IDirectFBDisplayLayer *layer);
  • void set_select(IDirectFB *dfb,IDirectFBDisplayLayer *layer, char *file);
  • void set_selOrig(IDirectFB *dfb,IDirectFBDisplayLayer *layer, char *file);
  • void selShow();
  • void selNotShow();
  • void set_category(IDirectFB *dfb,IDirectFBDisplayLayer *layer);

About the author
Karl Zhao, PhD, is the founder and CEO of DigiLink Software, Inc. He is the architect of DigiLink's Digital Media Element software. He also developed and maintains close relationship with technology partners such as Texas Instruments and its third party networks as well as OEM customers such as Motorola, UTStarcom, Pelco, etc. Prior to founding DigiLink, Zhao was with Equator Technologies, Inc., where he participated in Equator's BSP-15 architecture design, established and managed a design center, managed Equator's video communications segment with key design wins with leading manufacturers in video infrastructure equipment, video conferencing and video surveillance. He also held senior design engineer positions at Rockwell Semiconductor Systems (now Conexant) and DSP Software Engineering, Inc. (acquired by Tellab). Karl earned a Bachelor's degree in Electrical Engineering from Tsinghua University in Beijing, China. He has also earned Master's and Doctoral degrees in Electrical Engineering with a specialization in advanced signal processing technology, both from Northeastern University in Boston, MA, U.S.A. He can be reached at

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.