Open Embedded: An alternative way to build embedded Linux distributions - Embedded.com

Open Embedded: An alternative way to build embedded Linux distributions

As embedded processors have grown more powerful and feature-rich, the popularity of the Linux operating system in embedded applications has grown in leaps and bounds. Although the fact that Linux is open source and free of licensing fees is one major driver of its popularity, another key driver is the wealth of application software and drivers available as a result of Linux’s widespread usage in the desktop and server arenas.

However, embedded developers cannot simply pick up desktop Linux distributions or applications for use in their systems. Broadly speaking, embedded Linux developers face three main challenges:
     1. assembling a compatible combination of bootloader, kernel, library, application, and development tool components for the processor and peripherals used in their hardware;
     2. correctly cross-building a multi-megabyte image; and
     3. optimizing the various kernel and user-space components to reduce the footprint and associated memory cost.

Solutions to these challenges are far from straightforward and for a development team to achieve them requires significant effort and experience. Commercial embedded Linux vendors offer pre-tested solutions for particular embedded processors, but for developers who prefer a ‘roll your own’ approach to Linux, the Open Embedded (OE) build environment provides a methodology to reliably build customized Linux distributions for many embedded devices. A number of companies have been using OE to build embedded Linux kernel ports along with complete distributions, resulting in an increasing level of support to maintain and enhance key elements of the OE infrastructure.

In addition, a growing number of embedded Linux distributions (such as Angstrom) utilize OE. Although these distributions are not formally part of OE, they add to the attraction of using OE by providing ready-to-run starting points for developers. A final attraction is that some of the newer commercial distributions from companies such as MontaVista and Mentor Graphics are now based on OE. These provide additional tooling and commercially supported distributions.

In this article we present an overview of the key elements of the OE build environment and illustrate how these elements can be applied to build and customize Linux distributions. The Texas Instruments Arago distribution, which is based on the Angstrom distribution, will be used as example of how to create a new distribution based on OE and the distributions that already use it.

Most of the Arago- or Angstrom-based example scripts shown here have been modified slightly to more concisely demonstrate key concepts of OE. Developers should access the original scripts at the websites listed at the end of the article to view complete real-world implementations.

An Overview of Open Embedded
OE is based on BitBake, a cross-compilation and build tool developed for embedded applications. Developers use BitBake by creating various configuration and recipe files that instruct BitBake on which sources to build and how to build them. OE is essentially a database of these recipe (.bb) and configuration (.conf) files that developers can draw on to cross-compile combinations of components for a variety of embedded platforms.

OE has thousands of recipes to build both individual packages and complete images. A package can be anything from a bootloader through a kernel to a user-space application or set of development tools. The recipe knows where to access the source for a package, how to build it for a particular target, and ensures that a package’s dependencies are all built as well, relieving developers of the need to understand every piece of software required to add a particular capability to their application. OE can create packages in a variety of package formats (tar, rpm, deb, ipk) and can create package feeds for a distribution.

Most OE users will typically begin by selecting a particular distribution rather than building individual packages. The advantage of using an existing distribution is that it will often be necessary to select certain package versions to get a working combination. Distributions address this key function. They often provide a ‘stable’ build in addition to a ‘latest’ build to avoid the inherent instabilities that come from trying to combine the latest versions of everything.

A key benefit of OE is that it allows software development kit (SDK) generation. While some development teams may prefer to build their complete applications in OE, others may have a dedicated team that builds Linux platforms for application development teams to use. In these circumstances, the Linux platform team can generate a Linux distribution as a SDK that is easily incorporated into the build flow preferred by an application team. As a result, there is no need for application development teams to be OE experts.

A critical aspect of the OE database is that much of it is maintained by developers employed by parties with an interest in ensuring successful Linux distribution builds on embedded devices. This maintenance effort is critical given the amount of change occurring in the Linux kernel and application space.

A Quick Look at BitBake
The build tool developers are typically most familiar with is ‘make’, which is designed to build a single project based on a set of interdependent makefiles. This approach does not scale well to the task of creating a variety of Linux distributions each containing an arbitrary collection of packages (often hundreds of them), many of which are largely independent of each other, for an arbitrary set of platforms.

These limitations have led to the creation of a number of build tools for Linux distributions, such as Buildroot and BitBake. BitBake’s hierarchical recipes enable individual package build instructions to be maintained independently, but the packages themselves are easily aggregated and built in proper order to create large images. Thus it can build an individual package for later incorporation in a binary feed as well as complete images.

One important aspect of BitBake is that it does not dictate the way an individual package is built. The recipe (or associated class) for a package can specify any build tool, such as the GNU autotools, making it relatively straightforward to import packages into OE.

To address the need to select specific versions of packages that are known to work together and to specify the different embedded targets, BitBake uses configuration files.

BitBake fetches the package sources from the internet via wget (or any other Software Configuration Management tool such as Git or svn) using the location information in the recipe (Figure 1 below). It then applies any patches that are specified in the package description, which is a common requirement when cross-compiling packages for an embedded processor. The package collection is specified in the higher-level recipes such as those for images and tasks.

Figure 1 BitBake overview

Since many developers will want to use an existing distribution, BitBake enables a developer to override distribution defaults by placing customized recipes or configuration files earlier in the BBPATH search path. This enables developers to tweak a distribution for their own needs without having to directly modify (and then subsequently maintain custom copies of) the existing distribution files. This approach in OE is called ‘layering’ and each collection of recipes is called an ‘overlay’.

We’ll now examine some of the different recipe and configuration files to shed a more detailed light on how OE and BitBake work. We will start by looking at the recipe types.


OE Recipe Files
OE recipes can be written in shell script, with possible Python snippets, and are divided in five categories:

  • classes
  • packages
  • tasks
  • images
  • meta

These are hierarchical with an image being the top level recipe. Theimage recipe defines what goes into a particular root file system image.The recipe simply defines a set of tasks required to build the image. Atask recipe is a group of related packages required to bring a block ofrelated features or functionality. For example, a distribution intendedfor a smart phone might have digital music, digital camera, andcontacts book tasks that bring in all the packages needed to provide aparticular capability. One of the reasons of such indirection is toabstract sets of packages in tasks, so tasks can be easily re-used indifferent images.

It is common practice for distributions to have multiple image files tooffer a variety of functionality/footprint tradeoffs. For example, Aragoincludes a base image, console image, digital video demonstrationimage, and gstreamer image. As can be seen from its image recipe (see Listing #1 ),the gstreamer image is created by building four different task recipes.The other image recipes are similar, with the base image, for example,simply omitting the console, dvsdk, and gstreamer tasks and having adifferent image name and root file system size.

Listing #1: Example Image Recipe

# Arago GStreamer image
# gives you an image with DVSDK libs and GStreamer demo

require arago-image.inc
 
COMPATIBLE_MACHINE = “(?!arago)”
 
# The size of the uncompressed ramdisk is 150MB
ROOTFS_SIZE = “153600”

IMAGE_INSTALL += “
     task-arago-base
     task-arago-console
     task-arago-dvsdk
     task-arago-gst
     “
export IMAGE_BASENAME = “arago-gst-image”

Task recipes represent individual aggregations of packages. Forexample, the arago-base task builds about 15 packages. This recipe (see Listing #2 )introduces some standard BitBake variables you will need to becomefamiliar with. PR represents the Package Revision, which is the versionnumber for package recipe file. PN represents the Package Name, while PV(not used here) represents the Package Version, which is the versionfor the actual package source files. The tasks do not specify mostpackage versions as these are set in the configurations files.

Listing #2: Example Task Recipe

DESCRIPTION = “Basic task to get a device booting”

LICENSE = “MIT”
PR = “r9”

inherit task

# these can be set in machine config to supply packages needed to get machine booting
MACHINE_ESSENTIAL_EXTRA_RDEPENDS ?= “”
MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS ?= “”

ARAGO_ALSA_BASE = “
     alsa-lib
     alsa-utils-aplay
     “

ARAGO_BASE = “
     ${ARAGO_ALSA_BASE}
     ldd
     mtd-utils
     curl
     arago-feed-configs
     initscript-telnetd
     devmem2
     “
   
# minimal set of packages – needed to boot
RDEPENDS_${PN} = “
     base-files
     base-passwd
     busybox
     initscripts
     modutils-initscripts
     netbase
     update-alternatives
     module-init-tools
     ${ARAGO_BASE}
     ${MACHINE_ESSENTIAL_EXTRA_RDEPENDS}
     “
 
RRECOMMENDS_${PN} = “
     ${MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS}
     “

The package recipes address the specific needs of each package. A relatively simple example (see Listing #3 )for a GStreamer-based application illustrates some of the key functionsof a package recipe. The build mechanism is specified by inheriting theautotools and pkgconfig classes, dependencies are identified, packageversion and revision numbers identified, and the location of the packagesources is given.

Listing #3: Example Package Recipe

DESCRIPTION = “gstd: a GStreamer-based streaming server”
HOMEPAGE = “http://sourceforge.net/projects/harrier/”
LICENSE = “BSD”
SECTION = “multimedia”
PRIORITY = “optional”

inherit autotools pkgconfig

DEPENDS = “dbus dbus-glib gstreamer”
RDEPENDS_${PN} = “dbus dbus-glib gstreamer gst-plugins-base”
RRECOMENDS_${PN} = “gstreamer-ti”

SRCREV = “f3e22c93f4fd7ca47d6309b8450788127550ecb9”

PV = “1.0”
PR = “r13”
PR_append = “+gitr${SRCREV}”

SRC_URI = “git://gstd.git.sourceforge.net/gitroot/gstd/gstd;protocol=git “
S = “${WORKDIR}/git”

# We don't want to run autoconf or automake, unless you have
# automake > 1.11 with vala support
do_configure() {
      oe_runconf
}

FILES_${PN} += “${datadir}/dbus-1/*/*.service”
FILES_${PN}-dev += “${datadir}/dbus-1/interfaces/*.xml”

In the package recipe (see Listing #3 ), there is a line thatstates ‘inherit autotools pkgconfig’. This line is utilizing theremaining recipes type – a BitBake class. Classes have a peer-to-peerrelationship with the other recipes rather than a hierarchical one. Theyare used to factor out common recipe elements that can then be reused,through the inherit command. The most frequent use of classes is toinherit functions or portions of recipes for commonly used build toolslike the GNU autotools. An example (see Listing #4) is the class usedfor the pkgconfig tool (pkgconfig.bbclass):

Listing #4: An Example Class Recipe

inherit base

DEPENDS_prepend = “pkgconfig-native “

do_install_prepend () {

for i in `find ${S}/ -name “*.pc” -type f` ; do
            sed -i -e 's:-L${STAGING_LIBDIR}::g' $i
        done
}

do_stage_append () {
    install -d ${PKG_CONFIG_DIR}
    for pc in `find ${S} -name '*.pc' -type f | grep -v — '-uninstalled.pc$'`; do
        pcname=`basename $pc`
        cat $pc > ${PKG_CONFIG_DIR}/$pcname
    done
}


OE Configuration Files
Configuration files fall primarily into two classes: machineconfiguration and distribution (distro) configuration. There is also alocal configuration file and a file called bitbake.conf. bitbake.conf isthe first file read by BitBake and includes all the other configurationfiles. In addition, it defines many global variables and is calledbitbake.conf. It is not recommended to modify bitbake.conf directly, butrather place overrides either in the distro or local configurations.Machine configuration files define the particular boards being targeted.Distribution configuration files define a specific Linux distribution(e.g. a set of specific package versions) for one or more machines.

The distro configuration file is the best place to make global settingsthat apply to all the images generated from the distribution. OE enablesdevelopers to override these settings for particular images orpackages, providing flexibility if there are special cases to handle.For example, the local.conf (local configuration) file is usually usedto contain user-specific configurations to slightly modify defaultdistro configuration settings. We will overview some of the additionalconfiguration settings made in the Arago local.conf file once we havediscussed the distro and machine configuration files in further detail.

The distro configuration file specifies a number of basic ‘housekeeping’ parameters (see Listing #5 )such as the name of the distro, directories where sources aredownloaded and built packages are stored, and the supported file formatsfor the uImage:

Listing #5: Configuration File Snippet

# For now Arago is not big enough to warrant a separate distribution,
# reuse Angstrom, but set the name to Arago
DISTRO = “angstrom-2008.1”

# Set the distro name and version, since we now produce own SDK
DISTRO_NAME = “Arago”
DISTRO_VERSION = “2010.05”
BUILDNAME = “${DISTRO_NAME} ${DISTRO_VERSION}”

# Use this to specify where BitBake should place the downloaded sources into
DL_DIR = “${SCRATCH}/downloads”

# Put resulting images and packages in deploy directory outside of temp
#DEPLOY_DIR = “${OEBASE}/arago-deploy”
……

# Add the required image file system types below. Valid are
# jffs2, tar(.gz|bz2), cpio(.gz), cramfs, ext2(.gz), ext3(.gz)
# squashfs, squashfs-lzma
IMAGE_FSTYPES = “jffs2 tar.gz ext2.gz”

The distro configuration file also specifies which machines the distrowill be built for, although the details of booting Linux for eachmachine are contained in the machine configuration files. Since thedistro configuration is where tool chain versions are specified,supporting a large number of machines tends to cause the file to becomemore complex. For example, it may be necessary to specify multiple glibcpatches or different toolchain versions to accommodate all differentmachines. Other architecture-related items that may need to be specifiedin the distro config file include hardware v. software floating-point,whether different instruction sets (ARM v. Thumb) are supported, andvarious addressing modes that may not work well for some of the packagesincluded in the distro.

Selection of package versions is one of the more important functionscommonly found in distro configuration files and helps ensure knowncompatible versions are used. There are a number of ways in whichversions may be selected. If no version is specified the latest versionis selected. Similarly, it is possible to specify a version that is “nrevs behind” the most recent release. For packages where a specificversion is desired, a default preferred version can be specified for thewhole distribution in the distro config file, as illustrated (see Listing #6 ) in the snippet from the Angstrom distro configuration file.

Listing #6: Distro Configuration File Snippet

ANGSTROM_QT_VERSION ?= "4.6.2"
CE_VERSION ?= "latest"

PREFERRED_VERSION_autoconf = "2.65"
PREFERRED_VERSION_autoconf-native = "2.65"
PREFERRED_VERSION_automake-native = "1.10.3"
PREFERRED_VERSION_busybox       = "1.13.2"
PREFERRED_VERSION_glib-2.0      = "2.24.0"
PREFERRED_VERSION_glib-2.0-native = "2.24.0"

Another important function of a distribution is providing a set of feedsthat enable access to pre-built binary packages. A distributiontypically uses the binary feeds to dynamically load new packages atrun-time. To add packages in a robust manner, distributions incorporate apackage management system. The Angstrom distribution, for example,began with the ipkg package manager (it recently switched to opkg), asthis has some advantages for space-constrained embedded applicationscompared to desktop package managements like dpkg or RPM. ipk format wasoriginally based on deb format and is handled by either ipkg or opkgpackage managers, opkg being the newer and preferred one. The packageformat results in additional metadata being stored in the package. Thisensures incompatible packages are not loaded and that the correctrun-time dependencies will be brought in as well. Arago defines its ownfeeds of pre-built IPKs (see Listing #7 ), but it may not have thewidest variety of packages. In cases where more packages are needed,the Angstrom feeds can be enabled, although caution is required whenmixing different feeds.

Listing #7: Distro Configuration File Snippet

ANGSTROM_PKG_FORMAT ?= "ipk"
require conf/distro/include/
angstrom-package${ANGSTROM_PKG_FORMAT}.inc
#Use this variable to select which recipe you want to use to
#get feed configs (/etc/ipkg/, #/etc/apt/sources.list). Useful
#for derivative distros and local testing
ANGSTROM_FEED_CONFIGS = "arago-feed-configs"

# Feed configuration
ARAGO_URI = "http://feeds.arago-project.org"
ARAGO_FEED_BASEPATH = "feeds/live/${ANGSTROM_PKG_FORMAT}"
DISTRO_FEED_URI = "${ARAGO_URI}/${ARAGO_FEED_BASEPATH}"

Many other general build parameters may be set in the distro configfile. For example, the developer can specify that builds with additionaldebug or profiling information are done as a standard procedure. Suchlines can be commented out or overridden by a “production code” imagerecipe to improve performance in the final production:

Listing #8: Distro Configuration File Snippet

# Comment these two out if you want BitBake to build 
# production images.
DEBUG_BUILD = "1"
INHIBIT_PACKAGE_STRIP = "1"

# Build a package such that you can use gprof to profile it.
PROFILE_OPTIMIZATION = "-pg"
SELECTED_OPTIMIZATION = "${PROFILE_OPTIMIZATION}"
LDFLAGS =+ "-pg

Machine configuration files define the basics needed to boot Linux ontothe board. For example, the target CPU, such as ARM926 or ARM Cortex-A8,is defined along with the preferred recipe provider for the Linuxkernel and the appropriate bootloader, as in Listing #9 which shows the DM365 machine configuration file that is used for an ARM9-based video device from TI.

Listing #9: Machine Configuration File

#@TYPE: Machine
#@NAME: DM365 CPUs on a Davinci DM365 EVM board
#@DESCRIPTION: Machine configuration for the TI Davinci
DM365 EVM board

require conf/machine/include/dm365.inc
require conf/machine/include/tune-arm926ejs.inc

# Increase this everytime you change something in the kernel
MACHINE_KERNEL_PR = "r45"

TARGET_ARCH = "arm"

KERNEL_IMAGETYPE = "uImage"

PREFERRED_PROVIDER_virtual/kernel = "linux-davinci-staging"

PREFERRED_PROVIDER_virtual/bootloader = "u-boot"
UBOOT_MACHINE = "davinci_dm365_evm_config"

UBOOT_ENTRYPOINT = "0x80008000"
UBOOT_LOADADDRESS = "0x80008000"

EXTRA_IMAGEDEPENDS += "u-boot"
SERIAL_CONSOLE ?= "115200 ttyS0"

EXTRA_IMAGECMD_jffs2 = "--pad --little-endian
--eraseblock=0x20000 -n"

#ROOT_FLASH_SIZE = "29"

MACHINE_FEATURES = "kernel26 serial ethernet usbhost
usbgadget mmc alsa"

Machine configuration files will be needed for each board. Sinceconfiguration steps for different boards using the same or similardevices will often be the same, in practice the common steps can becoalesced into include files. This simplifies creating new machineconfiguration files in an error-free and easy-to-maintain manner. In theabove example, we ‘exploded’ some of the include files to give a moremeaningful representation of what the full machine configuration filewould look like.


Customizing a distribution for a specific application
While Linux applications will always have a significant footprint, theneeds for any individual application will be significantly less than thedesktop-like distributions that are frequently the default for embeddeddistributions. As a result, developers will often need to minimizememory footprint or perform some other customization.

Another major challenge to embedded developers is the pace of change inthe Linux software world. Although it is preferable to avoid falling toofar behind the latest kernel releases since this creates challenges inbackporting patches, at some point developers will need to lock downtoolchain, kernel, and application package versions to go through testand get to production. This may involve creating some custom recipes orconfiguration files for a distribution picking up different componentversions. Before customizing a pre-existing distribution, some timeshould be spent looking at what ‘canned’ options the distribution mightoffer. For example, Angstrom offers stable and development branches andbuilds with different footprints. If such options do not meet the needsfor a particular application, then some customization is required.

Footprint reduction and scalability was one issue that drove developmentof Arago by TI. Another issue was corporate legal concerns aboutshipping GPLv3-licensed software and any encryption software, whichwould require additional compliance activities related to export controlregulations. We will discuss some of the recipe and configuration fileintroduced in Arago to illustrate how a custom distribution can bederived from an existing distribution.

To address footprint and scalability, Arago simply created its own setof image and task recipes, some of which we have discussed already. Formany applications, doing this may suffice. For Arago, it was necessaryto modify some of the package recipes themselves. While a number of themodifications initially made in Arago were to fix generic bugs in therecipes that could be pushed upstream to the standard OE and Angstromfiles, avoiding GPLv3 or encryption required changes that were notappropriate for pushing upstream. For example, since SSH includesencryption, we had to remove it from the default distribution. This had aknock-on effect on the busybox recipe, which needed a standalone telnetdaemon to be enabled for remote shell access through its defaultconfiguration “defconfig” file. Since modified recipes need to besubsequently maintained to track changes in the mainstream ones, thisapproach is recommended only when there is no other alternative.

Eliminating GPLv3 content was achieved by forcing selection of gdb andgdbserver versions that were released prior to the introduction of theGPLv3 license or any patches released under that license. In addition toselecting specific GBD versions, Arago chose to use pre-built versionsof the CodeSourcery toolchain. This reduced initial build times forusers since there was no longer a reason to build the tools from source.

Rather than modifying the Angstrom configuration or recipe files, theapproach chose to customize GCC selection for Arago in the localconfiguration file. This can be used to override the preferred versionsin the distro configuration file. For example, Listing #10 of the sampleArago local configuration file sets new preferred versions of variousgdb-related components and disables inclusion of SSH, to avoid thepresence of encryption software. Note that using pre-built GCC binariesrather than building from source required adding and fixing support forexternal toolchains into OE beyond just selecting a particular version.

Listing #10: Local Configuration File Snippet

# Set some preferences
PREFERRED_PROVIDER_update-alternatives-cworth =
"update-alternatives-cworth"
PREFERRED_PROVIDER_ncurses-tools = "ncurses"
PREFERRED_PROVIDER_gdbserver = "gdbserver"
PREFERRED_VERSION_gdbserver = "6.6"
PREFERRED_VERSION_gdb = "6.6"
PREFERRED_VERSION_gdb-cross-sdk = "6.6"
PREFERRED_PROVIDER_libopkg-dev = "opkg-nogpg"
# Disable DropBear for now due to export restrictions
DISTRO_SSH_DAEMON = ""

Although the local configuration file was the primary customizationmethod chosen in Arago, developers can also insert their own recipesusing the BitBake search path and override some of the choices in thepre-existing distribution.

OE drawbacks
While OE is a powerful tool, it has a considerable learning curve andthe source of build errors may not be obvious. Development teams thatare unfamiliar with OE should allow at least several weeks to attainmoderate proficiency. Therefore, unless there is access to a developerwith previous OE experience, switching to OE during an existing projector using OE for the first time on projects with high-pressure schedulesmay be ill-advised.

The new commercial tools being introduced are intended to address someof the ease-of-use issues but obviously require licensing fees. TheNarcissus project also provides an easy-to-use web-based front-end forbuilding Angstrom-based root file systems for embedded devices.

A second drawback can be build times. OE builds everything from sourcethe first time a developer builds a distribution with it, including thetoolchain and a plethora of host tools. As a result, initial builds areextremely time-consuming and also require machines with at least agigabyte of RAM. Obviously, subsequent builds are significantly fasteras many components such as the toolchain will not need to be rebuilt. Ifthe time required for incremental builds is still unacceptable, a goodoption is to use OE to produce an SDK that can be easily used with otherbuild tools.

Conclusion
OE offers a way to create embedded Linux distributions that can leveragefrom a large database of pre-tested package build recipes. This cangreatly shorten the time required to create a new distribution. Inaddition, there are a number of pre-existing distributions that canserve as starting points for developers to create their owndistributions. OE is powerful and very flexible, enabling developers totailor distributions to their specific application needs. The flip sideof this power and flexibility is a significant learning curve. As aresult, it is best to develop expertise with OE outside of a committedproject schedule environment.

Useful Links
For further information on the TI Arago distribution and to review ordownload the full selection of its recipes and configuration files, goto arago-project.org. Information on the Angstrom distribution is available at: angstrom-distribution.org/. To access a web-based based tool for building Angstrom-based root file systems, go to angstrom-distribution.org/narcissus/. The wiki for OpenEmbedded may be found at wiki.openembedded.net. A free introductory course on OpenEmbedded can be accessed at free-electrons.com where you should scroll down to the Embedded Linux System Development section.

Nick Lethaby ()is the operating system product manager at Texas Instruments, where heis responsible for product requirements definition for the DSP/BIOS(TM)real-time operating system and multimedia SDKs. Lethaby has over 20years of applications engineering, product management and marketingexperience in embedded systems development tools. He has worked invarious other product management and marketing positions since hegraduated from the University of London with a bachelor of science incomputer science.

Denys Dmytriyenko is a Linux and open source technologist atTexas Instruments whose job is to work with TI internal teams andexternal open source communities. He has been with the company for nineyears. He received his MS in Computer Engineering from the ChernigovState
Technological University in Ukraine.

1 thought on “Open Embedded: An alternative way to build embedded Linux distributions

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.