HTTP not FTP for Simpler Internet File Transfers - Embedded.com

HTTP not FTP for Simpler Internet File Transfers

When the world is filled with Internet appliances, conventionalfile retrieval methods will no longer work. These appliances willbe making millions of requests for trivial data such as:

What TV program is on TLC at 8:00PM?
What is the temperature and wind speed in the western suburbs ofBoston?
Who has the lowest price for Kenya AA coffee?

Even within an intranet, with far fewer users than the Internet,one will see requests for secure door access number confirmations,temperature and wind speed around buildings, and the number ofindividuals in certain rooms. The replies will usually be sent asshort files or records and sourced from many different fileservers.

The Internet appliances requesting this kind of information maybe simple keypad-like devices or wall mounted thermostats. And asthe number of these appliances increases, both Internet and LANusers will experience longer delays in accessing servers usingtoday's methods of file acquisition.

One of the issues limiting today's Internet informationmanagement is decentralization, which is due in part to thedifferent application protocols, the number of file servers, andthe variety of storage formats. The types of information beingshared on the Internet—including text, images, sound, video,and multimedia—contribute to the limitations as well. Evenwithin a single type, there are many different formats. Fordigitally encoded images, one has the choice of gif, jpeg, tif,pict, and about a dozen others.

On the other hand, Internet appliances will be low cost,single-function information clients that usually will use a singleformat. Converting the data into the appropriate format may be upto the file server, unlike Web browsers (another informationclient), which have the protocols in their software to make theappropriate conversions.

Application protocols for file or information transfer that arebased on Internet protocols include:

  • FTP (File Transfer Protocol), a simple protocol for copyingfiles from one computer to another.
  • HTTP (Hypertext Transport Protocol), a protocol for requestingand delivering hypertext multimedia information.
  • SMTP (electronic mail), the protocol that delivers email fromone system to another.

Unfortunately, most of these information transfer protocols areeither overkill or inappropriate for the number of Internetappliances that will be requesting small amounts ofinformation.

Information Integration

For Internet appliances to work efficiently, the information orfile servers will have to provide the information in an integratedbut extendable format. Currently, two types of data transfersdominate Internet traffic: Web and email. While most email serversare dedicated to the delivery of text information with theoccasional attached binary file, Web servers are more abundant andable to handle many types of files. Therefore, Web servers willprobably be the information servers to Internet appliances.

The key to the power of Web technology is informationintegration. The Web provides three distinct forms:

  1. Linking data provided by multiple servers and presenting ittransparently to Web clients (usually Web browsers). The address ofeach data item on the Web is defined by a Uniform Resource Locator(URL), which provides a degree of location-independence for datastored by Web servers.
  2. Providing clients with data from diverse sources. Web serversand Web clients (browsers) support this form of integration indifferent ways.

    Web browsers integrate multiple sources of data by supportingseveral Internet data-access protocols in addition to HTTP, theprotocol used to retrieve pages stored on Web servers via the Web.For instance on a browser, a URL can specify the FTP as well as anHTTP protocol, and by doing so can transfer files from any sourceaccessible on the Internet. However, if Internet appliances becomeWeb clients, it is unlikely they will support more than oneprotocol due to their size and cost limitations.

    Web servers integrate multiple sources of data by invokingapplications that use the Common Gateway Interface (CGI) API toexecute in response to client requests. CGI can be used to linkWeb-oriented applications to traditional applications such asdata-access routines. Use of CGI offers great flexibility. Forexample, a procedure or function accessed via CGI may returndifferent results, depending on the capabilities of the requestingWeb client.

  3. Most importantly, the Web encompasses new types of data. TheHTTP protocols borrow a design for extendible data typing and typenegotiation from the Multipurpose Internet Mail Extensions (MIME)standard. Browsers are designed to support new data types viahelper applications that a user can add to the browser. Webbrowsers use this technique to deliver audio, video, and PostScriptdata to users. In this way, the Web is prepared for whatever newdata types become important in the future.

Web Clients

Because the majority of information servers also act as Webservers, the upcoming Internet appliances will be clients via HTTP.Today a Web client is a Web browser, which is not compatible withevolving Internet appliance clients. The Web browser is a largesystem with more functionality than would ever be required in thenew single function Internet appliance. The Web browser has as itsprimary function an interactive man-machine interface that can beall things to all users and all servers.

The single-function Internet appliances will most likelycommunicate with few or even single servers and be the interfacesfor single machine functions. The servers they communicate withwill obtain information from many other servers, but the clientswill not be required to communicate directly with theappliances.

Figure 1:   Internet Appliance connections tothe Internet of Information Servers through Thin Servers

The new Web client will provide a unique man-machine interface(MMI) that may be a light indicator, a few lines of text, or abuilding's floor plan. It will always have the same MMI andprobably will be either hardwired or have its software in ROM. Thetype of file it will retrieve could be a hypertext page such as anHTML coded text page or just a fixed length record. Seldom willthese devices need to download long or multiple files.

HTTP

Tim Berners-Lee created HTTP, the protocol at the core of theWeb, at the CERN research institute in Switzerland as a way forscientists to share research papers. It was designed to be simpleand extensible, but it was also designed for delivering staticcontent, which is the origin of its inefficiency as a protocol fortransaction processing. The model is simple: the client requests adocument, the server returns it, and the connection is closed. MostInternet appliances follow this paradigm.

HTTP was intended to serve the Hypertext Markup Language (HTML),but there is nothing in the HTTP specification that requires theserver to return HTML. HTTP can therefore serve any documenttype.

HTTP is stateless in that the protocol does not have provisionsfor memory about what the server or client has done in the past.HTTP is also called connectionless because no persistent connectionbetween server and client exists (at least until HTTP 1.1). Everypage the user requests appears to the server as if it was theirfirst connection. The server simply sets up another connection andreturns the requested page. This allows the HTTP protocol to beuncomplicated, which in turn allows it to be rapidly implemented onevery kind of machine that speaks TCP/IP, including Internetappliances.

Connectionless protocols contrast with typical client/servertransactions, where the client opens a connection to the databaseor application server, and the server keeps the connection openuntil the client explicitly logs off, as is the case with FTPconnections. The client can be in different states, includinglogged in, authenticated, and editing a document.

Multiple states are essential for many complex functions, suchas authentication and transaction processing. Although not providedby HTTP, the potential to interact in varied states or maintainpersistent connections can be provided in HTTP by extra softwarewritten using CGIs, Java, or CORBA. The actual state mechanismappears as cookies passed back and forth between browser andserver. Information is logged on the server and indexed by therequester's IP address, cookie, or direct socket connection, amongother mechanisms. Still, such states will not be necessary in mostInternet appliances.

Being connectionless is advantageous because HTTP scales moreeffectively than traditional client/server applications such as FTPservers. When an HTTP client is disconnected, which from theserver's point of view is nearly all the time, the server does notneed to maintain network resources for that client. This means thata single HTTP server can support far more clients than if theconnections were continuous.

As Internet appliances become more common, both the Web serverand Web clients can be scaled down. The Web servers may only haveto serve a building full of Web clients, rather than the entireworld. Often these are referred to as thin servers. The Web clientcan be small because it needs only display a few similarfunctions.

The Internet-appliance Web client portion can be implemented inabout 10-Kbytes of code. In addition, the Internet appliance needsa TCP/IP stack, network-hardware interface code, MMI function, andcode to control the device. All of this can be placed in ROM with asmall amount of scratch pad memory for variables and theinformation received from the Web server. This is not future orvaporware software, both the thin server (embedded Web servers) andWeb client software exists.

A number of companies are providing thin server functionality.Data General (now a division of EMC) was one of the first toprovide the IT style of thin servers with its Thin Line products.Allegro Software Development was one of the first companies toprovide embedded Web server software for devices such as networkrouters and printers. Allegro also has Web client software(RomWebClient) aimed at devices needing to obtain data frominformation Web servers.

Other companies are developing similar products. Phar LapSoftware has its HTTP Client for a proprietary operating system.EBS has Web Client listed on its Web site's price list with TBA inplace of a price. Other companies such as Spyglass and QNX haveembedded browsers, a more full featured form of a Web client, butthey may be much too big for emerging Internet appliances.

Using Web technology to obtain data from the large informationservers may be the only way servicing millions of Internetappliances is possible. For a few devices, where large amounts ofdata in multiple files will be retrieved, the FTP function isviable. Refrigerators, set top boxes, furnaces, air conditioners,and other home and office appliances will soon be connected to anetwork and will need data. This demand could overwhelm the WorldWide Web. Web client software and HTTP transfers seem to be theonly viable solution for these high-volume, short-durationtransactions.

About the Author
Ed Steinfeld has more than 25 years ofexperience in embedded and realtime computing. He began as aprogrammer writing code and designing hardware to test hybridcircuit boards. He has marketed embedded and realtime products toOEMs and resellers for Digital Equipment, VenturCom, and Phar LapSoftware. His international experience includes a stint in HongKong as Far East Channels Manager and responsibility forinternational OEM sales in Europe and the Pacific Rim. Ed is nowproviding marketing services to the embedded systemsindustry.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.