There are almost as many CODECs for audio and video compressionas there are researchers in the field. The choices are many, andthe applications are even more. Any video (or non-video)compression algorithm represents a design tradeoff between computepower needed to implement the compression, compute power needed fordecompression, quality of decompressed data relative to the inputdata, output size/input size compression ratio, and the time delayimposed by the compression scheme. For example, the highestpossible compression ratio could result in either very highhardware cost (hardware) or very poor image quality at thereceiving end. But, this might enable use of the cheapestcommunication link or the lowest-cost storage system. However,since the initial hardware cost is a key determining factor in thepurchase of any equipment, the market may not bear the cost of thehighest possible video data compression and the highest possiblevideo quality, too. Consequently, there are market- andapplication-dependent choices which must be made.
For corporate videoconferencing, the selection of a compressionscheme involving reasonable (but not great) quality images is lessprice-sensitive than a consumer multimedia game application wherecost is the primary determinant regarding purchase. The key invideoconferencing is a compression algorithm which produces animage data rate compatible with telephone communication links. Costis also a factor, if the desktop market is to be considered. Andcompression must have low delay or latency since audio and videomust be synchronized, and the purpose of a conference is tocommunicate “live”, not by delayed broadcast or video mail.
In contrast, for CD-ROM playback, a primary concern is attainingsoftware-only (low cost) decompression which can produce videobased on the data rate coming off the CD-ROM player. Since thematerial is compressed only once, the time and cost of compressionis not of primary concern. A similar concern exists forvideo-on-demand; the material is compressed only once, andtransmitted over cable or satellite TV, and decompressed via aset-top box.
For multimedia editing/authoring stations, compression anddecompression must be fast and both executed on the desktopmachine. Compression algorithms which make use of inter-framesimilarities are of limited use, since reconstructing a frame forediting can be very slow and lossy under these circumstances.
CODECs can be described by a wide range of parameters orconsiderations.
- Standards Compliance
There are many standards for CODECs published by the ITU (forexample H.261 video and G.728 audio) and the ISO (MPEG audio andvideo for example). Vendors using the standards-based CODECs arelikely to enjoy interoperability and often commercially availablesilicon support. However, proprietary algorithms may provideperformance advantages in many situations. If the need forinteroperability or for support from independent content vendors isminimal, it may make sense for a designer to use a proprietarycompression algorithm. Some algorithms are de-facto standardsbecause they are widely used, available for little or no cost;however they have never been officially standardized by astandards-setting body. Standards-compliant CODECs help break thechicken-egg conundrum.
- Computational Complexity
Highly complex algorithms may require computing power not availablein a desktop configuration. Obviously, there must be a matchbetween the available horsepower and the requirements of thealgorithm. For example, voice compression algorithms for wirelesstelephony must run on DSPs that are compatible with wirelesshandsetslow power consumption for long battery life is amarket requirement.
- Output Bitstream
The output bitstream must be compatible with the transmissionmedium. Data rates common for CD-ROM I/O (1500 kbps) in multimediaapplications have little use in telephone-based communications (9.6- 128 kbps). Similarly, digital TV broadcasting is looking at 6Mbps channels, so a CODEC optimized for this bandwidth should beable to provide far higher audio/video quality than one optimizedfor desktop videoconferencing over a basic rate ISDN line, 128kbps.
- Output Quality
Highly compressed voice signals must be intelligeable whendecompressed. Also, the speaker must be identifiable. Low qualityimages in videoconferencing, rather than no images, may actually bea detracting factor. Quality has been one of the overridingconcerns in multimedia and teleconferencing developments;fortunately quality is a benefactor of the continuous improvementsbeing made both in algorithms and in computationalhorsepower.
Teleconferencing, by its nature. is two-way, real-timecommunications. Signals must be almost simultaneously encoded anddecoded at both ends of the line. This is very different from thetypical PC multimedia paradigm. Algorithms based on the “encodeonce slowly; decode many times quickly” model are not appropriatein teleconferencing. However, algorithms with sufficient simplicityfor decoding can be done on the PC host processor, whileencoding is usually better done on dedicated silicon orgeneral purpose DSPs. This is also becoming a common paradigm forMPEG multimedia.
A long latency between the time a signal (speech or video) iscreated to the time it is received at the other end creates adysfunctional teleconference. This is not a factor in otherapplications of compression such as multimedia content deliverybecause there is typically not a two-way interaction going on. Manypeople have experienced latency problems when conducting longdistance telephone calls over satellite links. Because of thepacket nature of the Internet over an indeterminate number ofcomputer links, videoconferencing and audioconferencing over thatmedium suffer from latency efects.
The cost to the OEM or end user includes the cost of the hardwareand the cost of the software. Software may include royalties forlicensing of CODEC algorithms. As is obvious, cost plays a directrole in the marketability of any product.
Some CODECs are lossless, some are lossy, and some like JPEG can beeither. Lossless compression is considered a reversible process inthat it is possible to perfectly reconstruct the image or audiodata steam. Examples include vector coding, run-length coding,entropy coding, and variable-length coding. Lossy compression, usedin virtually all videoconferencing applications, takes advantage ofthe limits of the human sensory system by dropping from the datastream signals of various frequencies and amplitudes that we areless likely to notice as absent. Psychophysical coding producescompression techniques which reduce the number of bits required toreproduce the sounds or images, but the reconstruction is never100% of the original. Clearly, lossy techniques would beinappropriate for data compression of files where every bit counts,such as word-intensive documents or number-intensive spreadsheets.Most database information and all transmissions of computerprograms must be accurate down to the bit. Complex computergraphics must often be reproduced exactly, and transmission ofmedical images today is largely dependent on lossless techniques.Lossless techniques are pretty much limited to a 4:1 compressionratio while lossy techniques might provide 200:1 ratios. Ideally,lossy compression algorithms are written to discard what the earcannot hear or what the eye cannot see.