Standards have played virtually no role in IP video security until very recently, but that is quickly changing. At widely varying paces, enterprises are shifting from proprietary, analog video systems to open, IP solutions — and with that market shift, they are creating an intensified drive for standards.
This shift to IP coupled with the introduction of video content analysis or video analytics promises to extend the reach of video beyond security and into the enterprise to provide a rich source of data for business optimization.
Standards promote interoperability and reduce integration cost. Unfortunately, the video surveillance industry has been slow to adopt standards. With the introduction of feature-rich IP cameras and encoders, which offer higher resolutions and embedded analytic capabilities, customers have benefited from an increased choice of devices and capabilities. A lack of standards, however, has kept some of these devices out of reach. In addition to limiting customer choice, this standards deficiency increases integration costs for solution providers and manufacturers.
More importantly, it results in opportunity costs that lead to reduced innovation as Video Management System (VMS) vendors waste research and development dollars on integration activity instead of focusing efforts on new functionality.
The good news is that the video security industry is finally responding and a few attempts at standardization are underway. One such effort is the Physical Security Interoperability Alliance (PSIA), a group of leading physical security industry and IT manufacturers, system integrators and distributors that has come together to promote the interoperability of IP-enabled security devices, and is currently focused on several initiatives.This article will focus on the standards requirements for adding, configuring and managing IP cameras and encoders, hereafter referred to simply as an IP media device or IP camera.
Next: The move to IP video
The move to IP video
Video surveillance systems have consisted traditionally of NTSC or PAL analog cameras connected over coaxial cables to VHS tape recorders or Digital Video Recorders (DVRs). Figure 1 shows a typical DVR deployment with a remote office monitoring capability. Analog cameras connect directly to a DVR and video is typically stored on internal hard-drives contained in the DVR itself.
Figure 1: A typical DVR deployment with remote office monitoring
A video encoder (see Figure 2 ) is a device that digitizes signals from analog cameras and transmits them over an IP network to a VMS solution. Video encoders allow customers to maintain their investment in coaxial cabling and analog cameras as they transition part of their solution to IP. IP cameras, also shown in Figure 2 , combine the capabilities of an analog camera and encoder in a single device.
Figure 2: A hybrid analog and IP system.
Video surveillance systems are transitioning to an all-IP environment similar to the IP migration path most recently seen with enterprise voice, where many Private Branch Exchanges (PBXs) were replaced with Voice over IP (VoIP) solutions.
Next: All IP implementations
All IP implementations
In the last few years, we have seen the introduction of intelligent IP cameras that provide enhanced capabilities such as video analytics, multi-stream video and megapixel resolutions. Moving some of the intelligence to the edge has allowed VMS solutions running on commercial-off-the-shelf (COTS) servers to scale, replacing analog cameras and proprietary DVRs in the process. Figure 3 shows an all IP video implementation.
Figure 3: An all IP video implementation.
IP cameras provide several benefits to organizations including:
- Distributed computing, by pushing intelligence to the edge for encoding, compression, video analytics, email notification, motion detection and field upgrades
- Lower cabling costs through the use of structured cabling systems (UTP vs. coax)
- Simplified power distribution by leveraging Power over Ethernet (POE)
- A converged IT management infrastructure
- Use of COTS servers
- Transmission of pan, tilt, zoom (PTZ) commands and alarms over the same cable
- Multi-stream capability, e.g. MJPEG and H.264, to different users or applications
- Higher resolution including High Definition (HD) and megapixel capabilities
- Reduced camera requirement because of high-resolution cameras
The main disadvantage for IP cameras stems from a lack of standards and the different implementations necessary from vendor to vendor for video streaming, configuration and status notification, as well as the networking skills required to implement these devices.
Next: Leveraging IP protocols
Leveraging IP protocols
One of the obstacles to the adoption of IP media devices for physical security is the complexity of IP networking. A primary objective for a standard would be to reduce this complexity so that any user would be able to install a device, and have it recognized and operational without compromising network security. The challenges for implementing IP media devices in managed corporate networks where expertise may be readily available are different from those for small businesses or ad hoc networks.
An effort to reduce the complexity of IP networking for ad hoc or unmanaged networks was initiated through the Internet Engineering Task Force (IETF). The Zero Network Configuration, or Zeroconf, simplifies networking by allowing devices to:
- obtain IP addresses without a DHCP server
- resolve network names without a DNS server
- discover what services are available on a network
Apple's Bonjour is an implementation of Zeroconf that allows users to connect seamlessly to each other at a conference, for example, without the help of network administrators. In the home or small business, it allows users to add network devices, get a unique local network name assigned and discover all available services automatically. In a managed network environment, Bonjour could be used to discover available devices. Many network printers support Bonjour and by installing the Bonjour plug-in for Windows, these devices can be easily detected and used for printing. Similarly, IP cameras supporting Bonjour can be added to a network and auto-discovered.
The Universal Plug and Play (UPnP), from the UPnP Forum, provides some similar capabilities and uses the same method for automatically obtaining IP addresses. One of the key benefits of UPnP is that the use of common protocols removes the need for custom drivers and is OS and programming language independent.
As in the cases above, an IP media device standard should support device and service discovery protocols to allow devices to be discovered automatically. IP media devices provide audio and video streaming, event notification and camera position control capabilities. To optimize the benefits offered by these devices, they must be properly configured and managed. We'll look at each of these capabilities in more detail and identify the areas that a standard should address for optimal configuration and management.
Media selection involves streaming type and the choice of codec used.
IP media devices can stream video and audio over IP using several IP protocols and protocol combinations such as HTTP, Real-Time Transport Protocol (RTP) and Real-Time Streaming Protocol (RTSP). These protocol combinations may be transported over TCP, a connection-oriented protocol, to assure delivery, or UDP for improved performance.
An IP media standard must support all combination of protocols to allow cameras to be deployed in a variety of networking environments.
In order to store and view video, the video stream must be digitized. A video stream consists of a series of still images or frames displayed in rapid succession. In North America, real-time video is reached when 30 images are displayed to the viewer in one second (30 frames per second), which is known as NTSC (National Television System Committee). North American televisions receive and display video at this rate. In Europe, however, the rate is reduced to 25 fps — known as PAL (Phase Alternating Line). Each still image is a rectangle consisting of an array of picture elements also known as pixels. Pixels represent the light intensity that a camera sees in either black and white or color. Standard definition TV displays at 720 x 480 usable pixels, providing what is known as a 4 x 3 aspect ratio. The newer HD televisions have a much higher pixel count. For example, 1080p has an image resolution of 1920 x 1080 pixels.
In order to create a digital video stream, light must be converted into values that can be transmitted. This is done using light sensors. A light sensor responds to the intensity of the light it “sees” and returns a voltage signal. In black and white cameras, each pixel can be represented by a separate light sensor. In color cameras, however, the sensors are grouped together in threes. There is one for red, green and blue (RGB), since the combination of these colors in different variations can produce any other color.
The voltage provided by the sensors is converted to discrete digital values by use of an analog to digital converter (A/D). The A/D converter takes in the voltage and converts it to an 8-bit value, resulting in 8 bits for a simple black and white pixel, or up to 24 bits for a color pixel.
As stated previously, video is just another form of data, so why should IT professionals worry about it anymore than any other traffic that uses their network? The answer is volume. Transmitting video over a network can be expensive in terms of bandwidth utilization. When more bandwidth is consumed by video, less is available for other applications such as voice, data and mission-critical systems. In addition to this, transmitting large video streams can become cost-prohibitive over WANs where usage charges may apply. To reduce bandwidth utilization, IP media devices compress video using different compression techniques and codecs.
Codecs offer a tradeoff between video compression and performance. Early IP cameras offered MJPEG compression where each image was compressed and transmitted. While these cameras provided impressive resolution capabilities, the bandwidth and storage requirements imposed limited their usefulness. MPEG4 can reduce the data stream and storage requirements by up to 50% over MJPEG, while H.264 codecs can yield additional performance gains of up to 30%. The tradeoff for these significant compression improvements is revealed in terms of processing performance. More powerful processors or Digital Signal Processors (DSP) are required to handle MPEG4 and H.264.
IP cameras, with their distributed computing capabilities, allow IP VMS solutions to scale. By moving the digitization and compression of video signals to the edge, customers can choose the optimal codec for their implementation. The choice of codec is important as processing-intensive codecs can significantly reduce bandwidth and storage requirements. With Moore's law in effect, IP camera processing performance has increased and will likely continue to increase, while cost will likely continue to fall. With falling prices, IP cameras have undertaken more CPU-intensive tasks, including analytics processing and advanced compression algorithms like H.264.
While standard variants of MPEG4 and H.264 codecs exist, many vendors have implemented specific performance enhancements that require the use of vendor-specific codecs.
Some IP cameras or encoders also support audio and are equipped with a microphone or microphone input. Audio is encoded in one of several formats e.g. G.711, G.726, G.729, MP3. Like video, the choice of audio codec will depend on the audio quality required and available bandwidth and performance.
An IP media device standard should allow the negotiation between a VMS solution and an IP media device to occur so that compatible audio and/or video codecs can be selected automatically. IP telephony addressed this problem by using the Session Description Protocol (SDP) media-level information exchange to ensure interoperability.
Next: Event notification, video analytics, PTZ info
Events can be generated for various operational or informational conditions. Operational events tell you something about the status of the device and can include error conditions, such as loss of video, or a hardware or software failure.
Informational events provide the status of user pre-defined conditions. For example, a door contact closure event is generated when a door is opened, an audio analytic event is generated when the sound of broken glass is detected, and a video analytic event is generated when video motion is detected.
Video analytic metadata information
One of the most interesting technologies in video surveillance is video analytics. Wikipedia defines video analytics as “a technology that is used to analyze video for specific data, behavior, objects or attitude”. Video analytics can potentially turn vast amounts of seldom-used video data into digital assets, providing a rich source of data for business optimization and effectively extending the reach of video beyond surveillance. Examples of this include the use of an overhead camera to count the number of people going in and out of a store. Properly deployed, cameras can be positioned to measure the customer dwell time in a store so that merchandising opportunities could be maximized.
In a security control room, where guards must look at multiple screens displaying live video for extended periods, analytics can reduce fatigue and the number of monitors required by drawing attention to relevant events.While video analytic technology is still imperfect, practical deployments can yield business efficiencies. A full video frame-by-frame metadata definition is beyond the scope of an IP media device specification. Analytic events can, however, be transmitted by devices in a consistent manner to allow different applications to consume video.
An IP media device standard should define a single informational event notification method, but allow it to be transmitted over IP and leverage SNMP for operational issues where it makes sense.
Camera position control (pan, tilt, zoom — PTZ)
The vertical, horizontal and zoom movements of a camera are handled by its PTZ capabilities. Analog cameras use separate cables and serial ports for PTZ support. A camera's PTZ protocol is often vendor-specific, resulting in increased interoperability challenges.
IP cameras allow for the transmission of PTZ commands over IP, eliminating the need for serial interfaces. IP cameras can also provide a virtual PTZ capability. Some high-resolution IP cameras allow a user to move it vertically or horizontally and zoom within a fixed image area. This is possible because of the larger image size and resolution.
It is sometimes necessary to alter the position of a camera, based on the time of day, to a different preset position. Cameras can have several preset positions. Presets can be controlled remotely by a user or operate under a schedule. A schedule of preset views running in a client video application is often referred to as a guard tour.
An IP media device standard should include a single command structure for both PTZ and virtual PTZ camera movement. The standard should be comprehensive enough to address the different PTZ requirements. For example, the speed requirement of a PTZ movement can vary by implementation and should be addressed as a selectable parameter. Additional parameters such as presets and schedules also need to be included.
Next: Management and configuration
Management and configuration
Adding a device to a network automatically simplifies implementation. The increasing feature richness of these devices, however, may require some post-installation configuration to set alarm conditions, codec selection, multi-streaming capabilities and the destination address of additional streams, log file parameters, motion detection area mapping, schedules, PTZ parameters, local storage purging and security permissions.Some VMS solutions do not abstract the configuration complexities of IP media devices, which forces users or system administrators to access the device's native interface. This increases the learning curve and support cost for network administrators.
To achieve interoperability, an IP media standard should address:
- IP media device status information
- User access permissions
- Uploading to support device upgrades
- Device reboot
- Basic system settings
- Network settings
- IO port control
- Audio/video parameters
- Motion detection
- Event notification
The adoption of open standards and integration with enterprise management systems like HP Open View, IBM Tivoli, Microsoft System Center and CA Unicenter will help reduce the support burden of enterprise physical security by converging access, logging and auditing tasks into existing network applications.
An IP media device standard should leverage existing protocols while defining new ones that are consistent with the networking industry. Adoption of these protocols will likely accelerate the migration of enterprise video security to IP video networking and eventually lower costs as the industry undergoes organizational and technology convergence. The introduction of an IP media device standard would be welcomed by manufacturers and integrators alike and is long overdue. Customers and industry consultants will quickly grasp its benefits and likely drive its adoption by making support of such a standard a mandatory requirement in RFPs.
About the author
Peter Kuciak is Vice President of Research & Development with March Networks. Prior to joining the company, he spent 18 years in the telecommunications industry. From 2000 to 2008, Peter was Director of R&D, Professional Services and Product Management with Ubiquity Software. Ubiquity Software was acquired by AVAYA in 2007. He can be reached at .