Since the first consumer picked up a video camera and pointed it at her two-year-old blowing out birthday candles (then accidentally pointed it at the ceiling, then shook it while laughing when the two-year old ate cake with his hands), people have struggled with video stabilization. They may not have known that’s what it’s called, but they’ve known that the video they were recording was often shaky—stomach-churning, distractingly shaky. Many lugged tripods wherever they went; others eventually invested in higher-end digital video cameras with special image-stabilization hardware that helped keep junior more still in the video frame.
Fast forward to now. Virtually everyone carries a high-quality video camera in their smartphone. Amateurs, professionals, public safety workers, and others are flying drones fitted with video cameras to capture footage of everything from sporting events to dangerous wildfires. More police jurisdictions are requiring their officers to wear body cameras to record their interactions with the public. Forget tripods — today’s cameras are in motion by nature and they’re generating enough video to fill up 80% of the entire Internet, according to Cisco. That’s a lot of potentially shaky video, making video stabilization even more challenging.
The right stabilization isn’t always the most powerful
Today’s video stabilization technology is powerful and effective. It plays a crucial role in making video more watchable and even usable, depending on the application. So it stands to reason, the more the better. Throw your best, most powerful stabilization at incoming video and it can — and will be — perfectly stable.
However, the best video stabilization is not always the most powerful video stabilization. In most modern situations, the best video stabilization is no more or less stabilization than a situation requires. After all, video stabilization — however it’s employed — requires other resources: processing, battery power, a high-resolution source, etc. Therefore, it’s important that cameras in motion — some of which need to conserve and run on battery power for long periods of time — include video stabilization technology that can adapt to various use cases. Software-based video stabilization offers that flexibility.
It’s also important to understand that perfectly stable video can be counterproductive. Highly processed, highly stable video may be great for robots — artificial intelligence (AI)-driven systems that analyze video for small details, for instance — but most of the time, humans want a more realistic video experience. We tend to get tired and find it harder to follow objects on-screen if the movement doesn’t seem natural. That’s because we’re used to a world in which what we see isn’t perfectly stable in our field of view.
Therefore, video stabilization should be flexible enough to change based on the intended experience. To create a realistic balance between more stability and less, for example, we may require video stabilization software designed to mimic human movements.
Let’s explore some of the use cases that determine how much stabilization should be applied to video from cameras in motion.
More stabilization: information gathering
Whether it’s video from surveillance drones or body cameras, when the objective is to collect detailed footage for real-time or later analysis, the highest degree of stabilization is required. For example, utility operators send up drones to inspect power infrastructure. In order to pinpoint possible damage, the video footage these drones collect needs to be stable. Also, we are seeing growing adoption of video to support remote field service. A technician wearing a camera points it at whatever needs service and an expert back at a central office directs the field technician based on the video feed.
Increasingly, public safety organizations are using artificial intelligence to rapidly analyze video and identify important information, such as hikers lost in the wilderness or suspects trying to blend in with a crowd. For computers to analyze video quickly and effectively, it needs to be highly stable.
In cases of information gathering and analysis, it is perfectly acceptable if the stabilized video looks artificially so. Video stabilization software can even work with other types of video optimization algorithms to further improve the process. For example, algorithms that run on cameras in motion can also reduce motion blur and other visual noise. Many of these cameras capture video in less-than-perfect lighting conditions, so optimizing the footage in software will help the video stabilization software achieve the best results.
Less stabilization: live streaming and experience video
From smartphones capturing sporting events to action cameras shooting point-of-view adventures, consumers want video that’s stable enough to enjoy later, but not so stable that it loses the feeling of excitement and immersion. No one is planning to analyze the video—neither in the moment nor later, when shared among friends and family—so it doesn’t require the most advanced, high-intensity video stabilization algorithms. But it may need video stabilization software tuned for other purposes.
Video streaming, for example, puts a premium on file size and bandwidth. Next generation 5G wireless networks will make it easier to stream video at very high resolutions, but to get the most out of the intended experience, video stabilization should be optimized for the use case. Stabilization algorithms can be tuned to use bandwidth and compression more efficiently. They can also be optimized to better manage a device’s battery power, so that a user can still get stable, realistic video footage of that amazing ski run without completely depleting the battery.
As with information gathering use cases, live streaming and experience video stabilization can benefit from other video enhancement algorithms, including noise reduction. Software that performs field-of-view or horizon correction, for example, can make for better video stabilization. Horizon correction is exactly what it sounds like—software that automatically levels the horizon while the user is walking, panning, or even using a selfie stick. Combined with video stabilization algorithms, such correction creates better video while maintaining realism.
Dual capture and other uses cases
Beyond the two major uses of cameras in motion—information gathering and experience video—other nuanced applications continue to arise, creating more opportunities for applying right-sized video stabilization. Increasingly smartphone users are simultaneously using the front and rear cameras in their phones, often to provide real-time visual commentary on whatever they’re capturing. The application creates a sort of picture-in-picture where the user’s face, captured with the front camera, sits atop video of the larger scene, captured with the rear camera, such as landscape, sporting event, or any other subject.
In such a use case, video stabilization of both feeds should work in harmony—having one feed perfectly stable and the other shaky would be an undesirable viewing experience. But because the dual-capture scenario is more akin to a live stream or experience video use case, a lesser degree of video stabilization may be preferable. The goal is to apply enough stabilization to create a realistic, balanced effect that is adaptable to sudden changes in speed and direction of motion. What’s more, video enhancement algorithms designed for the front camera are key: A “selfie mode” can detect the user’s face and ensure it stays centered in the video frame even as the user moves around.
As technology like AI advances, there are surely more new ways of using cameras in motion on the horizon. We may not know exactly what they are, but we can expect they’ll require a degree of automatic stabilization. Adopting next-gen, software-based video stabilization and the greater flexibility it offers, ensures the existing and emerging camera-in-motion platforms can be easily adapted, calibrated, and tuned for future use.
>> This article was originally published on our sister site, EE Times Europe.
|Johan Svensson is CTO of Imint (Uppsala, Sweden).|
- Working with motion sensors in AR and VR designs
- AI vision processor enables 8K video at 30fps in under 2W
- Edge AI chip forgoes multiply-accumulate array to reach 55 TOPS/W
- Open-source software meets broad needs of robot-vision developers
- Next level of vision technology turns machines into smart partners
For more Embedded, subscribe to Embedded’s weekly email newsletter.