One of the things I talked about in my Not Your Grandmother's Embedded Systems session at ESC Silicon Valley was the topic of embedded vision. The example I often use is that of a next-generation electric toaster. The first time you power this on, it will cheerily greet you by saying: “Hello, what's your name?” I would, of course, answer: “You can call me Max the Magnificent!”
When I eventually come to drop in a couple of slices of bread, the toaster will recognize me, take note of the type of bread I'm using, and inquire: “Hello Max the Magnificent, how do you prefer this type of bread to be toasted?” I may respond by saying something like: “I like my toast to be well done.” When the toast pops out, the toaster might say something like: “How's that?” I may respond “That's just right” or “Perhaps a tad darker next time” or “Not quite so dark, if you please.”
The next time I go to make some toast, we can dispense with the dialog. The toaster will once again recognize me, it will recognize the bread, and it will present my taste buds with just the experience they are looking for. Similarly, the toaster will learn my preferences for bagels, baguettes, croissants, and so on, and it will do the same for all of the members of my household.
Now, when I give this sort of talk, some people are under the impression that this technology is a long way out; I believe it's closer than they think. But how might one go about adding embedded vision to one's systems? Well, I just received an email from my chum Rick Curl. The title of this email was “Who is watching you?” Inside, Rick pointed me toward a flyer for OMRON's Human Vision Components (HVC) module.
The HVC module integrates OMRON’s best in class image sensing technology (OKAO Vision) along with a camera, processor, and external interface, all onto a single 60mm x 40mm PCB. The module boasts 10 functions as follows:
- Human body detection
- Hand detection
- Face detection
- Face recognition
- Age estimation
- Gender estimation
- Facial pose estimation
- Gaze estimation
- Blink detection
- Expression/mood estimation
The HVC module uses serial communication via UART to communicate its findings in real-time to your main system, which can use this information as the basis for its actions. In the case of a flat screen display showing an advert in a store, for example, the system may present different adverts based on the age and/or gender of the person viewing the display. Similarly, a vending machine may base its food and beverage recommendations on the age and/or gender of the customer.
In the case of my hypothetical toaster, it may base it's responses on the user's expression. If it sees a “happy face,” it may bask in the glow of a job well done, while a sad or angry expression may prompt it to say: “Oh dear, did I do something wrong?”
I can envisage so many applications for this sort of thing. Take my Caveman Diorama project, which is to be presented in a 1950s television cabinet, for example. In an earlier column on this topic, I mentioned that there will be a variety of audio effects, like the sound of the waterfall and the sound of the cavemen chatting. I also mentioned the idea of having a tiny camera located at the back of the cave. When a visitor bends down to look into the cave, an image from the inside of the cave showing the visitor's face looking into the TV set will be presented as a live feed onto a screen mounted on the wall above the television (see also Caveman Cam).
Well, now imagine what we could do with one of OMRON’s HVC modules. Suppose we have a kid looking into the scene. If the kid smiles, the conversation inside the diorama could sound happy and include laughter; if the kid then frowns or looks puzzled, the tone of the conversation inside the diorama could change to reflect this; and so on and so forth.
While you're mulling on that, check out this video showing example HVC applications:
I have to admit that I'm very enthused by all of this. New applications keep on popping into my mind. Take my Inamorata Prognostication Engine, for example. This little beauty is intended to predict whether the radiance of my wife's smile will fall upon me. (Of course, if Gina ever discovers the true purpose of this device, then I really won't need a Prognostication Engine to determine her mood 🙂
Now, suppose that Gina comes to visit me in my office and pauses to peruse and ponder the Prognostication Engine, which sits in the bay outside. If the Prognostication Engine were equipped with an HVC module, it could detect and identify Gina's face and also determine her mood. If she starts off smiling, but soon begins to frown, and then starts to march toward my office, we could arrange things such that my door automatically closes and locks itself and an “On Air” sign lights up outside my door. This should give Gina pause for thought while also giving me sufficient time to escape through the window (LOL).
What about you? What applications spring to your mind for these little beauties?