Do you recall my ESC Boston: You should have been there! column from earlier this year? As part of that column, I included this short video, which was taken by my chum, Stephane Boucher, who leads the EmbeddedRelated.com website.
This video shows me demonstrating a mega-cool artificial neural network (ANN) and deep-learning system performing a machine vision application (the system was loaned to me by the chaps and chappesses at CEVA).
The way this works is that I set my notepad computer to display a series of random images that were originally gleaned from the Internet. I also have a small webcam that's pointing at the screen. The output from the webcam is fed into a board in which the neural network is implemented on a Xilinx FPGA. The neural network identifies the images. The output from the board is the original image augmented by an identifying caption, like “African Elephant,” “Human Baby,” “Steel Bridge,” “Soft Toy,” and so forth. This is all happening at a rate of about one image per second, which makes it even more impressive.
I'll be taking this system with me to show off at the Electronics of Tomorrow conference in Denmark on October 31. The following week, I'll be demonstrating it as part of my Advanced Technologies for 21st Century Embedded Systems talk at the Embedded Systems Conference (ESC) in Minneapolis.
Before setting off on my travels, I thought I'd better set everything up in the bay outside my office, just to ensure that everything was tickety-boo, as it were. Once the system was up and running, I decided (with the permission of their boss) to invite the lads from the manufacturing area downstairs to come and have a look. After all, it's not every day that they get to see next-generation technology of this ilk.
So, we all trooped upstairs and they were suitably impressed. But then one of them (there's always one, isn't there?) said, “Hang on, how do we know that these aren't pre-selected images that the system has been specially taught to recognize?”
Well, that's not an unfair question, when you come to think about it, so I responded, “Why don't you try pointing the camera at other things in the room?”
First, he pointed it at my notepad computer and, a second later, the output display showed the image of the computer annotated with the text “Notepad Computer.” Next, he pointed it at a book, then a pen, and so on for other objects that happened to by lying around on the table.
Unexpectedly, he pointed it at one of the other lads — a nice guy sporting a T-shirt and a nonchalantly unshaved look. The system immediately responded with “Plumber's Helper.” I have to say that this took everyone by surprise and we all burst out laughing.
Goodness only knows how the neural network came to this conclusion. I'm also wondering how it become part of the network's training? Still-and-all, as they say, it was a pretty clever conclusion, resulting in the lad now having a new nickname that might follow him around for a long time to come.
Will you be attending ESC Minneapolis? If so, it would be great to see you at my talk. Alternatively, if you spot me ambling around, please feel free to stop me to say “Hi”. I'll be the one in the Hawaiian shirt. As always, all you have to do is shout “Max, Beer!” or “Max, Bacon!” to be assured of my undivided attention.