Button pushers - Embedded.com

Button pushers

Click here for reader response to this article

In the early '80s I worked on a system that used radiation from 5 curies of cesium to measure the thickness of hot, 6-inch thick steel. This was a big industrial device whose main processor was a DEC (remember them?) PDP-11/45 (remember that?). A number of Z80s pushed data around the mill.

Like a lot of factory gear the instrument had an impressive front panel dominated by numerous switches, displays, and blinking lights. Bob, one of the company's sales guys, returning from a long trip abroad, was fascinated and told an interesting story.

Seems he had been an engineer before changing over to the dark side and had applied for a job at Cape Canaveral, working in a launch facility. The interview occurred in the blockhouse, in a room awash with those cool switches and lights that could easily form the backdrop of a high-tech thriller. Sitting at the desk he was adjacent to a big control panel, one with a 2-inch diameter cable snaking to the floor. The cable had been cut and anyone could see the panel wasn't connected to anything. Maybe it was a spare, or something headed for depot repair.

The usual tour followed, but it seems they kept coming back to the dead control panel. Coffee there, interview questions, and more. Finally he just couldn't resist the urge anymore and pressed one of the switches.

All of the blockhouse's alarms sounded.

Turns out this was part of the interview, a test to see if he could keep his hands in his pockets. An interviewer said “we can't afford to have someone in a launch complex who likes to randomly push buttons.”

Bob was shown the door. NASA never did extend him a job offer.

I'm the same way. Aren't buttons meant to be pressed? What happens if we turn this one? Visiting so many engineering companies over the years I've trained myself to be good, to admire the equipment without touching. But it's hard.

On a transatlantic flight recently Swissair provided all of the passengers with personal LCD screens on the seat backs. The screens were mostly there to play the movie, but a rather complex set of controls and menus that controlled games and other features beckoned. It's boring to sit in a seat for hour after hour. What happens if I try, well, this?

The system crashed. For the rest of the flight the display was on the blink. My wife, a much more disciplined person, hadn't played with her controls so saw the entire movie, as did most of the people on the plane.

But not the guy across the aisle and one seat forward. I'd watched. He'd monkeyed with the controls, too. And now his display was on the fritz. So, now completely bored and movie-less, I leaned over and asked.

Yep, he's an engineer too.

My wife gently berated me for wrecking the system. But, in my opinion, there should be no way that I can crash a product meant for non-techies. I couldn't find a reset button, but imagine that when they landed and power-cycled the plane everything came back to normal. But that's unacceptable. As the best embedded head I know says “None of my systems has a reset button. They simply don't crash.”

I'm not sure I agree with that philosophy as perfection is a terribly difficult state to attain. But we do rely too much on resets to cure software problems. People–well, some people–will press every button in utterly unexpected ways.

My Dell Jukebox MP3 player has a reset button. In fact, the manual describes how to reset the machine long before explaining how to actually use the device. How does a consumer feel when the user's manual immediately leaps into a “when the device's software crashes do this” narrative?

Every PC has some sort of a reset. Hold the power button for 7 seconds and the machine will shut down.

Any device that runs from removable batteries has the virtual reset switch. Pull the AAs out, wait 10 seconds, and try again.

Do we need reset switches? Is a watchdog timer adequate defense against odd modes invoked by users doing unexpected things? What do you think?

Jack G. Ganssle is a lecturer and consultant on embedded development issues. He helps companies with their embedded challenges, and is conducting one-day seminars about building better firmware faster in Austin and Baltimore in April. Contact him at . His website is .

Reader Response


A reset, and a return the user data portion of a flash user database function to a known state are essential functions that might be hidden with some well done equivalent of the MSDOS “Three Finger Salute”(Ctrl-Alt-Del) etc. There is no telling when a bit will get flipped the wrong way due to a momentary event in SDRAM or on the bus, and one will need a way to bring the system back to life.

Unless you are using ECC RAM, and a server grade A reset, and a return the user data portion of a flash user database function to a known state are essential functions that might be hidden with some well done equivalent of the MSDOS “Three Finger Salute”(Ctrl-Alt-Del) etc. There is no telling when a bit will get flipped the wrong way due to a momentary event in SDRAM or on the bus, and one will need a way to bring the system back to life.

Unless you are using ECC RAM, and a server grade architecture and board layout these events are a matter of time on all but the simplest of designs. On even simple designs brownouts or noise or ESD can lead to a need for a reset in some cases with consumer grade items. (Sometimes even on other HW)

– anonymous


Reset button was invented at Western Electric by a PDP-8 user who wanted to dial in from home and run his computer. Crash? Just call back! At work he had a 4 inch RED button that shorted out the 'incoming call' line!

Reset button implies software not complete or too darn complicated to EVER be done.

– Rick Merrill


Does answering this question constitute pressing the button?

We develop therapeutic medical devices and ALWAYS use the microcontroller watchdog timer as well as a second hardware watchdog timer. When the medical therapy device is active you will be glad to learn that nothing the operator can do (even if they lose the instructions) will cause something unexpected to happen. We verify this by code review, engineer testing, QA testing, and outside testing in the clnical setting. And no, we never put a reset button on the instruments. And, we get hammered by the FDA and must recall all devices if a flaw resulting in a real or potential injury is reported.

And yes, this more than doubles the development cost and time to market.

– Frank Ingle


Back in the old days, there was one product we were trying to make impervious to button pushers. We finally replaced the keyboard with an RS-232 connection and fed a pseudo-random stream of codes to the system. It did turn up a lot of errors. Ironically, we used the source code from the system as the pseudo-random stream.

– Mike Harvill


Much more to this (important) issue than indicated.

When, for example, is it essential to have the system cycle to a stable state after 'crashing' — and how is that state established and defined?

When is it essential NOT to cycle the system, or some of its functionality, when it crashes (perhaps in response to a malicious attack, or DOS attack on a distributed system with corrupt or uncertain rebuild code)?

A recent issue of Crosstalk magazine discusses this subject in some interesting detail. One of their conclusions is that fairly sophisticated security is required AROUND the reset function (which, in turn, may require its own VM, reset procedures, obfuscation, etc.)

A good consumer system (such as the Swissair system described) would contain several functions — one of which would revert to a running state, like a selective 'undo' on the commands through the UI. Wouldn't be that hard to program it so you couldn't 'revert' to previously-running diagnostics, etc., that “users” shouldn't get access to even if (perhaps 'particularly' if) they're computer-savvy…

– Robert M. Ellsworth


As in case of PCs and other such devices there is a provision to use -by cicking on it and doing the same thing to undo the operation ,there should be a reset button for every button present on the panel,precisely 1/0 for first and/or the second operation.

– vasanthalakshmi ch


I think that watchdog timers are most effective with hardware/environment and certain types of progamming problems. It usually just restarts the program but doesn't usually reset all the data to factory defaults. Wouldn't want to do that to the setup data everytime a thunderstorm rumbled through.

A means to reset the device however will give the user a way to default to factory (known good) data sets from a corrupted one.

Even though everyone does limiting checking on data sets as long as the word “bug” is in our dictionary it behoves us to provide a reset mechanism. I had a fancy Amateur Radio that somehow got messed up with a stored “setup” data set and every time I recycled power it just came back up to the screwed up state. Finally had to do the “hold down two buttons while powering up” to get back to the factory defaults.

– Bob Callahan


Of course every embedded system must have provisions for recovery from unexpected events, clearly the microprocessors watch dog timer, external watch dog timers and possibly a user initiated reset are options. We just shouldn't count on the watch dog to save us from inadequate user interface testing. As we move from simple buttons to GUIs and on to voice user interfaces we must give the user the ability to recover from his/her errors gracefully, and leave the watch dog to pick up that rare hardware event that sends the processor off into never never land.

– Scott Greene


At NASA systems resets are ussually done by recycling power. Subsequent loss of control is handled by having redundant systems or reverting to manual control, the solution prefered by astronauts.

I have seen systems here trend toward more automation with the users having less authority over the system. I expect the new Crew Exploration Vehicle to continue this trend.

– Joel Altman


Every embedded systems should be able to recover by itself from any condition. In many of my early product designs most of the problems were linked to processor going in unknown state due to power line noise and never coming out without power recycling. With newer watchdog and power suppervisory cirtuits the hardware design has become easier, but putting software protections requires lot of careful design and implementation. This is an important differentiator between application prgramming and embedded systems programming. Unfortunately, many of the embedded systems and software engineers are either ignorant or careless about it.

I would like to share a funny incidence. I was interviewing an engineer for embedded systems postions and he explained the design he made for a water heater. He had given a RESET switch on front panel of water heater !! I asked him, do you expect a kid or a housewife operating the water heater to know what RESET means !! I asked him why don't you use a power supply supervisory chip with watch dog timer and the response was what is a watch dog timer?

– Upendra Patel


In Jurassic Park, Interestingly the mathematician, Malcolm, argues that the entire park (including all those dinosaurs, technicians and of course the electronics) are non-linear systems and hence guided by Chaos Theory. Small perturbations could get amplified and drives the system towards instability.

The programmer of the Park's security system sabotages the computers.

The reset sequence helps the rest of the parks survivors to bring it back to life.

All our systems are indeed non-linear in nature, the effect a single bit flip drives the system towards instability. Hence adding an external reset seems to be a necessity.

– Badri


There are situations where you need to monitor what triggered the system to go into the “weeds”… In most of the embedded softwares I include a java type “exception” handling .. Where in you “catch” something that is not expected and report it to the user while returning the control back to the system only on a manual RESET of the system. The manual reset ensures that if something unexpected had occured the system would let you know where and when for us to figure out HOW.

– nchinoy


In automotive control systems (not infotainment!) we are fortunate in that we don't have to deal with what most people consider to be a user interface – there are (generally) no buttons to press. We do have a user interface, but ours is foot pedals and steering wheel, for braking and steering systems. It's actually quite hard for the user to enter an invalid combination! Nevertheless, we're in a particularly noisy environment, and have to deal with its consequences.

We don't have a reset button. But we do have at least one watchdog, and a minimum recovery time, and (currently) a known safe state. Also, we protect all our adapted data with CRCs and double banking in NVM (serial EEPROM, for instance), and have a 'known good' (if not ideal) set of data to fall back to. That's for current systems that have a mechanical backup.

The next step is x-by-wire, where we remove the steering column and operate the steering rack using electrical signals from the position of the steering wheel, or we remove the hydraulic braking system and drive motors on the calipers from electrical signals deduced from the position of the brake pedal. This makes vehicle manufacture simpler and 'ambidextrous' – we can just unplug the driver controls from a right-hand-drive vehicle and plug them in the other side to make a left-hand-drive vehicle – but we no longer have a safe state for the electronics. It has to be fault tolerant, which means it keeps working even when it's broken. That makes a reset (even one generated by a watchdog) quite hard to handle. If nothing else, we have to have the system back up and running in around 50ms to prevent the driver noticing the glitch.

You'll appreciate this poses some interesting design problems, which we can't solve in the same way as NASA would with fully redundant systems – it's not a practical economic proposition. It will hold up x-by-wire for a year or two, but I'm confident it won't stop it in the long run.

– Paul Tiplady


There are situations where you need to monitor what triggered the system to go into the “weeds”… In most of the embedded softwares I include a java type “exception” handling .. Where in you “catch” something that is not expected and report it to the user while returning the control back to the system only on a manual RESET of the system. The manual reset ensures that if something unexpected had occured the system would let you know where and when for us to figure out HOW.

– nchinoy


On my first assignment (some 20 years ago) I had to write a driver for the user interface of a rather complex communication center. Panel itself had over 150 keys. After I wrote and tested the driver, project manage came to my booth and started to push buttons with the open palm, pressing several keys at the time, in a random fashion. After a few seconds, system crashed. I quickly found the bug (a loop for key decoding was not designed for simultaneous pressing of several keys and was never exited, causing watchdog to kick-in) and asked him why he did it. He answered that he once looked at the guys operating the system. They do it in 12-hour shifts and get bored – and they start to play with the keys…We engineers tend to think that all users know what they are doing. Wrong. Most users don't (who reads 100-page user manuals?).They simply start to push keys in a random fashion to see what will happen. Conclusion – create a case for each key combination possible and handle it accordingly.

– Dejan Durdenic


Last week I was the owner of a Creative Muvo TX FM MP3 player for one day. It's a nice little gadget that I bought to record conversations (which it's no good for, as it turns out, because the microphone and ADPCM coding together create recordings “of poor quality” (crap)).

Without trying, I crashed the Muvo. I was listening to FM, pressed a button (there are only three buttons on this thing), and put the Muvo into a mode from which I could not recover. No matter what I did, I couldn't exit FM station-search/store-preset mode.

Simple, as you say. There's one AAA battery. Take it out, wait 10 seconds, put it back in. Still in FM mode. When you've got half a gig of flash incorporated into a device, I guess you can afford to keep track of what you're doing.

So, guess how you reset this device. You can't reset it, there's no reset sequence. You can't kill an OTL process because it remembers where it is using the flash. You have to — reload its OS. That's right. You unplug the Muvo from its battery carrier revealing a USB connector. You plug it into your PC and you run the boot loader that jams an OS back into the Muvo. Like magic, all your presets and your stored music disappear and you've got a pristine, working, and empty Muvo. Mine went back to Fry's.

Progress.

– Steve Leibson


I actually own a device without a reset button (that I know of): my Apple iPod. And I have had it crashed (sort of — still alive but running very slowly and not producing any sound) on me once… only way to reset it that I could figure out was to let it run down the battery (wdhich as it was still working the CPU did not take more than overnight).

Even the sun has some spots…

– Jakob Engblom

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.