Repair or Replace
Jack G. Ganssle
An engineer's adventures in paradise: Jack shares what he's learned from watching engineers make do in the Bahamas.
Scott and I trudged up the hill, our pace slow, not so much due to the incline, but more because of the tropical sun. It beat down relentlessly, turning the quick walk of type-A people into an unhurried stroll.
Our goal was the 200-foot radio tower visible from every corner of Staniel Cay (pronounced "Key"), a small island in the Bahamas' Exuma chain. I'd been searching for an Internet connection for weeks. Desperation reached a peak in Warderick Wells (no phone service at all; only one family lives in the small group of islands) before I made friends with folks on a sailboat who transferred e-mail via single sideband radio. Though it was impossible to suck down my mail, I managed to send off a critical message that went through from my new acquaintances' computer, into their SSB radio, over the high frequency bands to a ham station ashore, that routed the packets into the 'Net. No drug addict has worked harder at getting a fix.
The guidebooks hinted that Staniel Cay was the center of Exuma civilization, offering all sorts of services. That suggested Internet to my too-long-unconnected brain, which, along with its attraction of having the caves used to film the James Bond movie Thunderball, brought us here from more remote places.
The tower provided phone service to all 60 inhabitants of the island by means of a microwave link to islands up the chain. Not a single wire connects this remote bit of paradise to the world; even power is generated internally, by the Cay's truck-sized diesel plant.
Most islanders view Batelco (the Bahamas Telephone Company) with derision. In Nassau we learned it's not unusual to wait three to six months for a new phone, and that's in the very capital of the Bahamas. Far more pay phones are broken than operative. Yet I was struck by the difficulty of the company's mission in the out-islands. It's hard to believe that the minimal economy of some of the more remote Cays can support a telephone system, yet Batelco does indeed provide phones to virtually every resident. Here in Staniel the phone book shows 35 phone numbers, all sharing the same first five numbers of their seven digit dialing code.
At the base of the tower we found a hut. "Building" would be far too grand a word for the small concrete structure that houses the island's phone service. It turns out that long distance calls to the U.S.A. cost up to $6 per minute. Knowing that after three weeks I'd have 500 to 1,000 e-mail messages waiting, I decided to pass on my Internet quest until we returned to Nassau a few days hence. But the air conditioning felt so good after the tropical heat that we stayed and chatted up the engineer.
Both Scott and I are embedded folks, our careers in this field starting with the dawn of the microprocessor age. We were fascinated by the system's equipment and the story of how Batelco keeps everything running.
Phone service only came to most of these out islands within the last 20 years. Before that, before microprocessors made central offices small and cheap, electronic inter-island communication was non-existent or consisted of irregularly used radios. In the panic of getting products out the door, of deadlines and hassles, sometimes it's hard to see how our small part in the embedded world benefits people. Though I had nothing to do with the telecom industry, I felt no small pride in being in an industry that had clearly helped these communities. Staniel Cay doesn't even have a doctor, but when the inevitable medical emergency arises, a phone call to Nassau brings help by air in hours.
Most fascinating, though, were the problems associated with keeping the system running. Everything electronic breaks. In the U.S.A. this is an annoyance. We toss out old broken gear and shed only the smallest tear at the cost of the new CD player, cell phone, or whatever. As toy junkies, the tear is more theatrical than real as the failure of a product means we get to buy a newer, fancier, more feature-rich version. Dig out the American Express card, place a toll-free call, and the replacement widget arrives on our doorstep before 10:30 a.m. the next day. We're so accustomed to this sort of service that much of American business is planned around quick deliveries. Too often, we substitute FedEx for planning ahead. Need something on Thursday? Wait till Wednesday to start thinking about it.
Here in Staniel Cay there's no FedEx; no UPS; no postal system. The mail boat arrives once every two weeks, more or less, if the weather is good. Occasionally it sinks. Everything comes on the boat. Food, fuel, outboard motors, and mail. Need something before the boat arrives? You'll do without, learn to wait, and live with even more disappointment since seven times out of 10 the thing you order won't be on it anyway. "Sorry mon, it be here real soon now."
I met a fellow stuck in the Cay for two months while he waited on a part for his engine. ("Stuck" is perhaps the wrong word for being forced to spend time in paradise.) He finally went islander and cobbled up a fix using duct tape and bailing wire.
Batelco suffers along with the locals. When parts of the phone system go down, as they must in this difficult tropical environment, they either fix things here or wait, and wait, and wait, for spare parts. Though individual outages may be annoying, when the entire system goes down the island becomes telephonically isolated.
One corner of the the tower's equipment room is stacked with spare boards. Another has a much higher pile of defective boards. The engineer uses a rather comprehensive set of built-in diagnostics to isolate failures to a board, and then swaps in a replacement card. The microprocessors make diagnostics a critically necessary part of the system, and unlike too many self-test routines, which seem more aimed at filling a niche on the datasheet ("includes full diagnostics!"), these actually work.
He then calls Batelco headquarters in Nassau and orders a replacement board, ideally keeping at least one good spare on hand at all times. Unfortunately the system collapses in typical Bahamas fashion at this point. Headquarters accepts the order and promises a new part, but weeks go by before the mail boat comes, and all too often months pass while paperwork creeps from desk to desk.
The perversity of nature ensures the next failure will be on the same type of board, for which there's no longer a spare immediately at hand. The engineer earns his pay by cobbling up a solution.
As an American consumer, I'm always struck by the behavior of island economies. Here, as in so many other places, everything is imported. Trash is expensive. Where do old, used-up, things go? Small islands have little room for disposal, so tossing out defective gear is problematic. So cars have long lives. No one replaces alternators or starter motors; they're always rebuilt. Shops weld and repair parts that in America we'd just replace. Trash piles are plundered for bits of treasure, that small metallic thing that can fix the car, the generator, or whatever. When the decrepit conch-fishing boat sinks at anchor, salvagers raise the rotten wreck, rebuild the engine, and squeeze ever more life out of it. It might sink a couple of times a year, but raising it is so much more effective than getting a replacement.
Technology that can't be repaired is a problem. Years ago aid agencies realized that large-scale projects in third world countries often resulted in large scale defects; from this sprang the concept of appropriate technology, that which is optimized for use in a particular culture. So a series of small, repairable water pumps makes more sense than a mega-scale water treatment plant in parts of Africa. Mechanical ignitions, which can be serviced with bailing wire, replace high-efficiency electronic spark systems.
There must be some critical size where the disposable society we've created "works." One where delivery and disposal are quick and easy. Where the economy is strong enough to support casual replacement. Though it pains me to be part of the disposable economy, it's the way America works. Technology changes so fast that yesterday's miracle is tomorrow's junk.
I once had a Tektronix 545 oscilloscope, a 100-pound vacuum tube beast that refused to die. The thing must have been 30 years old, worked like a champ, but wasn't fast enough for modern electronics. It had to go, but how could I toss such a huge, functional thing in the dumpster? Yet in the end it was obsolete, a relic with no value. I finally found a teenager who claimed to have a need for it, but figured that soon it, too, would be on the junk heap of outmoded technology.
Here in the Bahamas I suspect that my old scope would be a prize, no doubt to be repaired and maintained for decades to come. Though the shop that repairs computers might not find it useful, it would find it's way naturally through the junk chain to the shop that, perhaps, repairs radios where a 1MHz bandwidth is more than enough. Instead of just chucking the thing — because disposal itself is so expensive here — this Bahaman system I can't figure out seems to ensure that it'll find it's natural niche in the obsolete technology environment.
So the Batelco engineer repairs the broken board, or at least makes every attempt to fix the thing. All without much of a scope, logic analyzer, or development system (he has, in fact, never heard of one). His tools come from Sears. Wire cutters, needle nose pliers, clunky soldering gear. There's no surface mount station, not even an anti-static mat. Yet he's the master of a microprocessor-based telecom switching station.
I peered deeper, trying to understand better. The equipment is, by U.S. standards, old but robust. It uses 8085 microprocessors, a mid '70s-era CPU that's still designed into some products. The 8085 is a 40-pin DIP device, an 8-bitter that runs at a few megahertz.
Each circuit board is beautifully designed, with wider-than-usual tracks (perhaps .020), power and ground planes, and two circuit layers on the outside where they're accessible. Parts aren't crowded together like commuters in New York, but spread leisurely around. Each device is a DIP, with pins on 0.1 centers. Many are inserted in high-quality machined pin sockets, yielding a somewhat expensive but totally reliable connection.
If you're not a hardware person, or are one who came of age in the last decade, the previous paragraph says that parts are accessible, they're replaceable, and pins and connections are large enough that even the bluntest fingers can easily slip a probe on any desired node.
The sockets make part substitution hassle-free. More, they allow the engineer to bend pins up and disconnect them at will to run experiments, without soldering. It gives him options, allows him to exercise his skills at debugging without requiring a lab full of highly skilled production people.
The Batelco employees scattered on these remote islands remove parts from their stock of other failed boards, swapping them as necessary to make things work. That pile of bad boards that no one could repair is a resource full of potentially good components. They live with essentially no stock of known good parts, since the head office believes that one spare board is enough; in their wisdom, management figures new boards are just a mail boat away. Life in the field gives the lie to such easy dreams, so these engineers develop strategies, like stockpiling failed boards, to deal with reality.
This older technology is perfectly adapted for the low-tech area it's destined for. Can you imagine this engineer — or one of us — in some remote spot with few tools, trying to repair a typical modern embedded product? One that's chock full of high-density SMT parts? With everything integrated into a couple of proprietary ASICs, which, of course, are no longer available?
Next time your cell phone fails, or when it's time to upgrade and your current model becomes another item on the technology junk heap, tear it apart. You'll be astonished at the packaging engineering. You'll find one or two boards covered front and back with micro-components soldered practically on top of each other. Without removing parts I bet you'll find that just closing the case up is difficult.
One of the worst symptoms of our headlong plunge into ever more elaborate and sophisticated technology is how rapidly parts become obsolete. Though this Batelco system used 8085 microprocessors that are still easily and cheaply available, most of our systems use components which won't exist in a year or two.
I can't count the times people have called or e-mailed me in despair over parts issues. Their product is just about done. Six to 12 months of engineering has yielded just the thing the boss wants and the company needs. Yet as they go to market, one or more integrated circuits, be they microprocessors, PLDs, or memory components, no longer exists. The "last buy" message comes from the vendor even before these poor folks have shipped a single unit. The panic that ensues is for all of the wrong reasons. Developers go into a frenzy because they need to ship product, usually forgetting that obsolete or near-obsolete components mean the system cannot be repaired in a year or a decade. Shipping is swell, but long term maintenance is critical.
The kinder, gentler days when vendors had second sources for parts, when they produced even low volume components for 10 to 20 years, are gone. Some of the semiconductor folks tell me the solution is to design everything into high-density FPGAs and PLDs. I've made that mistake, and found that FPGAs, for instance, go obsolete as fast as any other part. The newer replacement version is almost never pin-compatible with its predecessor, necessitating a board redesign. Without pin-compatible replacement parts, the poor sod in the field will need a reliable stock of spare boards. In some parts of the world board stocks will always be thin to non-existent.
Seems to me that building an irreparable system is extremely poor engineering. I do recognize that today's reality sometimes means that repair is not an option, that a defective unit gets immediately tossed out. In this landfilled world that's scary, but it's a part of the way things work, no matter how annoying. But if your system is destined for a place where such casual disposal isn't possible, designing things that way is foolish.
As an engineer, you're making choices every day that will impact the life of your user. Recognize that some economies don't operate as frenetically as the U.S.'s, that sometimes repair is more important than features, size, and power consumption.
It's easy to succumb to an arrogance of the office, where we lose touch with real customer needs, when we listen on the phone with but a single ear to a repair person in the field who is upset, while we make faces and wish only that the turkey would shut up. As an engineering manager I learned to send the developers into the field, to let them do some service calls, interact with the customer, and feel first hand just how much pain these folks experience with the products. Let each engineer get beaten up in person by unhappy clients. It's a sure way to make them consumer advocates when they return, to give them a sense of what is really important. So if your product is destined for more remote areas of the world, design appropriately. Help your customer succeed despite the challenges of the local environment.
Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. He founded two companies specializing in embedded systems. Contact him at firstname.lastname@example.org.