Making embedded system debug easier: useful hardware & software tips
Construction Methods
Embedded controllers can be constructed using any one of several techniques, but the most common method is a printed circuit board (PCB). The PCB is constructed of insulating material, such as epoxy impregnated glass cloth, laminated with a thin sheet of copper.
Multiple layers of copper and insulating material can be laminated into a multilayer PCB. By drilling and plating holes in the material, it is possible to interconnect the layers and provide mounting locations for through-hole components.
In designing the layout, or interconnecting pattern of the PCB, there are many confl icting requirements that must be addressed to make a reliable, cost-effective, and producible device. For low-speed circuits, the parasitic effects can be ignored and are often assumed to be ideal connections.
Unfortunately, real circuits are not ideal, and the wires and insulating material have an effect on the circuit, especially for signals with fast signal rise/fall times. The traces, or wires, on the PCB have stray resistance, capacitance, and inductance.
At high speeds, these stray effects delay and distort the signals. Special care must be taken when designing a PC board to avoid problems with transmission line effects, noise, and unwanted electromagnetic emissions.
Power and Ground Planes. When possible, it is a good idea to use two layers of a four-or-more-layer PCB dedicated to the Vcc and ground signals. These are referred to as power and ground planes. One advantage is that there is a beneficial high-frequency parasitic power supply decoupling capacitance, which reduces the power supply noise to the ICs.
Power planes also reduce the undesirable emission of electromagnetic radiation that can cause interference and reduce the circuit’s susceptibility to externally induced noise. The power planes tend to act as a shield to reduce the susceptibility to external noise and radiation of noise from the system.
Ground Problems. Although the concept of an ideal circuit ground may seem relatively simple, a great many system problems can be directly traced to ground problems in actual applications.
At the least, this can cause undesirable noise or erroneous operation; at the worst, it can result in safety problems, including possibly even death by electrocution. Lest you dismiss the importance of this possibility too quickly, the author has narrowly missed electrocution while testing a device in which the grounding was improperly implemented!
These problems are most often caused by one of the following problems:
1) Excessive inductance or resistance in the ground circuit, resulting in “ground loops”
2) Lack of or insufficient isolation between the different grounds in a system: earth, safety, digital, and analog grounds
3) Nonideal grounding paths, resulting in the currents flowing in one circuit inducing a voltage in another circuit
The solutions to these problems vary, depending on the type of problem and the frequency range in which they occur.
Usually they can be simplified to reducing the currents fl owing in common impedances of circuits that need to remain isolated using a single point ground and the prudent application of shields and insulation to prevent unwanted parasitic signal coupling.
EMC and ESD effects
Electromagnetic compatibility (EMC) issues have become much more significant now that there are a large number of electronic devices which unintentionally radiate electromagnetic energy in the same frequency ranges used for communication, navigation, and instrumentation.
Regulatory agencies—such as the Federal Communications Commission (FCC) in the United States, the Department of Communications (DOC) in Canada, and similar organizations in Europe—have defined limits to the amount of energy such electronic devices are allowed to emit at various frequencies.
Even more stringent requirements are placed on life-critical equipment, such as aircraft navigation and life support equipment, because of the sensitive nature of the applications. Among other things, these devices are required to provide a minimum level of immunity to externally induced noise (radiated and conducted susceptibility).
In solving an EMC problem, the first step is to identify the source of the noise, the path to the problem area, and the destination at which the problem manifests itself. Once these three characteristics of an EMC problem are identified, the engineer can evaluate the relative merits of eliminating the noise at its source, breaking the path using shielding and similar techniques, and reducing the sensitivity of the affected circuit.
There are several useful resources, including publications, seminars, test labs, and consultants who specialize in solving EMC problems. The best solution is usually to begin testing a new design at the earliest possible point in the prototype phase to determine the potential problem areas so that they can be addressed with the least cost and schedule impact.
Electrostatic discharge (ESD) is an important design consideration in embedded applications because of the potential for failure and erroneous operation in the presence of external electric fields.
ESD voltages are commonly impressed on embedded interfaces—on the order of tens of thousands of volts—when someone walks across a floor in a low-humidity environment before touching an electronic device.
One of the most common places where this becomes an issue is in the keyboard or user input device, which comes in direct contact with the outside world. This effect can cause immediate damage or upset or may cause latent failures that show up months after the ESD event.
Designers most often use shielding and grounding techniques similar to those used for safety and emission-reduction techniques to minimize the effects of ESD. The same resources that are available for EMC problems are also generally of use for ESD problems.
Fault Tolerance
Increasingly, fault tolerance has become a requirement in embedded systems as they fi nd their way into applications where failure is simply unacceptable. Many hardware and software solutions have been developed to address this need.
To understand how to deal with these faults, we must first identify and understand the types and nature of each type of fault. Every fault can be categorized as a “hard” or a “soft” fault. Hard faults cause an error that does not go away—for example, pushing reset or powering down does not result in recovery from the fault condition. Soft faults are due to transient events or, in some cases, program errors.
Self-test and diagnostic programs may be able to identify and diagnose the failure if it is not too severe.
Depending on the type of fault that occurs and which device(s) are affected, it may be possible to design a system to detect the fault, possibly even isolating the location of the fault to some degree. In the event of a soft failure, it may be possible for the designer to make the system recover from the fault automatically.
A built-in self-test program can be written for an embedded processor that will be able to detect faults in the following types of devices:
• Processor (if the fault is not too severe)
• Memory
• ROM
• RAM
• E/EEPROM
• Peripheral devices
Note that it is difficult, if not impossible, to detect faults in the control circuits or “glue logic” in a system. Other devices, such as memories, lend themselves to diagnostic methods.
The data contents of ROM devices can be tested for errors using one or more of the following techniques:
• Parity
• Checksum
• Cyclic redundancy check (CRC)
RAM memories and the integrity of information stored in RAM by the processor can be tested for proper operation using one of the following techniques:
• Hardware error detection and correction
• Data/address pattern tests
• Data structure integrity by checking stack limits and address range validity
Additionally, the integrity of the program and proper execution sequence by the CPU can be checked using one or more of the following techniques:
• Hardware parity error detection
• Duplicate, redundant hardware and cross checking or voting
• “Watch dog” timer that operates the CPU chip’s reset line
• Diagnostics that run constantly, when the CPU has nothing else to do


Loading comments... Write a comment