Preventing code theft

Here at Embedded Systems Design one of the issues we deal with is copyright protection. As part of a publishing company, we're protective of our content and how it's distributed. As programmers, I'm sure a lot of you have the same concerns. Protecting your code from piracy, industrial espionage, and theft can be a big problem.

Many of us don't worry about our code getting “borrowed” by another programmer, even if it's used without our knowledge. Indeed, the whole open-source community depends on that very principle: that code can and should be reused by anyone who cares to use it. And it's obviously led to some terrific products including Web browsers, protocol stacks, codecs, and entire operating systems.

Other programmers (or their employers) are far more worried about letting their code out into the wild, and for them there are a number of tricks to preventing code theft. We'll cover code protection in a future issue, but until then I'd like to hear what techniques you're using to protect your own software.

One way is to make the code physically inaccessible by using a microprocessor or microcontroller with on-chip memory. If the code never leaves the chip it's hard for outsiders to guess what's inside. Hooking up a logic analyzer or emulator won't do much good if the opcodes never cross an external bus.

A determined (and well-funded) software pirate can “decapitate” the chip and examine the stored memory bits using an electron microscope, but even that technique is usually thwarted by modern semiconductor manufacturing processes that put the ROM under several layers of metal. Some chips are deliberately manufactured to make their internal organs proof against spying.

For off-chip memory, some programmers encrypt their object code before it goes into ROM or onto disk. The encryption part is easy; it's decrypting the code on the fly that's tricky. The processor will be awfully confused fetching encrypted opcodes unless you place some decryption hardware between the off-chip memory and the CPU. But then you're back to the problem of passing “clear text” across an external bus for everyone to see. Some PowerPC processors with the CodePack compression technology encrypt their binaries as a nice side effect.

Another twist involves processors with configurable or user-definable instruction sets. If you create your own the instruction set, outsiders can monitor the instruction stream all they like but it won't make any sense. You don't even need to create the entire instruction set. Just one or two well-placed FOOBAR instructions can obfuscate algorithms and disguise program flow pretty effectively. (It doesn't even matter what your mystery instructions do; a NOP is as good as a JMP for concealing your intent.)

Then there's the “security through obscurity” ploy–obfuscating your code simply by making it hard to follow. (Certain programmers may do this unintentionally.) This method works better for protecting source code than object code. If your source listings are gratuitously larded with extraneous and unused code it can be hard for a competitor to separate the real program flow from the red herrings. Object code is a bit harder to bepuzzle. If it runs correctly, a determined hacker can follow its progress through the usual means.

Whatever method you use, the intent is usually the same. Competitors don't usually want to reuse your code wholesale, they just want to learn from it. It's not so much outright theft we're worried about as not giving our competitors a head start. Occasionally there's some real intellectual property at stake but for the most part, we're just trying to make things tough for the other guys.

Let us know what works for you.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.