In this chapter from his book “The Software IP Detective's Handbook,” Bob Zeidman describes some of the concepts behind the new field of software forensics, and how they can be used to safeguard the unique and proprietary Intellectual Property incorporated into your design.
The word forensic comes from the Latinword forensis meaning “of or before the forum.” Inancient Rome, an accused criminal and the accusing victim would present theircases before a group in a public forum. In this very general senseit was not unlike the modern U.S. legalsystem where plaintiffs and defendants present their cases in a public forum.Of course, the rules and procedures of the presentation, of which there arevery many, differ from those days. Also, whether in a civil trial or a criminaltrial, all parties can be represented by lawyers trained in the intricacies ofthese rules and procedures.
At these ancient Roman forums, both parties would present theircases to the forum and one party would be declared a winner. The party with thebetter presentation skills, regardless of innocence or guilt, would oftenprevail.
The modern system relies on the fact that attorneysrepresenting the parties make the arguments rather than the parties themselves.The entire system relies on the assumption that lawyers, trained in law and skilledat presenting complex information, will present both parties’ cases in the bestpossible manner and that ultimately a just outcome will occur. I don’t want tosay that the truth will prevail, not only because that’s a cliché but becausethere is often some amount of truth in the arguments of both parties. Rather,more often than not, justice will be served.
This model works very well—not perfectly, but very well. Withregard to highly technical cases, however, the percentage of cases wherejustice is served is lower because the issues are difficult for judges andjuries to grasp. Technical experts can throw around highly technical terms,sometimes without realizing it and other times to purposely confuse a judge orjury. This is why two things are required to improve the analysis of softwarefor the legal system:
- Create a standard method of quantizing software comparisons.
- Create a standard methodology for using this quantization to reach a conclusion that is usable in a court of law.
These two things are embodied in what is called “softwareforensics.” Before we arrive at a working definition, let us look at thedefinitions of related terms: “forensic science,” “forensic engineering,” and“digital forensics.”
The Need for Software Forensics
Some years ago, when I had just begun developing the metricsdescribed in The Software Detective's Handbook, – as well as the software to calculatethe metrics, and the methodology to reach a conclusion based on the metrics – Iwas contacted by a party in a software copyright dispute in Europe.
A software company had been accused of copying source code fromanother company. The software implemented real-time trading of financialderivatives. A group of software engineers had left one company to work for theother company; that’s the most common circumstance under which software isstolen or alleged to have been stolen.
The plaintiff hired a well-known computer science professor fromthe Royal Institute of Technology, Stockholm, Sweden, to compare the sourcecode. This respected professor, who had taught computer science for many years,reviewed both sets of source code and wrote his report.
His conclusion could be boiled down to this: “I have spent 20 yearsin the field of computer science and have reviewed many lines of source code.In my experience, I have not seen many examples of code written in this way.Thus it is my opinion that any similarities in the code are due to the factthat code was copied from one program to another.”
Unfortunately for the plaintiff, the defendant responded by hiringanother well-known computer science professor. This person was the head of thecomputer science department at the very same Royal Institute of Technology, thefirst professor’s boss.
This professor compared the source code from the two parties, andessentially her conclusion was this:”I have spent 20 years in the field ofcomputer science and have reviewed many lines of source code. In my experience,I have seen many examples of code written in this way. Thus it is my opinionthat any similarities in the code are due to the fact that these are simplycommon ways of writing code.”
The defendant did some research and came across my papers and myCodeSuite software . Thedefendant hired me, and I ran a CodeMatch comparisonand then followed my standard procedure. CodeMatch revealed a fairly highcorrelation between the source code of the two programs.
However, there were no common comments or strings, there were nocommon instruction sequences, and when I filtered out common statements andidentifier names I was left with only a single identifier name that correlated.Because the identifier name combined standard terms in the industry, and bothprograms were written by the same programmers, I concluded that no copying hadactually occurred.
After writing my expert report, what struck me was how much a trulystandardized, quantified, scientific method was needed in this area of softwareforensics, and I made it my goal to bring as much credibility to this field asthere is in the field of DNA analysis, another very complex process that iswell defined and accepted in modern courts.
According to the Merriam-Webster OnlineDictionary , science is defined as “knowledge or a system of knowledgecovering general truths or the operation of general laws especially as obtainedand tested through scientific method.” Forensic science is the application ofscientific methods for the purpose of drawing conclusions in court (criminal orcivil). The first written account of using this kind of study and analysis tosolve criminal cases is given in the book entitled CollectedCases of Injustice Rectified , written by Song Ci during the Song Dynastyof China in 1248. In one case, when a person was found murdered in a smalltown, Song Ci examined the wound of the corpse. By testing different kinds ofknives on animal carcasses and comparing the wounds to that of the murdervictim, he found that the wound appeared to have been caused by a sickle. SongCi had everyone in town bring their sickles to the town center for examination.One of the sickles began attracting flies because of the blood on it, and thesickle’s owner confessed to the murder.
This groundbreaking bookdiscussed other forensic science techniques, including the fact that water inthe lungs is a sign of drowning and broken cartilage in the neck is a sign ofstrangulation. Song Ci discussed how to examine corpses to determine whetherdeath was caused by murder, suicide, or simply an accident.
In modern times, the best-known methods of forensic scienceinclude finger-print analysis and DNA analysis. Many other scientifictechniques are used to investigate murder cases—to determine time of death,method of death, instrument of death—as well as other less criminal acts. Someother uses of forensic science include determining forgery of contracts andother documents, exonerating convicted criminals through ex post factoexamination of evidence that was not considered at trial, and determining theorigins of paintings or authorship of contested documents.
Forensic engineering is the investigation of things todetermine their cause of failure for presentation in a court of law. Forensicengineering is often used in product liability cases when a product has failed,causing injury to a person or a group of people.A forensic engineering investigation often involves examination and testing ofthe actual product that failed or another copy of that product.
The examination involvesapplying various stresses to the product and taking detailed measurements todetermine its failure point and mode of failure. For example, a plate ofglass at a very high temperature, when hit by a small stone, might chip, shatter, or crack in half. This kind ofexamination would be useful for understanding how a car or airplanewindshield failed. The investigation might start out to replicate the situationthat led to the failure in order to understand what factors might have combinedto cause it.
Forensic engineering also encompasses reverse engineering, theprocess of understanding details about how a device works. Thus forensicengineering is critical for patent cases and many trade secret cases.
Two of the most famous cases of forensic engineering involvedthe Challenger and Columbia space shuttle disasters. On January 28, 1986, the space shuttle Challenger exploded on takeoff, killing its crew. PresidentRonald Reagan formed the Rogers Commission to investigate the tragedy. Asix-month investigation concluded that the O-rings—rubber rings that are usedto seal pipes and are used in everyday appliances like household waterfaucets—had failed.
The O-rings were designed to create a seal in the shuttle’ssolid rocket boosters to prevent superheated gas from escaping and damaging theshuttle. Theoretical physicist Richard Feynman famously demonstrated ontelevision how O-rings lose their flexibility in cold temperatures by placingrubber O-rings in a glass of cold water and then stretching them, thussimplifying a complex concept for the public. Further investigation revealedthat engineers at Morton Thiokol, Inc., where the O-ring was developed andmanufactured, knew of the design flaw and had informed NASA that the lowtemperature on the day of the launch created a serious danger. They recommendedthat the launch be postponed, but NASA administrators pressured them intowithdrawing their objection.
On February 1, 2003, the space shuttle Columbia disintegrated over Texas during reentry into the Earth’s atmosphere. All sevencrew members died. Debris from the accidentwas scattered over sparsely populated regions from southeast of Dallas, Texas, to western Louisiana and southwestern Arkansas. NASAconducted the largest ground search ever organized to collect thedebris, including human remains, for its investigation. The Columbia AccidentInvestigation Board, or CAIB, consisting of military and civilian experts invarious technologies, was formed to conduct the forensic examination.
Figure 9.1 Challenger spaceshuttle: the crew and physicist Richard Feynman demonstrating the breakdown of the O-ring that was determined to be the cause
Amazingly enough, Columbia ’s flight data recorder was recovered in the search. Columbia had a special flight data OEX (O rbiterEx periments) recorder, designed to record and measurevehicle performance during flight. It recorded hundreds of different parametersand contained extensive logs of structural and other data that allowed the CAIBto reconstruct many of the events during the last moments of the flight. Theinvestigators could track the sequence in which the sensors failed, based onthe loss of signals from the sensors, to learn how the damage progressed.
Six months of investigation led to the conclusion that a pieceof foam that covered the fuel tank broke off during launch and put a hole inthe leading edge of the left wing
“Digital forensics” is the term for the collection and study ofdigital data for the purpose of presenting evidence in court. Most typically,digital forensics is used to recover data from storage media such as computerhard drives, flash drives, CDs, DVDs, cameras, cell phones, or any other devicethat stores information in a digital format, for the purpose of determiningimportant characteristics of that data that are useful in solving a crime orresolving a civil dispute.
These characteristics might include the type of data (e.g.,pictures, emails, or letters) or the owner of the data, or the date of creationor modification of the data. Digital forensics does not involve examining thecontent of the data, because that requires skills that are not necessarilycomputer science. For example, a digital forensic examiner may be able torecover a deleted email from an invest-ment banker about a publicly tradedcompany. However, it would take someone familiar with banking and bankingregulations to determine whether the content of the email constituted illegalinsider trading.
Digital forensics ofteninvolves examining metadata, which is the information about the data ratherthan the content of the data. For example, while the content of an email maygive facts about insider trading by an investment banker and thus be usefulevidence for criminal proceedings against that banker, the metadata might showthe date that the email was created. If the banker was on vacation that day,this digital forensic information might be evidence that the banker was beingframed by a colleague. Proving or disproving such an issue is a key componentof the investigative part of digital forensics.
Digital forensic examiners often inspect large and smallcomputer systems to look for signs of illicit access or “break-ins.” This caninvolve examining network activity logs that are stored on the computers. Itmay involve searching for suspicious files that meet certain well-knownprofiles and that are used to attack a system, or it may involve looking atfiles created at the time of a known break-in. It may also involve activelymonitoring packets traveling around a network.
Techniques employed bydigital forensic examiners include methods for recover-ing deleted andpartially deleted files on a computer hard disk. They also include comparingfiles and sections of files to find sections that are bit-by-bit identical.
Other techniques includerecovering and examining metadata that gives important information about thecreation of a file and its various properties. Automatically searching thecontents of files and manually examining the con-tents of files are alsoimportant techniques in digital forensics.
Digital forensic examinersmust be very careful about how data is extracted from a computer so that thedata is not corrupted while the extraction is taking place. Operating systemstypically maintain important metadata about files, and any modification of afile, such as moving or copying it for the purpose of examining it, will changethe metadata.
For this reason, specialtechniques and special hardware have been developed to preserve the contents ofcomputer disks prior to a forensic examination. This can be particularly trickywhen the system being examined is used in an active business, such as an onlineretailer, or in a critical system, such as one that controls a medical deviceand must operate 24/7. In these cases, special techniques, special hardware,and special software have been developed to extract data from such a livesystem.
Evidence procedures, such as how an item or information isacquired, docu-mented, and stored, are very important. An examiner should beable to show what procedures were used or not used, to collect the evidence,and to show how the evidence was stored and protected from other parties.
Digital forensic examiners must also be very concerned aboutdocumenting the chain of custody, which is the trail of people who handled theevidence and the places where it has been stored. In order to reduce the chanceof evidence tampering, and to relieve any doubts in the mind of a judge orjury, the chain of custody must be well documented in a manner that can beverified.
Software forensics is the examination of software for producingresults in court; it should not be confused with digital forensics. There aretimes when digital forensic techniques are used to recover software from acomputer system or computer storage media so that a software forensicexamination can be per-formed, but the analysis process and the methodology forfinding evidence are much different. Unlike digital forensics, softwareforensics is involved with the content of the software files, whether those filesare binary object code files or readable text source code files.
The objective of software forensics is to find evidence for alegal proceeding by examining the literal expression and the functionality ofsoftware. Software forensics requires a knowledge of the software, oftenincluding things such as the programming language in which it is written, itsfunctionality, the system on which it is intended to run, the devices that thesoftware controls, and the processor that is executing the code.
Whereas a digital forensicexaminer attempts to locate files or sections of files that are identical, forthe purpose of identifying them, a software forensic examiner must look at codethat has similar functionality even though the exact representation might be different.In patent and trade secret cases, functionality is key, and two programs thatimplement a patent or trade secret may have been written entirely independentlyand look very different.
In copyright and tradesecret cases, software source code may have been copied but, because of thenormal development process or through attempts to hide the copying, may end uplooking very different. Digital forensic processes will not find functionallysimilar programs; software forensic processes will. Digital forensic processeswill not find code that has been significantly modified; software forensicprocesses will.
Thoughts on Requirements for Testifying
In recent years I have been frequently disturbed by the poorjob done by some experts on the opposing side of cases I have worked on.Sometimes the experts do not seem to have spent enough time on the analysis,most certainly because of some cost constraints of their client. Other timesthe experts do not actually have the qualifications to perform the analysis.
For example, I have been across from experts who use hashing to“determine” that a file was not copied because the files have different hashes.If you are familiar with hashes, changing even a single space inside a sourcecode file will result in a completely different hash. While hashing is a greatway to find exact copies, it cannot be used to make any statement aboutcopyright infringement.
Most disturbing is when anexpert makes a statement that is unquestionably false and the only reason itcould be made is that the expert is knowingly lying to support the client. Inone case an expert justified scrubbing all data from all company disks (overwritingthe data so that it could not be retrieved), the weekend after a subpoena wasreceived to turn over all computer hard drives, as a normal, regular procedureat the company.
Another time an experiencedprogrammer—the author of several programming textbooks—claimed that she coulddetermine that trade secrets were implemented in certain source code filessimply by looking at the file paths and file names. Yet another time a veryexperienced expert, after hours at deposition trying to explain a concept thatwas simply and obviously wrong, finally admitted that the lawyers had writtenhis expert report for him.
Although I was oftensuccessful, working with the attorneys for my client, in discrediting theresults of the opposing expert, there were times when the judge simply did notunderstand the issues well enough to differentiate the other expert’s opinionsfrom mine.
Is there a way to ensurethat experts actually know the areas about which they opine and a way toencourage them to give honest testimony and strongly -discourage them fromgiving false testimony?
Following are a few ideasabout this, though each one carries with it potential problems. Perhaps not allof these ideas can definitely be implemented, but if some or all of them wereadopted in the current legal system, we might have just results a higherpercentage of the time. And applying these ideas to criminal cases might be agood idea, where an expert’s opinion can be the difference between life anddeath for a person accused of a crime.
Certain states require that experts be certified in a field ofengineering before being allowed to testify about that field in court. Myunderstanding is that few states require certification, and it is rare in thosestates that an expert is actually disqualified from testifying because of lackof certification.
Perhaps if certification were required, there would be fewer“experts” who are simply looking for ways to do extra work on the side.Similarly, it might be more difficult for attorneys to find “experts” whosupport their case only because they are not sophisticated enough to understandthe technical issues in depth.
One important question would be who runs the certificationprogram? There would certainly be somecompetition and fighting among organizations to implement thecertification. Organizations definitely exist, such as the Association for Computing Machinery (ACM) and theInstitute of Electrical and Electronics Engineers (IEEE), that could setcertification standards for computer scientists and electrical engineersrespectively.
Other engineering groups could set standards for their ownengineers. Perhaps the American Bar Association (ABA) or the AmericanIntellectual Property Law Association (AIPLA) as well as state and federal governmentoffices could also be involved.
A very importantconsideration would be under what circumstances certification could be revoked.There would have to be a hierarchy of actions and ramifications ranging fromfines to revocation. In reality, many penalties short of revocation wouldalmost certainly result in the end of an expert’s career. Few attorneys wouldwant to put an expert on the stand who had a record of having been found to beunqualified or dishonest. Also, would any behaviors lead to criminal chargesagainst the expert? Perhaps unethical behavior in a criminal trial should carrystronger punishment, including criminal charges, than similar behavior in acivil trial.
There should be ano-tolerance policy for dishonest, unethical, or illegal behavior by an expert. At a recent conference on digitalforensics, a professor gave an exampleof a student who cheated on a test.
The professor discoveredthe cheating and confronted the student. The student was sufficientlyremorseful, according to the teacher (in my experience most criminals areremorseful once they are caught), and so the professor gave the student asecond chance. This was simply a wrong decision.
Remember that digitalforensics is the study of sophisticated ways to hack into systems, so this professor could very well be traininga criminal. Unfortunately, only about half of the faculty members at theconference agreed with me, and not all of the colleges had official policiesregarding cheating. For sure, all forensics education programs must havezero-tolerance policies, in writing, and any certification program must, too.
One issue that is sure to arise is what to do if no certifiedexpert in a particular field is available to work on the case. Perhaps thetechnology is very new or specialized. Or perhaps all of the certified expertsare conflicted out or simply have no time. It seems that a judge could createan exception, allowing someone with experience in the field to testify in caseswhere certified experts are not available.
Many experts themselves resistcertification requirements because they are already earning a living that theywould not want to interrupt in order to study for and take a test that theyfeel is unnecessary. I also used to think the certification was unnecessary,but having seen the shoddy or unethical work of some experts, I am changing mymind. The government requires that a lawyer pass a bar exam before practicinglaw, yet experts require no similar test despite their importance to the legalprocess.
Another way of dealing with this problem is to require neutralexperts who are contracted either by the court or jointly by the parties in thecase and whose costs are shared by both parties. Currently, there are typicallytwo situations when neutral experts are used. One situation is when the judgedecides that the issues involved are too complex for the judge or the jury tounderstand without an expert in the field to explain them, and a neutral expertcan cut through any biases that the experts hired by the parties may have.
Another situation is when the parties agree on an expert andjointly cover the expert’s fees. Hiring only one expert saves time and money incoming to a resolution, and it gives each party a limited ability to persuadethe expert. Perhaps neutral experts should be required for every case. Theparties could split the cost, or the loser could be required to pay. This seemsto be a good solution, particularly if the neutral expert has been certified inher area of expertise. One drawback of having a neutral expert that should beconsidered carefully is that a biased expert, or one whose skills are less thanideal, could draw an incorrect conclusion, and there would be little abilityfor a party to challenge it on technical grounds.
Of course, having a neutral expert does not preclude thepossibility that each party could additionally employ its own expert, thoughthis might further obscure the issues rather than clarify them, given thatthere could potentially be three different opinions.
Testing of Tools and Techniques
It also seems that tools andtechniques used by experts should be tested and certified by an official body.There have been instances of experts using the wrong tools, either accidentallybecause they did not really understand what the tool did, or possibly onpurpose to confuse the issues before the judge or jury. It would be good torequire that tools be tested, that their results be rigorously verified, andthat experts be certified in the use of the tools before testimony can beintroduced in court that relies on the results of the tools.
Robert Zeidman , a regular contributor to ESD and presenterat the Embedded Systems Conference is president and founder of ZeidmanConsulting, a contract research firm in Silicon Valley which focuses onengineering consulting to law firms handing Intellectual property disputecases. He is also president and founder of Software Analysis and ForensicEngineering Corp. (SAFE), a provider of IP analysis tools.
This article is reproduced from Robert Zeidman's book: “The Software IP Detective's Handbook ,” Copyright, 2011, and used with the permission of Pearson Education, Inc. Written permission from Pearson Education Inc. is required for all other uses.