Legacy code and the Internet of Things
With the advent of the Internet of Things (Iot), small-memory footprints have returned, and the need to deal with the requirements of a resource-constrained environment is back with a vengeance. Unfortunately, the understanding that “memory is important” is part of a skill set that has largely been lost because, as far as the typical Java developer is concerned, memory is infinite.
Memory management will be key in the conversion of the massive legacy code base being made necessary by the transition from the now depleted IPV4 network address space to IPv6, which provides an inventory of new IP addresses.
Whether we are talking about a web server, a data collection application, or something else that is aware of the network, there are probably billions of lines of code existing that will see an impact.
The language that the code is written in -- whether Java, Python, Perl, or C -- is relatively immaterial because the core data structures will have to change. This means that all code that manipulates those data structures has to change. That is not a trivial exercise. It isn’t something that you can fix by saying, “Just run this macro against this code and it will magically convert everything from IPv4 to IPv6.”
The issue is that with IPv6, unlike IPv4, there is a function call for address lookup and resolution that dynamically allocates memory in order to return a linked list of all the potential addresses. Since most people have never seen the new APIs, they don’t really understand the nature of the function call and forget they need to free the memory that was allocated. This creates a memory leak.
In some cases, it’s not a big deal to leak 20 or 30 bytes. But any memory leakage at all is a horror show for safety-critical systems and real-time operating systems that have no dynamic memory allocation at all. If I were to tell an expert on safety-critical systems that I was interested in doing dynamic memory allocation or reallocation in a safety-critical system, his or her head would explode.
It’s also problematic for any operating system that has a fixed pool of memory available.
When you have a user application that is allocating memory and not freeing it, that application continues to run until all the free memory is used up. The next time the OS needs to do something -- guess what? It will fail.
Compounding the problem is the fact that much of this legacy code could be decades old. The guy who wrote that code is long departed and any documentation is probably way out of date -- including the documentation that is in the code. So now you have what I call a software archaeology project, in which you have to go in very diligently with toothbrushes and air puffers and try to uncover what the code actually does, as opposed to what it was originally intended to do.
You have to tread carefully, because you don’t know what sorts of bugs are in that code. If you start making significant changes, the chance of your introducing new errors is off the charts.
You could just simply say, "Well screw it. I am going to write everything from scratch." Very few choose this path, however, since it is extremely expensive to rewrite two million lines of code. It also means you will have your own new coding errors with no mileage on that code.
Given how the new functions work, errors are almost a sure thing. The question is how long it will take to find and fix them. Moreover, the error in the code pales in comparison to the potential security vulnerabilities that you just introduced.
The tricky part in all of this is that you will only need to change pieces of code. If we think about only the networking code, the re-examination of billions of lines of code will be required -- and the skill set that’s required to do this is going to be found among people who speak C and C++.
To read more, click here.