In my last post, “ A Peek Inside Amazon FreeRTOS”, I started to analyze the high-level behavior that the Amazon FreeRTOS demonstration application exhibits while running on an STM32F475 IoT Discovery Node. The analysis looked at the high-level task behavior and CPU utilization which helps a new developer start to understand the run-time performance and behavior. In this post, we’ll dig a little deeper and examine how data flows through the application and examine how memory is being used.
Analyzing a black box software stack in order to understand what it is doing is critical to a developer that wants to ensure their application is robust and has a minimum number of defects. The Amazon FreeRTOS demonstration code comes out of the box with code space usage of 418 kB! That’s a lot of unknown code that is just asking for conflicts and bugs to be introduced if we don’t fully understand it before making our own modifications.
One view that can be very useful to developers in understanding how data and communication is flowing through an application is to use the communication view in Percepio Tracealyzer. This view will show the communication actors such as tasks, message buffers and queues and how the information flows between these sources. Using the communication view, Amazon FreeRTOS exhibits the behavior shown in Figure 1.
Figure 1 – The flow of communication and data throughout the Amazon FreeRTOS demonstration application. (Source: Beningo Embedded Group)
In this single picture, we can easily see how the data flows through the application and what we should be looking for in the source code. For example, we can see that there is:
- A single mutex that is shared between the MQTT and TmrSvc tasks
- There are two queues in the application
- There is a single message buffer used to communicate between the MQTT and Echoing tasks
- The TmrSvc, Echoing and MQTTEcho task can send information to the logging task through a queue
- A queue is used to send information to MQTT from Echoing
Since Amazon FreeRTOS doesn’t include any of this information in the documentation, a developer would normally have to recreate this diagram by hand themselves. This would typically require an in-depth code review which could be time consuming. Without understanding how the application behaves, a developer trying to add in their own code will be far more likely to introduce defects that could cause performance issues, race conditions and other issues.
Seeing the communication flow is one thing, but a developer may want to understand more about what a particular task is doing. For example, what exactly is the Echoing task doing? In order to better understand the Echoing task, I clicked on the last appearance of the Echoing task in the Trace View and reviewed the events that were generated which can be seen in Figure 2.
Figure 2 – The Echoing task generates a series of events when it executes which can be seen in this image. (Source: Beningo Embedded Group)
In the event data, a developer should be able to see the events that match the communication flow. Waiting for the message buffer, receiving the buffer, sending data to the MQTT queue and the logging queue and then waiting again on the message buffer. We can indeed see this data progression in the event data, but closer scrutiny does reveal that there are malloc events that are occurring during run-time. If this application has hard real-time requirements, we may want to flag this since malloc is often non-deterministic and could result in other performance issues such as memory fragmentation. Notice also that we are only seeing memory being dynamically allocated and no memory is being released! So, either we have a memory leak in the application or this memory is freed elsewhere in the application.
Tracking down the free statements could require us to dig into the code and matching each malloc and free statement. Instead, I noted that this task was executing for around 2 minutes and 7 seconds and filtered the event log for instances of free. Clicking on the event, a developer can see that the memory that is allocated in Echoing is used to store information that is also used by the Logging task. Once the Logging task is done with the data, it frees the memory location. This behavior can be seen in Figure 3.
Figure 3 – A filter is used in the event log to find all malloc free occurrences. (Source: Beningo Embedded Group)
Scrolling through the event log, I estimated that there must be several hundred calls to free throughout the approximate 3-minute execution time. Just out of curiosity, I exported the log and discovered that free is called 57,769 times! That’s 321 calls to free per second! This raises concerns that the heap could become fragmented along with other performance concerns. For a production system, a developer would want to statically allocate a buffer or use a memory pool to limit allocating and freeing memory so much, which is undoubtedly eating away at available CPU cycles and affecting the applications determinism. This will require that a developer dig much deeper into the FreeRTOS code and identify the software stacks and code that are driving that behavior.
As we’ve seen through-out this article, using trace technology can be extremely useful to analyze a code base which has no documentation or that we want to better understand from an execution stand-point. We’ve also started to better understand how Amazon FreeRTOS works and some behaviors that a developer working on a production intent system may want to investigate further.
Jacob Beningo is an embedded software consultant, advisor and educator who currently works with clients in more than a dozen countries to dramatically transform their software, systems and processes. Feel free to contact him at email@example.com, at his website www.beningo.com, and sign-up for his monthly Embedded Bytes Newsletter here.