TinyML-based endpoint devices face unique security threats - Embedded.com

TinyML-based endpoint devices face unique security threats


As TinyML adoption continues to grow, it’s important to be aware of various attacks that can negatively impact your TinyML development.

With endpoint AI (or TinyML) in its infancy stage and slowly getting adopted by the industry, more companies are incorporating AI into their systems for predictive maintenance purposes in factories or even keyword spotting in consumer electronics. But with the addition of an AI component into your IoT system, new security measures must be considered.

IoT has matured to an extent where you can reliably release products into the field with peace of mind, with certifications that provide assurance that your IP can be secured through a variety of techniques, such as isolated security engines, secure cryptographic key storage, and Arm TrustZone usage. Such assurances can be found on microcontrollers (MCUs) designed with scalable hardware-based security features. The addition of AI, however, leads to the introduction of new threats that infest themselves into secure areas—namely in the form of adversarial attacks.

Adversarial attacks target the complexity of deep learning models and the underlying statistical mathematics to create weaknesses and exploit them in the field, leading to parts of the model or training data being leaked, or outputting unexpected results. This is due to the black-box nature of deep neural networks (DNN), where the decision-making in DNNs is not transparent (i.e., the presence of “hidden layers” and customers are unwilling to risk their systems with the addition of an AI feature, slowing AI proliferation to the endpoint).

Adversarial attacks are different than conventional cyberattacks as when traditional cyber security threats occur, security analysts can patch the bug in the source code and document it extensively. Considering there is no specific line of code you can address in a DNN, it becomes understandably difficult.

Notable examples of adversarial attacks can be found throughout many applications, such as when a team of researchers, led by Kevin Eykholt, tapped stickers onto stop signs, which caused an AI application to predict it as a speed sign. Such misclassification can lead to traffic accidents and more public distrust in using AI in systems.

The researchers managed to get 100% misclassification in a lab setting and 84.8% in field tests, proving that the stickers were quite effective. The algorithms fooled were based on convolution neural networks (CNN), so it can be extended to other use cases using CNN as a base, such as object detection and keyword spotting.

Figure 1: Stickers taped on to STOP sign to fool the AI into believing it is a speed sign. The stickers (perturbations) are used to mimic graffiti to hide in plain sight. (Source: Eykholt, Kevin, et al. “Robust physical-world attacks on deep learning visual classification.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.)

Another example by researchers from the University of California, Berkley, showed that by adding noise or perturbation into any music or speech, it would be misinterpreted by the AI model to mean something other than the played music, or it would cause the AI to transcribe something completely different—yet the perturbation remains inaudible to the human ear.

This can be maliciously used in smart assistants or AI transcription services. The researchers have reproduced the audio waveform that is over 99.9% similar to the original audio file but can transcribe any audio file of their choosing at a 100% success rate on Mozilla’s DeepSpeech algorithm.

Figure 2: By adding a small perturbation, the model can be tricked to transcribe any desired phrase. (Source: Carlini, Nicholas, and David Wagner. “Audio adversarial examples: Targeted attacks on speech-to-text.” 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018.)

Types of Adversarial Attacks

To understand the many types of adversarial attacks, one must look at the conventional TinyML development pipeline as shown in Figure 3. In the TinyML development pipeline, the training is done offline—usually in the cloud—followed by the final polished binary executable flashed onto the MCU and used via API calls.

The workflow requires a machine learning engineer and an embedded engineer. Since those engineers tend to work in separate teams, the new security landscape can lead to confusion on responsibility division between the various stakeholders.

Figure 3: End-to-end TinyML workflow (Source: Renesas)

Adversarial attacks can occur in either training or inference phases. During training, a malicious attacker could attempt “model poisoning”, which can be of targeted or untargeted types.

In targeted model poisoning, an attacker would contaminate the training data set/AI base model, resulting in a “backdoor” that can be activated by an arbitrary input to gain a particular output that works properly with expected inputs. The contamination could be a small perturbation that does not affect the expected operation (such as model accuracy, inference speeds, etc.) of the model and would give the impression that there are no issues.

This also does not require the attacker to grab and deploy a clone of the training system to verify the operation because the system itself was contaminated and would ubiquitously affect any system using the poisoned model/data set.

Untargeted model poisoning, or Byzantine attacks, is when the attacker intends to reduce the performance (accuracy) of the model and stagnates training. This would require returning to a point before the model/data set has been compromised (potentially from start).

Other than offline training, federated learning—a technique where data collected from the endpoints is used to retrain/improve the cloud model—is intrinsically vulnerable due to its decentralized nature of processing. This allows attackers to partake in compromised endpoint devices, leading to the cloud model becoming compromised. This could have large implications as that same cloud model could be used throughout millions of devices.

During the inference phase, a hacker can opt for the “model evasion” technique where they iteratively query the model (e.g., an image) and add some noise to the input to understand how the model behaves. In such a manner, the hacker could potentially gain a specific/required output (i.e., a logical decision after tuning their input enough times without using the expected input). Such querying could also be used for “model inversion”, where the information about the model or the training data is extracted similarly.

Risk Analysis During AI TinyML Development

For the inference phase, adversarial attacks on AI models is an active field of research, where academia and industry have aligned to work on those issues and developed the Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS), which is a matrix that would allow cybersecurity analysts to assess the risk to their models. It also consists of use cases throughout the industry including edge AI.

Learning from the provided case studies will provide product developers/owners an understanding on how ATLAS would affect their use case, asses the risks, and take extra precautionary security steps to alleviate customer worries. AI models should be viewed as prone to such attacks and careful risk assessment needs to be conducted by various stakeholders.

For the training phase, ensuring that datasets and models come from trusted sources would mitigate the risk of data/model poisoning. Such models/data should usually be provided by reliable software vendors. A machine learning model can be also trained with security in mind, making the model more robust, such as a brute force approach of adversarial training where the model is trained on many adversarial examples and learns to defend against them.

Cleverhans,an open-source training library, is used to construct such examples to attack, defend, and benchmark a model for adversarial attacks. Defense distillation is another method where a model is trained from a larger model to output probabilities of different classes, rather than hard decisions—making it more difficult for adversary to exploit the model. Both of those methods, however, can be broken down with enough computational power.

Keep Your AI IP Safe

At times, companies might worry about malicious intent from competitors to steal the model IP/feature that is stored on a device on which the company has expended its R&D budget on. Once the model is trained and polished, it becomes a binary executable stored on the MCU and can be protected by the conventional IoT security measures, such as protection of physical interfaces to the chip, encryption of software, and using TrustZone.

An important thing to note, however, is that even if the binary executable would be stolen, it is only the final polished model that is designed for a specific use case that can be easily identified as a copyright violation. As a result, reverse engineering would require more effort than starting with a base model from scratch.

Furthermore, in TinyML development, the AI models tend to be well-known and open-sourced, such as MobileNet, which can then be optimized through a variety of hyperparameters. The datasets, on the other hand, are kept safe because they are valuable treasures that companies spend resources to acquire and are specific for a given use case. This could include adding bounding boxes to regions of interest in images.

Generalized datasets are also available as open source, such as CIFAR, ImageNet, and others. They are sufficient to benchmark different models on, but tailored data sets should be used for specific use case development. For the case of a visual wake word in an office environment, a dataset secluded to an office environment would give the optimum result.

>> This article was originally published on our sister site, EE Times.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.