In recent years, a variety of companies have been introducing and promoting deep learning technology. However, two large issues are still to be solved when it comes to deep learning. One is the necessity to prepare a large amount of training data, and the other is the need for a huge amount of backpropagation and other calculations in the initial training stage. The latter was often performed on servers with high performing GPUs, located in the cloud, thus very power consuming. Therefore, it was not realistic to perform training on edge devices. However, a technology exists that can perform training and inference extracting features from only a small amount of data. This technology is known as Sparse Modeling.
In this article we will compare Sparse Modeling with deep learning; cover use cases where Sparse Modeling is beneficial; and explain the mechanism of Sparse Modeling training and inference. Furthermore, examples of visual inspection applications leveraging Sparse Modeling technology will be introduced. In this case, we will cover scratch detection on a flying crane pattern, which is one of the traditional patterns of Japanese paper.
What is Sparse Modeling？
The word “sparse” is defined as “thinly dispersed or scattered”. Sparse Modeling is based on the assumption that essential information is actually very limited (therefore”sparsely distributed”). This technology identifies and extracts essential information from the input data for the output.
Sparse Modeling identifies the relationship between different data. When outputting, Sparse Modeling does not focus on the input data itself but focuses on the relationship between input and output data. By focusing on the relationship between the data, the quantity and quality of the input data itself does not matter. Consequently, only a small amount of data is needed. Sparse modeling is categorized as an unsupervised learning method in machine learning.
Deep learning generally delivers high performance in applications where sufficient data and annotations can be prepared (e.g., object detection for automated driving). However, Sparse Modeling expands the scope of AI application to use cases where large amounts of data cannot be collected, and interpretability is of high importance.
Comparison of Sparse Modeling and deep learning
Sparse modeling can handle both one-dimensional data such as time series and two-dimensional data such as images. Imaging applications include defect interpolation, defect inspection (anomaly detection), and super-resolution. Figure 1 shows a comparison between a conventional machine learning method and Sparse Modeling for a task of detecting defect images in solar panel inspections. You can see that the amount of training data (number of images) needed for Sparse Modeling is significantly smaller than that of deep learning. However, the accuracy of Sparse Modeling is more than 90% – with a training time of only 19 seconds.
Figure 1. Performance comparison in solar panel defect inspection by AI. (Source: Hacarus)
Applications where Sparse Modeling is beneficial
Sparse Modeling works even with small amounts of training data. This is because it extracts the features that are essential from the input data in the training stage. Looking at an example of a visual inspection application located in an industrial manufacturing site, there is often a large amount of “good” data, but very little bad data. This is a case where Sparse Modeling works as an AI solution.
Sparse modeling is computationally inexpensive for model creation, so it enables us not only to perform inference on edge devices, but also training.
When it comes to AI models performing both training and inference on edge devices, HACARUS calls this “True Edge”. By training on the edge, there is no need to send data to an external location (e.g., a server in the cloud) resulting in fewer data security concerns. The Congatec and HACARUS AI kit (Figure 2) is an example of a device capable of performing Sparse Modeling on the edge. This device is made from a Congatec Box-PC and a HACARUS AI Kit – which includes visual inspection software SPECTRO CORE.
Figure 2. Congatec Box-PC and Hacarus AI kit. (Source: congatec)
Training and Prediction with Sparse Modeling
The overall flow of visual inspection using Sparse Modeling is shown in Figure 3.
Figure 3. Overall flow of visual inspection system. (Source: Hacarus)
In this example, the objective is to detect a screw on the perforated metal as a foreign object. First of all, images of the perforated metal with no anomalies are prepared. In order to avoid errors due to differences in camera angles and target positions, it is recommended to prepare several dozen training images from slightly different viewpoints. In the training phase, these training images are input to the training algorithm to create an AI model. If the inspection target does not change, the AI model need only be created once.
In the prediction phase, we input the prediction image with a screw on it, as well as the AI model, into the inference algorithm to perform inference. As a result of the inference, an image with a red frame around the location of the foreign object is produced.
The details of the training phase are shown in Figure 4. First, the training image is divided into small segments called patches. The dictionary learning algorithm then analyzes the patches and extracts characteristic image patterns. In this example, 64 different patterns of holes in the perforated metal were extracted. Each of them is called a base, and the AI model (also called a trained dictionary) is a collection of 64 bases.
click for full size image
Figure 4. Training phase of anomaly detection. (Source: Hacarus)
The details of the prediction phase are shown in Figure 5. First, we broke down the prediction image (A) into patches of the same size as the training phase. For each decomposed patch, a base of the AI model is chosen and combined to find the combination that best represents the patch. This process is applied to all patches and stitched together to produce an approximate image using the AI model. We call it the reconstructed image (B).
click for full size image
Figure 5. Prediction phase of anomaly detection. (Source: Hacarus)
In the reconstructed image (B), the location of image features that are close to the training image, in this case a regular pattern of holes, is very similar to the prediction image (A). In contrast, locations with image features that are not included in the training image, i.e., locations where screws are placed, are not well represented by the basis included in the AI model. At that location, there is a difference between the reconstructed image and the prediction image. Thus, a foreign substance can be detected. The red area in the foreign object detection image (C) indicates the location of the foreign object.
Visual Inspection Applications Leveraging Sparse Modeling Technology
We conducted a scratch detection experiment using a flying crane image (image size 640 pixel x 480 pixel), a classic design for traditional Japanese paper. The experiment environment is shown in Figure 2.
Thirteen normal images, shown in Figure 6, were used for training. This data set is less than a tenth of the images which would be required when using deep learning. Although the pattern of the cranes amongst the 13 images is the same, the position of the cranes in the images differ slightly. The reason for this is to provide a translation invariance to the AI model.
Figure 6. Training images of Japanese paper with flying crane pattern. (Source: Hacarus)
Example of Defect Detection
The prediction image is shown on the left side of Figure 7. Scratches are partially overlapped in the crane in the lower left corner. In the prediction image, the scratch is detected as a foreign object, surrounded by a red frame. Thus, the AI model was able to detect the change of the crane’s shape due to scratching, as it had already learned the shape of the crane.
Figure 7. One scratch defect on a crane. (Source: Hacarus)
Training Image Size and Training Time
Figure 8 is a graph showing training image size and training time for the flying crane pattern. In the case of 13 training images with a size of 1280 pixel x 960 pixel, the training time was only about 16 seconds. The training time is far shorter than that of deep learning. As the image size decreases, the learning time decreases almost in a linear form.
click for full size image
Figure 8. Training image size vs training time. (Source: Hacarus)
Prediction Image Size and Prediction Time
Figure 9 shows a graph of prediction image size and prediction time for the flying crane pattern. The training time is only about 0.6 seconds per 1280 pixel x 960 pixel prediction image size. As the image size decreases, the learning time decreases almost linearly to less than 0.1 seconds for 480 pixel x 360 pixel.
click for full size image
Figure 9. Prediction image size vs prediction time. (Source: Hacarus)
Figure 10 shows the results evaluating a further 100 industrial images using SPECTRO, HACARUS’ visual inspection solution. We were able to obtain very positive results, with an accuracy of 96%, precision of 100% and a reproduction rate of 95.65%. The anomaly score in the graph is an internal SPECTRO indicator that takes a value from 0 to 1, and the higher the score, the higher the anomaly level. The horizontal axis shows the anomaly score and the vertical axis shows the cumulative percentage of the total score. The graph in red shows that the degree of abnormality is in the neighborhood of 0.133, which covers all defective products.
Figure 10. Accuracy results. (Source: Hacarus)
We covered the comparison of Sparse Modeling and deep learning, applications where Sparse Modeling is beneficial, and the mechanism of training and prediction in Sparse Modeling. Scratch detection of traditional Japanese paper patterns and flying cranes was successfully achieved by SPECTRO CORE, a visual inspection program that uses sparse modeling. The accuracy, precision, and reproducibility were confirmed to be high in this experiment.
|Haruyuki Tago is Edge Evangelist at Hacarus, leading the Hacarus Edge team focused on building true Edge AI, capable of both training and inference on the edge. Mr. Tago brings over 40 years of international experience from senior management roles across firms such as Sony, Toshiba and IBM in microprocessor and SoC design management to Hacarus. He contributed greatly to the SoC designs which power the Sony PlayStations’ gaming consoles, as a member of the management team.|
- What’s driving AI to the edge
- Choosing solutions for edge AI
- Development platform enables AI training on Arm Cortex-M-based microcontrollers
- Benchmark scores highlight broad range of machine-learning inference performance
- Dealing with the memory problem in artificial intelligence
- Microcontrollers take on growing role in edge AI
For more Embedded, subscribe to Embedded’s weekly email newsletter.