Enabling telemetry for custom models in Intel DevCloud for the Edge
While edge computing has seen an exponential growth in the last few years, developers are experiencing issues in implementing AI and edge software solutions. Intel DevCloud for the Edge addresses these challenges by providing a remote development environment for users including the necessary tools to determine the optimal hardware configuration for a given solution.
Intel DevCloud for the Edge allows users to develop, prototype, and experiment with AI workloads for computer vision. Developers are able to run AI applications remotely on a wide set of the latest Intel hardware, as well as get immediate access to the up to date Intel Distribution of OpenVINO Toolkit. Furthermore, it offers access to application-specific performance benchmarks on various CPU, GPU, VPU combinations, and FPGAs. Lastly, DevCloud for the Edge provides telemetry metrics including data about the intensity and use of conditions of a computing device.
This article is a how to for users who want to enable the telemetry metrics for their application and make data driven decisions in order to determine the best hardware for their solution.
To get started with Intel DevCloud for the Edge, take a look at Get Started on the Intel DevCloud for the Edge.
1.1 Introduction
This sample application discusses what is needed in order to enable the telemetry dashboard built in the Intel DevCloud for the Edge environment. Specifically, the steps to run metrics on a custom model, supported by Intel Distribution of OpenVINO toolkit for inference, with input data (images and videos). Once inferencing on your model application is complete, you are able to compare the telemetry metrics which are available on the compute nodes.
This is the resulting telemetry dashboard output using this sample application for person detection on an Intel GPU.
click for full size image
(Source: Intel)
The dashboard consists of application details ran during a given job – ie: average inference time (MS), inference count, target hardware. It also includes the following metrics: frames per second, inference times, CPU/GPU usage during inferencing, average CPU/GPU temperature, and memory usage during inferencing.
By the end of this tutorial you will be able to produce this system-level data and dashboard for your custom model and learn about your model’s performance on different Intel hardware and determine the best hardware for your solution.
1.2 Overview
The overall workflow for Intel DevCloud for the Edge sample is as follows:
click for full size image
(Source: Intel)
- Register for Intel DevCloud for the Edge
- Launch and open a Jupyter Notebook
- Develop models and send jobs to the job queue with the target hardware specified
- Metrics/results are accessed by Jupyter Notebook
- Produce telemetry via Grafana dashboard
We will now go over the key concepts needed for this article. It is recommended that you read the following from Get Started on the Intel DevCloud for the Edge.
1.2.1 Intel Distribution of OpenVINO Toolkit
The OpenVINO toolkit is a robust toolkit for rapidly designing applications which expands computer vision and non-vision workloads through Intel hardware, optimizing efficiency, using the new generations of artificial neural networks.
1.2.2 Model Optimizer
Model Optimizer works with a network model that has been developed using a supported deep learning system. The Model Optimizer creates an Intermediate Representation (IR) of the network, which the Inference Engine can read, load, and infer. The model is defined by a pair of files called the Intermediate Representation which include an.xml (network topology information) and a.bin (weights and biases binary data).
For the complete API Reference, see Inference Engine Python* API Reference
1.2.3 Inference Engine
The Inference Engine includes a plugin library for all Intel hardware and allows to load a trained model. In order to do so the application will tell the inference engine which hardware to target as well as the respective plugin library for that device. The Inference Engine uses blobs for all data representations which captures the input and output data of the model. The Inference Engine API will be used to load the plugin, read the model intermediate representation, load the model into the plugin, and process the output.
1.2.4 Intel OpenVINO Metrics Writer (installed on DevCloud environment)
Intel DevCloud for the Edge has a preinstalled Python package to collect the OpenVINO application metrics. The applicationMetricWriter will need to be imported and call two functions that captures the information of the model as well as the inference times. With both functions called correctly, it will send the collected system-level data the telemetry Grafana Dashboard.
Note* this repository is not public and can only be used in the Intel DevCloud for the Edge environment.
1.3 Sample Application
Lets get started to enable the telemetry dashboard via a Jupyter Notebook in the DevCloud environment.
click for full size image
(Source: Intel)
These are the tasks that will be performed:
- Import all custom model files (tensorflow, kaldi, onnx, etc)
- Use the Model Optimizer to create the model Intermediate Representation (IR) files in the necessary precisions (weights and schema- take a look into the Model Optimizer for the models that are compatible with the Inference Engine -.xml and.bin files)
- Create the job file (.sh) used to submit running inference on compute nodes
- Enable telemetry using Application Metrics Writer
- Submit jobs for different compute nodes and monitor the job status until complete (submitting a job will call the bash and custom python file)
- Display model metrics on the Telemetry Dashboard
These are the sample files used to enable the telemetry metrics.
- custom_model_telemetry – Jupyter Notebook
- py- python code for custom model application
- sh – Bash file
- txt- Specifies video file or Image folder to be used
- path to test videos ex: /dir/dir/ex.mp4
- path to test images directory ex:/dir/dir/testImg
1.3.1 Imports
You must upload all custom model files into the DevCloud environment.
Python imports necessary in order to run the sample application via Jupyter* notebook custom_model_telemetry.ipynb:
import matplotlib.pyplot as plt import os import time import sys from qarpo.demoutils import *
1.3.2 Convert Custom Model to Intermediete Representation
The Intel Distribution of OpenVINO toolkit includes the Model Optimizer used to convert and optimize trained models into the IR model files, and the Inference Engine then uses the IR model files to run inference on hardware devices. The IR model files can be created from custom trained models from popular frameworks (Caffe, TensorFlow, MXNet, ONNX, and Kaldi).
For any specifications for conversion, refer to How to Convert a Model to Intermediate Representation (IR). To learn about the supported frameworks and topologies, refer to Supported Framework Layers.
To convert your custom model you may use the following, depending on the framework:
Go to the
Use the mo.py script to convert a model with the path to the input model.caffemodel file:
python3 mo.py –input_model
Use the mo_tf.py script to convert a model with the path to the input model.pb file:
python3 mo_tf.py –input_model
To convert an MXNet* model contained in a model-file-symbol.json and model-file-0000.params, run the Model Optimizer launch script mo.py, specifying a path to the input model file: MXNET:
python3 mo_mxnet.py –input_model model-file-0000.params
Use the mo.py script to convert a model with the path to the input model.nnet or.mdl file:
python3 mo.py –input_model
Use the mo.py script to convert a model with the path to the input model.onnx file:
python3 mo.py –input_model
Jupyter* Notebook– Python example with other parameters
The input arguments are as follows:
!mo.py \ --input_model raw_models/public/mobilenet-ssd/mobilenet-ssd.caffemodel \ --data_type\ -o models/mobilenet-ssd/ \ --scale 256 \ --mean_values [127,127,127]
NOTE: Some models will require manipulation in the above scripts to specify conversion parameters. To learn about when you need to use these parameters, refer to Converting a Model Using General Conversion Parameters.
1.3.3 Code Specifics
The following is the code needed to generate data and enable the telemetry dashboard. We will be initializing values and arrays, creating classes and functions to help loop through all input data, store the input data in arrays, and parse input arguments from the bash file. Code will be specified if it belongs to the Jupyter* notebook or custom Python file.
1.3.3.1 Code Setup
Import Python modules for custom_telem_enable.py:
from __future__ import print_function import sys import os from argparse import ArgumentParser import cv2 import numpy import time import datetime import collections import threading import datetime import math from openvino.inference_engine import IECore from pathlib import Path from qarpo.demoutils import progressUpdate from PIL import Image import glob import applicationMetricWriter
1.3.3.2 Constants
Constants for custom_telem_enable.py include placeholder values for cpu extension, window names, and thresholds.
CPU_EXTENSION = '' STATS_WINDOW_NAME = 'Statistics' CAM_WINDOW_NAME_TEMPLATE = 'inference_output_Video_{}_{}' FRAME_THRESHOLD = 5 WINDOW_COLUMNS = 3 LOOP_VIDEO = False
1.3.3.3 Globals
Globals for custom_telem_enable.py include placeholder value for the model files as well as image and video frames arrays.
model_xml = '' model_bin = '' videoCaps = [] frames = 0 frameNames = [] numVids = 20000
1.3.3.4 Conf.txt
We set the path to where the input data are located which is created in the Jupyter* notebook custom_model_telemetry.ipynb.
For Images enter the path of the image’s directory:
%%writefile conf.txt /data/reference-sample-data/python-classification/TEST
For Videos enter the path to the videos:
%%writefile conf.txt /data/reference-sample-data/python-sample/people-detection.mp4 /data/reference-sample-data/python-sample/one-by-one-person-detection.mp4
1.3.3.5 Set NumRequests
We set the variable NumRequests_* – in the Jupyter* notebook custom_model_telemetry.ipynb -to the maximum number of inference requests. This will improve performance for each hardware.
# Set maximum number of inference requests for CPU NumRequests_CPU = 2 print(f"Number of inference requests for CPU set to:{NumRequests_CPU}") # Set maximum number of inference requests for CPU NumRequests_GPU = 4 print(f"Number of inference requests for GPU set to:{NumRequests_GPU}") # Set maximum number of inference requests for NCS2 NumRequests_NCS2 = 4 print(f"Number of inference requests for NCS2 set to:{NumRequests_NCS2}") # Set maximum number of inference requests for FPGA NumRequests_FPGA = 4 print(f"Number of inference requests for FPGA set to:{NumRequests_FPGA}") # Set maximum number of inference requests for HDDL-R NumRequests_HDDLR = 128 print(f"Number of inference requests for HDDL-R set to:{NumRequests_HDDLR}")
1.3.3.6 Classes
Classes for custom_telem_enable.py:
VideoCap and FrameInfo classes are created to aid in initializing the captured frame by frame information of videos. The objective is to read and initialize all frames in videos so we can run inferencing correctly.
[Note: Scroll to see the complete code snippet.]
class FrameInfo: def __init__(self, frameNo=None, count=None, timestamp=None): self.frameNo = frameNo self.count = count self.timestamp = timestamp class VideoCap: def __init__(self, cap, cap_name, is_cam): ''' Initialize the captured frames in videos ''' self.cap = cap self.cap_name = cap_name self.is_cam = False self.cur_frame = {} self.initial_w = 0 self.initial_h = 0 self.frames = 0 self.cur_frame_count = 0 self.total_count = 0 self.last_correct_count = 0 self.candidate_count = 0 self.candidate_confidence = 0 self.closed = False self.countAtFrame = [] self.video = None self.rate = 0 self.start_time = {} if not is_cam: self.fps = self.cap.get(cv2.CAP_PROP_FPS) self.length = self.cap.get(cv2.CAP_PROP_FRAME_COUNT) else: self.fps = 0 self.videoName = cap_name + ".mp4" def init_vw(self, h, w, fps): self.video = cv2.VideoWriter(os.path.join(output_dir, self.videoName), cv2.VideoWriter_fourcc(*"avc1"), fps, (w, h), True) if not self.video.isOpened(): print ("Could not open for write" + self.videoName) sys.exit(1)
1.3.3.7 Functions
Functions for custom_telem_enable.py:
The functions below parse the arguments passed from the bash file (1.2.6.1) to the custom_telem_enable.py file and store into global variables, arrange the windows of video frames, parse the conf.txt, and save the images or video into an array.
[Note: Scroll to see the complete code snippet.]
def env_parser(): ''' Parse env values and store to global ''' global TARGET_DEVICE, numVids, LOOP_VIDEO if 'DEVICE' in os.environ: TARGET_DEVICE = os.environ['DEVICE'] if 'LOOP' in os.environ: lp = os.environ['LOOP'] if lp == "true": LOOP_VIDEO = True if lp == "false": LOOP_VIDEO = False if 'NUM_VIDEOS' in os.environ: numVids = int(os.environ['NUM_VIDEOS']) def args_parser(): ''' Parse arguments from the bash file and store to globals ''' parser = ArgumentParser() parser.add_argument("-d", "--device", help="Specify the target device to infer on; CPU, GPU or MYRIAD is acceptable. Application " "will look for a suitable plugin for device specified (CPU by default)", type=str) parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model's weights.", required=True, type=str) parser.add_argument("-e", "--cpu_extension", help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels " "impl.", type=str, default=None) parser.add_argument("-lp", "--loop", help = "Loops video to mimic continous input", type = str, default = None) parser.add_argument("-c", "--config_file", help = "Path to config file", type = str, default = None) parser.add_argument("-n", "--num_videos", help = "Number of videos to process", type = int, default = None) parser.add_argument("-nr", "--num_requests", help = "Number of inference requests running in parallel", type = int, default = None) parser.add_argument("-o", "--output_dir", help = "Path to output directory", type = str, default = None) parser.add_argument("-inp", "--input_type", help = "Input Type either Video or Images", type = str, default = None) global model_xml, model_bin, device, CPU_EXTENSION, LOOP_VIDEO, config_file, num_videos, output_dir, num_infer_requests,input_type args = parser.parse_args() if args.model: model_xml = args.model model_bin = os.path.splitext(model_xml)[0] + ".bin" if args.device: device = args.device if args.cpu_extension: CPU_EXTENSION = args.cpu_extension if args.loop: lp = args.loop if lp == "true": LOOP_VIDEO = True if lp == "false": LOOP_VIDEO = False if args.config_file: config_file = args.config_file if args.num_videos: num_videos = args.num_videos if args.num_requests: num_infer_requests = args.num_requests if args.output_dir: output_dir = args.output_dir if args.input_type: input_type = args.input_type def parse_conf_file(job_id): """ Parses the configuration file. Reads videoCaps and images and stored to an array """ with open(config_file, 'r') as f: cnt = 0 for idx, item in enumerate(f.read().splitlines()): # for input type video, save videos to array if input_type == 'V': if cnt < num_videos: split = item.split() if split[0].isdigit(): videoCap = VideoCap(cv2.VideoCapture(int(split[0])), CAM_WINDOW_NAME_TEMPLATE.format(job_id, idx), True) else: if os.path.isfile(split[0]) : videoCap = VideoCap(cv2.VideoCapture(split[0]), CAM_WINDOW_NAME_TEMPLATE.format(job_id, idx), False) else: print ("Couldn't find " + split[0]) sys.exit(3) videoCaps.append(videoCap) cnt += 1 else: break # for input type image, retrieve all images in folder and save to list elif(input_type == 'I'): split = item.split() global image_files image_files = glob.glob(os.path.join(split[0], "*","*.jpg")) image_files.extend(glob.glob(os.path.join(split[0], "*", "*.JPG"))) else: print("Input type not compatabile") for vc in videoCaps: if not vc.cap.isOpened(): print ("Could not open for reading " + vc.cap_name) sys.exit(2) def arrange_windows(width, height): """ Arranges the windows for videos so they are not overlapping. Also starts the display threads """ spacer = 25 cols = 0 rows = 0 # Arrange video windows for idx in range(len(videoCaps)): if(cols == WINDOW_COLUMNS): cols = 0 rows += 1 cv2.namedWindow(CAM_WINDOW_NAME_TEMPLATE.format("", idx), cv2.WINDOW_AUTOSIZE) cv2.moveWindow(CAM_WINDOW_NAME_TEMPLATE.format("", idx), (spacer + width) * cols, (spacer + height) * rows) cols += 1 # Arrange statistics window if(cols == WINDOW_COLUMNS): cols = 0 rows += 1 cv2.namedWindow(STATS_WINDOW_NAME, cv2.WINDOW_AUTOSIZE) cv2.moveWindow(STATS_WINDOW_NAME, (spacer + width) * cols, (spacer + height) * rows)
1.3.4 Inference Engine
We create an inference engine instance in order to produce inferencing on the model. Then create an IEINetwork object to read the model network from the Intermediate Representation (IR) to the plugin. Once read, load the network into the plugin which creates an executable network which is used during inferencing. The input and output blobs of the model are stored and used later. Lastly, the model information include the input batch size, number of input channels, input height and width stored in the following variables n,c,h,w, respectively. For the complete API Reference, see Inference Engine Python* API Reference.
Main function for custom_telem_enable.py – which initializes/creates the inference plugin, determine model blobs, and load model into the inference engines.
[Note: Scroll to see the complete code snippet.]
def main(): # Plugin initialization for specified device and load extensions library global rolling_log, job_id job_id = os.environ['PBS_JOBID'] # Args Parser functions env_parser() args_parser() parse_conf_file(job_id) # create Inference Engine instance ie = IECore() if CPU_EXTENSION and 'CPU' in device: ie.add_extension(CPU_EXTENSION, "CPU") # Read IR files print("Reading IR...") net = ie.read_network(model=model_xml, weights=model_bin) # store name of input and output blobs assert (len(net.input_info.keys()) == 1 or len(net.input_info.keys()) == 2), "Sample supports topologies only with 1 or 2 inputs" for blob_name in net.input_info: if len(net.input_info[blob_name].input_data.shape) == 4: input_blob = blob_name elif len(net.input_info[blob_name].input_data.shape) == 2: img_info_input_blob = blob_name else: print("topology length not accepted") input_blob = next(iter(net.inputs)) out_blob = next(iter(net.outputs)) # load the model into the Inference Engine for our device print("Loading IR to the plugin...") exec_net = ie.load_network(network=net, num_requests=num_infer_requests, device_name=device) # Input type of video or Image if input_type == 'V': isVideoInput(net,exec_net, input_blob, out_blob) else: isImageInput(net,exec_net, input_blob, out_blob) if __name__ == '__main__': sys.exit(main() or 0)
1.3.5 Run Telemetry Metrics
Now its time to run telemetry metrics on your model.
1.3.5.1 OpenVINO Metrics Writer
We have created a Python package in order to collect the OpenVINO application metrics and is preinstalled in Intel DevCloud for the Edge environment. You will need to import the applicationMetricWriter and invoke two functions that capture the information of the model as well as the inference times. The functions are to be called as followed:
import applicationMetricWriter applicationMetricWriter.send_inference_time(milliseconds) applicationMetricWriter.send_application_metrics(model_xml, device)
Send_inference_time requires the time it takes to run inference on one video frame/image in milliseconds. This function should be called for all frames/images of the test input. Send_application_metrics requires the inference engine compatible xml model schema and the device in which the job is run (ie: CPU/GPU/etc). With both functions called correctly, it will produce and the needed data to the telemetry Grafana Dashboard.
1.3.5.2 Video Input
If the data to test are videos, then we will loop through each frame of the videos in order to send the inferencing time to the dashboard. We will use the model loaded and submit to the inference engine, frame by frame and save inferencing. Once inferencing is complete we call the application metrics writer to enable data to the telemetry dashboard.
Video input function for custom_telem_enable.py:
[Note: Scroll to see the complete code snippet.]
def isVideoInput(net,exec_net, input_blob, out_blob): ''' Function that runs for video input in conf file. ''' # read the input's dimensions: n=batch size, c=number of channels, h=height, w=width n, c, h, w = net.input_info[input_blob].input_data.shape # retrieve minimun frames per second and length of videos minFPS = min([i.cap.get(cv2.CAP_PROP_FPS) for i in videoCaps]) minlength = min([i.cap.get(cv2.CAP_PROP_FRAME_COUNT) for i in videoCaps]) # capture the rate for all videos for vc in videoCaps: vc.rate = int(math.ceil(vc.length/minlength)) waitTime = int(round(1000 / minFPS / len(videoCaps))) # wait time in ms between showing frames # open videos to write stats frames_sum = 0 for vc in videoCaps: vc.init_vw(h, w, minFPS) frames_sum += vc.length statsWidth = w if w > 345 else 345 statsHeight = h if h > (len(videoCaps) * 20 + 15) else (len(videoCaps) * 20 + 15) statsVideo = cv2.VideoWriter(os.path.join(output_dir,f'Statistics_{job_id}.mp4'), cv2.VideoWriter_fourcc(*"avc1"), minFPS, (statsWidth, statsHeight), True) if not statsVideo.isOpened(): print ("Couldn't open stats video for writing") sys.exit(4) # Init a rolling log to store events rolling_log_size = int((h - 15) / 20) rolling_log = collections.deque(maxlen=rolling_log_size) # Start with async mode enabled is_async_mode = True no_more_data = False # frames submitted to inference engine frame_count = 0 progress_file_path = os.path.join(output_dir, f'i_progress_{job_id}.txt') infer_start_time = time.time() current_inference = 0 previous_inference = 1 - num_infer_requests videoCapResult = {} infer_requests = exec_net.requests frame_count = 0 f_proc = 0 #Start while loop while True: # If all video captures are closed stop the loop if False not in [videoCap.closed for videoCap in videoCaps]: print("All videos completed") no_more_data = True break no_more_data = False # loop over all video captures for idx, videoCapInfer in enumerate(videoCaps): # read the next frame if not videoCapInfer.closed: vfps = int(round(videoCapInfer.cap.get(cv2.CAP_PROP_FPS))) for i in range(videoCapInfer.rate): ret, frame = videoCapInfer.cap.read() videoCapInfer.cur_frame_count += 1 # If the read failed close the program if not ret: videoCapInfer.closed = True break frame_count += 1 f_proc += 1 if videoCapInfer.closed: print("Video {0} is done".format(idx)) print("Video has {0} frames ".format(videoCapInfer.length)) break # Copy the current frame for later use videoCapInfer.cur_frame[current_inference] = frame.copy() videoCapInfer.initial_w = int(videoCapInfer.cap.get(3)) videoCapInfer.initial_h = int(videoCapInfer.cap.get(4)) # Resize and change the data layout so it is compatible in_frame = cv2.resize(frame, (w, h)) in_frame = in_frame.transpose((2, 0, 1)) # Change data layout from HWC to CHW in_frame = in_frame.reshape((n, c, h, w)) inf_start = time.time() if is_async_mode: exec_net.start_async(request_id=current_inference, inputs={input_blob: in_frame}) # Async enabled and only one video capture if(len(videoCaps) == 1): videoCapResult = videoCapInfer videoCapInfer.start_time[0]= time.time() # Async enabled and more than one video capture else: # Get previous index videoCapResult[current_inference] = videoCapInfer videoCapInfer.start_time[current_inference] = time.time() else: # Async disabled exec_net.start_async(request_id=current_inference, inputs={input_blob: in_frame}) videoCapResult = videoCapInfer if previous_inference >= 0: # Number of videos is 1 if(num_videos==1): status = exec_net.requests[previous_inference].wait(-1) if status is not 0: raise Exception("Infer request not completed successfully") # Parse inference results vidcap = videoCapResult current_count=0 det_time = time.time() - vidcap.start_time[0] # Call OpenVINO Metrics Writer to send inference times applicationMetricWriter.send_inference_time(det_time*1000) res = exec_net.requests[previous_inference].outputs[out_blob] else: # More than 1 video status = exec_net.requests[previous_inference].wait(-1) if status == 0: res = exec_net.requests[previous_inference].outputs[out_blob] vidcap = videoCapResult[previous_inference] res_frame = vidcap.cur_frame[previous_inference] end_time = time.time() current_count = 0 infer_duration = end_time - vidcap.start_time[previous_inference] # Call OpenVINO Metrics Writer to send inference times applicationMetricWriter.send_inference_time(infer_duration*1000) res_frame = cv2.resize(res_frame, (w, h)) vidcap.frames+=1 # Progress Tracker with time and frame information used for slide bar if frame_count%10 == 0: progressUpdate(progress_file_path, time.time()-infer_start_time, frame_count, frames_sum) current_inference += 1 if current_inference >= num_infer_requests: current_inference = 0 previous_inference += 1 if previous_inference >= num_infer_requests: previous_inference = 0 # Loop video if LOOP_VIDEO = True and input isn't live from USB camera if LOOP_VIDEO and not videoCapInfer.is_cam: vfps = int(round(videoCapInfer.cap.get(cv2.CAP_PROP_FPS))) # If a video capture has ended restart it if (videoCapInfer.cur_frame_count > videoCapInfer.cap.get(cv2.CAP_PROP_FRAME_COUNT) - int(round(vfps / minFPS))): videoCapInfer.cur_frame_count = 0 videoCapInfer.cap.set(cv2.CAP_PROP_POS_FRAMES, 0) if no_more_data: break #End of while loop-------------------- progressUpdate(progress_file_path, time.time()-infer_start_time, frames_sum, frames_sum) t2 = time.time()-infer_start_time print(f"total processed frames = {f_proc}") for videos in videoCaps: print(videos.closed) print("Frames processed {}".format(videos.cur_frame_count)) print("Frames count {}".format(videos.length)) videos.video.release() videos.cap.release() print("End loop") print("Total time {0}".format(t2)) print("Total frame count {0}".format(frame_count)) print("fps {0}".format(frame_count/t2)) with open(os.path.join(output_dir, f'stats_{job_id}.txt'), 'w') as f: f.write('{} \n'.format(round(t2))) f.write('{} \n'.format(f_proc)) # Call OpenVINO Metrics Writer for model info applicationMetricWriter.send_application_metrics(model_xml, device)
1.3.5.3 Image Input
If the data to test are images, we will loop through each image and send the time it takes to perform inferencing on each image. Once inferencing is complete we call the application metrics writer to enable data to the telemetry dashboard.
Image input function for custom_telem_enable.py:
def isImageInput(net,exec_net, input_blob, out_blob): ''' Function that runs for image input in conf file ''' infer_file = os.path.join(output_dir,'i_progress_'+str(job_id)+'.txt') infer_time = [] correct = 0; error = 0 t0 = time.time() # read the input's dimensions: n=batch size, c=number of channels, h=height, w=width n, c, h, w = net.input_info[input_blob].input_data.shape for i in range(0, len(image_files), n): images = numpy.ndarray(shape=(n, c, h, w)) for j in range(n): input = image_files[i*n + j] image = cv2.imread(input, 0) # Read image as greyscale if image.shape[-2:] != (h, w): image = cv2.resize(image, (w, h)) # Normalize to keep data between 0 - 1 image = (numpy.array(image) - 0) / 255.0 # Change data layout from HWC to CHW image = image.reshape((1, 1, h, w)) images[j] = image try: t0P = time.time() result = exec_net.infer(inputs={input_blob: images}) infer_time.append((time.time()-t0P)*1000) # Call OpenVINO Metrics Writer to send inference times applicationMetricWriter.send_inference_time((time.time()-t0P)*1000) # Progress Tracker with time and frame information used for slide bar if i%10 == 0: progressUpdate(infer_file, time.time()-t0, i+1, len(image_files)) result = result[out_blob] except Exception as e: print("Exception Occurred when trying to run inference") # Call OpenVINO Metrics Writer for model info applicationMetricWriter.send_application_metrics(model_xml, device)
1.3.6 Submit Job
1.3.6.1 Create Bash file
We run inference on several different edge compute nodes present in the Intel DevCloud for the Edge. Work is sent to these nodes by submitting the corresponding non-interactive jobs into a queue. For each job, we will specify the type of the edge compute server that must be allocated for the job.
The job file is a Bash script that serves as a wrapper around the Python executable of our application that will be executed directly on the edge compute node. One purpose of the job file is to simplify running an application on different compute nodes.
The job file will be submitted as if it were run from the command line using the following format:
custom_telem_enable.sh
Where the job file input arguments are:
- <output_directory> – Output directory to use to store output files
- <device> – Hardware device to use (e.g. CPU, GPU, etc.)
- <fp_precision> – Which floating point precision inference model to use (FP32 or FP16)
- <num_videos> – Number of input videos to process from configuration file
– Indicate the input type as Image or Video (I or V respectively)
Based on the input arguments, the job file will do the following:
- Change to the working directory PBS_O_WORKDIR where this Jupyter* Notebook and other files appear on the compute node
- Create the <output_directory>
- Choose the appropriate inference model IR file for the specified <fp_precision>
- Run the application Python executable with the appropriate command line arguments
The following is the custom_telem_enable.sh which is created in the Jupyter* notebook custom_model_telemetry.ipynb.
%%writefile custom_telem_enable.sh # Store input arguments: OUTPUT_FILE=$1 DEVICE=$2 FP_MODEL=$3 NUM_VIDEOS=$4 NUM_REQ=$5 INPUT_TYPE=$6 # The default path for the job is the user's home directory, cd $PBS_O_WORKDIR # Make sure that the output directory exists. mkdir -p $OUTPUT_FILE # Check for special setup steps depending upon device to be used if [ "$DEVICE" = "HETERO:FPGA,CPU" ]; then # Environment variables and compilation for edge compute nodes with FPGAs - Updated for OpenVINO 2020.3 export AOCL_BOARD_PACKAGE_ROOT=/opt/intel/openvino/bitstreams/a10_vision_design_sg2_bitstreams/BSP/a10_1150_sg2 source /opt/altera/aocl-pro-rte/aclrte-linux64/init_opencl.sh aocl program acl0 /opt/intel/openvino/bitstreams/a10_vision_design_sg2_bitstreams/2020-3_PL2_FP16_MobileNet_Clamp.aocx export CL_CONTEXT_COMPILER_MODE_INTELFPGA=3 fi # Set inference model IR files using specified precision #insert path to your XML file below MODELPATH=models/mobilenet-ssd/FP16/mobilenet-ssd.xml #Run the custom telem code python3 custom_telem_enable.py -d $DEVICE \ -m $MODELPATH \ -o $OUTPUT_FILE \ -nr $NUM_REQ \ -c conf.txt \ -n $NUM_VIDEOS \ -inp $INPUT_TYPE echo "job submitted"
1.3.6.2 Job Request
Now that we have the job script, we can submit jobs to edge compute nodes in the Intel DevCloud for the Edge. To submit a job, the qsub command is used with the following format:
qsub
There are three options of qsub command that we use to send a job to different compute nodes:
- <job_file> – This is the job file we created in the previous step
- -N<JobName>: Sets name specific to the job
- -l<nodes> – Specifies the number and the type of nodes using the format nodes=<node_count>:<property>[:<property>…]
- -F”<job_file_arguments>” – String containing the input arguments described in the previous step to use when running the job file
To see the available types of nodes on the Intel DevCloud for the Edge, run the following in the Jupyter* notebook custom_model_telemetry.ipynb:
!pbsnodes | grep compnode | awk ‘{print $3}’ | sort | uniq -c
1.3.6.3 Submit Job to an edge compute node (with target hardware)
The following will send a job to a compute node where the output is the JobID for the submitted job. The JobID can be used to track the status of the job. Feel free to run on different hardware’s found in the pbsnodes command gives in the section above (change the nodes= portion of code).
Run the following in the Jupyter* notebook custom_model_telemetry.ipynb:
#Submit job to the queue for Images job_id_core =!qsub custom_telem_enable.sh -l nodes=1:idc001skl -F "results/core CPU FP16 1 {NumRequests_CPU} I" -N custom_telem # Submit job to the queue for Videos job_id_core =!qsub custom_telem_enable.sh -l nodes=1:idc001skl -F "results/core CPU FP16 1 {NumRequests_CPU} V" -N custom_telem print(job_id_core[0]) #Progress indicators if job_id_core: progressIndicator('results/core', f'i_progress_{job_id_core[0]}.txt', "Inference", 0, 100)
1.4 Display Telemetry Dashboard
Once your submitted jobs are completed, run the code below to view telemetry dashboards containing performance metrics for your model and target hardware.
The following is located in the Jupyter* notebook custom_model_telemetry.ipynb.
link_t = " Click here to view telemetry dashboard of the last job ran on Intel Core i5-6500TE" result_file = "https://devcloud.intel.com/edge/metrics/d/" + job_id_core[0].split('.')[0] html = HTML(link_t.format(href=result_file)) display(html)
A link will generate and take you to the Grafana Dashboard. Using this sample application, you are now able to display the telemetry metrics for a custom model.
1.5 More Information
For more information take a look at the following:
- Stay tuned for new announcements at Intel DevCloud for the Edge
- Intel DevCloud for the Edge Overview – learn about the capabilities and how DevCloud works
- More Jupyter Notebook Tutorials– additional sample application Jupyter Notebook tutorials
- Deep Learning WorkBench – Learn to convert deep learning models into OpenVINO, profile performance, and optimize all via a Web-based interface
- Model Optimizer Developer Guide – learn the capabilities of the Model optimizer
- Intel Distribution of OpenVINO toolkit Main Page– learn more about the tools and use of the Intel Distribution of OpenVINO toolkit for implementing inference on the edge
- For technical support, please visit the Intel DevCloud Forums
![]() |
Related Contents:
- Tools Move up the Value Chain to Take the Mystery Out of Vision AI
- Running object detection models with Intel DevCloud for the Edge
- Training AI models on the edge
- Applying machine learning in embedded systems
- Artificial intelligence algorithms and challenges for autonomous vehicles
- Cloud providers cite role as AI inference moves to edge
For more Embedded, subscribe to Embedded’s weekly email newsletter.