.. _Get Accuracy: Get Model Accuracy ################## The fastest way of getting accuracy numbers using your already compiled model is by using our **Model Accelerator** mode and following the steps below. .. image:: media/ModelAccelerator.svg :align: center :alt: Model Accelerator Diagram **Steps** #. On the host machine, load the dataset and run preprocessing for the data. #. Run the compiled model on our board, we will send the preprocessed data to the board using Ethernet or PCIe and then run the model. We must tesselate for our internal memory and because our MLA uses int8 to run the operation we must quantize. Then we perform the inverse operation and we send the prediction back to the host. #. Run any postprocessing on the raw predictions so we can either display it or classify it. By doing this, you can just copy the pre and post processing blocks from your existing pipeline and check that you get the same results with minimal effort. The primary goal of this accelerator mode is to debug your application, and not necessarily to get high FPS numbers. We are a System-on-Chip and not an accelerator. Therefore, we have not focused our efforts in maximizing the performance of this mode. Before jumping into the SiMa-specific code let us work on a well known model, ResNet50, using a well known framework, PyTorch. PyTorch ResNet50-1.5v --------------------- Preprocessing and Dataset ************************* ResNet50 has been trained using ``ImageNet1000``. However, this dataset is no longer publicly available. Therefore, for ease-of-use of this tutorial we attached a subset (500 images) from the validation set in a pickled file. To load the dataset we simply have to load it using ``pickle``. .. code-block:: python def get_dataset(path): with open(path, 'rb') as f: dataset = pickle.load(f) return dataset['data'], dataset['target'] ``PyTorch`` includes a ready to go preprocessing ``Transform`` with all the needed preprocessing for the images. Therefore, we will be using that. We have added a class variable to set the output to ``numpy`` because ``PyTorch`` uses ``PyTorch.Tensor`` and SiMa APIs use ``numpy.arrays`` and we want to use the same ``Processing`` class. Also, our APIs expect ``NHWC`` format, so we need to transpose the array. .. code-block:: python class Processing(): def __init__(self, numpy=False, resnet_type='IMAGENET1K_V2'): if resnet_type == 'IMAGENET1K_V2': self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V2.transforms() elif resnet_type == 'IMAGENET1K_V1': self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V1.transforms() self.numpy = numpy def set_numpy_outputs(self): self.numpy = True def set_torch_outputs(self): self.numpy = False def preprocessing(self, img): img = Image.fromarray(img.transpose((1, 2, 0)), "RGB").resize((224, 224)) preprocessed_img = torch.unsqueeze(self.preprocessing_transforms(img), dim=0) if self.numpy: preprocessed_img = preprocessed_img.detach().numpy().transpose(0, 2, 3, 1) return preprocessed_img def postprocessing(self, prediction): if self.numpy: prediction = np.argmax(prediction) else: prediction = torch.argmax(prediction) return prediction PyTorch Inferencing ******************* The inferencing code for ``PyTorch`` is quite simple, we load the model, that will download automatically. We set it to evaluation and start using our images and labels. Then we simply run preprocessing, the inference of the model, postprocessing, and analysis of the accuracy. Finally, we save the model and we want to save the input node name because we will use it for our SiMa compilation. .. code-block:: python def pytorch_example(dataset, processing, model_path, resnet_type): # Load the model from pytorch model = models.resnet50(weights=resnet_type) model.eval() images, labels = dataset total_images = len(images) print("Inferencing on pytorch...") accurate_predictions = 0 for img, label in tqdm(zip(images, labels), total=total_images): preprocessed_img = processing.preprocessing(img) prediction = model(preprocessed_img) prediction = processing.postprocessing(prediction) accurate_predictions += int(prediction == label) accuracy = (accurate_predictions/total_images) * 100 print("Correct predictions:", accurate_predictions) print("Accuracy:", accuracy, "%") torch.save(model, model_path) name, _ = next(model.named_children()) input_name = name print("Model", model_path, "saved with input name", input_name) return input_name, accuracy Compiling ResNet50-1.5v ----------------------- We assume you are familiar with this simple compilation process. Learn more about this compilation process from the :ref:`ModelSDK` topic in this document. .. code-block:: python def compile_pytorch_resnet50(dataset, processing, model_path, input_name): from afe.apis.defines import default_quantization from afe.apis.loaded_net import load_model from afe.core.utils import convert_data_generator_to_iterable from afe.load.importers.general_importer import pytorch_source from sima_utils.data.data_generator import DataGenerator # Input shape in NCHW format (N = 1). input_shape = (1, 3, 224, 224) importer_params = pytorch_source(model_path, input_names=[input_name], input_shapes=[input_shape]) loaded_net = load_model(importer_params) images, _ = dataset n_calib_samples = len(images) processing.set_numpy_outputs() samples = np.empty((n_calib_samples, 224, 224, 3), dtype=np.float32) for i in range(n_calib_samples): preprocessed_img = processing.preprocessing(images[i]) samples[i] = preprocessed_img #.transpose(0, 2, 3, 1) input_generator = DataGenerator({input_name: samples}) calibration_data = convert_data_generator_to_iterable(input_generator) # Quantize the model using 35 samples from the calibration dataset. model_sdk_net = loaded_net.quantize(calibration_data, default_quantization, model_name=model_path, arm_only=False) compiled_folder = "compiled_model/" os.makedirs(compiled_folder, exist_ok=True) model_sdk_net.save(model_name=model_path, output_directory=compiled_folder) model_sdk_net.compile(output_path=compiled_folder, log_level=logging.INFO) import tarfile model = tarfile.open(compiled_folder + model_path + "_mpk.tar.gz") model.extractall(compiled_folder) model.close() return compiled_folder Model Accelerator ResNet50-1.5v ------------------------------- Follow the steps below to run the model on our board; this is specific to SiMa's Palette software. **Steps** #. Send the ``.lm`` file to the board. .. code-block:: python def send_model(args): print("Sending model...") password = '' max_attempts = 10 if not args.bypass_tunnel: ssh_connection, local_port = create_forward_tunnel(args, password, max_attempts) if ssh_connection is None: logging.debug(f'Failed to forward local port after {max_attempts}') sys.exit(-1) # we start to work with the local_port from now on args.dv_port = local_port # Copy the .lm or .tar.gz model file to the board. scp_file = scp_files_to_remote(args, args.model_file_path, password, "/home/sima", max_attempts) if scp_file is None: logging.error(f'Failed to scp the model file after {max_attempts}') sys.exit(-1) #. Set up your ``Pipeline``. As described above, the Accelerator runs the pre and post processing in the host and then the model on the board. However, the data before getting into the quantized model has to go through ``tesselation`` and ``quantization`` and before running the postprocessing it has to go through ``detesselation`` and ``dequantization``. Also, we must specify the destination of the board in the network since we will be using an Ethernet connection. We have the parameters for these operations in the ``.json`` file and the ``Pipeline`` sets the parameter for all these operations. Then we set the processing to ``numpy`` outputs and we start our inference loop. The main difference lies in calling ``pipeline.quantize_tesselate`` and ``pipeline.detesselate_dequantize`` for the previously mentioned reasons. .. code-block:: python def model_accelerator_example(dataset, processing, args): send_model(args) pipeline = Pipeline(args.model_file_path, args.mpk_json_path, devkit_ip=args.dv_host, local_port=args.dv_port, mlsoc_lm_folder="/home/sima") print("Model sent and ready!") processing.set_numpy_outputs() images, labels = dataset total_images = len(images) accurate_predictions, i = 0, 0 print("Inferencing using model accelerator...") for img, label in tqdm(zip(images, labels), total=total_images): # Preprocess the frame preprocessed_frame = processing.preprocessing(img) preprocessed_frame = pipeline.quantize_tesselate(preprocessed_frame) # Run the inference on the preprocessed frame - returns output feature map as bytes (ofm_bytes) prediction = pipeline.run_inference(preprocessed_frame=preprocessed_frame, fcounter=i) prediction = pipeline.detesselate_dequantize(prediction) prediction = processing.postprocessing(prediction) accurate_predictions += int(prediction == label) i += 1 accuracy = (accurate_predictions/total_images) * 100 print("Correct predictions:", accurate_predictions) print("Accuracy:", accuracy, "%") return accuracy As you can see the code is what you would expect by running any machine learning framework. It should also be useful as a template to run your own models and get the accuracy numbers. Running the Example ------------------- .. button-link:: https://docs.sima.ai/pkg_downloads/SDK1.3.0/get_accuracy.zip :color: primary :shadow: Download the Example **Steps** #. Move or copy the downloaded code to the shared directory between the Palette SDK and your host system. .. code-block:: console sima-user@sima-user-machine:~$ cd ~/Downloads sima-user@sima-user-machine:~/Downloads$ unzip get_accuracy.zip sima-user@sima-user-machine:~/Downloads$ mv get_accuracy ~/workspace/ #. Start your Palette software container. #. Move to the ``get_accuracy`` folder and install the necessary packages. .. code-block:: console sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_accuracy sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# chown ../get_accuracy sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get update sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get install sshpass sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python3 -m venv venv_accuracy --system-site-packages sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# source venv_accuracy/bin/activate sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# pip install rpyc #. Run the application. .. code-block:: console sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host --run_pytorch_inference --compile_pytorch_model Inferencing on pytorch... 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:16<00:00, 28.43it/s] Correct predictions: 378 Accuracy: 81.29032258064515 % Model resnet50.pt saved with input name conv1 Running calibration ...DONE 2024-01-05 10:54:18,948 - afe.ir.quantization_utils - WARNING - Quantized bias was clipped, resulting in precision loss. Model may need retraining. Running quantization ...DONE ... 2024-01-05 10:55:06,842 - mlc.test_util.test_context - INFO - Code generation done Compilation done Sending model... Creating the Forwarding from host sima@10.42.0.241's password: Copying the model files to DevKit sima@10.42.0.241's password: Model sent and ready! Inferencing using model accelerator... 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:15<00:00, 29.54it/s] Correct predictions: 373 Accuracy: 80.21505376344086 % Accuracy lost due to quantization: 1.0752688172042895 .. note:: There is a known issue where the card might not start inferencing, it will look like the following: .. code-block:: console ... Inferencing using model accelerator... 0%| | 1/465 [00:00<00:53, 8.75it/s] 0%| | 1/465 [00:13<1:44:11, 13.47s/it] You can simply reset the board and run the Python script as follows: .. code-block:: console sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host Sending model... Creating the Forwarding from host sima@10.42.0.240's password: Copying the model files to DevKit sima@10.42.0.240's password: Model sent and ready! Inferencing using model accelerator... 100%|█████████████████████████████████████████| 465/465 [00:12<00:00, 37.08it/s] Correct predictions: 371 Accuracy: 79.78494623655914 %