.. _Get Accuracy:

Get Model Accuracy
##################

The fastest way of getting accuracy numbers using your already compiled model is by using our **Model Accelerator** mode and following the steps below.

.. image:: media/ModelAccelerator.svg
       :align: center
       :alt: Model Accelerator Diagram

**Steps**

#. On the host machine, load the dataset and run preprocessing for the data.

#. Run the compiled model on our board, we will send the preprocessed data to the board using Ethernet or PCIe and then run the model. We must tesselate for our internal memory and because our MLA uses int8 to run the operation we must quantize. Then we perform the inverse operation and we send the prediction back to the host.

#. Run any postprocessing on the raw predictions so we can either display it or classify it. By doing this, you can just copy the pre and post processing blocks from your existing pipeline and check that you get the same results with minimal effort.

The primary goal of this accelerator mode is to debug your application, and not necessarily to get high FPS numbers. We are a System-on-Chip and not an accelerator. Therefore, we have not focused our efforts in maximizing the performance of this mode.

Before jumping into the SiMa-specific code let us work on a well known model, ResNet50, using a well known framework, PyTorch.

PyTorch ResNet50-1.5v
---------------------

Preprocessing and Dataset
*************************

ResNet50 has been trained using ``ImageNet1000``. However, this dataset is no longer publicly available. Therefore, for ease-of-use of this tutorial we attached a subset (500 images) from the validation set in a pickled file.
To load the dataset we simply have to load it using ``pickle``.

.. code-block:: python 

    def get_dataset(path):
        with open(path, 'rb') as f:
            dataset = pickle.load(f)
        return dataset['data'], dataset['target']

``PyTorch`` includes a ready to go preprocessing ``Transform`` with all the needed preprocessing for the images. Therefore, we will be using that.

We have added a class variable to set the output to ``numpy`` because ``PyTorch`` uses ``PyTorch.Tensor`` and SiMa APIs use ``numpy.arrays`` and we want to use the same ``Processing`` class.
Also, our APIs expect ``NHWC`` format, so we need to transpose the array.

.. code-block:: python 

    class Processing():
        def __init__(self, numpy=False, resnet_type='IMAGENET1K_V2'):
            if resnet_type == 'IMAGENET1K_V2':
                self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V2.transforms()
            elif resnet_type == 'IMAGENET1K_V1':
                self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V1.transforms()
            self.numpy = numpy

        def set_numpy_outputs(self):
            self.numpy = True

        def set_torch_outputs(self):
            self.numpy = False

        def preprocessing(self, img):
            img = Image.fromarray(img.transpose((1, 2, 0)), "RGB").resize((224, 224))
            preprocessed_img = torch.unsqueeze(self.preprocessing_transforms(img), dim=0)

            if self.numpy:
                preprocessed_img = preprocessed_img.detach().numpy().transpose(0, 2, 3, 1)

            return preprocessed_img
        
        def postprocessing(self, prediction):
            if self.numpy:
                prediction = np.argmax(prediction)
            else:
                prediction = torch.argmax(prediction)

            return prediction

PyTorch Inferencing 
*******************

The inferencing code for ``PyTorch`` is quite simple, we load the model, that will download automatically.
We set it to evaluation and start using our images and labels.

Then we simply run preprocessing, the inference of the model, postprocessing, and analysis of the accuracy.
Finally, we save the model and we want to save the input node name because we will use it for our SiMa compilation.

.. code-block:: python 

    def pytorch_example(dataset, processing, model_path, resnet_type):
        # Load the model from pytorch
        model = models.resnet50(weights=resnet_type)
        model.eval()

        images, labels = dataset
        total_images = len(images)

        print("Inferencing on pytorch...")
        accurate_predictions = 0
        for img, label in tqdm(zip(images, labels), total=total_images):
            preprocessed_img = processing.preprocessing(img)

            prediction = model(preprocessed_img)

            prediction = processing.postprocessing(prediction)

            accurate_predictions += int(prediction == label)

        accuracy = (accurate_predictions/total_images) * 100
        print("Correct predictions:", accurate_predictions)
        print("Accuracy:", accuracy, "%")

        torch.save(model, model_path)
        name, _ = next(model.named_children())
        input_name = name
        print("Model", model_path, "saved with input name", input_name)

        return input_name, accuracy


Compiling ResNet50-1.5v
-----------------------

We assume you are familiar with this simple compilation process. Learn more about this compilation process from the :ref:`ModelSDK` topic in this document.

.. code-block:: python 

    def compile_pytorch_resnet50(dataset, processing, model_path, input_name):
        from afe.apis.defines import default_quantization
        from afe.apis.loaded_net import load_model
        from afe.core.utils import convert_data_generator_to_iterable
        from afe.load.importers.general_importer import pytorch_source
        from sima_utils.data.data_generator import DataGenerator
        

        # Input shape in NCHW format (N = 1).
        input_shape = (1, 3, 224, 224)
        importer_params = pytorch_source(model_path, input_names=[input_name], input_shapes=[input_shape])

        loaded_net = load_model(importer_params)
        images, _ = dataset
        n_calib_samples = len(images)

        processing.set_numpy_outputs()

        samples = np.empty((n_calib_samples, 224, 224, 3), dtype=np.float32)
        for i in range(n_calib_samples):
            preprocessed_img = processing.preprocessing(images[i])
            samples[i] = preprocessed_img #.transpose(0, 2, 3, 1)

        input_generator = DataGenerator({input_name: samples})
        calibration_data = convert_data_generator_to_iterable(input_generator)

        # Quantize the model using 35 samples from the calibration dataset.
        model_sdk_net = loaded_net.quantize(calibration_data,
                                            default_quantization,
                                            model_name=model_path,
                                            arm_only=False)

        compiled_folder = "compiled_model/"
        os.makedirs(compiled_folder, exist_ok=True)
        model_sdk_net.save(model_name=model_path, output_directory=compiled_folder)
        model_sdk_net.compile(output_path=compiled_folder, log_level=logging.INFO)

        import tarfile 
  
        model = tarfile.open(compiled_folder + model_path + "_mpk.tar.gz") 
        model.extractall(compiled_folder) 
        model.close() 

        return compiled_folder

Model Accelerator ResNet50-1.5v
-------------------------------

Follow the steps below to run the model on our board; this is specific to SiMa's Palette software.
**Steps**

#. Send the ``.lm`` file to the board.

    .. code-block:: python 

        def send_model(args):
            print("Sending model...")
            password = ''
            max_attempts = 10
            if not args.bypass_tunnel:
                ssh_connection, local_port = create_forward_tunnel(args, password, max_attempts)
                
                if ssh_connection is None:
                    logging.debug(f'Failed to forward local port after {max_attempts}')
                    sys.exit(-1)

                # we start to work with the local_port from now on
                args.dv_port = local_port

            # Copy the .lm or .tar.gz model file to the board.
            scp_file = scp_files_to_remote(args, args.model_file_path, password, "/home/sima", max_attempts)
            if scp_file is None:
                logging.error(f'Failed to scp the model file after {max_attempts}')
                sys.exit(-1)

#. Set up your ``Pipeline``. As described above, the Accelerator runs the pre and post processing in the host and then the model on the board.

    However, the data before getting into the quantized model has to go through ``tesselation`` and ``quantization`` and before running the postprocessing it has to go through ``detesselation`` and ``dequantization``. 
    Also, we must specify the destination of the board in the network since we will be using an Ethernet connection.

    We have the parameters for these operations in the ``.json`` file and the ``Pipeline`` sets the parameter for all these operations.

    Then we set the processing to ``numpy`` outputs and we start our inference loop. The main difference lies in calling ``pipeline.quantize_tesselate`` and ``pipeline.detesselate_dequantize`` for the previously mentioned reasons.

    .. code-block:: python 

        def model_accelerator_example(dataset, processing, args):
            send_model(args)
            pipeline = Pipeline(args.model_file_path, args.mpk_json_path, devkit_ip=args.dv_host, local_port=args.dv_port, mlsoc_lm_folder="/home/sima")
            print("Model sent and ready!")

            processing.set_numpy_outputs()

            images, labels = dataset
            total_images = len(images)

            accurate_predictions, i = 0, 0
            print("Inferencing using model accelerator...")
            for img, label in tqdm(zip(images, labels), total=total_images):
                # Preprocess the frame
                preprocessed_frame = processing.preprocessing(img)
                
                preprocessed_frame =  pipeline.quantize_tesselate(preprocessed_frame)

                # Run the inference on the preprocessed frame - returns output feature map as bytes (ofm_bytes)
                prediction = pipeline.run_inference(preprocessed_frame=preprocessed_frame, fcounter=i)

                prediction = pipeline.detesselate_dequantize(prediction)

                prediction = processing.postprocessing(prediction)    

                accurate_predictions += int(prediction == label)
                i += 1

            accuracy = (accurate_predictions/total_images) * 100
            print("Correct predictions:", accurate_predictions)
            print("Accuracy:", accuracy, "%")

            return accuracy

    As you can see the code is what you would expect by running any machine learning framework.
    It should also be useful as a template to run your own models and get the accuracy numbers.

Running the Example
-------------------

.. button-link:: https://docs.sima.ai/pkg_downloads/SDK1.3.0/get_accuracy.zip
    :color: primary
    :shadow:

    Download the Example

**Steps**

#. Move or copy the downloaded code to the shared directory between the Palette SDK and your host system.

    .. code-block:: console 

        sima-user@sima-user-machine:~$ cd ~/Downloads
        sima-user@sima-user-machine:~/Downloads$ unzip get_accuracy.zip
        sima-user@sima-user-machine:~/Downloads$ mv get_accuracy ~/workspace/

#. Start your Palette software container.
#. Move to the ``get_accuracy`` folder and install the necessary packages.

    .. code-block:: console 

        sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_accuracy
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# chown <YOUR_USERNAME> ../get_accuracy
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get update
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get install sshpass
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python3 -m venv venv_accuracy --system-site-packages
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# source venv_accuracy/bin/activate
        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# pip install rpyc

#. Run the application.

    .. code-block:: console 

        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host <BOARD_IP_ADDRESS> --run_pytorch_inference --compile_pytorch_model
            Inferencing on pytorch...
            100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:16<00:00, 28.43it/s]
            Correct predictions: 378
            Accuracy: 81.29032258064515 %
            Model resnet50.pt saved with input name conv1
            Running calibration ...DONE
            2024-01-05 10:54:18,948 - afe.ir.quantization_utils - WARNING - Quantized bias was clipped, resulting in precision loss.  Model may need retraining.
            Running quantization ...DONE
            ...
            2024-01-05 10:55:06,842 - mlc.test_util.test_context - INFO - Code generation done
            Compilation done
            Sending model...
            Creating the Forwarding from host
            sima@10.42.0.241's password: 
            Copying the model files to DevKit
            sima@10.42.0.241's password: 
            Model sent and ready!
            Inferencing using model accelerator...
            100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:15<00:00, 29.54it/s]
            Correct predictions: 373
            Accuracy: 80.21505376344086 %
            Accuracy lost due to quantization: 1.0752688172042895

.. note::

    There is a known issue where the card might not start inferencing, it will look like the following:

    .. code-block:: console

        ...
        Inferencing using model accelerator...
        0%|                                           | 1/465 [00:00<00:53,  8.75it/s]  
        0%|                                         | 1/465 [00:13<1:44:11, 13.47s/it]
    
    You can simply reset the board and run the Python script as follows:

    .. code-block:: console

        sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host <BOARD_IP_ADDRESS>
            Sending model...
            Creating the Forwarding from host
            sima@10.42.0.240's password: 
            Copying the model files to DevKit
            sima@10.42.0.240's password: 
            Model sent and ready!
            Inferencing using model accelerator...
            100%|█████████████████████████████████████████| 465/465 [00:12<00:00, 37.08it/s]
            Correct predictions: 371
            Accuracy: 79.78494623655914 %