Get Model Accuracy

The fastest way of getting accuracy numbers using your already compiled model is by using our Model Accelerator mode and following the steps below.

Model Accelerator Diagram

Steps

  1. On the host machine, load the dataset and run preprocessing for the data.

  2. Run the compiled model on our board, we will send the preprocessed data to the board using Ethernet or PCIe and then run the model. We must tesselate for our internal memory and because our MLA uses int8 to run the operation we must quantize. Then we perform the inverse operation and we send the prediction back to the host.

  3. Run any postprocessing on the raw predictions so we can either display it or classify it. By doing this, you can just copy the pre and post processing blocks from your existing pipeline and check that you get the same results with minimal effort.

The primary goal of this accelerator mode is to debug your application, and not necessarily to get high FPS numbers. We are a System-on-Chip and not an accelerator. Therefore, we have not focused our efforts in maximizing the performance of this mode.

Before jumping into the SiMa-specific code let us work on a well known model, ResNet50, using a well known framework, PyTorch.

PyTorch ResNet50-1.5v

Preprocessing and Dataset

ResNet50 has been trained using ImageNet1000. However, this dataset is no longer publicly available. Therefore, for ease-of-use of this tutorial we attached a subset (500 images) from the validation set in a pickled file. To load the dataset we simply have to load it using pickle.

def get_dataset(path):
    with open(path, 'rb') as f:
        dataset = pickle.load(f)
    return dataset['data'], dataset['target']

PyTorch includes a ready to go preprocessing Transform with all the needed preprocessing for the images. Therefore, we will be using that.

We have added a class variable to set the output to numpy because PyTorch uses PyTorch.Tensor and SiMa APIs use numpy.arrays and we want to use the same Processing class. Also, our APIs expect NHWC format, so we need to transpose the array.

class Processing():
    def __init__(self, numpy=False, resnet_type='IMAGENET1K_V2'):
        if resnet_type == 'IMAGENET1K_V2':
            self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V2.transforms()
        elif resnet_type == 'IMAGENET1K_V1':
            self.preprocessing_transforms = models.ResNet50_Weights.IMAGENET1K_V1.transforms()
        self.numpy = numpy

    def set_numpy_outputs(self):
        self.numpy = True

    def set_torch_outputs(self):
        self.numpy = False

    def preprocessing(self, img):
        img = Image.fromarray(img.transpose((1, 2, 0)), "RGB").resize((224, 224))
        preprocessed_img = torch.unsqueeze(self.preprocessing_transforms(img), dim=0)

        if self.numpy:
            preprocessed_img = preprocessed_img.detach().numpy().transpose(0, 2, 3, 1)

        return preprocessed_img

    def postprocessing(self, prediction):
        if self.numpy:
            prediction = np.argmax(prediction)
        else:
            prediction = torch.argmax(prediction)

        return prediction

PyTorch Inferencing

The inferencing code for PyTorch is quite simple, we load the model, that will download automatically. We set it to evaluation and start using our images and labels.

Then we simply run preprocessing, the inference of the model, postprocessing, and analysis of the accuracy. Finally, we save the model and we want to save the input node name because we will use it for our SiMa compilation.

def pytorch_example(dataset, processing, model_path, resnet_type):
    # Load the model from pytorch
    model = models.resnet50(weights=resnet_type)
    model.eval()

    images, labels = dataset
    total_images = len(images)

    print("Inferencing on pytorch...")
    accurate_predictions = 0
    for img, label in tqdm(zip(images, labels), total=total_images):
        preprocessed_img = processing.preprocessing(img)

        prediction = model(preprocessed_img)

        prediction = processing.postprocessing(prediction)

        accurate_predictions += int(prediction == label)

    accuracy = (accurate_predictions/total_images) * 100
    print("Correct predictions:", accurate_predictions)
    print("Accuracy:", accuracy, "%")

    torch.save(model, model_path)
    name, _ = next(model.named_children())
    input_name = name
    print("Model", model_path, "saved with input name", input_name)

    return input_name, accuracy

Compiling ResNet50-1.5v

We assume you are familiar with this simple compilation process. Learn more about this compilation process from the ModelSDK topic in this document.

def compile_pytorch_resnet50(dataset, processing, model_path, input_name):
    from afe.apis.defines import default_quantization
    from afe.apis.loaded_net import load_model
    from afe.core.utils import convert_data_generator_to_iterable
    from afe.load.importers.general_importer import pytorch_source
    from sima_utils.data.data_generator import DataGenerator


    # Input shape in NCHW format (N = 1).
    input_shape = (1, 3, 224, 224)
    importer_params = pytorch_source(model_path, input_names=[input_name], input_shapes=[input_shape])

    loaded_net = load_model(importer_params)
    images, _ = dataset
    n_calib_samples = len(images)

    processing.set_numpy_outputs()

    samples = np.empty((n_calib_samples, 224, 224, 3), dtype=np.float32)
    for i in range(n_calib_samples):
        preprocessed_img = processing.preprocessing(images[i])
        samples[i] = preprocessed_img #.transpose(0, 2, 3, 1)

    input_generator = DataGenerator({input_name: samples})
    calibration_data = convert_data_generator_to_iterable(input_generator)

    # Quantize the model using 35 samples from the calibration dataset.
    model_sdk_net = loaded_net.quantize(calibration_data,
                                        default_quantization,
                                        model_name=model_path,
                                        arm_only=False)

    compiled_folder = "compiled_model/"
    os.makedirs(compiled_folder, exist_ok=True)
    model_sdk_net.save(model_name=model_path, output_directory=compiled_folder)
    model_sdk_net.compile(output_path=compiled_folder, log_level=logging.INFO)

    import tarfile

    model = tarfile.open(compiled_folder + model_path + "_mpk.tar.gz")
    model.extractall(compiled_folder)
    model.close()

    return compiled_folder

Model Accelerator ResNet50-1.5v

Follow the steps below to run the model on our board; this is specific to SiMa’s Palette software. Steps

  1. Send the .lm file to the board.

    def send_model(args):
        print("Sending model...")
        password = ''
        max_attempts = 10
        if not args.bypass_tunnel:
            ssh_connection, local_port = create_forward_tunnel(args, password, max_attempts)
    
            if ssh_connection is None:
                logging.debug(f'Failed to forward local port after {max_attempts}')
                sys.exit(-1)
    
            # we start to work with the local_port from now on
            args.dv_port = local_port
    
        # Copy the .lm or .tar.gz model file to the board.
        scp_file = scp_files_to_remote(args, args.model_file_path, password, "/home/sima", max_attempts)
        if scp_file is None:
            logging.error(f'Failed to scp the model file after {max_attempts}')
            sys.exit(-1)
    
  2. Set up your Pipeline. As described above, the Accelerator runs the pre and post processing in the host and then the model on the board.

    However, the data before getting into the quantized model has to go through tesselation and quantization and before running the postprocessing it has to go through detesselation and dequantization. Also, we must specify the destination of the board in the network since we will be using an Ethernet connection.

    We have the parameters for these operations in the .json file and the Pipeline sets the parameter for all these operations.

    Then we set the processing to numpy outputs and we start our inference loop. The main difference lies in calling pipeline.quantize_tesselate and pipeline.detesselate_dequantize for the previously mentioned reasons.

    def model_accelerator_example(dataset, processing, args):
        send_model(args)
        pipeline = Pipeline(args.model_file_path, args.mpk_json_path, devkit_ip=args.dv_host, local_port=args.dv_port, mlsoc_lm_folder="/home/sima")
        print("Model sent and ready!")
    
        processing.set_numpy_outputs()
    
        images, labels = dataset
        total_images = len(images)
    
        accurate_predictions, i = 0, 0
        print("Inferencing using model accelerator...")
        for img, label in tqdm(zip(images, labels), total=total_images):
            # Preprocess the frame
            preprocessed_frame = processing.preprocessing(img)
    
            preprocessed_frame =  pipeline.quantize_tesselate(preprocessed_frame)
    
            # Run the inference on the preprocessed frame - returns output feature map as bytes (ofm_bytes)
            prediction = pipeline.run_inference(preprocessed_frame=preprocessed_frame, fcounter=i)
    
            prediction = pipeline.detesselate_dequantize(prediction)
    
            prediction = processing.postprocessing(prediction)
    
            accurate_predictions += int(prediction == label)
            i += 1
    
        accuracy = (accurate_predictions/total_images) * 100
        print("Correct predictions:", accurate_predictions)
        print("Accuracy:", accuracy, "%")
    
        return accuracy
    

    As you can see the code is what you would expect by running any machine learning framework. It should also be useful as a template to run your own models and get the accuracy numbers.

Running the Example

Download the Example

Steps

  1. Move or copy the downloaded code to the shared directory between the Palette SDK and your host system.

    sima-user@sima-user-machine:~$ cd ~/Downloads
    sima-user@sima-user-machine:~/Downloads$ unzip get_accuracy.zip
    sima-user@sima-user-machine:~/Downloads$ mv get_accuracy ~/workspace/
    
  2. Start your Palette software container.

  3. Move to the get_accuracy folder and install the necessary packages.

    sima-user@docker-image-id:/home# cd /home/docker/sima-cli/get_accuracy
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# chown <YOUR_USERNAME> ../get_accuracy
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get update
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# sudo apt-get install sshpass
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python3 -m venv venv_accuracy --system-site-packages
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# source venv_accuracy/bin/activate
    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# pip install rpyc
    
  4. Run the application.

    sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host <BOARD_IP_ADDRESS> --run_pytorch_inference --compile_pytorch_model
        Inferencing on pytorch...
        100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:16<00:00, 28.43it/s]
        Correct predictions: 378
        Accuracy: 81.29032258064515 %
        Model resnet50.pt saved with input name conv1
        Running calibration ...DONE
        2024-01-05 10:54:18,948 - afe.ir.quantization_utils - WARNING - Quantized bias was clipped, resulting in precision loss.  Model may need retraining.
        Running quantization ...DONE
        ...
        2024-01-05 10:55:06,842 - mlc.test_util.test_context - INFO - Code generation done
        Compilation done
        Sending model...
        Creating the Forwarding from host
        sima@10.42.0.241's password:
        Copying the model files to DevKit
        sima@10.42.0.241's password:
        Model sent and ready!
        Inferencing using model accelerator...
        100%|█████████████████████████████████████████████████████████████████████████████████████████████| 465/465 [00:15<00:00, 29.54it/s]
        Correct predictions: 373
        Accuracy: 80.21505376344086 %
        Accuracy lost due to quantization: 1.0752688172042895
    

Note

There is a known issue where the card might not start inferencing, it will look like the following:

...
Inferencing using model accelerator...
0%|                                           | 1/465 [00:00<00:53,  8.75it/s]
0%|                                         | 1/465 [00:13<1:44:11, 13.47s/it]

You can simply reset the board and run the Python script as follows:

sima-user@docker-image-id:/home/docker/sima-cli/get_accuracy# python get_accuracy.py --model_file resnet50.pt --dv_host <BOARD_IP_ADDRESS>
    Sending model...
    Creating the Forwarding from host
    sima@10.42.0.240's password:
    Copying the model files to DevKit
    sima@10.42.0.240's password:
    Model sent and ready!
    Inferencing using model accelerator...
    100%|█████████████████████████████████████████| 465/465 [00:12<00:00, 37.08it/s]
    Correct predictions: 371
    Accuracy: 79.78494623655914 %