Coding Period: Week 6
Preliminaries
After presenting the simulation videos, we decided to analyze the failure cases. I will also combine the best performing models in single video for presentation and comparison. Furthermore, some PR and issues are created. Most importantly, I worked on using TensorRT to optimize the tensorflow models. Using TensorRT (here TF-TRT) is not typical and therefore I have shared many resources in the blog to understand it.
Objectives
- Record F1 car crashing videos and their stats
- Record stats for Dynamic range quantization model on Montreal circuit
- Combine the best performing simulation in one video with explainations
- Updata PR #67 with results table, inference script and weight links
- Prepare TensorRT scripts for optimization (float32, float16 and int8 modes)
- Evaluate TensorRT in offline mode
Related Issues and Pull requests.
Related to use BehaviorMetrics repository:
- While recording the stats for simulation, I encountered a issue - Crash while recording stats with PilotNet (TF) model on Montreal circuit #392.
Updated PR:
Repository for new script:
The execution
Video demonstration
I used my personal computer with a NVIDIA GeForce GTX 1050/PCIe/SSE2
GPU with Intel® Core™ i7-7700HQ CPU @ 2.80GHz × 8
CPU, 8 GB RAM and batch size of 1 for simulation.
The crash videos are avaiable on google drive - https://drive.google.com/drive/folders/1ovjuWjSy-ea7YtgnaSsgVsHnbo0HJY1A?usp=sharing
Final video
Optimization with TensorRT
I used the server with a 8 GB NVIDIA GeForce GTX 1080/PCIe/SSE2
GPU and batch size of 1 (for inference).
There are some useful resources, I have listed below:
- User Guide - https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
- TF-TRT blog - https://blog.tensorflow.org/2021/01/leveraging-tensorflow-tensorrt-integration.html
- Inference example - github_repo
- TensorRT API - https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/experimental/tensorrt
- Dynamic image size example - notebook_link
- Basic colab example - notebook_link
The scripts are available here - https://github.com/nik1806/DeepLearningStudio/tree/tf_trt
Result table
I am using tensorflow graphs (low-level API) for inference here.
Method | Model size (MB) | MSE | Inference time (s) |
---|---|---|---|
Baseline | 196 | 0.041032556329194385 | 0.0012623071670532227 |
Precision fp32 | 260 | 0.04103255125749467 | 0.0013057808876037597 |
Precision fp16 | 260 | 0.04103255125749467 | 0.0021804444789886475 |
Precision int8 | 260 | 0.04103255125749467 | 0.0011799652576446533 |
Conclusion
- Conversion with
int8
precision has best inference time. - There is no loss in performance (MSE is same) of the optimized models.
- The model size has increased for optimized models.
- As we know there is no guarantee to receive better performing models with TensorRT, we can expect the results we have with PilotNet.
References
[1] https://github.com/JdeRobot/BehaviorMetrics
[2] https://github.com/JdeRobot/DeepLearningStudio
[3] https://developer.nvidia.com/tensorrt
[4] https://www.tensorflow.org/lite/performance/model_optimization
[5] https://www.tensorflow.org/model_optimization/guide/install
[6] https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
[7] https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow
[8] https://www.tensorflow.org/model_optimization/guide