Coding Period: Week 2

2 minute read

Preliminaries

During our meetings we discussed the possible reasons for previous week error. We also discussed about methods to calculate inference time. We have the options to use time() or perf_counter() method from time library [5] or follow the blog [4]. The blog provide more information about aspects of accurately measuring, asynchronous execution, GPU power-saving modes etc. After complete re-installation of cuda, I again start working on benchmarking baseline models. More work on previous PR was done. Moreover, I prepared scripts to evaluate PilotNet and DeepestLSTMTinyPilotNet.

Objectives

  • Update previous PR’s according to requested changes.
  • PilotNet baseline - Calculate loss (regression), inference time (script and BehaviorNet)
  • DeepestLSTMTinyPilotNet baseline - Calculate loss (regression), inference time (script and BehaviorNet)
  • Installation of TensorRT
  • Decide on the method for inference benchmark.
  • https://github.com/JdeRobot/BehaviorMetrics/pull/371

The execution

I follow the data split defined in paper Memory based neural networks for end-to-end autonomous driving. The baseline was calculate using dataset (total images - 94348) based on circuits - Simple circuit, Montmeló and Montreal. The experiments are performed using a NVIDIA GeForce GTX 1050/PCIe/SSE2 GPU with Intel® Core™ i7-7700HQ CPU @ 2.80GHz × 8 CPU, 8 GB RAM and batch size of 1.

Baseline experiment

The following table present baseline performances:

Model Loss MSE Mean Absolute Error Inference time (time() / perf_counter() / [4])
PilotNet 0.041 0.041 0.095 (0.0364 / 0.035 / 0.0323) sec
Deepest
LSTM
TinyPilotNet
0.025 0.019 0.051 (0.0358 / 0.0356 / 0.0355) sec

My code for evaluating the performance of PilotNet can be found in a separate branch baseline-exp.

Since the inference time from all different methods are quite close. We decided to stick to default time() method with a warm-up for our inference benchmark, similar to 9.

TensorRT installation

TensorFlow-TensorRT (TF-TRT) is a library to support inference optimization on Nvidia GPU’s. Nvidia has very well presented it in the blog [6, 7]. Two important things which needs to taken care:

  1. TensorFlow 2.x, TF-TRT only supports models saved in the TensorFlow SavedModel format.
  2. TensorRT execution engine should be built on a GPU of the same device type as the one on which inference will be executed as the building process is GPU specific.

Additional steps for installations are:

pip install nvidia-pyindex
pip install nvidia-tensorrt

I came across an error Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: and Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7:. Solving / installation will takes hours, so one should plan accordingly. I realized that unfortunately for local installation I have install specific version of cuda and TensorRT but some were not supported for my OS - ubunt18.04. After spending one day experimenting with different installation ways such as .deb, tars or pip wheel, I decided to use pre-build docker. Nvidia provide a docker image which support TensorRT and TF-TRT - https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow.

To pull the docker into local machine:

docker pull nvcr.io/nvidia/tensorflow:22.06-tf2-py3

To run launch the container with DeepLearningStudio as mount:

docker run --gpus all -it --rm -v DeepLearningStudio/:DeepLearningStudio/ nvcr.io/nvidia/tensorflow:22.05-tf2-py3

References

[1] https://github.com/JdeRobot/BehaviorMetrics
[2] https://github.com/JdeRobot/DeepLearningStudio
[3] https://developer.nvidia.com/tensorrt
[4] https://towardsdatascience.com/the-correct-way-to-measure-inference-time-of-deep-neural-networks-304a54e5187f
[5] https://stackoverflow.com/questions/52222002/what-is-the-difference-between-time-perf-counter-and-time-process-time
[6] https://blog.tensorflow.org/2021/01/leveraging-tensorflow-tensorrt-integration.html
[7] https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html
[8] https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow
[9] Optimize TensorFlow Models For Deployment with TensorRT