This project provides a high-performance implementation of YOLOv11 object detection using TensorRT for inference acceleration. The pipeline processes images and videos in batches, leveraging CUDA for preprocessing and inference. Non-Maximum Suppression (NMS) and postprocessing are performed on the CPU to optimize results.
Configuration | Inference Time (ms) | Preprocessing Time (ms) | Total Latency (ms) | FPS |
---|---|---|---|---|
Baseline (CUDA Only) | 80 | - | 80 | 12.5 |
TensorRT + CUDA Streams (CPU Preprocessing) | 30 | Depends on CPU (~0-10) | 30 | 33.3 |
TensorRT + CUDA Streams (CUDA Preprocessing) | 20 | Optimal | 20 | 50 |