YOLO11 with CUDA and TensorRT

This project provides a high-performance implementation of YOLOv11 object detection using TensorRT for inference acceleration. The pipeline processes images and videos in batches, leveraging CUDA for preprocessing and inference. Non-Maximum Suppression (NMS) and postprocessing are performed on the CPU to optimize results.

GitHub Repository

Feature

Report

ConfigurationInference Time (ms)Preprocessing Time (ms)Total Latency (ms)FPS
Baseline (CUDA Only)80-8012.5
TensorRT + CUDA Streams (CPU Preprocessing)30Depends on CPU (~0-10)3033.3
TensorRT + CUDA Streams (CUDA Preprocessing)20Optimal2050