Optimizing Inference and New LLM Features in Desktops and Workstations
This session explains how to apply optimizations in TensorRT to dramatically improve inference performance, also sharing stories of NVIDIA's collaborative work with partners to introduce new features and enhancements for TensorRT releases.