Data Types (DType)¶

usls supports multiple precision levels to balance accuracy, speed, and memory usage.

Quick Reference

DType	Precision	Speed	Memory	Best For
`Fp32`	32-bit float	Baseline	100%	Maximum accuracy
`Fp16`	16-bit float	2-3x faster	50%	Modern GPUs
`Q8`	8-bit quantized	Fast	25%	Edge deployment
`Q4F16`	4-bit + FP16	Fast	~15%	VLMs, limited memory
`Bnb4`	BitsAndBytes 4-bit	Fast	~12%	Ultra-low memory

Configuration¶

Global (All Modules)¶

Example

let config = Config::sam3()
    .with_dtype_all(DType::Fp16)
    .commit()?;

Per-Module¶

Example

let config = Config::clip()
    // Fast visual encoding
    .with_visual_dtype(DType::Fp16)
    // Accurate text encoding
    .with_textual_dtype(DType::Fp32)
    .commit()?;

TensorRT Note¶

TensorRT FP32 Behavior

TensorRT automatically converts FP32 to FP16 for performance. Use --dtype fp32 with TensorRT for optimal speed.

TensorRT-RTX preserves the input precision exactly.