An innovative library for efficient LLM inference via low-bit quantization
FP8 per-tile scaled linear solver for consumer NVIDIA GPUs
JAX Scalify: end-to-end scaled arithmetic.