CPU-reference-first Tabu Search Quadratic Knapsack solver with optional accelerator hooks
Shard-first late-interaction retrieval for ColBERT and ColPali style workloads with CPU/GPU modes, Triton MaxSim, BM25 hybrid search, durable CRUD/WAL, multimodal preprocessing, and base64-ready reference APIs.
High-performance late-interaction retrieval engine for on-prem AI. ColBERT/ColPali multi-vector search with Rust fused MaxSim, Triton GPU kernels, ROQ quantization, LEMUR routing, WAL-backed CRUD, and a FastAPI server — single machine, CPU or GPU.