🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Slicing a PyTorch Tensor Into Parallel Shards
Easy way to efficiently run 100B+ language models without high-end GPUs