Lightweight Python tool using Optuna for tuning llama.cpp flags: towards optimal tok/s for your machine
A powerful shell that's powered by a locally running LLM (ideally Llama 3.x or Qwen 2.5)