Unified benchmarking and profiling framework for the JAX scientific ML ecosystem. Timing, GPU/energy monitoring, FLOPS counting, roofline analysis, statistical testing, regression detection, and CI integration.
A new package that helps developers integration-test AI and LLM applications by validating structured outputs. It takes a user's test scenario or prompt as input, sends it to an LLM, and uses pattern