PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
ianarawjo
promptstats

Statistical analysis methods for comparing prompt and model performance in LLM evaluations.

1K 101 2
ianarawjo
evalstats

Statistical analysis methods for comparing prompt and model performance in LLM evaluations.

1K 101 2
ankurpand3y
judicator

Who evaluates the evaluator? Judicator audits LLM-as-a-Judge systems for 7 documented bias types. Zero config. Works with any LLM.

973 5 1
prompt-foundry
prompt-foundry-python-sdk

The prompt engineering, prompt management, and prompt evaluation tool for Python

438 8 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery