Dependents of nemo-evaluator

22 dependents

Package	Description	Downloads/month
nemo-evaluator-launcher	Open-source library for scalable, reproducible evaluation of AI models and bench...	37K
nvidia-lm-eval	A framework for evaluating language models - packaged by NVIDIA	14K
nvidia-bfcl	Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)	2K
nvidia-livecodebench	LiveCodeBench - packaged by NVIDIA	2K
nvidia-simple-evals	Open AI simple evals - packaged by NVIDIA	1K
nvidia-ifbench	IFBench: A challenging benchmark for precise instruction following	1K
nvidia-tau2	The tau2 package - packaged by NVIDIA	698
nvidia-mtbench-evaluator	MTBench evaluator - packaged by NVIDIA	320
nvidia-bigcode-eval	BigCode Evaluation Harness - packaged by NVIDIA	308
nvidia-vlmeval	OpenCompass VLM Evaluation Kit - packaged by NVIDIA	286
nvidia-safety-harness	Content safety evaluation tool - packaged by NVIDIA	270
nvidia-eval-factory-garak	the LLM vulnerability scanner	242
nvidia-mmath	MMATH - packaged by NVIDIA	212
nvidia-hle	Humanity's last exam adaptation - packaged by NVIDIA	212
nvidia-scicode	A benchmark that challenges language models to code solutions for scientific pro...	200
nvidia-tooltalk	Evaluating tool-augmented LLMs in a conversational setting - packaged by NVIDIA	195
nvidia-genai-perf-eval	The Triton Inference Server provides an optimized cloud and edge inferencing sol...	189
nvidia-compute-eval	Library for evaluating Large Language Models on CUDA code	173
nvidia-crfm-helm	Holistic Evaluation of Language Models (HELM) is an open source Python framework...	137
nvidia-profbench	Professional domain benchmark for evaluating LLMs on Physics PhD, Chemistry PhD,...	127
nvidia-long-context-eval	Long context evaluations - packaged by NVIDIA NeMo Evaluator	93
nvidia-aa-lcr	Artificial Analysis Long Context Reasoning (AA-LCR) adaptation - packaged by NVI...	72