PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Web Agents Python Packages

Python packages with the GitHub topic web-agents. Sorted by relevance, with stars and monthly downloads.
ServiceNow
agentlab

AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

3K 574 112
OSU-NLP-Group
uground-demo-test

[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents

1K 312 18
reacher-z
clawbench-eval

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

1K 162 10
reacher-z
claw-harness

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

1K 162 10
reacher-z
clawbench-harness

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

1K 162 10
reacher-z
nail-clawbench

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

1K 162 10
reacher-z
openclawbench

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

1K 162 10
reacher-z
clawbench-cli

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

773 162 10
reacher-z
claw-ai

ClawBench: Can AI Agents Complete Everyday Online Tasks? (alias of claw-bench)

252 162 10
reacher-z
task-harness

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

246 162 10
reacher-z
claw-eval

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

235 162 10
reacher-z
claw-agent

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

226 162 10
reacher-z
mcq-bench

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

216 162 10
reacher-z
harness-hub

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

215 162 10
reacher-z
life-bench

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

212 162 10
reacher-z
nail-eval

ClawBench: Can AI Agents Complete Everyday Online Tasks? (alias of claw-bench)

211 162 10
reacher-z
r2-harness

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

210 162 10
reacher-z
r2agent

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

210 162 10
reacher-z
nail-agent

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

209 162 10
reacher-z
nail-group

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

208 162 10
ServiceNow
doomarena-taubench

DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats

208 58 6
reacher-z
nail-bench

ClawBench: Can AI Agents Complete Everyday Online Tasks? (alias of claw-bench)

207 162 10
reacher-z
harnessos

Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

206 162 10
reacher-z
everyday-bench

ClawBench: Can AI Agents Complete Everyday Online Tasks? (alias of claw-bench)

202 162 10
    • Data from PyPI, GitHub, ClickHouse, and BigQuery