Set-of-Mark detection pipeline for macOS — Apple Vision, YOLO11, and VLM on MLX. Transforms screenshots into numbered element maps and structured JSON manifests.
Set-of-Mark detection pipeline for macOS — Apple Vision, YOLO11, and VLM on MLX. Transforms screenshots into numbered element maps and structured JSON manifests.