`decon`, but with python API binding.
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation