
A living, open-source community to track and advance model agentic capabilities
EvalSys is an evolving open-source community on a mission to track and advance model agentic capabilities. We continuously release benchmarks, frameworks, datasets, toolchains, and models to help push the field forward—and we welcome collaborators from all backgrounds to join and contribute!