DS-STAR: A state-of-the-art versatile data science agent

...we present DS-STAR, a new agent designed to solve data science problems. DS-STAR introduces three key innovations: (1) a data file analysis module that automatically extracts context from varied data formats, including unstructured ones; (2) a verification stage where an LLM-based judge assesses the plan’s sufficiency at each step; and (3) a sequential planning process that iteratively refines the initial plan based on feedback. This iterative refinement allows DS-STAR to handle complex analyses that draw verifiable insights from multiple data sources. We demonstrate that DS-STAR achieves state-of-the-art performance on challenging benchmarks like DABStep, KramaBench, and DA-Code. It especially excels with tasks involving diverse, heterogeneous data files.

Paper