BixBench Benchmark Leaderboard

BixBench is a benchmark for real-world bioinformatics and computational biology data analysis. It evaluates AI models on multi-step scientific workflows that require code execution, statistical reasoning, and biological domain knowledge to interpret experimental data.

Leaderboard

Top 1 models on BixBench Benchmark Leaderboard (scores from public evaluations).

  1. 1GPT-5.580.5% on BixBench Benchmark Leaderboard

Models tracked

Models with bixbench in their evaluation profile.

  • No models linked yet.

View task leaderboards →