✪ Sorting by Developer Region: Light orange represents European models; light blue represents U.S. models; light green represents domestic models; and light purple represents Chinese models.
✪ Percentage Labeling: Values above 50% are highlighted in green, while values below 50% are highlighted in pink.
-
Language Model Benchmarks / Small Models (13B and below)
-
Language Model Benchmarks / Large Models (above 13B)