Test Results of the October 2025 Open Source Models

Breadcrumb

Introduction

The Artificial Intelligence Evaluation Center (AIEC) has been established to promote localized AI evaluation and third-party certification in Taiwan, thereby strengthening the development of trusted AI within the industry. The Center will periodically publish benchmark evaluation results for language models. In addition to adopting indicators based on the Chinese Language and Social Studies sections of the national high school entrance examination, AIEC also incorporates evaluation criteria reflecting Taiwanese values, aligning with global trends in AI sovereignty. These benchmarks serve as key references for developing locally adapted models or fine-tuning international models.

Back to Test Results

✪Sorted by the region of the developing organization: light orange represents European models, light blue represents U.S. models, light green represents local models, and light purple represents Chinese models.

✪Explanation of percentage values: figures above 50% are marked in green; figures below 50% are marked in pink.

Language Model Benchmark / Small Models (13B and below)

Language Model Benchmark / Small Models (13B and below), please refer to the “Small” worksheet in the files below, “Test Results of the October 2025 OpenSource Models(Large and Small Models).ods” or “Test Results of the October 2025 OpenSource Models(Large and Small Models).xlsx

Language Model Benchmark / Large Models (above 13B)

Language Model Benchmark / Large Models (above 13B), please refer to the “Large” worksheet in the files below, “Test Results of the October 2025 OpenSource Models(Large and Small Models).ods” or “Test Results of the October 2025 OpenSource Models(Large and Small Models).xlsx

Downloads:

Test Results of the October 2025 OpenSource Models(Large and Small Models).ods

Test Results of the October 2025 OpenSource Models(Large and Small Models).xlsx