Test Results of the November 2025 Open-Source Models

Breadcrumb

Introduction

The Artificial Intelligence Evaluation Center (AIEC) has been established to promote localized AI evaluation and third-party certification in Taiwan, thereby strengthening the development of trusted AI within the industry. The Center will periodically publish benchmark evaluation results for language models. In addition to adopting indicators based on the Chinese Language and Social Studies sections of the national high school entrance examination, AIEC also incorporates evaluation criteria reflecting Taiwanese values, aligning with global trends in AI sovereignty. These benchmarks serve as key references for developing locally adapted models or fine-tuning international models.

Back to Test Results

✪Sorted by the region of the developing organization: light orange represents European models, light blue represents U.S. models, light green represents local models, and light purple represents Chinese models.

✪Explanation of percentage values: figures above 50% are marked in green; figures below 50% are marked in pink.

✪This test cycle includes 18 newly added models (3 small models and 15 large models).

Language Model Benchmark / Small Models (13B and below)

Language Model Benchmark / Small Models (13B and below)Please refer to the relevant files below.

Language Model Benchmark / Large Models (above 13B)

Language Model Benchmark / Large Models (above 13B)Please refer to the relevant files below.

Downloads:

Test Results of the November 2025 OpenSource Models(Small Models).ods

Test Results of the November 2025 OpenSource Models(Large Models).ods