Skip to Main Content

[標題]最新消息

The Artificial Intelligence Evaluation Center (AIEC) Releases Latest Language Model Evaluation Results, Strengthening Localized AI Capabilities and Trustworthy Development

The Artificial Intelligence Evaluation Center (AIEC), under the Ministry of Digital Affairs (MODA), released its latest evaluation results for domestic and international open-source language models on May 1. The assessment utilized three key metrics highly relevant to Taiwan—"Taiwanese Values," "Senior High School Entrance Exam (GSAT) Chinese," and "Senior High School Entrance Exam (GSAT) Social Studies"—to examine the proficiency of current AI models in Traditional Chinese comprehension, local socio-cultural context, and domestic knowledge.

In this latest release, Accelerate Private Machine Intelligence Company (APMIC) became the first domestic developer to voluntarily authorize the public disclosure of its test results. This milestone signifies a strategic shift in Taiwan's AI industry, moving beyond a pure focus on functionality and performance toward a development model rooted in transparency, reliability, and verifiability.

While international Large Language Models (LLMs) like Anthropic, Gemini, and ChatGPT have demonstrated remarkable capabilities in writing and translation, they occasionally struggle with nuances specific to Taiwan's legal systems, educational content, and cultural norms.

The Administration for Digital Industries (ADI) stated that the primary goal of the AIEC's localized evaluation is to provide a clear understanding of how AI models perform within a Taiwanese context. These results serve a dual purpose:

  • For Developers: Identifying specific areas for improvement and refinement.
  • For Enterprises & Users: Providing a concrete reference point when selecting AI products for professional or personal use.

Since October 2025, the AIEC has consistently published benchmark results, completing evaluations for 131 models to date. The findings highlight a crucial distinction: linguistic fluency does not equate to local understanding. Speaking Chinese is not the same as understanding Taiwan.

"Taiwan needs more than just 'smarter' AI; we need AI that understands local needs and responds to local contexts," the ADI emphasized.

The ADI further noted that APMIC's decision to lead in transparency reflects an industry trend where third-party evaluation is viewed as an essential part of product development. This openness not only helps users understand model capabilities but also bolsters corporate credibility in business partnerships, government procurement, and international markets.

The ADI encourages more domestic model developers, system integrators, and AI service providers to participate in testing and disclose their results. This transparency fosters a positive feedback loop:

  1. Validation: Great models are proven through rigorous testing rather than just marketing.
  2. Visibility: High-quality products gain the market recognition they deserve.
  3. Competitiveness: Verified reliability enhances Taiwan's standing in the global AI landscape.

Moving forward, the AIEC will continue to release test results for open-source models and strengthen its third-party evaluation capabilities. By helping the industry navigate AI governance trends and product optimization, the AIEC ensures that Taiwan is not merely a consumer of AI, but a driving force for trustworthy, verifiable, and deployable AI development.

News Source: Administration for Digital Industries (ADI) - The Artificial Intelligence Evaluation Center (AIEC) Releases Latest Language Model Evaluation Results, Strengthening Localized AI Capabilities and Trustworthy Development (Open in a new window)