Hunyuan-A13B: Revolutionizing Language Models with Dynamic Reasoning

AI快讯11小时前发布 niko
4 0
AiPPT - 一键生成ppt

Tencent recently made a significant move by open – sourcing its new languagemodel, Hunyuan – A13B. This model stands out with its groundbreakingdynamic reasoning method.

Real – Time Reasoning Adjustment : Hunyuan – A13B can adjust the depthof reasoning in real – time. For straightforward queries, it operates in afast mode , providing raPid responses. For complex problems, it initiatesa multi – step deep reasoning process. Users can manually control this viacommands like “/thINK” and “/no_think”.

MoE ARChitecture and Efficiency : The model adopts a Mixture of Experts(MoE) architecture. With a total of 80 billion parameters, only 13 billionare active during inference, boosting efficiency. Moreover, it supports a256,000 tokens context window , greatly enhancing long – text handlingcapabilities.

Training Data : Hunyuan – A13B was trained on 20 trillion tokens. Toimprove its performance in scientific tasks, Tencent gathered 250 billiontokens from the STEM field , including high – school to university -lEVEl materials.

Performance in Exams : Tencent claims that the “Hunyuan – A13B – Instruct”version rivals leading models. In the 2024 American Invitational MathematicsExamination (AIME) , Hunyuan – A13B achieved an 87.3% accuracy rate ,surpassing OpenAI o1’s 74.3%.

© 版权声明
Trea - 国内首个原生AI IDE