I am currently a researcher on the Pre-training Team at Tencent Hunyuan. My current work focuses on improving training stability of LLMs and exploring efficient Pre-training.
I received both my B.S. and M.S. degrees from Beihang University, under the supervision of Prof. Qinghong Yang. My research interests lie in the interpretability of llms and efficient Transformer training. Outside of work, I share my life with a cat named Caiyuan.
") does not match the recommended repository name for your site ("").
", so that your site can be accessed directly at "http://".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}" in index.html.
",
which does not match the baseurl ("") configured in _config.yml.
baseurl in _config.yml to "".

Hunyuan Team
Technical Report 2025
We introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba model featuring an adaptive long-short chain-of-thought mechanism that dynamically switches between rapid responses and deep reasoning modes, achieving state-of-the-art performance with significantly improved efficiency.
Hunyuan Team
Technical Report 2025
We introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba model featuring an adaptive long-short chain-of-thought mechanism that dynamically switches between rapid responses and deep reasoning modes, achieving state-of-the-art performance with significantly improved efficiency.

Hunyuan Team
Technical Report 2024
We introduce Hunyuan-Large, the largest open-source Transformer-based mixture-of-experts model, with 389 billion total parameters and 52 billion activated parameters, capable of handling up to 256K tokens, outperforming LLaMA3.1-70B across a wide range of benchmarks.
Hunyuan Team
Technical Report 2024
We introduce Hunyuan-Large, the largest open-source Transformer-based mixture-of-experts model, with 389 billion total parameters and 52 billion activated parameters, capable of handling up to 256K tokens, outperforming LLaMA3.1-70B across a wide range of benchmarks.

Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu
Findings of the Association for Computational Linguistics (ACL) 2023
We present a simple and effective method to train a strong bilingual/multilingual multimodal representation model by altering the text encoder in CLIP with XLM-R, achieving new state-of-the-art on ImageNet-CN, Flicker30k-CN, COCO-CN and XTD.
Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu
Findings of the Association for Computational Linguistics (ACL) 2023
We present a simple and effective method to train a strong bilingual/multilingual multimodal representation model by altering the text encoder in CLIP with XLM-R, achieving new state-of-the-art on ImageNet-CN, Flicker30k-CN, COCO-CN and XTD.

Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu
AAAI Conference on Artificial Intelligence (AAAI) 2024
We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes, improving Llama-2-7B truthfulness from 40.8% to 74.5% on TruthfulQA.
Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu
AAAI Conference on Artificial Intelligence (AAAI) 2024
We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes, improving Llama-2-7B truthfulness from 40.8% to 74.5% on TruthfulQA.