Portrait
Zhongzhi Chen
Researcher, Tencent Hunyuan
About Me

I am currently a researcher on the Pre-training Team at Tencent Hunyuan. My current work focuses on improving training stability of LLMs and exploring efficient Pre-training.

I received both my B.S. and M.S. degrees from Beihang University, under the supervision of Prof. Qinghong Yang. My research interests lie in the interpretability of llms and efficient Transformer training. Outside of work, I share my life with a cat named Caiyuan.

Education
  • Beihang University
    Beihang University
    M.S. in Software Engineering
    Sep. 2021 - Jun. 2024
  • Beihang University
    Beihang University
    B.S. in Software Engineering
    Sep. 2017 - Jun. 2021
Experience
Honors & Awards
  • National Graduate Scholarship
    2023
  • Academic Excellence Scholarship
    2022
  • Outstanding Undergraduate Graduate Award
    2021
  • Outstanding Student Leader Award
    2020
  • National College Innovation and Entrepreneurship Competition Award
    2020
News
2026
๐ŸŽ‰ Hy3-preview is released! Read the blog post here.
2025
๐Ÿš€ The 7B model I trained is released! Check it out on Hugging Face.
Selected Publications
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Hunyuan Team

Technical Report 2025

We introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba model featuring an adaptive long-short chain-of-thought mechanism that dynamically switches between rapid responses and deep reasoning modes, achieving state-of-the-art performance with significantly improved efficiency.

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Hunyuan Team

Technical Report 2025

We introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba model featuring an adaptive long-short chain-of-thought mechanism that dynamically switches between rapid responses and deep reasoning modes, achieving state-of-the-art performance with significantly improved efficiency.

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Hunyuan Team

Technical Report 2024

We introduce Hunyuan-Large, the largest open-source Transformer-based mixture-of-experts model, with 389 billion total parameters and 52 billion activated parameters, capable of handling up to 256K tokens, outperforming LLaMA3.1-70B across a wide range of benchmarks.

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Hunyuan Team

Technical Report 2024

We introduce Hunyuan-Large, the largest open-source Transformer-based mixture-of-experts model, with 389 billion total parameters and 52 billion activated parameters, capable of handling up to 256K tokens, outperforming LLaMA3.1-70B across a wide range of benchmarks.

AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities

Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu

Findings of the Association for Computational Linguistics (ACL) 2023

We present a simple and effective method to train a strong bilingual/multilingual multimodal representation model by altering the text encoder in CLIP with XLM-R, achieving new state-of-the-art on ImageNet-CN, Flicker30k-CN, COCO-CN and XTD.

AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities

Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu

Findings of the Association for Computational Linguistics (ACL) 2023

We present a simple and effective method to train a strong bilingual/multilingual multimodal representation model by altering the text encoder in CLIP with XLM-R, achieving new state-of-the-art on ImageNet-CN, Flicker30k-CN, COCO-CN and XTD.

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning
Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu

AAAI Conference on Artificial Intelligence (AAAI) 2024

We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes, improving Llama-2-7B truthfulness from 40.8% to 74.5% on TruthfulQA.

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

Zhongzhi Chen, Xingwu Sun, Xianfeng Jiao, Fengzong Lian, Zhanhui Kang, Di Wang, Cheng-Zhong Xu

AAAI Conference on Artificial Intelligence (AAAI) 2024

We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes, improving Llama-2-7B truthfulness from 40.8% to 74.5% on TruthfulQA.

Full list on Google Scholar