I graduated from SYSU with the master's degree.
I am currently working in the Qwen team, responsible for Qwen's post-training and distillation.
In my free time, I like to write technical blogs on [Wechat Official Accounts: YeungNLP] and [Zhihu: 红雨瓢泼]
🔭 Experiences:
- Alibaba Qwen Team, responsible for Qwen's post-training and distillation. (from 2024-07 to now)
- Shopee, responsible for building NLP algorithm ability about Customer Service. (from 2022-04 to 2024-05)
- Tencent, responsible for building NLP algorithm ability about Product Understanding. (from 2021-06 to 2022-04)
⚙ Here are some my public projects:
| Project | Description | Code |
|---|---|---|
| Firefly | One-stop training for LLMs. Some achievements: 1. firefly-llama2-13b ranked 3rd among all 13B models on Open LLM Leaderboard, only 0.5 points less than 1st. 2. firefly-llama-30b ranked 10th among all 30B models on Open LLM Leaderboard trained with single V100. 3. firefly-baichuan-13b achieves over 1.63 million downloads. 4. firefly-qwen1.5-en-7b-dpo improves 7.21 points compared with the official chat model. 5. firefly-gemma-7b improves 9.37 points compared with the official chat model. |
|
| GPT2-chitchat | Chinese GPT2 for chitchat | |
| Firefly-LLaMA2-Chinese | Chinese Llama2 with efficient and effective training method. | |
| LongQLoRA | Efficient and Effective method for extending context length of Llama2 to 8192 with single V100. Technical Report | |
| CPM | Chinese composition model based on CPM | |
| CLIP-Chinese | Chinese CLIP model trained with 1.4 million image-text pairs | |
| ClipCap-Chinese | Chinese image caption model based on clip and mengzi | |
| OFA-Chinese | Chinese multi-modal unified pre-training model | |
| LLMPruner | Prune vocabulary of LLMs to save memory in training. |



