Lightweight LLM Reproduction and Multimodal Extension
End-to-end small-parameter LLM training, fine-tuning, RL training, inference, and multimodal extension.
This project reproduces a compact decoder-only large language model pipeline from data preparation to training and deployment. It covers tokenizer construction, pretraining data organization, supervised fine-tuning, LoRA adaptation, reinforcement learning training, sampling, KV cache, and inference deployment.
The project also extends the text-only pipeline toward speech-vision-language multimodal modeling, including visual encoder to LLM alignment, image-text instruction data organization, and VQA/caption prototypes.