CV | Xiaoqiang Shi

Contact Information

Name	Xiaoqiang Shi
Professional Title	PhD Student
Email	shixiaoqiang21@mails.ucas.ac.cn

Professional Summary

PhD student in Computer Application Technology at the University of Chinese Academy of Sciences, working on multimodal large language models, agents, computer vision, and 3D Gaussian avatar reconstruction.

Education

2021 - Present

China
PhD candidate

University of Chinese Academy of Sciences

Computer Application Technology
- Consecutive master-doctoral program.
- Research interests include multimodal LLMs, agents, computer vision, and 3D Gaussian avatar reconstruction.
2017 - 2021

Liaoning, China

Bachelor of Engineering

University of Science and Technology Liaoning

Mechanical Design, Manufacturing and Automation

Research Experience

Publications

2026

3DGA: 3D Avatar Animation from Monocular Video via Deformable Gaussian Splatting

Pattern Recognition

Co-first author work on monocular-video 3D avatar animation with deformable Gaussian Splatting. SCI Q1 Top, CCF B.
2023

BSSNet: A Real-Time Semantic Segmentation Network for Road Scenes Inspired from AutoEncoder

IEEE Transactions on Circuits and Systems for Video Technology

First-author work on real-time road-scene semantic segmentation with a focus on accuracy, speed, and model complexity. SCI Q1 Top, CCF B.
2026

CANVAS: Any-Shot Animatable 3D Gaussian Head Avatars from 1-K Images

Submitted to ACM Multimedia 2026

First-author work on any-shot animatable 3D Gaussian head avatars.
2026

HiFiAvatar: Prior-Guided Gaussian Appearance Learning for High-Fidelity Animatable Head Avatars from Monocular Video

Submitted to ACM Multimedia 2026

First-author work on high-fidelity animatable head avatars from monocular video.
2026

3DGA: Topology-Aware 3D Gaussian Head Avatars with Barycentric Parameterization and Layered Densification

Submitted to IEEE Transactions on Image Processing

First-author work on topology-aware 3D Gaussian head avatars. Submitted to SCI Q1 Top / CCF A venue.

Projects

Lightweight LLM Reproduction and Multimodal Extension

Reproduced a small-parameter decoder-only LLM pipeline from pretraining through SFT/LoRA, reinforcement learning, inference, and multimodal extension.
- Covered tokenizer, data construction, training, fine-tuning, RL training, sampling, KV cache, and deployment.
- Extended the pipeline toward speech-vision-language multimodal modeling.
Coding Agent Prototype

Built a code-oriented LLM agent prototype that closes the loop from user task understanding to file edits, command execution, verification, and error repair.
- Designed task parsing, context management, tool calling, and result feedback workflows.
- Supported code understanding, modification, validation, and iterative debugging.
3D Gaussian Human and Head Avatar Reconstruction

Developed a series of 3D Gaussian avatar methods for animatable human/head reconstruction and high-fidelity rendering.
- Worked on deformable Gaussian representations, topology constraints, barycentric parameterization, layered densification, and prior-guided appearance learning.
- Related work includes 3DGA, CANVAS, and HiFiAvatar.
BSSNet Real-Time Road Scene Semantic Segmentation

Designed a real-time semantic segmentation network for road scenes inspired by autoencoder structures.
- Balanced segmentation accuracy, inference speed, and model complexity.
- Published in IEEE TCSVT.

Skills

Large Language Models (Advanced): Decoder-only Transformer, tokenizer, pretraining, SFT, LoRA, RL training, KV cache, sampling

Multimodal Models and Agents (Advanced): vision-language alignment, VQA, captioning, tool calling, task planning, memory, multi-turn interaction

Computer Vision and 3D Reconstruction (Advanced): semantic segmentation, object detection, 3D Gaussian Splatting, animatable avatars, high-fidelity rendering

Honors and Service

Languages

Chinese : Native

English : Professional working proficiency

Contact Information

Professional Summary

Education

PhD candidate

University of Chinese Academy of Sciences

Computer Application Technology

Bachelor of Engineering

University of Science and Technology Liaoning

Mechanical Design, Manufacturing and Automation

Research Experience

Publications

3DGA: 3D Avatar Animation from Monocular Video via Deformable Gaussian Splatting

Pattern Recognition

BSSNet: A Real-Time Semantic Segmentation Network for Road Scenes Inspired from AutoEncoder

IEEE Transactions on Circuits and Systems for Video Technology

CANVAS: Any-Shot Animatable 3D Gaussian Head Avatars from 1-K Images

Submitted to ACM Multimedia 2026

HiFiAvatar: Prior-Guided Gaussian Appearance Learning for High-Fidelity Animatable Head Avatars from Monocular Video

Submitted to ACM Multimedia 2026

3DGA: Topology-Aware 3D Gaussian Head Avatars with Barycentric Parameterization and Layered Densification

Submitted to IEEE Transactions on Image Processing

Projects

Lightweight LLM Reproduction and Multimodal Extension

Coding Agent Prototype

3D Gaussian Human and Head Avatar Reconstruction

BSSNet Real-Time Road Scene Semantic Segmentation

Skills

Honors and Service

Languages