Projects
A showcase of selected research, open-source, and industrial AI/Robotics projects.
Natural Language-Based Object Understanding for Robots
Building robot perception systems that ground natural language in visual environments to enable intuitive human-robot interaction and reasoning.
- Constructed a contextual Knowledge Graph and RAG-based Vector Database.
- Generated QA pairs based on video shots and interaction history.
- Developed a time-sensitive hybrid RAG QA system for robot reasoning.
SAMIF: Semantic-Aware Mutual Information Factorized Learning
Researched cross-modality shared and unique information factorization strategies to improve semantic segmentation across aligned data modalities.
- Designed, experimented, and validated cross-modality factorized learning frameworks.
- Achieved state-of-the-art performance for multimodal semantic segmentation (2-3% improvement).
- Created PoC by improving the IoU (0.3% ↑) and f1-score (23% ↑) in action localization.
Small Multimodal LLM & Audio Tower Training
Developed small-scale multimodal language models featuring audio/vision towers and optimized using advanced policy training.
- Designed an audio tower and LLM cross-modal continual learning pipeline.
- Achieved a 20.7% improvement by training GRPO on synthetic data with Self-Reflection.
- Served image and video-based QA demo applications within the company.
Fast-Reasoning Modality-Aware Policy Optimization
Collaborated on online policy optimization methods that apply modality-aware weighting in Group Relative Policy Optimization (GRPO).
- Proposed a loss update and reward mechanism that applies modality-aware online weighting in GRPO.
- Improved answer accuracy after fast reasoning by predicting missed and hallucinated cases.
- Co-authored and submitted a paper on the resulting policy optimization framework.
Optimized Body & Hand Pose Estimation
Developed state-of-the-art pose estimation algorithms and optimized them for edge-device integration.
- Trained 2D & 3D pose estimation models using Quantization-Aware Training (QAT).
- Designed RecycleNet, which re-trains synthesized hand mesh models.
- Successfully deployed body pose estimation models to LG Smart TVs.
Golf Pose Estimation & Action Localization
Developed and optimized golf swing pose estimation models and temporal action localization algorithms for portable smart devices.
- Deployed golf pose estimation models onto AI assistant golf devices.
- Created action localization pipelines and dedicated annotation tools for internal dataset curation.
Early Development Career
Frontend & full-stack development projects built before transitioning to AI research.
Deer — Electric Scooter Sharing App
전동킥보드 공유 서비스 디어 모바일 앱 프론트엔드 개발 참여. 사용자 지도 뷰, 킥보드 실시간 조회 및 예약 기능 구현.
da Vinci — Link Collection Service
링크 수집 서비스 사내 스타트업 개발기획 팀. 크롬 확장 프로그램(링크 저장), 모바일 앱/웹 프론트엔드 담당, 백엔드 개발 참여.
KLUE — Course Review Platform
고려대학교 강의 평가 사이트. 기획 및 프론트엔드 리뉴얼, 어드민 페이지 개발 참여.
SubjectArea — Shopping & News App
쇼핑 및 뉴스 관련 앱. 어드민 페이지, 백엔드, 서버 개발 참여.
StayTuned
이커머스 어시스턴트 프론트엔드 웹 툴 구축. (Frontend development for e-commerce helper utilities.)
BodyApp (슬기로on)
해부학 및 신체 구조 교육 설명을 위한 태블릿 전용 앱 데모 빌드. (Tablet-based application for anatomical visualizations and physiological descriptions.)