Summer Undergraduate Research Fellowship (SURF)

Animation from Text and Pose
Research Problem
- Animation is hard and only a select few can create animations today.
Motivation
- Our goal is to enable those without professional software expertise to create animations.
Approach
- Our pipeline generates animations from text and human poses by employing two key components:
- Motion-Diffusion-Model (MDM): generating human skeleton sequences based on text.
- Follow Your Pose: Pose-Guided Text-to-Video model.
Applications
- Animation
Methodology
MDM
- Input text: human action text (e.g., “a person walks forward and stops”)
Pose-Guided Text-to-Video
- Generate the video based on reality video.
- Input text: Appearance and Background (e.g., “Iron man on the beach”)
Path Drawing Tools
- Enable user interaction through mouse movement.
- Implemented with OpenCV.
Results
Conclusion
- Generated video showed strong alignment with provided prompts and pose
- Interaction with the users by providing path drawing tools
Future Works
- Need to implement MDM for generating sequence of multiple human action within a single video.
- More complex human action need to be generated.
- Smoothness of the background should be improved.