Summer Undergraduate Research Fellowship (SURF)

SURF Symposium Poster

Animation from Text and Pose

Research Problem

  • Animation is hard and only a select few can create animations today.

Motivation

  • Our goal is to enable those without professional software expertise to create animations.

Approach

  • Our pipeline generates animations from text and human poses by employing two key components:
    1. Motion-Diffusion-Model (MDM): generating human skeleton sequences based on text.
    2. Follow Your Pose: Pose-Guided Text-to-Video model.

Applications

  • Animation

Methodology

screen reader text
Figure 1: Pipeline of Animation from Text and Pose

MDM

  • Input text: human action text (e.g., “a person walks forward and stops”)

Pose-Guided Text-to-Video

  • Generate the video based on reality video.
  • Input text: Appearance and Background (e.g., “Iron man on the beach”)

Path Drawing Tools

  • Enable user interaction through mouse movement.
  • Implemented with OpenCV.

Results

screen reader text
Figure 2: Characters and Background Variation
screen reader text
Figure 3: Captain America kicks with his right leg in the snow
screen reader text
Figure 4: An astronaut is dancing on the moon & Path drawing tool

Conclusion

  • Generated video showed strong alignment with provided prompts and pose
  • Interaction with the users by providing path drawing tools

Future Works

  • Need to implement MDM for generating sequence of multiple human action within a single video.
  • More complex human action need to be generated.
  • Smoothness of the background should be improved.
Hyungjun Doh
Hyungjun Doh
Master’s Student

My research interests are Human-AI interaction and its practical applications, with a specific focus on Extended Reality, Task Guidance Systems, and AI-infused interfaces.