THOR: Text to Human-Object Interaction Diffusion via Relation Intervention

1Shanghaitech University, 2Shanghai AI Lab
Teaser image

THOR are used to generate dynamic human object interactions from textual descriptions

Abstract

This paper addresses the challenging task of generating dynamic Human-Object Interactions from textual descriptions, named Text2HOI. While most existing works assume interactions with limited body parts or static objects, our task involves addressing the variation in human motion, the diversity of object shapes, and the semantic vagueness of object motion simultaneously. To tackle this, we propose a novel Text-guided Human-Object Interaction diffusion model with Relation Intervention (THOR). THOR is a cohesive diffusion model equipped with a relation intervention mechanism. In each diffusion step, we initiate text-guided human and object motion and then leverage human-object relations to intervene in object motion. This intervention enhances the spatial-temporal relations between humans and objects, with human-centric motion providing additional guidance for synthesizing consistent motion from text. To achieve more reasonable and realistic results, relation intervention loss is introduced at different levels of motion granularity.

Teaser image

Generating human object interactions from textual descriptions

The woman puts down the wood chair backwards.

The woman sits on the yoga ball and jumps clockwise.

The woman moves the stool with her feet.

The woman puts the medium box on the ground, then moves it with her foot.

The woman lifts the yogamat with both hands and turns around.

The man walks around with the wood chair in his left hand, then puts it down.

The woman puts down the wood chair backwards.

The woman throws the small box up with both hands while turning counterclockwise.

The man takes the backpack off his right shoulder and puts it on the ground.

The man swings the plastic container back and forth with his left hand.

A woman held a trashbin and made a motion to dump the trash.

The man holds the suitcase with both hands, keeping it flat and walking counterclockwise.