TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation

1Korea Advanced Institute of Science and Technology
NeurIPS 2024

TL;DR

This project proposes Test-time Procrustes Calibration (TPC), a simple, model-agnostic method to improve diffusion-based human image animation when the reference image and target pose are misaligned in scale or rotation. TPC calibrates the reference image at test time to better match the target pose, maintaining high fidelity and consistency without retraining, making the system more robust in real-world scenarios.

Human image animation aims to generate a human motion video from the inputs of a reference human image and a target motion video. Current diffusion-based image animation systems exhibit high precision in transferring human identity into targeted motion, yet they still exhibit irregular quality in their outputs. Their optimal precision is achieved only when the physical compositions (i.e., scale and rotation) of the human shapes in the reference image and target pose frame are aligned. In the absence of such alignment, there is a noticeable decline in fidelity and consistency. Especially, in real-world environments, this compositional misalignment commonly occurs, posing significant challenges to the practical usage of current systems. To this end, we propose Test-time Procrustes Calibration (TPC), which enhances the robustness of diffusion-based image animation systems by maintaining optimal performance even when faced with compositional misalignment, effectively addressing real-world scenarios. The TPC provides a calibrated reference image for the diffusion model, enhancing its capability to understand the correspondence between human shapes in the reference and target images. Our method is simple and can be applied to any diffusion-based image animation system in a model-agnostic manner, improving the effectiveness at test time without additional training.

Poster

BibTeX

@article{yoon2024tpc,
    title={Tpc: Test-time procrustes calibration for diffusion-based human image animation},
    author={Yoon, Sunjae and Koo, Gwanhyeong and Lee, Younghwan and Yoo, Chang},
    journal={Advances in Neural Information Processing Systems},
    volume={37},
    pages={118654--118677},
    year={2024}
    }