Synthesized MetaHuman Dataset

Synthesized MetaHuman Dataset

A synthesized dataset for assessing face-reenactment deepfake generators

Eurecom

Description

The Synthesized MetaHuman dataset consists of realistic face video animations generated within the 3D environment of Unreal Engine, utilizing the MetaHuman asset available in the Quixel Bridge library.

The dataset comprises a diverse range of facial expressions, including amusement, anger, disgust, laughter, sadness, and surprise. It features 10 unique identities, with each identity exhibiting 20 distinct head or facial movements. All videos are rendered at a resolution of 1920×1080 pixels, ensuring a high level of visual quality and detail for the evaluation process.

dataset_synthesized

Evaluation Protocol

Face-reenactment aims to generate a synthesized video that animates a target face based on the movements captured from a driving video, while preserving the identity conveyed by the source image.

The primary objective of the synthesized MetaHuman dataset is to facilitate the evaluation of generated images through cross-reenactment algorithms.

To evaluate cross-reenactment generated images use the proposed protocol outlined in [1], depicted below.
protocol


The protocol involves two video sequences, denoted as A and B, representing distinct identities. For each frame, the head pose and expression are identical in both sequences. The evaluation protocol can be summarized as follows:

1. Select a frame from video sequence A as the source image.

2. Select a driving video sequence, comprising video frames of identity B, to animate the source image. The head pose and expression in all frames of the driving video correspond to those of the source face.

3. Input the source image and driving video frames into a face-reenactment method to generate a new video sequence representing source identity A. This generated video sequence should accurately reflect the facial expressions and movements that match those of the driving video sequence.

4. Assess the accuracy of the generated frames by comparing them to the ground-truth video using metrics such as SSIM, CSIM, LPIPS, AKD, FID, and FVD.

References

[1] Husseini, Sahar, and Jean-Luc Dugelay. “A comprehensive framework for evaluating deepfake generators: Dataset, metrics performance, and comparative analysis.” Proceedings of the IEEE/CVF international conference on computer vision. 2023.


[2] Husseini, Sahar, and Jean-Luc Dugelay. “Metahumans help to evaluate deepfake generators.” 2023 IEEE 25th International Workshop on Multimedia Signal Processing (MMSP). IEEE, 2023.

Contact

support

If you have any question or request regarding the synthesizedMetaHuman Dataset, please contact Dr. Sahar Husseini (husseini@eurecom.fr) or Prof. Jean-Luc DUGELAY (jld@eurecom.fr)