Face Verification with Challenging Imposters and Diversified Demographics Dataset
Face verification aims to distinguish between genuine and imposter pairs of faces, which include the same or different identities, respectively. The performance reported in recent years gives the impression that the task is practically solved. Here, we revisit the problem and argue that existing evaluation datasets were built using two oversimplifying design choices. First, the usual identity selection to form imposter pairs is not challenging enough because, in practice, verification is needed to detect challenging imposters. Second, the underlying demographics of existing datasets are often insufficient to account for the wide diversity of facial characteristics of people from across the world. To mitigate these limitations, we introduce the FaVCI2D dataset. The FaVCI2D dataset includes identities from 153 countries. The total number of unique IDs is 52,411, with 12,468 of them being used in genuine pairs. The total number of images is 64,879, with two images for IDs from genuine pairs and one for the imposter-only IDs. The complete versions of FaVCI2D, created with random and challenging imposter selection, include a total of 24,936 pairs divided equally between the two types of pairs. We target a balanced gender and geographic distribution. It was possible to obtain enough pairs for America, Asia and Europe but not for Africa. The dataset includes 3,708 genuine pairs, 50% female - 50% male, for each of the first three regions and 1,344 for Africa, 23.3% female - 76.7% male. The gender distribution of IDs in the entire dataset is 44% female - 56% male, which is the closest we can get with reusable resources to a perfect balance. Age-related information was found for 6,535 out of a total of 12,468 genuine pairs. .
To download the data, please follow this link.
For more details see:
- A. Popescu, L.-D. Ştefan, J. Deshayes-Chossart, & B. Ionescu. (2022). Face Verification with Challenging Imposters and Diversified Demographics. Winter Conference on Applications of Computer Vision (WACV).
If you plan to make use of the FaVCI2D dataset, or refer to its results, please acknowledge the work of the authors by citing the paper listed above.
This work was supported by the European Commission under European Horizon 2020 Programme, grant number 951911 - AI4Media. It was made possible by the use of the FactoryIA supercomputer, financially supported by the Ile-de-France Regional Council.