Selfies & Videos Face Recognition Dataset

Selfies & Videos Face Recognition Dataset

10k+ files and 1k+ people: 8 photos and 2 videos per ID

Check samples on Kaggle

Introduction

The Face Recognition Training Dataset: Selfies, Images & Videos is a comprehensive, production-ready collection designed for training robust temporal face analysis, liveness detection systems. With 1,000+ unique individuals represented through 8 photos, 2 videos (zoom-in and head turn), and historic archive photos, this dataset provides unmatched temporal diversity for next-generation AI models

Built to Solve Real Production Failures

Synthetic datasets and lab-controlled collections fail to address three production challenges that only real-world temporal data can solve:

1. Temporal Recognition Gap (Age-Invariant Matching)
Access control systems, banking KYC, and identity verification face a common failure: people enrolled years ago look different today. Archive photos spanning multiple years enable training models that recognize individuals despite natural aging, weight changes, hairstyle evolution, and fashion trends – impossible to replicate with synthetic data or same-day captures

2. Sufficient Per-Identity Training Samples. Models trained on 1-3 photos per person overfit to specific poses and lighting, causing high false negative rates in production. With 5-10 diverse selfies plus videos per identity, our dataset provides the sample density needed for robust feature learning

3. Structured Metadata for Production Training
Comprehensive metadata (age, gender, ethnicity, device OS/model) enables critical ML workflows: demographic stratification for bias testing, device-specific training splits, age-group analysis, and geographic diversity validation. Unlike raw image collections, structured metadata lets you build balanced training sets that match your production user demographics. 

Composition

Paramener
Value
Total Participants
1,000+ unique individuals
Total Files
100,000+ files
Images per Person
8 images (2 selfies + 6 archive photos)
Videos per Person
2 videos (zoom-in and head turn)
Metadata Fields
Age, gender, ethnicity, device (OS/model)

Multi-Modal Data Per Person

  • 5-10 diverse selfies captured across different environments and conditions
  • 2 videos per person: zoom-in and head turn
  • Historic photos providing temporal perspective on each identity
  • Total 95,000+ files for comprehensive model training

Examples

Production-Ready Quality

  • 1,000+ unique individuals with verified multi-temporal data
  • High-resolution images and videos suitable for commercial deployment
  • Ethically collected with proper consent and documentation
  • Clean, organized structure ready for immediate integration
  • Multi-device captures: smartphones (iOS/Android), webcams, various cameras

Source and collection methodology

The Selfies & Videos Face Recognition Dataset was collected through a structured, multi-stage process involving a diverse group of participants recruited from multiple geographic regions. All data collection followed strict ethical guidelines with full informed consent obtained from each participant prior to any image capture

Use cases and applications

  • Face Recognition & Detection. Train robust face recognition and detection models using diverse selfies per person to identify individuals across varying lighting, poses, and environmental conditions in security, surveillance, and access control applications
  • KYC & Identity Verification. Automate customer identity verification by matching live selfies against official ID document photos for banking onboarding, fintech applications, and regulatory compliance with AML/KYC requirements
  • Biometric Authentication. Implement secure facial biometrics for mobile device unlocking, payment authorization, multi-factor authentication, and physical access control systems with sub-second response time and 99%+ accuracy

Application Example

Challenge
The model must match recent photos of people with their old photos that are in the database
Solution
AI model trained on temporal data recognizes individuals despite aging, weight changes, and appearance variations
This Dataset
Multi-year time periods in archive photographs reflect real changes in appearance over time

Download information

A sample version of this dataset is available on Kaggle. Leave a request for additional samples in the form below

Have a question?

We collect data from our internal team. All information is further verified by our specialists

Once your enquiry has been sent, we will contact you to discuss the details and complete the necessary paperwork. The timing of receiving the dataset depends on the specific request and additional requirements

Our unique selling point is to provide legally clean datasets to our customers. We obtain the consent from all the participants to use their data for AI model development. We are able to provide comprensive reporting on the licensing, data collection and privacy compliance of our datasets. Although there seems to be a diverse response to how to control AI development and deployment, we are able to service global customers seeking to launch global AI products.

The price depends on your specific requirements. Please submit a request to receive a free consultation

Contact us

Tell us about yourself, and get access to free samples of the dataset 

Didn't find what you were looking for?

Our collection includes many datasets for various requests

High-quality biometric datasets for real-world AI

Contacts

UAE, Ajman

© 2022 – 2025 Copyright protected.