IBeta Level 1 Training Dataset

iBeta Level 1 Dataset

Comprehensive dataset for PAD with 30,000+ iBeta level 1 attacks from 85+ IDs

Check samples on Kaggle

Introduction

The iBeta Level 1 Paper & Replay Attacks Dataset offers a comprehensive collection of presentation attacks (PAD) tailored for iBeta Level 1 testing. Beyond paper-based masks and printouts, it includes a diverse set of replay attacks – photo/video replays on smartphone, and laptop displays under varying brightness levels, distances, and angles—to reflect real-world spoofing scenarios. Designed for researchers and developers working on liveness detection, this dataset provides broad coverage for training and validating anti-spoofing models, delivering end-to-end completeness for iBeta Level 1

iBeta Level 1 dataset summary

The dataset combines two types of attack to deliver full iBeta Level 1 PAD coverage:

Paper attacks: 22,000+ paper-mask/print attacks from 80+ participants with a balanced mix of genders and ethnicities. Each sequence is captured on both iOS and Android, with varied viewpoints and multi-frame 10-second videos for active liveness detection
Replay attacks (combined, mobile + PC): Over 8,000 replay-display clips derived from selfies of 2,500+ people across two sources. Replays are performed on smartphone screens (iOS/Android), laptops, and desktop monitors/PC displays

Together, the corpus offers 22k+ paper and 8k+ replay attack videos across diverse people, devices, screens, lighting conditions, and capture geometries—providing broad, Level 1 complete coverage for training and validating liveness/anti-spoofing models

Source and collection methodology

Data were collected from real-life selfies and short videos provided by participants, followed by two families of presentation attacks

Paper attacks: print, cutout, cylinder, and 3D mask variations, recorded on both iOS and Android with controlled changes in angle, distance, and lighting
Replay attacks (mobile + PC): photos/videos of the same participants replayed on smartphone screens (iOS/Android) and desktop monitors. Replay clips (~5–12+ s) include slow camera motion, zoom-in/zoom-out phases, varied brightness, viewing angles, and distances; phone borders are hidden when applicable

All sequences contain explicit zoom-in and zoom-out segments to support active liveness detection and to simulate realistic spoofing attempts

How companies achieved iBeta with us

This dataset is ideal for teams focused on liveness detection and PAD model training. It’s especially valuable for developers preparing their models for iBeta certification, as it includes a comprehensive set of spoofing scenarios required for level 1 testing

How industry leaders achieve superior liveness detection with our dataset

Technology company from Vietnam: iBeta Level 2 success

A Vietnam-based AI/Big Data firm coached by Axon Labs passed iBeta PAD Level 2 on the first try with 0% successful spoofs; the solution claims 99.9% face-recognition accuracy

Fintech Company from Brazil: iBeta Level 1 Success

One of the largest fintechs in Brazil approached us to prepare an active biometric authentication system for iBeta Level 1 certification

Digital Bank from Vietnam: iBeta Level 2 success

Digital bank from Vietnam asked Axon Labs to prepare its anti-spoofing model in order to pass iBeta Level 2 on the first attempt, and the goal was achieved

Participants & scale: 80+ paper-attack participants; 2,500+ selfie contributors for replay
Coverage: 22k+ paper attacks + 8k+ replay clips (combined)
Devices & screens: iOS/Android phones (15 models) + laptops/desktop monitors (PC)
Capture details: multi-frame videos — paper ~10 s, replay ~5–12+ s; explicit zoom-in/zoom-out for active liveness
Diversity: balanced mix of genders and ethnicities (Caucasian, Black, Asian) with varied lighting, angles, and distances

Why Axonlabs better than competitors

One of our partners tested our dataset and a competitor’s dataset using their own liveness detection model while preparing for iBeta Level 1 certification. The results show a clear difference in difficulty between the two datasets. Both datasets were tested on a sample of approximately 200 attack attempts each, ensuring a fair comparison

• Our dataset presents a greater challenge for liveness detection models. The model frequently misclassified attack images as real (label 0), meaning our spoofing techniques are more advanced and harder to detect

• The competitor’s dataset, on the other hand, was mostly detected as attacks (score 1), except for a single type of attack where the model showed some uncertainty

This demonstrates that our dataset provides more value for training robust liveness detection models, as it exposes them to more deceptive and realistic attacks

Understanding the score:

Horizontal axis: score value (0 – model judges the frame as “live”, 1 – “spoof”). Dot color shows ground truth: green = genuine face, red = spoof attack

By training on a more challenging dataset, models can significantly improve their spoof detection capabilities, making them more resilient against real-world threats

This dataset provides 5 variations of spoof attacks

Some of the spoof attacks in our dataset were tested on Doubango, a leading 3D liveness detection framework

Doubango performs advanced 3D liveness checks using a single 2D image and claims to outperform market leaders like FaceTEC, BioID, Onfido, and Huawei in both speed and accuracy

During testing, our attack images bypassed Doubango’s security checks, with the system generating green bounding boxes around the faces (indicating acceptance as “live” users). This confirms that the attacks were not flagged as spoofs, demonstrating their ability to trick even high-performance systems

These results highlight the quality of our dataset for training robust anti-spoofing models capable of defending against evolving threats in real-world scenarios

1. Real life selfie & videos from participants

Genuine facial data collected in various lighting conditions and angles to ensure robust system evaluation

2. Print and Cutout paper attacks

Attackers use printed photos or cutout masks with eye mouth holes to trick recognition systems

3. Cylinder attack to create volume effect

A printed face is wrapped around a cylindrical object to simulate a 3D structure. This method is effective in deceiving simple 2D detection algorithms

4. Paper attacks on Actor with head/eyes variations

A paper face is placed over a real person’s head to mimc real facial movement. Variation include blinking, head tilts, and expressions to test system resilience

5. 3D paper masks with volume based elements such as nose

High-quality 3D masks icorporate raised features sucj as a nose to enhance realism. More challenging for liveness detection algorithms

5. PC/Mobile Replay attacks

A pre-recorded video of a real face is played on a phone, or laptop screen and captured by the camera as if it were a live user. Variations include different angles, distances, screen brightness, and glare to account for screen quality and reflections

Download information

Sample data are available on Kaggle as three separate datasets: Paper Attacks (sample), Replay Attacks — Mobile (sample), and Replay Attacks — PC/Laptop (sample). Request full access or additional samples via the form below

Have a question?

Where does the data come from?

We collect data from our internal team. All information is further verified by our specialists

What is the process?

Once your enquiry has been sent, we will contact you to discuss the details and complete the necessary paperwork. The timing of receiving the dataset depends on the specific request and additional requirements

Can you help us meet dataset disclosure requirements, GDPR and other regulatory controls?

Our unique selling point is to provide legally clean datasets to our customers. We obtain the consent from all the participants to use their data for AI model development. We are able to provide comprensive reporting on the licensing, data collection and privacy compliance of our datasets. Although there seems to be a diverse response to how to control AI development and deployment, we are able to service global customers seeking to launch global AI products.

What makes this dataset suitable for iBeta certification?

The dataset follows iBeta testing protocols and includes diverse attack scenarios that mirror real-world spoofing attempts. It covers both passive and active liveness testing requirements with proper demographic representation and standardized capture conditions essential for certification preparation

What is the price of the dataset?

The price depends on your specific requirements. Please submit a request to receive a free consultation

Contact us

Tell us about yourself, and get access to free samples of the dataset

I want to receive communications on the newly added datasets

Didn't find what you were looking for?

Our collection includes many datasets for various requests

Liveness Detection

iBeta Level 1 Dataset

iBeta Level 1 Dataset

Introduction

iBeta Level 1 dataset summary

Source and collection methodology

How companies achieved iBeta with us

Technology company from Vietnam: iBeta Level 2 success

Fintech Company from Brazil: iBeta Level 1 Success

Digital Bank from Vietnam: iBeta Level 2 success

iBeta Level 1 dataset features

Why Axonlabs better than competitors

This dataset provides 5 variations of spoof attacks

Download information

Have a question?

Contact us

Didn't find what you were looking for?

iBeta Level 1 Dataset

iBeta Level 2 Dataset

Display Replay Dataset for Liveness Detection

Photo Print Dataset

Contacts

What we offer

Follow us