The Hidden Data Trail Created by Online Age Checks

Quick Verdict

Online age checks have a legitimate purpose. They can prevent children from accessing pornography, gambling, alcohol sales and other age-restricted services. They can also help platforms separate child and adult experiences, apply stronger privacy settings to younger users and avoid processing data from children who are not permitted to use a service.

The concern is not simply that an age check exists. The concern is how much information is collected, which organisations receive it, whether the check can be linked to the site being visited, how long records are retained and whether the result is reused for advertising, profiling or unrelated identity checks.

Balanced conclusion

A system that confirms only “age requirement met” through a short-lived, unlinkable token can be relatively privacy-preserving. A system that sends a passport, selfie, bank information and browsing context to the same company creates a much larger and more consequential data trail.

Age Assurance, Estimation and Verification Are Not the Same

“Age assurance” is the broad term for methods used to determine whether someone belongs to an age or age-range category. It can include a simple declaration, an estimate based on a face or account history, or a formal verification using identity evidence.

Approach	What It Tries to Establish	Typical Confidence
Self-declaration	The user states a date of birth or confirms they are over a threshold.	Low
Age estimation	Technology estimates an age or range from a face, email history or behavioural signals.	Variable
Age verification	Evidence such as an ID, bank record or trusted credential confirms an age attribute.	Potentially high
Proof of age	A signed token confirms that a threshold is met without necessarily disclosing identity or birth date.	High when well implemented

Regulators do not treat every method as equally effective. For example, Ofcom lists open banking, photo-ID matching, facial age estimation, mobile-network checks, credit-card checks, digital identity services and certain email-based estimation systems as methods capable of being highly effective when implemented properly. Self-declaration alone is not considered highly effective.

The Data Trail Behind a Single Age Check

A well-designed website may receive only a yes-or-no result. That does not mean no data was processed elsewhere. The verification provider may still need enough information to make the decision, detect fraud and demonstrate that its system works.

Data Layer	Examples	Why It May Be Kept
Evidence supplied by the user	ID image, selfie, face video, phone number, card details, bank consent or digital credential.	To make the age decision, investigate fraud or handle a challenge.
Derived information	Estimated age, age band, confidence score, liveness result or “over 18” status.	To deliver the decision and monitor accuracy.
Technical metadata	IP address, browser type, device identifiers, operating system, timestamps and language.	Security, abuse prevention, troubleshooting and audit logging.
Transaction context	Which service requested the check, the required threshold and whether access was granted.	Billing, compliance evidence and dispute resolution.
Fraud signals	Repeated attempts, document reuse, emulator detection, suspicious network patterns or mismatched details.	To stop borrowed documents, bots and automated circumvention.
Operational records	Provider name, software version, model version, decision code and retention event.	Regulatory accountability, quality testing and incident investigation.

The most sensitive link

The greatest privacy concern is often not one individual data item. It is the ability to connect identity evidence with the fact that a person attempted to visit a particular adult, health, gambling, dating or politically sensitive service.

This is why role separation matters. A verifier may need to know enough to confirm age, while the website may need to know only that the threshold was met. Neither side necessarily needs the full picture.

What Each Age-Check Method Can Reveal

Method	Potential Advantages	Potential Data Trail and Limitations
Photo-ID matching	Can provide strong evidence of exact age and document ownership.	May process a name, birth date, ID number, document image and selfie. Borrowed or forged documents remain possible, and central retention creates breach risk.
Facial age estimation	May confirm an approximate age without asking for a name or government document.	Processes a face image or video and a statistical output. Accuracy varies, especially near the threshold, and camera requests may create trust or phishing concerns.
Open banking	Can use an established regulated relationship and return only an age attribute when designed properly.	The bank or intermediary sees an authorised request. Users may be uncomfortable connecting banking credentials, and availability depends on account access.
Credit-card check	Uses familiar payment infrastructure and can indicate adult eligibility in some jurisdictions.	Creates payment-related metadata and may exclude adults without credit cards. A child may also use an adult’s card.
Mobile-network check	May verify whether adult restrictions are removed from a phone account without sharing full identity with the website.	Links the check to a phone number and network account. Shared family plans and inaccurate account settings can cause errors.
Email-based estimation	Can reduce friction when the email has a long-established history with adult services.	May infer age from where an address has appeared, creating cross-service linkage and profiling concerns. New or privacy-focused addresses may be misclassified.
Digital identity wallet	Can support selective disclosure, proving only “over 18” rather than sharing a full identity record.	Privacy depends on wallet design, issuer trust, token linkability and whether repeated presentations can be correlated.
Offline proof or age card	Can avoid uploading identity documents directly to an online service.	Requires distribution and governance, may be inconvenient or stigmatising, and can be transferred to another person.

Biometrics require careful wording

A face image is highly sensitive, but facial age estimation is not automatically the same as facial recognition. Estimation can be designed to predict an age range without identifying the person. The privacy risk rises if images are retained, reused, matched against identity documents or converted into persistent identifiers.

The Benefits of Effective Age Checks

The case for age assurance is not limited to blocking access. It can be used to change what a child sees, how an account is configured and which forms of data processing are permitted.

Reduced access to adult material: effective checks can make it harder for children to encounter pornography, gambling and age-restricted products.
Age-appropriate design: services can apply safer defaults, limit contact features or restrict targeted advertising for younger users.
Lower risk of unlawful child-data processing: a service that excludes underage users can avoid collecting information it may have no lawful basis to process.
Clearer accountability: an auditable system gives regulators and users more evidence than a simple “I am over 18” tick box.
Less disclosure can be possible: modern credentials can prove a threshold without giving the website a full date of birth or identity document.
Reusable proof can reduce repeated uploads: a trusted wallet may allow one verification to support several services without sending a passport each time.

Important limitation

Age assurance is one safety control, not a complete child-protection system. It does not replace content moderation, safer defaults, parental support, reporting systems or action against harmful design practices.

Privacy and Security Risks

Age checks create risk when the system collects more information than is needed, retains it for too long or allows several parties to combine their records.

Risk	How It Arises	Possible Consequence
Data breach	A provider stores ID documents, selfies, account records or verification logs.	Identity theft, embarrassment, fraud or exposure of sensitive browsing interests.
Cross-site tracking	The same verifier, token or identifier is reused across several services.	Separate visits can be linked into a profile even when the websites do not know the user’s name.
Purpose expansion	Information gathered for age checking is later used for fraud scoring, advertising, identity proofing or law-enforcement requests.	The system becomes broader than users reasonably expected.
Phishing	Users become accustomed to uploading IDs, faces or payment information when a website asks.	Fraudsters can imitate legitimate age-check pages to steal sensitive information.
Chilling effect	A user believes a sensitive visit may become connected to their identity.	Adults may avoid lawful health, sexual, political or support resources.
Centralisation	A small number of identity providers handle checks for a large share of the internet.	A breach, outage or policy change affects many services and users at once.

Data minimisation reduces these risks but does not eliminate them. Even a yes-or-no result can become identifying when combined with an IP address, account login, timestamp or unique token.

Real-World Security Incidents Involving Verification Data

Public evidence shows that verification records are attractive targets, but the incidents are not all equivalent. Some involved confirmed access to identity images. Others involved exposed credentials or development files where the provider says no customer data was reached. Keeping those categories separate is essential.

Incident	What Was Confirmed	Scale and Data Involved	Important Qualification
Discord third-party support breach, 2025	Confirmed exposure	Discord said approximately 70,000 users may have had government-ID photos exposed. Those images had been used to review age-related appeals. Potentially affected support data also included names, usernames, email or contact details, IP addresses, support messages, purchase history and limited card information such as payment type and the last four digits.	Discord said its own core systems were not breached; the compromised organisation was a former customer-support provider, 5CA. Discord later said its current age-assurance vendors were not involved in that incident.
Tea app verification-record breach, 2025	Confirmed exposure	Tea reported that attackers accessed about 72,000 images, including roughly 13,000 selfies and photo IDs submitted for account verification and 59,000 images from posts, comments and messages. Tea's California breach notice said affected verification images could contain a name, date of birth, driving-licence number, passport number or another government identifier.	Tea is a dating-safety platform, not a specialist age-assurance provider. The incident is relevant because its account-verification process collected the same high-risk documents and selfies often proposed for age checks.
AU10TIX legacy-credential event, 2024	Security event; no verified data breach	AU10TIX said inactive credentials associated with a retired log-management tool appeared in a public Telegram post after a personal-device compromise.	AU10TIX says the credentials were isolated from production, and an independent forensic review found no evidence of customer-data exposure, production access or misuse. It should not be described as a confirmed leak of verification records.
Persona source-map exposure, 2026	Exposed development files; no verified personal-data exposure	Researchers found publicly accessible frontend source maps on a non-production Persona subdomain. Persona disabled the subdomain and acknowledged that the source maps should not have been publicly accessible.	Persona says the environment contained no customer data, secrets or backend access and had never served federal customers. The event exposed readable frontend code, not identity documents or live verification records.

What the confirmed cases prove

The Discord and Tea incidents show why retention matters. Information originally supplied for a narrow verification purpose can remain in support systems, legacy storage or archived files long enough to become part of a later breach. A deletion promise is meaningful only when it covers copies, support attachments, backups, audit systems and older storage environments.

There is no complete public incident count

In a November 2025 Freedom of Information response, the UK Information Commissioner's Office said it could not produce a complete count of complaints, investigations or breach reports involving age- or identity-verification providers because its case systems do not record incidents at that level of detail. The absence of a central total should not be mistaken for evidence that incidents are rare.

What users can learn from these incidents

Third parties expand the attack surface: a platform can secure its main service while support contractors, identity vendors and archived storage remain exposed.
Support workflows can retain more than the main verification flow: an ID deleted by an automated verifier may still exist in an appeal ticket or manual-review system.
Legacy data is still sensitive: records collected under an old process can remain vulnerable years after the process changes.
Not every security report is a customer-data breach: exposed code, inactive credentials and test environments should be reported accurately rather than automatically described as stolen IDs.
Numbers need attribution: use figures confirmed by the affected organisation or regulator and clearly label larger attacker claims as unverified.

Evidence standard used here

A case is labelled a confirmed exposure only where the affected organisation, a regulatory notice or another authoritative source acknowledged unauthorised access to verification records. Researcher claims are included as security events when the provider disputes or limits the claimed impact.

Accuracy, Fairness and Exclusion

Stronger evidence can improve confidence, but it often increases friction and data collection. Less intrusive methods can be easier to use but may produce more errors.

People close to the threshold: facial estimation is most likely to be disputed when a 17-year-old and a 19-year-old appear similar.
Unequal model performance: systems must be tested across relevant ages, skin tones, disabilities, camera qualities and demographic groups.
Document access: not every adult has a current passport, driving licence, credit card or conventional bank account.
Shared accounts: phone plans, payment cards and household devices may belong to one person but be used by another.
Digital exclusion: people with older devices, limited connectivity or low technical confidence may struggle with camera and wallet systems.
Appeals: users need an accessible alternative when a system makes an incorrect decision.

No method is perfect

A highly accurate system can still be unfair if it excludes adults who lack the required evidence. A privacy-preserving system can still be ineffective if children can bypass it easily. The correct method depends on the harm, legal requirement and user population.

What Privacy-Preserving Age Assurance Looks Like

European and UK regulators increasingly emphasise data minimisation, proportionality and separation between proof issuance and the website requesting the result. The European Commission is also developing anonymous proof-of-age systems that can confirm a threshold without disclosing a full identity.

Collect only the attribute needed: a site asking whether someone is over 18 should not automatically receive a name, address or exact birth date.
Separate the verifier from the website: the verifier can inspect evidence while the website receives only a signed outcome.
Prevent linkability: tokens should not allow the verifier or several websites to build a history of where the user presented proof.
Use short retention periods: raw IDs, selfies and videos should be deleted as soon as they are no longer necessary.
Process locally where practical: on-device estimation can reduce the transfer and central storage of face images.
Publish clear retention rules: users should know what is collected, why, who receives it and when it is deleted.
Provide alternative methods: no adult should be excluded solely because they lack one document, bank account or compatible device.
Support challenges: users need a straightforward way to correct an inaccurate decision.
Commission independent testing: security, privacy, fairness and accuracy claims should be evaluated by qualified third parties.
Prohibit secondary use: age-check data should not become an advertising identifier or general-purpose identity profile.

Selective disclosure

The privacy ideal is not “show the website your ID securely”. It is “prove the required age attribute without showing the website the rest of your ID”.

What Users Should Check Before Verifying Their Age

Who is performing the check? Look for the verifier’s legal name rather than trusting only the website’s branding.
What is being requested? A request for a full passport should have a stronger justification than a request for an anonymous age token.
Will the website receive identity data? A good notice should explain whether it receives only a result or also receives the underlying evidence.
How long is information retained? “We protect your privacy” is less useful than a specific deletion period.
Is the face image stored? Check whether processing is local, temporary or retained for model training and fraud detection.
Can the check be reused or linked? Reusable credentials should prevent unrelated services from correlating presentations.
Is another method available? A legitimate process should offer an alternative when the first method fails or is inaccessible.
Is the page genuine? Verify the domain before uploading an ID, opening a banking flow or activating a camera.

The legal position differs by country and by the type of service. Our earlier Utah VPN age-check guide explains one example of how location masking and age-verification duties can interact.

Frequently Asked Questions

Does an age-check website always see my identity?

No. A third-party verifier can inspect the evidence and send the website only a result such as “over 18”. Whether this happens depends on the design and privacy policy.

Is facial age estimation the same as facial recognition?

Not necessarily. Age estimation can analyse facial features to predict an age range without trying to identify the person. It becomes more identity-sensitive if the image is matched against an ID document, retained or converted into a persistent biometric identifier.

Can an age check be anonymous?

It can be designed so the website receives an anonymous or pseudonymous proof that an age threshold is met. Some party may still need to verify the user initially, but selective-disclosure and unlinkable-token designs can prevent the website from receiving the full identity.

Why do age-check providers keep logs?

Possible reasons include security, fraud prevention, accuracy monitoring, billing, regulatory evidence and handling disputes. The important questions are whether each record is necessary, how long it is retained and whether it can be linked to a sensitive visit.

Are credit-card checks reliable proof of adulthood?

They can be useful where only adults can obtain the relevant card, but they are not perfect. A child may use an adult’s card, and some adults do not have credit cards. The method also introduces payment-related data and phishing concerns.

Can age-check information be used for advertising?

It should not be repurposed without an appropriate lawful basis and clear information to the user. Privacy-preserving systems should prohibit advertising, profiling and unrelated identity uses.

Does deleting an ID image remove the entire data trail?

No. The provider may still retain a decision, timestamp, device information, fraud score or audit record. Deleting raw evidence is valuable, but users should also understand the retention of derived data and metadata.

Are online age checks good or bad?

They have genuine benefits and genuine risks. The outcome depends on necessity, effectiveness, proportionality, accessibility, security and whether the system proves only what is required.

Written by Martin Needs

Director at NeedSec LTD | Cybersecurity Expert | 10+ Years Experience

“The privacy question is not only whether an ID image is deleted. It is whether the remaining decision, token, device data and timestamp can still connect a person to a sensitive online activity.”

OSCP Certified CSTL (Infra/Web) Cyber Essentials Assessor CompTIA PenTest+ Digital Identity

NeedSec LinkedIn Malt Upwork