Testing Methodology — Streaming Content Safety

How We Test

We use browser automation (Chrome and Playwright) to simulate how real children interact with streaming platforms. Rather than cataloging what parental controls exist, we test whether those controls actually work -- from the perspective of a determined child who wants to watch restricted content.

Every test scenario models a realistic action a child might attempt: searching for a forbidden show, switching to an unrestricted profile, clicking a shared link, or trying to bypass a PIN. Tests are automated, timestamped, and evidence-backed with screenshots at each step.

Our primary adversary model is the “determined child” -- a tech-literate child aged 10-14 who has unsupervised access to a shared device where the streaming platform is already logged in. They know how to search, browse, and switch profiles, but do not use developer tools or exploit technical vulnerabilities.

Test Categories

Each platform is tested across 9 categories, each weighted by severity. Higher weights indicate more critical attack vectors.

PE-01

Profile Escape

Weight: 5

Can a child switch to an unrestricted profile?

SD-01

Search & Discovery

Weight: 5

Can mature content be found via search?

DU-01

Direct URL / Deep Link

Weight: 3

Can mature content be accessed via direct URL?

KM-01

Kids Mode Escape

Weight: 3

Can children escape the restricted browsing experience?

RL-01

Recommendation Leakage

Weight: 4

Does mature content appear in recommendations?

CB-01

Cross-Profile Bleed

Weight: 3

Does watch history bleed across profiles?

CG-01

Content Rating Gaps

Weight: 2

Are content ratings displayed accurately?

PL-01

PIN/Lock Bypass

Weight: 4

Can PIN/password protections be bypassed?

MF-01

Maturity Filter Effectiveness

Weight: 4

Overall maturity filter effectiveness

Scoring Rubric

Each test scenario yields a score from 0 (best) to 4 (worst). Lower scores indicate stronger parental controls.

Score	Label	Meaning
0	Full Block	The platform completely blocks this attack vector. No restricted content is accessible.
1	Partial Block	The platform partially blocks access but leaks some information (e.g., content existence revealed).
2	Soft Barrier	A soft barrier exists (e.g., a warning dialog) but can be bypassed with a single click.
3	Unprotected	No protection. The child can access restricted content with no barriers.
4	Facilitated	The platform actively facilitates access to restricted content (e.g., recommending mature content on a kids profile).

Test Profiles

Each platform is tested from the perspective of three child profiles with different maturity restriction levels.

TestChild7

Child (7)

Kids profile

Simulates a 7-year-old on the most restricted Kids profile. Should only see TV-Y and G-rated content. This is the primary test target -- if a child can escape from here, the platform has a critical failure.

TestChild12

Child (12)

Kids / age-restricted profile

Simulates a 12-year-old with TV-PG/PG restrictions. Tests whether pre-teen restrictions hold against searches for TV-MA content and profile switching attempts.

TestTeen16

Teen (16)

Standard profile with restrictions

Simulates a 16-year-old with TV-14/PG-13 restrictions. Should block TV-MA, R, and NC-17 content. Tests whether teen profiles leak adult content through recommendations or direct URL access.

Grade Calculation

Each profile receives a letter grade (A+ through F) based on a three-step process:

Weighted average: Each test score (0-4) is multiplied by its category weight. The weighted sum is divided by total weight to produce a weighted average (0.0 = perfect, 4.0 = worst).
Exponential penalty: The weighted average is normalized to 0-1 and raised to the power of 1.5, then mapped to a 0-100 scale. This penalizes platforms with several moderate failures more heavily than a single bad score.
Letter grade assignment: The penalized score maps to a letter grade via standard thresholds.

Grade	Score Range
A+	95-100
A	85-94
A-	80-84
B+	75-79
B	70-74
B-	65-69
C+	60-64
C	55-59
C-	50-54
D+	40-49
D	30-39
F	0-29

The platform's overall grade equals the worst profile grade. A platform is only as safe as its weakest profile -- if the Kids profile gets a D but the Teen profile gets a B, the platform grade is D.

Critical Failure Overrides

Certain test failures are so severe that they cap a profile's grade regardless of other scores. These are called Critical Failure Overrides (CFOs).

CFO-2: Zero-Auth Profile Escape

If a child on a Kids or restricted profile can switch to an unrestricted profile with zero authentication (no PIN, no password, no confirmation dialog), the profile's grade is capped at D regardless of how well other categories score. This override exists because profile escape negates all other parental controls -- it does not matter if search is filtered and recommendations are clean if the child can simply switch to an adult profile.

Grades affected by a CFO are marked with an asterisk (e.g., D*) in platform reports.

Platforms Tested

Phase 1 of our streaming content safety research covers 3 major streaming platforms:

Netflix Peacock Prime Video

Additional platforms (Disney+, Max, Hulu, Paramount+, Apple TV+, YouTube) are planned for future phases.