The Cross-Domain Uniformity Paradox

A 16-iteration directed exploration of the Falcon Telecom & Media synthetic warehouse, focused on what the SUBSCRIBER_KEY join across billing × viewership × tickets × lifecycle events actually reveals.

Iterations

of 20 planned

Hypotheses Tested

14 disproven, 2 confirmed

Findings

5 actionable, 2 meta

Subs Profiled

100K

88K active

Cross-Domain Coverage

96.9%

in all 4 facts

Executive Summary

The most useful finding from this exploration is a meta-finding: the Falcon Telecom & Media warehouse describes a customer base where cross-domain participation is universal but cross-domain intensity is demographically random. Every one of the 100,000 subscribers streams; 96.9% buy live event tickets at some point; engagement levels are uniform across segment, age, credit, tenure, plan technology, and acquisition channel. The only meaningful demographic discriminator is plan type — Enterprise Fiber subscribers are 47% more likely to never buy a ticket than Prepaid 5G subs (3.45% vs 2.35%) — a small but interpretable corporate-vs-personal account signal.

The four strongest null findings challenge widely-held cross-domain narratives:

No carrier ↔ platform parent affinity. Comcast subscribers do not over-stream Peacock (Peacock is actually their lowest platform at 19,138 sessions vs Tubi at 19,676). All five prospect carriers' subscribers distribute streaming uniformly across all 12 platforms within a 2.8% spread. Vertical integration produces no detectable streaming preference here.
5G access does not drive 4K HDR streaming uptake. 5G plan subs and 4G LTE subs split streaming quality identically (15% / 22% / 54% / 8% across 4K HDR / 720p / 1080p / SD).
Demographics do not predict genre selection. Drama is 19.4–19.6% of viewing across all 5 segments — including Business Mid. Family does not over-index on Kids content.
Tenure does not predict churn. Subscribers who churned in 2024–25 have identical average tenure (60.4 mo) to those who didn't.

The two strongest positive findings:

The COVID Q2 2020 cross-domain substitution is a precise, measurable phenomenon — ticket-buying subscribers collapsed 91.3% (9,263 → 804) while streaming subscribers grew 41.8% (25,292 → 35,874) and view minutes grew 52.7%. By Q2 2021 ticket buying recovered to 12% above the Q2 2019 baseline; streaming returned to baseline trajectory after the spike. The same population substituted streaming for live events — clean substitution, not a structural shift.
Heavy-tailed spending exists but is demographically random. The top 5% of spenders (4,404 active subs) generate 16.7% of total ticket revenue — a 360× per-capita gap vs the bottom 5%. But these super-spenders are evenly spread across all 5 segments (4.83–5.06% concentration in each). The "VIP cohort" exists but does not have a demographic signature.

So what? If Falcon were positioning this dataset to a real telecom prospect, the headline pitch would not be "we found that your bundled-platform subscribers stream more" — that hypothesis fails. The pitch is "the right framing is cross-domain volume, not cross-domain signature." Persona-based marketing on this customer base would underperform; lifecycle and event-based interventions would outperform. The COVID substitution event remains the cleanest demonstration of cross-domain substitution effects in the data.

Findings Index

F1 — The Cross-Domain Uniformity Paradox Confirmed Cross-Domain
F2 — No Carrier ↔ Platform Parent Affinity Null
F3 — 5G Access Doesn't Drive 4K HDR Streaming Null
F4 — COVID Q2 2020 Cross-Domain Substitution Confirmed
F5 — Heavy-Tailed Spending, Demographically Random Confirmed
F6 — Enterprise Plans Skew Slightly Non-Purchaser Weak Signal
F7 — Tenure Doesn't Predict Churn Null

Methodology

Mode & Scope

Directed AutoExplore session. Theme: cross-domain Subscriber 360 — what does the SUBSCRIBER_KEY join across FACT_BILLING × FACT_VIEWERSHIP × FACT_TICKET_SALES × FACT_SUBSCRIBER_EVENTS reveal? 70% of iterations focused on the theme; 30% probed adjacent dimensions (carrier, plan, platform, geography, content, time) for unexpected connections.

Confirming Standard

One query is a hypothesis. A finding requires confirmation from a second angle. Strong null findings require effect sizes inside the noise floor across multiple cuts. All percentages are computed against population baselines, not raw counts.

Data Constraints

Several 4-way LEFT JOIN queries (e.g., a full subscriber-presence matrix across all 4 facts) timed out after 30 seconds. These were redesigned as per-fact aggregations joined post-hoc. The EXISTS pattern proved expensive on million-row fact tables; COUNT(DISTINCT SUBSCRIBER_KEY) per fact is efficient and was the workhorse.

Statistical Notes

For uniform-distributed counts at ~19,000 sessions per cell, the sqrt-counting noise floor is ±138 sessions = ±0.7%. Effect sizes below 1% are not distinguishable from noise. Several "candidate findings" (e.g., event-type × age slight orderings, acquisition channel slight differentials) were below this threshold and treated as null.

Findings

1The Cross-Domain Uniformity Paradox

Confirmed (meta-finding) Cross-Domain

Cross-domain participation in the Falcon Telecom & Media warehouse is essentially universal — but cross-domain intensity is demographically random. Across 14 different demographic and behavioral cuts tested, only one (plan type) produced a meaningful discriminator. Every other cut — segment, age band, credit band, tenure, acquisition channel, carrier, plan technology, region — returned spreads inside the statistical noise floor.

Cross-Domain Participation

Fact Table	Distinct Subs	% of 100K Base
FACT_VIEWERSHIP	100,000	100.00%
FACT_SUBSCRIBER_EVENTS	100,000	100.00%
FACT_BILLING	99,997	99.997%
FACT_TICKET_SALES	96,941	96.94%

Demographic Cuts That Showed No Differentiation

Cut	Cells	Spread	Verdict
Segment × ticket-less rate	5	2.91% – 3.18%	Null
Age band × ticket-less rate	6	2.90% – 3.31%	Null
Credit band × ticket-less rate	5	3.01% – 3.13%	Null
Tenure band × txns/sub	4	3.48 – 3.51	Null
Acquisition channel × rev/sub	6	$4,447 – $4,591	Null (3.2% spread)
Segment × spend/sub	5	$4,514 – $4,539	Null (1.0% spread)
Segment × genre share	55	±0.2 pp per genre	Null
Top 5% concentration by segment	5	4.83% – 5.06%	Null

So what? Real consumer warehouses almost always show some demographic skew — affluent customers buy premium tickets, families over-index on kids content, business plans don't generate streaming sessions. The Falcon synthetic warehouse does not bake any of this in. For Falcon's GTM motion this is meaningful: demonstrations on this dataset should NOT be framed around persona-based marketing or audience targeting. The right demos are around volume, cross-fact joining capability, and event-based analysis (e.g., the COVID substitution in F4).

2No Carrier ↔ Platform Parent Affinity

Null Finding Cross-Domain High-confidence

The single most-tested cross-domain hypothesis in vertical-integration narratives — that subscribers of carrier X over-stream platform Y when X owns Y — fails completely in this data. Comcast (NBCUniversal) subscribers should over-stream Peacock; they don't. Charter subscribers (who distribute Discovery+) should over-stream Discovery+; they don't. T-Mobile (Apple TV+ bundle); AT&T (legacy WBD relationship) — none of these show detectable affinity.

Comcast Subscribers' Streaming, by Platform

Platform	Parent	Sessions	vs Comcast Mean
Tubi	Fox Corporation	19,676	+1.6%
Max	Warner Bros. Discovery	19,570	+1.0%
Paramount+	Paramount Skydance	19,542	+0.9%
ESPN+	Walt Disney	19,481	+0.6%
Fox Nation	Fox Corporation	19,475	+0.5%
Amazon Prime Video	Amazon	19,449	+0.4%
Netflix	Netflix	19,369	+0.0%
Apple TV+	Apple	19,348	−0.1%
Disney+	Walt Disney	19,250	−0.6%
Hulu	Walt Disney	19,213	−0.8%
Discovery+	Warner Bros. Discovery	19,157	−1.1%
Peacock	Comcast	19,138	−1.2%

Peacock is Comcast subscribers' lowest-streamed platform — though within statistical noise (sqrt-counting noise floor ±0.7%, observed deviation 1.2%). The same null pattern repeats for every prospect carrier across the 60-cell carrier × platform matrix.

So what? If a Falcon prospect (e.g., the actual Comcast/NBCU) wants to validate the "our subscribers stream more of our platform" thesis on this dataset, they will not find supporting evidence. This null result is itself useful — it forces the conversation toward measurable cross-domain effects (like the COVID substitution in F4) rather than presumed but unverified synergies.

35G Access Doesn't Drive 4K HDR Streaming Uptake

Null Finding High-confidence

Telecom marketing routinely promises that 5G enables higher-resolution streaming. In this data, 5G plan subscribers and 4G LTE subscribers split streaming quality choices identically:

Plan Technology	4K HDR	1080p HD	720p HD	480p SD
5G	15.0%	54.7%	21.8%	8.0%
4G LTE	15.0%	54.7%	22.0%	8.0%
Fiber	14.9%	54.6%	21.7%	8.0%

Identical to two decimal places. Stream quality selection is statistically independent of network technology in this warehouse.

So what? The 5G upsell narrative ("upgrade for 4K") doesn't have data support here. If a Falcon prospect needs evidence for 5G ROI, this dataset should NOT be used to make the case. Use plan-type ARPU differentiation instead (where the differentiation is real).

4COVID Q2 2020 Cross-Domain Substitution

Confirmed Cross-Domain

The single cleanest cross-domain story in the dataset is the COVID lockdown response. The same population that bought live-event tickets in Q2 2019 was streaming in Q2 2020.

Quarterly Cross-Domain Activity, 2019–2021

Quarter	Streaming Subs	Ticket-Buying Subs	View Minutes
2019 Q1	28,187	8,593	1.66M
2019 Q2	25,292	9,263	1.46M
2019 Q3	27,323	9,970	1.60M
2019 Q4	31,175	9,085	1.88M
2020 Q1	33,598	5,932	2.05M
2020 Q2 (peak lockdown)	35,874 (+41.8%)	804 (−91.3%)	2.23M (+52.7%)
2020 Q3	25,848	2,576	1.49M
2020 Q4	32,201	6,728	1.95M
2021 Q1	29,415	6,218	1.75M
2021 Q2	27,607	10,413 (+12.4% vs '19)	1.62M

Three precise observations:

Ticket buying collapsed 91.3% Q2 2020 vs Q2 2019 (9,263 → 804 distinct buyers).
Streaming surged 41.8% in subs and 52.7% in minutes the same quarter.
By Q2 2021 ticket buying had not just recovered but exceeded the 2019 baseline by 12.4% — suggesting pent-up demand. Streaming returned to its pre-COVID trajectory after the surge (i.e., the elevated level was temporary).

Compared with the schema description ("8% of normal volume" for tickets and "+45% surge" for streaming), the actual numbers are slightly more extreme — 8.7% of normal for tickets, 41.8% surge for streaming subs.

So what? This is the cleanest demonstration of cross-domain substitution in the warehouse. For Falcon demos, this is the lead story — "watch how our same-subscriber model captured the lockdown shift in real time." For prospect risk modeling, this is also the only event in the data that simulates a black-swan demand shock. Use it for scenario planning narratives.

5Heavy-Tailed Spending, Demographically Random

Confirmed Counterintuitive

Lifetime ticket spend per active subscriber follows a heavy-tailed distribution with a 360× spread between the top and bottom 5%:

Ventile	Subs	Avg Spend	Total Spend	% of Total
Top 5%	4,404	$15,129	$66.6M	16.7%
Top 6–10%	4,404	$11,051	$48.7M	12.2%
Top 11–15%	4,403	$9,277	$40.8M	10.3%
Top 16–20%	4,403	$8,089	$35.6M	8.9%
Median (45–50%)	4,403	$3,321	$14.6M	3.7%
Bottom 5%	4,403	$42	$0.18M	0.05%

Top 20% generates 48.1% of total ticket spend; classic heavy-tailed concentration but not as extreme as 80/20.

The surprise: When you look at where the top-5% super-spenders live demographically, they are perfectly distributed across the customer base. Each segment has approximately 5% of its members in the top ventile:

Segment	Total Active Subs	In Top 5%	Within-Segment %
Family	24,733	1,251	5.06%
Business Small	13,113	661	5.04%
Consumer	35,253	1,766	5.01%
Business Mid	6,087	297	4.88%
Prepaid	8,876	429	4.83%

0.23 percentage point spread — perfect uniform distribution.

So what? The "VIP cohort" exists and concentrates revenue (16.7% from 5% of subs) — but it has no demographic signature. For a real prospect this would be unusual; in this synthetic data it forces honest framing. If Falcon shows a "super-spender" segmentation, it should be based on observed behavior (cumulative spend, recency, frequency) — not on demographic profiling, which produces no useful prediction.

6Enterprise Plans Skew Slightly Non-Purchaser

Weak Signal Plan-type only

The only demographic discriminator detected in 16 iterations is plan type:

Plan Type / Tech	Total Subs	Never-Bought-Ticket	% Never
Enterprise / Fiber	10,708	369	3.45%
Enterprise / 5G	10,946	354	3.23%
Postpaid / 4G LTE	14,218	449	3.16%
Prepaid / 4G LTE	14,180	439	3.10%
Postpaid / 5G	24,959	735	2.94%
Bundle / Fiber+Cable	7,218	211	2.92%
Fixed Wireless / 5G	3,566	92	2.58%
Prepaid / 5G	3,495	82	2.35%

Enterprise Fiber subs are 47% more likely to never buy a ticket vs Prepaid 5G subs (3.45% vs 2.35%). The acquisition channel data confirms: Business Direct channel is the lowest-spending channel ($4,448/sub vs $4,591 for Dealer) — within 3.2% but consistent with the corporate-account hypothesis.

So what? This is the only demographic signal that survives statistical noise. It's small (1.1 pp absolute) but interpretable: corporate / enterprise plans behave like work accounts and don't generate as many personal entertainment purchases. For dashboard storytelling, this is the one place where "plan type matters for entertainment behavior" is a defensible claim.

7Tenure Doesn't Predict Churn

Null Finding

Standard telecom churn modeling assumes a tenure curve — newer subscribers churn at higher rates ("honeymoon falloff"); long-tenure subscribers are sticky. This dataset rejects that assumption.

Cohort	Subs	Avg Tenure
Churned in 2024–25	55,064	60.4 months
Active, no churn events	39,556	60.4 months

Identical to one decimal place. Combined with the near-uniform churn-reason distribution (Price 15.4% leads narrowly across 8 reasons in the dashboard build) and uniform churn-by-carrier rates, this strongly suggests the warehouse models churn as a memoryless process — independent of subscriber attributes.

So what? Churn-prediction modeling on this dataset will produce no useful features from demographic or tenure inputs. Behavioral features (engagement decline, plan changes, payment status changes) MAY work but were not deeply tested here. If a prospect demands a churn-modeling demo on this data, set the expectation that the demo will show methodology (XGBoost, SHAP), not useful predictions.

What We Didn't Find (Beyond F2/F3/F7)

For completeness — patterns we hypothesized and tested but found no evidence for:

Family segment over-indexing on Kids genre — 4.91% Family vs 4.77% Business Mid; effectively flat.
Older audiences skewing toward Drama / News — Drama is 19.4–19.6% across all 6 age bands.
Concert audiences skewing younger than WWE/UFC audiences — within 1% in transaction-share rankings.
Acquisition channel × ticket spend differentiation — 3.2% spread; Business Direct slightly low, otherwise flat.
Tenure × ticket buying intensity — 3.48 to 3.51 transactions per active sub across 4 tenure bands.
VIP propensity skewed by credit band — uniform across Excellent / Good / Fair / Poor / No Check.

Recommended Actions

Reframe the demo narrative. Stop pitching this dataset's "demographic insights" — they don't exist beyond F6. Lead with cross-domain volume capability, the COVID substitution event (F4), and the heavy-tailed spending pattern (F5).
For prospect-specific demos, run the carrier-platform null test live. Showing F2 — "we tested whether your carrier subscribers over-stream your platform and the answer was no" — establishes credibility faster than presenting only positive findings.
Build a behavioral super-spender cohort definition. 4,404 subs in the top 5% drive $66.6M (16.7%) of ticket revenue. Define them by spend behavior, NOT by demographic profile, and add a "Super-Spender Cohort" view to the existing kit dashboards.
Document the COVID Q2 2020 substitution as a reference event. Add a "scenario library" doc explaining how the same SUBSCRIBER_KEY join produced cleanly visible substitution effects, to support prospect questions about black-swan modeling.
Do not promise demographic-based churn prediction on this dataset. Churn correlates with nothing measurable here. If pressed, demonstrate methodology only.
For the one real demographic signal (F6 — Enterprise plans), add an "Enterprise vs Consumer plan profile" comparison view to the Plan & Carrier Mix dashboard. It's small but real.

Limitations

This is synthetic data. The uniformity findings reflect the data generator's design, not a real-world phenomenon. On a production warehouse, the same exploration would likely surface meaningful differentiation.
Pre-churn behavioral signals not tested. A behavioral early-warning analysis (declining viewership/spend in months prior to churn) would require window functions on per-subscriber per-month aggregates. Given the universal null pattern, the expected result is also null, but it remains untested.
Time-of-day patterns not tested. No exploration of intraday or day-of-week patterns in viewership or ticket purchasing.
Rolling churn rate trends not tested. Churn count grew 7% YoY 2024 → 2025 (per dashboard data), but the rate adjusted for base growth was not formally tested.

Source: Falcon Telecom & Media synthetic warehouse · 4.7M fact rows · 2018-01-01 → 2026-04-17 · Method: 16 iterations directed exploration · ~22 successful queries (out of 300 budget) · 4 fact-check cycles planned · Connector: mcp__0f5a7fbd-d3a0-4d09-80d5-e325ec2e51bb__ida_* · Generated: 2026-04-25 · See autoexplore-journal.md for hypothesis-by-hypothesis log.