Interpreting Crosstabs

When analyzing survey cross-tabs, a residual t-test is an essential statistical tool to identify values that significantly deviate from what we expect. These deviations help uncover meaningful insights that might be missed when scanning large tables. This article explains how residual t-tests work and how to interpret the results.

What is a Residual T-test?

A residual t-test measures the difference between observed and expected frequencies in a survey data cross-tabulation and evaluates whether the deviation is statistically significant.

Residual: The difference between the observed value (actual data) and the expected value (based on the overall row and column averages).
T-test: A statistical test applied to determine whether the residuals are significantly high or low compared to what random variation would suggest.

In MX8 Labs' cross-tabs:

Green bars represent significantly higher-than-expected values (positive residuals).
Red bars indicate significantly lower-than-expected values (negative residuals).
Blue bars are not statistically significant - these values fall within the expected range.

How Does it Work?

1. Calculate Expected Values:

Each cell's expected value is computed from the row and column totals. For instance, if a platform's general visibility rate is 60% across services, you'd expect around 60% visibility for any specific service unless other factors influence it.

2. Compare Observed vs. Expected:

The t-test checks whether the observed value differs from the expected value by more than what can be attributed to random chance. This difference is quantified as a residual.

3. Highlight Significant Results:

If the residual exceeds a set threshold (as defined in the project or report, typically a 95% confidence level or a p-value < 0.05), the cell is highlighted:
Green for high residuals.
Red for low residuals.

4. Account for Context:

Even slight differences might be flagged as significant in rows or columns where the expected values are exceptionally stable or predictable.

Example: Ad Visibility by Video Platform

Let's analyze this cross-tab to understand these concepts better.

Interpreting Crosstabs

Key Observations:

1. Amazon Prime Video and YouTube (Total):

YouTube's visibility score of 81% is marked green, indicating it is statistically higher than expected. This means YouTube's visibility score outperforms the benchmark calculated from the other services.
Conversely, 42% of Amazon Prime Video is marked red, statistically lower than expected.

2. By First Video Platform:

On Instagram, YouTube's visibility is green at 86%, significantly higher than expected for Instagram users. However, Disney+'s visibility, at 49% (blue), is not statistically different from expectations, even though it appears close in percentage to others in the same column. This indicates subtle variations in expected rates that affect significance.
On Snapchat, YouTube's visibility is marked blue at 78%, which aligns with the expected rate for users. In contrast, Amazon Prime Video's 14% (red) visibility is unexpectedly low, likely because Snapchat users do not favor it for video discovery.

3. Netflix on TikTok:

TikTok users show a statistically low ad visibility for Netflix (31%, red). Despite Netflix's broad popularity, it appears TikTok users engage less with ads on Netflix compared to other platforms. This insight may help tailor marketing strategies.

Why Do Some Similar-Looking Numbers Differ?

It's common to see two adjacent cells with close percentages, one marked as significant and the other not. This discrepancy arises because residual t-tests consider the expected value for that row or column, not just the raw percentage differences. For example:

A 43% ad visibility for Discovery+ on Facebook (blue) is not statistically significant because it aligns with Facebook's average ad visibility expectations.
However, a slightly higher percentage, like YouTube's 62% on the same platform (green), is significant because YouTube's expected visibility on Facebook was lower.

This underscores the importance of understanding the statistical framework rather than relying solely on visual comparison.

Significance in Weighted Data

When your data is weighted — which is the default for MX8 Labs Research Platform reports — the significance tests run on the effective sample size of each cell, not the raw respondent count. The effective sample size is smaller than the raw count whenever weights are unequal, and it reflects how much independent information the cell actually carries.

In practice, this means a heavily weighted cell will show fewer significant highlights than the raw numbers alone would suggest. That's intentional: it avoids overclaiming confidence when a handful of respondents are doing a lot of the weighted work. If significance looks sparser than you'd expect, check the weighting diagnostics for the report — low weighting efficiency is usually the reason. See Weighting methodology for the full detail.

Key Takeaways:

Residual t-tests add context: They show where observed data significantly deviates from expectations, helping identify where performance exceeds or falls short.
Colors simplify interpretation: Green (higher), red (lower), and blue (as expected) offer a quick visual guide to significant insights.
Subtle differences matter: Statistical significance is relative to the expected values for each row or column, not just the raw percentages.

By leveraging residual t-tests, survey analysts can move beyond surface-level observations and uncover deeper, actionable insights. For example, platforms like YouTube may prioritize advertising on Instagram based on its standout performance, while Netflix may need strategies to improve TikTok engagement.

Switching Between the Table and Chart Views

Every cross-tab can be viewed as either a table or a chart. The Table / Chart toggle in the top right of the report switches between the two views without changing the underlying data, filters, or series configuration.

The chart view is most useful when you want to see the shape of a distribution at a glance, share a finding in a meeting, or drop a chart straight into a deck. The table view remains the right choice when you need to read exact percentages or scan residual t-test highlights across many cells.

For single-select questions, the chart view renders as a 100% stacked bar with one row per variable value. Each series segment is labeled with its percentage, so you can compare the composition of each group without losing precision:

Cross-tab chart view with stacked rows showing brand choice by annual income

For multi-select questions, where respondents can choose more than one option and percentages add up to more than 100%, the chart renders as stacked columns — one column per option with series segments broken out by the row variable. This makes it easy to see which options over-index for a particular segment:

Cross-tab chart view with stacked columns showing brand discovery by age

The chart view respects the same Variable, Series, and Charts for selectors as the table view, so you can pivot across rows and columns without leaving the chart. Use the Download button at the top right to export the current view as an image for use in presentations.