Documentation

SPSS Export Format

1. Overview

The SPSS format presents data in the same wide structure as the Wide format export (one row per respondent, one column per question) but also includes additional metadata for each variable. This makes it the preferred choice for researchers using SPSS, Stata, or other statistical tools that can read SPSS files.

MX8 Labs supports two SPSS file formats:

  • ZSAV (.zsav) - The compressed, modern SPSS format. Recommended for current versions of SPSS (v21 and later) and most modern statistical tools. Produces smaller files that are faster to download and transfer.
  • SAV (.sav) - The legacy SPSS format. Use this option if you are working with older versions of SPSS or third-party tools that do not support the compressed ZSAV format.

Both formats contain identical data and metadata; the only difference is file size and compatibility.

2. File Structure & Layout
  • Each row corresponds to one respondent.
  • Each column corresponds to one question or metadata field.
  • Variable names follow the same convention as Wide format (e.g., V001_respondent_id, V002_Recent_Restaurant_Visit).
Example (first 5 columns):
V001_respondent_idV002_Recent_Restaurant_VisitV003_AgeV004_GenderV005_Ethnicity
0097ac15-868c-6608-25fc-c0fe2cd884a8Yes35FemaleWhite
012a5d8b-9dc4-2132-782d-73742be4088fYes21FemaleWhite
022b6b43-6304-2dc5-0830-10d98bd7dee3Yes18MaleWhite
3. Key Columns
  • V001_respondent_id - Unique identifier for each respondent.
  • Question columns (V###_...) - Same as Wide format but with metadata.
  • Weight column (e.g., V171_weight) - Statistical weight for the respondent.
  • Timing columns (e.g., V172_start_time, V173_end_time) - Survey start and completion timestamps.
4. Data Representation
Single-choice questions

Stored as one column per question with the selected answer recorded.

Multi-choice questions

Each option is represented as a separate column. The value assigned to each selected option indicates the order in which the respondent selected it (e.g., -1 = not selected, 1 = selected first, 2 = selected second, 3 = selected third, etc.). This preserves the sequence of selections, which can be useful for analyzing which options respondents considered first.

If the respondent selects an exclusive option (such as "None of the above" or "Prefer not to say") after previously selecting one or more non-exclusive options, the exclusive option is recorded with a value reflecting the order in which it was clicked, and all previously selected non-exclusive options are deselected at that point (reset to -1). Only the exclusive option will appear as selected in the final data.

You may want to consider defining MRSETS within SPSS to facilitate easy reporting, as this is not something that MX8 Labs includes in the export file.

Numeric questions

Stored directly as numeric values. "Don't know" responses are coded as -999 and flagged as an SPSS missing value, so they are automatically excluded from means, medians, and other numeric summaries.

Open-end questions

SPSS exports include only the coded values for open-end questions, not the raw verbatim responses. This keeps the .sav / .zsav file aligned with a statistical analysis workflow where categorical codes are more useful than free text.

  • Each coded open-end appears as a labeled numeric variable, with value labels mapping each code to its human-readable category.
  • Until the open-end field has been closed (i.e., coding is finalized for the field), all responses in that variable are marked as "To be classified" in the export. Once the field is closed, the export reflects the final code assignments.
  • If you need the raw, uncoded verbatim responses, download the data in CSV or one of the other non-SPSS formats, which include the full response text.

⚠️ Note: Because raw open ends and recoded variables are excluded from the SPSS format, use the Long or CSV formats when you need verbatim text or recoded variables alongside the coded values.

5. Metadata Provided

The SPSS format includes additional metadata that makes analysis easier:

  • Variable labels - Full question text (e.g., "What is your gender?").
  • Value labels - Mappings of codes to human-readable labels (e.g., 1=Male, 2=Female, 3=Non-Binary).
  • Measurement levels - Nominal, ordinal, scale, etc., depending on the question type.
  • Missing value definitions - Explicitly marked missing values (e.g., -1 = Not selected).
  • Variable types - Numeric, string, date/time.

This metadata ensures the dataset is analysis-ready in SPSS and other statistical software.

6. Missing & Special Values
  • -1 typically denotes unselected or non-applicable options.
  • Empty cells may represent skipped questions.
  • "Prefer not to say" appears as a standard category.
  • All missing values are flagged as SPSS missing values in the export — including any negative sentinel codes (e.g., -1, -2, -99). This means SPSS, Stata, and other compatible tools will automatically exclude these values from calculations such as means, frequencies, and cross-tabs, without requiring you to define missing values manually.
7. Weighting
  • Apply the weight column in analysis to ensure results reflect target population.
8. Best Practices
  • Use the built-in metadata in SPSS/Stata to reduce manual labeling.
  • Leverage variable labels to quickly identify questions.
  • Use value labels to decode numeric response values.
  • Match V### codes with reporting_id from the Long format if you need to cross-reference.
9. Stacked Exports

The SPSS format supports stacked exports, where the data is organized by the tags assigned to each question in the survey editor. In a stacked export, each respondent row is repeated for every tag group, and only the variables belonging to that tag are included alongside the respondent identifier and metadata variables (weight, timing). All SPSS metadata — variable labels, value labels, and measurement levels — is preserved for each variable in the stacked output.

To generate a stacked export, select the Stacked option in the download dialog. The resulting file will contain a Tag variable indicating which tag group each row belongs to.

This is particularly useful when you want to run separate analyses on different sections of a survey (e.g., "Brand Awareness" vs. "Purchase Intent") without manually subsetting the data.

10. When to Use SPSS Format
  • For researchers working in SPSS, Stata, or other statistical software.
  • When you want metadata-rich survey data without needing to import a separate codebook.
  • For advanced statistical analysis requiring labeled variables and categories.
  • When using the stacked option, for analyzing tagged question groups with full variable and value labels intact.