Documentation

Server-Side File Delivery & Format Requirements

Article

Server-Side File Delivery & Format Requirements

This article covers everything you need to know about preparing and uploading exposure data files for an S3 snapshot source. If you haven't created your S3 snapshot source yet, start with Creating an S3 Snapshot Exposure Source.

Delivery method

You upload .csv.gz files directly into MX8 Labs' S3 bucket. During onboarding, MX8 Labs grants your AWS IAM role PutObject permission into a dedicated prefix (folder) within the bucket. Once that's in place, you can push files on your own schedule.

Delivery frequency

Daily delivery works best. Multiple drops per day are fine too — there's no limit on how often you upload. Choose a cadence that aligns with your data pipeline.

File format

Every file you upload must follow these specifications:

  • File type: .csv.gz (gzip-compressed CSV)
  • Encoding: UTF-8
  • Line endings: Unix (\n)
  • Delimiter: comma (,)
  • Header row: required
  • Row structure: one row per respondent exposure
  • Max file size: 2 GB compressed — split larger datasets across multiple files
Required columns

Your CSV must include columns that map to the following required fields. Your column names do not need to match MX8 Labs' field names — you specify the mapping when creating the source.

Required FieldDescription
exposed_ip_address or hashed_ip_addressPlaintext IPv4/IPv6 address, or a hashed IP. Provide one or the other.
uidYour user identifier, up to 128 characters. Appended to respondent records during matching.
brandThe brand name the user was exposed to, up to 128 characters.

You can include additional columns beyond these — MX8 Labs will simply ignore any columns that aren't mapped. Just make sure to tell MX8 Labs which of your columns correspond to each required field when setting up the source.

File naming

There's no required naming convention. That said, we recommend including a timestamp in each filename (e.g., exposures_2026-03-10T14-00.csv.gz) to avoid accidentally overwriting a previous upload and to make troubleshooting easier.

Duplicates and late data

If MX8 Labs receives duplicate rows (same UID + IP + same day), it keeps the latest row based on file timestamp. If you discover errors in a previously uploaded file, upload a corrected version — the newer file's data will take precedence.

What not to include

Do not include PII beyond the specified fields. This means no email addresses, device IDs, or personal names. The only identifying information in your files should be the UID, brand, and IP address (plaintext or hashed).

Next steps