Setting up a Twilio Text-to-Caller respondent source

Twilio Text-to-Caller combines two channels in one respondent source. You first reach respondents by SMS with the survey's opening question, and once they reply the platform places an AI-powered voice survey call to complete the interview. It pairs the high open rates of text with the richer, conversational data you get from a Twilio Voice survey.

The source builds on interactive Twilio text, so the SMS contact handling, opt-out behavior, and delivery window all work the same way. The difference is what happens after a respondent replies: instead of continuing over SMS, the platform calls them.

How respondents trigger the call

The respondent answers the first question by replying to the SMS. That reply does two things at once: it records their answer to question one, and it triggers the follow-up call for the rest of the survey. The call is placed as long as the reply meets three conditions:

It comes from a number on your contact list. The reply must come from a phone number in the file you uploaded (or an active test number). Replies from numbers that aren't in the source are ignored, and no call is placed.
The number hasn't opted out. A number that previously opted out is blocked and won't be called.
The reply isn't an opt-out keyword. If the message is a STOP-style word (STOP, STOPALL, UNSUBSCRIBE, CANCEL, END, or QUIT, case-insensitive), or anything Twilio flags as an opt-out, the respondent is unsubscribed and sent your stop-confirmation message instead of being called.

Because replying is how respondents both answer question one and opt into the call, write your first question so it doubles as the invitation: make it clear that texting back an answer starts the call, and that replying STOP opts them out.

How the survey is delivered: text first, then voice

A Text-to-Caller survey is delivered across two channels, and it is important to design for this split:

Question one is asked and answered by text. When the source goes live, the platform sends the first question of your survey as the opening SMS. It is not a separate, free-text invitation. Whatever you write as question one is exactly what the respondent reads, and the answer they text back is recorded as their response to it.
Every question after that is handled by voice. The respondent's reply triggers the call, and the survey picks up from question two onward, conducted by voice. Question one is not repeated on the call.

This has real consequences for how you write the survey, so think carefully about the respondent's experience:

Question one has to work as a standalone text question. It is the respondent's first impression, and they answer it by typing a reply, so keep it short and clear, and make sure it can be answered in a short text message. Avoid long preambles, on-screen-only instructions, or anything that assumes the respondent can see a list of options to pick from.
Mind the handoff from text to voice. The respondent answers one question by typing and then, moments later, the rest by speaking. Make sure question one and question two read naturally as a sequence across that change of channel, so the call doesn't feel like it starts over.
Design the rest of the survey for voice. As with any Twilio Voice survey, media-based content such as images, videos, and display text is not presented over the call. Keep questions and answer options easy to follow when heard rather than read.

Before you start

Advanced respondent sources must be enabled in your account.
You need a Twilio Account SID, Auth Token, and a Twilio phone number that is enabled for both SMS and voice. Text-to-Caller uses the same number to send messages and to place (or receive) calls, so a Messaging Service SID is not supported here.
Your contact numbers must be in E.164 format (example: +15551234567). See E.164 phone number format for the formatting rules and common country prefixes.

Step 1: Create a new Twilio Text-to-Caller source

Go to Sources, click Add Respondent Source, and select Twilio Text to Caller.

Fill in these Twilio-specific fields together with the selected base fields:

Field (UX label)	What to enter	Required
Twilio account SID	Your Twilio Account SID	Yes
Twilio from number	SMS- and voice-enabled Twilio number in E.164, used to send texts and place calls	Yes
Twilio auth token	Your Twilio Auth Token	Required on create, optional on update
Stop confirmation message	Auto-reply sent after STOP/unsubscribe	Optional
Messages per second	Outbound SMS rate limit for this source	Yes (min 1)
Voice name	The AI voice used during the survey call (alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar)	Yes
Incoming calls	Allow respondents to call the Twilio number to start the survey	Optional (default on)
Background noise	Play subtle background ambience during calls for a more natural, call-center feel	Optional (default on)
Calls per second	Maximum outbound survey calls to start per second for this source	Yes (min 1)
Completion url	Redirect base URL after survey completion	Optional
Identity column	Contact file identity column name (default: Identifier)	Optional

Text-to-Caller sources currently support the US market.

Step 2: Configure Twilio webhooks

A Text-to-Caller source exposes three URLs in the Twilio setup panel. Copy each one into the matching place in the Twilio Console:

SMS inbound webhook URL: set as the inbound webhook for the phone number so respondent replies reach the platform.
SMS status callback URL: set as the status callback so the platform receives SMS delivery updates.
Voice inbound webhook URL: set as the voice webhook for the phone number so the survey call and any call-backs are handled.

Save your changes in Twilio after pasting each URL.

Step 3: Choose your voice settings

The Voice name dropdown selects the AI voice used during the call. Pick the one that best fits the tone and audience of your survey.

Leave Background noise on for most studies. It plays a low level of ambient sound under the call so the conversation feels like it is coming from a staffed call center rather than a silent line, which respondents tend to find more natural. It is enabled by default.

Incoming calls lets a respondent call the Twilio number back and start the survey themselves if they miss the outbound call after opting in. Inbound calls are routed through the Voice inbound webhook URL from Step 2. It is enabled by default.

Step 4: Upload contacts

Use Upload contacts file.
Supported file types: .csv, .csv.gz, .xlsx. For .xlsx, the platform reads the workbook's active sheet. You do not need to format the data as an Excel table.
Include the configured Identity column (default Identifier) and populate it with phone numbers in E.164 format (e.g. +15551234567). Numbers that are not in E.164 will be rejected. See E.164 phone number format for the formatting rules and common country prefixes.
The values in the identity column must be unique within the file. If any number appears more than once, the upload will be rejected. De-duplicate before uploading.

You can also include additional columns of first-party data alongside the identity column. Any column header that matches a question code, or that exactly matches the name of a stored variable in your survey, will be auto-matched. The value is pre-filled into the survey at runtime and is available in reporting. Matching to stored variables is exact and case-sensitive, so spell the column header identically to the stored variable.

Step 5: Test your survey

Send a safe test before going live:

Confirm your test number is not already in your real contact list.
Trigger a test from the source. You will receive the SMS opt-in, and once you reply the platform will call you and run through the survey as a respondent would experience it.

Step 6: Go live

Use the Status action Go Live.
Use Pause, Restart, or Complete as needed during fielding.

Reporting

Reporting works the same as for any other respondent source. Standard reports, crosstabs, and data exports include Text-to-Caller responses. Because the interview is conducted over voice, media-based content such as images, videos, and display text is not presented to respondents.