Voice Messages - botBrains Docs

Audio support enables your AI agent to understand spoken input, allowing customers to communicate through voice messages for a more natural and efficient support experience.

Why Audio Input Matters

Voice message capabilities transform how customers interact with your AI: Convenience - Customers can communicate while walking, driving (hands-free), or in situations where typing is inconvenient. Complex Explanations - Users can verbally explain nuanced problems that would require lengthy text descriptions, making it easier to articulate complex issues. Accessibility - Voice input particularly benefits users with:

Visual impairments
Mobility limitations or conditions like arthritis
Limited literacy skills in their preferred language

Natural Communication - Speaking feels more natural than typing for many users, leading to more conversational and detailed queries. Mobile Optimization - Voice is often the preferred input method on mobile devices where typing is cumbersome.

How Audio Input Works

Customer records - User presses microphone button and speaks their question
Audio capture - Browser or app captures audio stream
Speech-to-text - Audio is accurately transcribed to text
AI processing - The AI understands spoken language, identifies tone, and processes verbal descriptions
Text response - AI generates response (users with visual impairments rely on screen readers for response)

Audio input is currently one-way: customers can send voice messages, but the AI responds with text. Users with visual impairments rely on screen readers for reading responses.

Multilingual Support

Audio input works across 24+ languages, enabling broader accessibility and international customer support:

Automatic language detection
Matches your configured agent language
Supports multiple languages if your agent is multilingual

Supported Channels

Website Widget:

Microphone button available in chat interface
Browser-based audio capture
Works on desktop and mobile browsers
Requires microphone permissions from browser

Mobile SDKs:

Native audio capture on iOS and Android
Optimized for mobile device microphones
Background noise reduction

Integration Channels:

Zendesk
Salesforce
Slack
Other integrated platforms (where supported)

Use Cases

Customer Support

Hands-Free Assistance:

Customer: [Speaking while driving] "I need to check the status of
my order. I ordered a blue backpack last week."

AI: [Text response] "I can help you check your order status.
Could you provide your order number or the email address used
for the purchase?"

Complex Problem Descriptions:

Customer: [Voice message] "My washing machine is making this weird
grinding noise during the spin cycle, and it's been getting louder
over the past week. It only happens when I use the heavy-duty setting."

AI: [Text response] "Based on your description, the grinding noise
during the spin cycle on heavy-duty mode could indicate a few issues..."

Mobile-First Scenarios

On-the-Go Support:

Customers walking or commuting
Users without a keyboard handy
Quick questions while multitasking

Accessibility Priority:

Users with visual impairments using voice + screen readers
Mobility-limited customers who find typing difficult
Elderly users more comfortable with speaking

Configuration

Enable audio input for your deployment:

Go to Deploy → [Your Deployment] → Settings
Enable Audio Input
Configure language support
Test with sample voice messages
Deploy to production

Guidance Considerations

Train your AI to handle voice-specific scenarios:

"When customers describe problems verbally, they may include
more context and emotion than typed messages. Pay attention to
tone and ask clarifying questions if the description is unclear."

"For users with accessibility needs, ensure responses are
clear and well-structured for screen reader compatibility."

Best Practices

Clear Instructions

Guide users on how to use voice input:

"Click the microphone icon and speak your question clearly"
"Press and hold to record, release to send"
"Speak naturally - I'll transcribe and understand your message"

Fallback Options

Always provide text input as alternative:

Some users prefer typing
Audio may not work in all environments (noisy locations, poor connectivity)
Privacy concerns in public spaces
Some situations require written documentation

Audio Quality Tips

Inform users about optimal recording conditions:

"For best results, speak in a quiet environment"
"If your message wasn't transcribed correctly, you can try again
or type your question instead"
"Hold your device's microphone close to your mouth"

Response Formatting

Structure responses for clarity when read by screen readers:

Use clear, concise sentences
Organize information with bullet points
Avoid complex formatting that may not read well audibly
Include important information at the beginning

Technical Considerations

Privacy and Security

Audio Data Processing:

Audio temporarily processed for transcription
Text is stored, audio typically not retained
Configurable data retention policies
GDPR and privacy law compliance

User Consent:

Microphone permission required from browser
Clear privacy notices in your deployment
User control over when to enable microphone

Safety Measures

Content Moderation:

Malware and virus scanning for uploaded files
Content moderation detecting inappropriate content
Reporting mechanisms for policy violations
Team protection from inappropriate material

Performance

Transcription Quality:

Accurate transcription across 24+ languages
Tone and emotion detection
Handles various accents and speaking styles
Background noise reduction (especially on mobile)

Latency:

Speech-to-text processing adds 1-3 seconds
Overall conversation feels natural
Optimized for real-time interaction

Bandwidth:

Audio streaming requires stable connection
Automatic quality adjustment
Fallback to text on poor connections

Browser Compatibility

Requirements:

Modern browsers (Chrome, Firefox, Safari, Edge)
HTTPS required for microphone access
Mobile browser support
Permissions must be granted by user

Frequently Asked Questions

Microphone not working

Possible causes:

Browser blocked microphone permission
No microphone connected
Microphone used by another application
Not using HTTPS connection
Browser doesn’t support audio input

Solutions:

Check browser permission settings
Grant microphone access when prompted
Close other apps using microphone
Ensure website uses HTTPS
Try different browser
Test microphone with other applications

Poor transcription quality

Possible causes:

Background noise interference
Speaking too quietly or too fast
Strong accent or dialect
Poor microphone quality
Unstable internet connection

Solutions:

Move to quieter environment
Speak clearly and at moderate pace
Position microphone closer to mouth
Use headset microphone for better quality
Check internet connection stability
Try typing complex terms or proper nouns

Voice message not sending

Possible causes:

Internet connection interrupted
Audio file size too large
Browser compatibility issue
Microphone permission revoked

Solutions:

Check internet connection
Try shorter voice messages
Refresh page and grant permissions again
Update browser to latest version
Use text input as fallback

Language not recognized correctly

Possible causes:

Speaking multiple languages in one message
Language settings mismatch
Strong accent or dialect

Solutions:

Stick to one language per message
Verify language settings in deployment
Speak more clearly or try text input
Contact support if persistent issues

Example Scenarios

Technical Support

Customer scenario: User experiencing software error while commuting Voice message: “Hey, I’m getting an error message when I try to log in. It says ‘Connection timeout’ or something like that. I’m on my phone using cellular data.” AI processes: Transcribes message, identifies error type, considers context (mobile, cellular) AI response: “I can help with that login error. The ‘Connection timeout’ message usually occurs when there’s a network connectivity issue. Since you’re on cellular data, let’s try these steps…”

Product Information

Customer scenario: User shopping while holding items Voice message: “I’m looking at this blue backpack. Does it have a laptop compartment? And is it waterproof?” AI processes: Transcribes questions, identifies product context, prepares detailed response AI response: “Let me provide details about that blue backpack. Yes, it features a dedicated laptop compartment that fits devices up to 15 inches…”

Order Status

Customer scenario: User checking order while driving (hands-free) Voice message: “Can you tell me where my order is? I ordered it last Tuesday. The order number is… uh… B dash 1 2 3 4 5.” AI processes: Transcribes order inquiry, extracts order number, handles uncertainty (“uh”) AI response: “I found your order B-12345 from last Tuesday. It’s currently in transit and expected to arrive on Thursday…”

Next Steps

Now that you understand audio support:

Vision - Enable image understanding for visual problems
Documents - Allow customers to attach files and documents
Escalations - Configure handoff when voice descriptions need human review
Deploy - Set up audio-enabled deployments
Website Integration - Add audio input to your website widget

Voice message capabilities create more natural, accessible customer interactions. Enable audio input strategically based on your users’ needs and usage contexts.

Getting Started

Train

Deploy

Users

​Why Audio Input Matters

​How Audio Input Works

​Multilingual Support

​Supported Channels

​Use Cases

​Customer Support

​Mobile-First Scenarios

​Configuration

​Guidance Considerations

​Best Practices

​Clear Instructions

​Fallback Options

​Audio Quality Tips

​Response Formatting

​Technical Considerations

​Privacy and Security

​Safety Measures

​Performance

​Browser Compatibility

​Frequently Asked Questions

​Example Scenarios

​Technical Support

​Product Information

​Order Status

​Next Steps

Why Audio Input Matters

How Audio Input Works

Multilingual Support

Supported Channels

Use Cases

Customer Support

Mobile-First Scenarios

Configuration

Guidance Considerations

Best Practices

Clear Instructions

Fallback Options

Audio Quality Tips

Response Formatting

Technical Considerations

Privacy and Security

Safety Measures

Performance

Browser Compatibility

Frequently Asked Questions

Example Scenarios

Technical Support

Product Information

Order Status

Next Steps