Why Chat Monitoring Is Different
Chat conversations operate under unique constraints that distinguish them from traditional support tickets: Immediate response expectations - Users expect replies within seconds, not minutes. A 30-second delay that’s acceptable in email feels glacial in chat. Synchronous dialogue patterns - Chat conversations happen in real-time, creating natural back-and-forth exchanges. Users don’t batch all their questions into one message—they ask, wait, respond, and iterate. Shorter attention spans - Users abandon chat conversations quickly if they don’t get immediate value. A slow or unhelpful initial response loses the user entirely. Context within single sessions - Chat users typically resolve their entire question in one continuous session, unlike ticket workflows that span days with multiple agents. Informal communication style - Chat favors conversational, friendly language over formal support terminology. Tone and brevity matter enormously. Peak hour sensitivity - Chat volume spikes during business hours and specific events. Your AI must handle sudden surges without degradation. These differences demand specialized metrics that capture response speed, conversation flow quality, and user engagement patterns specific to real-time interactions.Accessing Chat Performance Metrics
Navigate to Analyze → Metrics and select the General view. The General view provides comprehensive chat analytics across all real-time conversation channels.Filtering for Chat Channels
Use the channel filter to focus specifically on chat-based conversations: Website Chat (Web channel) Your website chat widget conversations. Filter to “Web” to see only these interactions. Slack (Slack channel) Workspace conversations from your Slack integration. Essential for internal support or community chat. WhatsApp (WhatsApp channel) WhatsApp Business conversations. Combines both synchronous and asynchronous patterns. To analyze only chat performance, select these channels and exclude email, ticketing, and other asynchronous channels from your view.While WhatsApp conversations can be asynchronous, many users treat them like real-time chat. Monitor your WhatsApp response time patterns to determine which monitoring approach fits your use case.
Key Chat Performance Metrics
Response Time Analysis
Unlike ticketing systems where first response time is measured in hours, chat response time matters in seconds. What to measure:- Time from user message to AI’s first response
- Average response latency across conversations
- 95th percentile response time (catching slowest responses)
- Response time during peak hours vs. off-peak
- Navigate to Analyze → Conversations
- Filter to your chat channels (Web, Slack, WhatsApp)
- Open 20-30 conversations
- Manually calculate time between user message and AI response
- Average the results to establish your baseline
- 0-5 seconds average: Excellent chat performance
- 5-10 seconds average: Good, room for optimization
- 10-20 seconds average: Fair, likely impacting satisfaction
- 20+ seconds average: Poor, investigate latency issues urgently
Conversation Resolution Rate
What it measures: Percentage of chat conversations that successfully resolve the user’s question without escalation or abandonment.User Satisfaction in Chat (CSAT)
What it measures: Customer Satisfaction score specific to chat conversations, calculated as percentage of 4-5 star ratings.- 85%+ = Excellent (users love the chat experience)
- 75-85% = Good (solid chat satisfaction)
- 65-75% = Fair (chat experience needs improvement)
- Below 65% = Poor (fundamental chat quality issues)
- AI is answering but not conversationally
- Responses too formal or lengthy for chat
- Tone feels robotic or unhelpful
- Missing pleasantries (greetings, acknowledgment)
- Knowledge gaps preventing answers
- AI escalating too frequently
- Users frustrated by lack of help
- AI is friendly but not helpful
- Good tone but missing information
- Users appreciate effort despite not resolving issue
Handoff Timing and Patterns
What it measures: When and why chat conversations escalate from AI to human agents. Access via the Handoff Chart in General view. This visualization shows escalation triggers and timing patterns. Chat-specific handoff analysis: Immediate handoffs (within first 1-2 messages):- Business hours (9 AM - 5 PM): Expected for complex questions
- After hours/weekends: May indicate AI confidence issues when team isn’t available
- Specific days: Monday mornings, Friday afternoons may have patterns
- Filter conversations by escalation trigger
- Identify most common escalation reasons
- For each reason, determine if escalation was necessary or avoidable
- Add knowledge or refine guidance to reduce avoidable escalations
- Adjust escalation rules if escalating too early or too late
Chat Abandonment Rate
What it measures: Percentage of chat conversations where the user leaves before their question is resolved or escalated. How to calculate: Navigate to Analyze → Conversations, filter to:- Channels: Web, Slack, WhatsApp (your chat channels)
- Status: Unresolved
- Date range: Last 7-30 days
- 0-10%: Excellent (minimal abandonment)
- 10-20%: Good (reasonable abandonment for difficult questions)
- 20-35%: Fair (significant abandonment, investigate causes)
- 35%+: Poor (losing too many users mid-conversation)
Peak Hour Coverage
What it measures: AI performance during highest-traffic periods compared to off-peak times. Why this matters for chat: Unlike ticketing where AI handles weekend overflow, chat AI must maintain quality during peak weekday hours when:- Traffic volume is 3-5x higher than off-peak
- Users expect instant responses
- Multiple concurrent conversations strain resources
- Human agents are busy and can’t easily take handoffs
- Shows conversation volume by day of week and hour
- Darker colors indicate higher volume
- Identify your busiest 3-4 hour blocks
- Select specific date ranges during peak hours
- Note CSAT, Resolution Rate, and involvement levels
- Export conversations for detailed analysis
- AI latency increases during high traffic
- May indicate infrastructure scaling issues
- Consider optimizing data provider queries
- Review caching strategy for common questions
- Resolution rate drops during peaks
- More escalations during busy periods
- May indicate human agents too busy to take handoffs gracefully
- Could suggest AI hesitates to handle questions during peaks
- Peak times see more “Not Involved” conversations
- Agents handling conversations manually without AI
- May indicate team doesn’t trust AI during critical periods
- Review guidance for peak hour scenarios
Using the General View for Chat Analysis
The General view dashboard provides several charts specifically valuable for chat monitoring:Conversation Status Chart
What it shows: Stacked area chart of resolved, unresolved, and escalated conversations over time. Chat-specific usage:- Track daily resolution trends for chat channels
- Identify days when resolution rate dips (investigate those days)
- Correlate status changes with deployments or knowledge updates
- Monitor whether unresolved conversations (abandonment) is growing
- Apply chat channel filters (Web, Slack, WhatsApp)
- Set date range to last 30 days
- Look for trends: Is green (resolved) growing? Is yellow (unresolved) shrinking?
- Click specific dates to view conversations from that day
AI Involvement Rate Chart
What it shows: Pie chart showing autonomous, public, private, and not-involved distribution. Ideal chat involvement distribution:- Chat is deployed for questions too complex for full automation
- Knowledge gaps preventing autonomous handling
- Guidance too conservative, escalating unnecessarily
- Review public involvement conversations for improvement opportunities
- Team handling many chats manually without AI assistance
- Integration may not be triggering AI for all conversations
- Agents may be disabling AI during busy periods
- Review deployment settings and team training
Message Volume Chart
What it shows: Area chart with message count and conversation count over time. Chat-specific metrics:User Sentiment Chart
What it shows: Distribution of positive, neutral, and negative sentiment in user messages. Why sentiment matters more in chat: Real-time chat captures emotional reactions immediately:- Users express frustration faster in chat than email
- Positive sentiment validates good chat experience
- Sentiment shift mid-conversation indicates AI performance
- Filter to chat channels
- Note baseline sentiment distribution
- Filter to negative sentiment conversations
- Review whether negative sentiment correlates with:
- Long response times
- Unresolved status
- Many-message conversations
- Specific topics or times
Real-Time Performance Monitoring
Unlike ticketing where daily review suffices, chat performance benefits from more frequent monitoring:Daily Quick Check (5 minutes)
Every morning:- Open Analyze → Metrics, General view
- Filter to chat channels (Web, Slack, WhatsApp)
- Review yesterday’s performance:
- CSAT score: Did it drop from baseline?
- Resolution rate: Any significant change?
- Total conversations: Traffic volume normal?
- Check Conversation Status Chart: Any unusual spikes in escalations or unresolved?
- Sudden CSAT drops (5+ percentage points): Investigate immediately
- Resolution rate degradation: May indicate knowledge issue or technical problem
- Volume spikes: Ensure AI is handling load without quality loss
Weekly Deep Dive (30-45 minutes)
Every Monday or Tuesday:-
Review previous week’s chat metrics:
- Filter date range to past 7 days
- Note CSAT, resolution rate, involvement rate
- Compare to previous week’s performance
-
Analyze low-rated conversations:
- Navigate to Analyze → Conversations
- Filter: Chat channels, Rating 1-2 stars, Last 7 days
- Read 10-15 conversations
- Document common issues (response quality, speed, knowledge gaps)
-
Check abandonment patterns:
- Filter: Chat channels, Status: Unresolved, Last 7 days
- Review 10-15 abandoned conversations
- Note where users are leaving (after how many messages?)
- Identify topics with high abandonment
-
Review escalation triggers:
- Filter: Chat channels, Status: Escalated, Last 7 days
- Check Handoff Chart for escalation reasons
- Determine if escalations were necessary or avoidable
- Plan knowledge improvements to reduce avoidable escalations
-
Track improvements:
- Review conversations for topics you recently improved
- Verify that changes are having desired effect
- Document what’s working to replicate success
Real-Time Chat Monitoring (During Critical Periods)
For product launches, marketing campaigns, or other high-stakes events: Set up active monitoring:- Keep Analyze → Conversations open and filtered to chat channels
- Refresh every 15-30 minutes to see new conversations
- Spot-check recent conversations for quality
- Watch for unusual patterns (sudden escalation spike, low ratings)
- Define thresholds for manual intervention (e.g., 3 consecutive 1-star ratings)
- Assign team member to monitor metrics during event
- Prepare to adjust AI deployment if quality issues arise
- Have backup plan to route conversations to humans if AI struggles
Conversation Flow Analysis
Chat conversations reveal how well your AI handles real-time dialogue dynamics:Multi-Turn Dialogue Quality
What to analyze: How AI performs across multiple back-and-forth exchanges. Access the data:- Navigate to Analyze → Conversations
- Filter to resolved chat conversations
- Use Conversation Length Chart to filter to 5-10 message conversations
- Review 20-30 examples
Context Retention Analysis
What it measures: Whether AI maintains conversation context across messages. How to test:- Filter to chat conversations with 6+ messages
- Read user follow-up questions
- Check if AI responses acknowledge previous context
- Look for phrases indicating context loss:
- “As I mentioned earlier…” (good, maintaining context)
- “I don’t have information about that” (bad, after already discussing topic)
- User repeating question multiple times (bad, AI not understanding)
- AI treats each user message as independent query
- Follow-up questions reference information from earlier in conversation
- AI doesn’t connect related questions across messages
- Pronouns and references not resolved (user says “What about the other option?” AI doesn’t know what option)
- Ensure guidance emphasizes conversation continuity
- Add examples of multi-turn conversations to training
- Review knowledge structure for related topics that should link together
- Test conversation memory by asking follow-ups in Preview Panel
Greeting and Closing Quality
First impressions matter in chat: Opening message analysis:- Filter to chat conversations
- Review first AI message in 30-50 conversations
- Evaluate:
- Does AI greet user warmly?
- Is tone conversational, not formal?
- Does AI acknowledge the question clearly?
- Is first response fast (ideally under 5 seconds)?
User Engagement Metrics
Beyond success rates, engagement metrics show whether users trust and value the chat experience:Repeat User Rate
What it measures: Percentage of chat users who return for multiple conversations. How to calculate:- Navigate to Analyze → Metrics, General view
- Note “Unique Users” count for selected time period
- Note “Conversations” count for same period
- Calculate: Conversations per User = Conversations / Unique Users
- Indicates users find chat valuable (come back)
- Lower customer acquisition cost (retention vs. new users)
- Trust in AI capability (wouldn’t return if first experience was poor)
- First-time experience may not be compelling enough
- Users solving problem once and never need help again (could be good!)
- Users avoiding chat after poor initial experience (needs investigation)
Follow-Up Question Rate
What it measures: Percentage of conversations where user asks follow-up questions after initial answer. How to analyze:- Use Conversation Length Chart to see message distribution
- Calculate percentage of conversations with 3+ user messages
- Users engaged in productive dialogue
- AI handling complex questions through conversation
- Good multi-turn performance
- Users not getting clear answers initially
- Having to ask same question multiple ways
- AI not understanding or answering directly
- Review these conversations for knowledge or guidance issues
- AI providing clear, complete answers immediately
- Users getting what they need quickly
- Efficient resolution (ideal for chat)
- Users abandoning after poor first answer
- Not bothering to ask follow-ups
- Lost confidence in AI after initial interaction
Survey Completion Rate
What it measures: Percentage of users who complete satisfaction surveys when offered. How to analyze:- Review Conversation Rating Chart in General view
- Note “Abandoned” bar (surveys offered but not completed)
- Calculate: Survey Completion = Rated / (Rated + Abandoned) × 100%
- CSAT score based on small sample may not represent true satisfaction
- Missing feedback from users who abandoned or had neutral experience
- Hard to identify problems without representative ratings
- Time survey appropriately (immediately after resolution, not after idle time)
- Keep survey simple (1 question: star rating, optional comment)
- Frame survey helpfully: “How did I do?” not “Rate your experience”
- Don’t offer survey if conversation was very short (1-2 messages)
Chat Quality Indicators
Beyond metrics, qualitative signals reveal chat experience quality:Response Relevance
What to check: Does the AI actually answer the question asked? How to evaluate:- Read 20-30 random chat conversations
- For each, ask: “Did the first AI response directly address the user’s question?”
- Calculate: Relevance Rate = Directly Relevant Responses / Total
- AI provides related but not directly relevant information
- AI answers a similar question, not the one asked
- AI gives generic response when user wants specific information
- AI misunderstands due to typos or informal phrasing
- Add examples of actual user questions to guidance
- Include common phrasings and variations in knowledge
- Adjust guidance to prioritize direct answers over comprehensive explanations
- Test with real user questions in Preview Panel
Response Conciseness
What to check: Are AI responses appropriately brief for chat? How to evaluate:- Read 30-50 chat conversations
- Count sentences in typical AI responses
- Note which responses feel too long
- Users asking follow-ups that were already answered (didn’t read full response)
- Users saying “too much info” or “just tell me X”
- High abandonment after AI’s first long response
- Many follow-up questions because initial answer buried key info
- Add guidance: “Keep responses to 2-3 sentences for simple questions”
- Show examples of concise vs. verbose answers
- Instruct AI to offer more detail rather than providing it upfront
- Remove unnecessary preambles (“Thank you for contacting us today…”)
Tone Appropriateness
What to check: Does AI match the conversational, friendly tone users expect from chat? How to evaluate:- Read 20-30 chat conversations
- Note formal or robotic phrasing
- Check if tone matches your brand voice
- Add tone guidance: “Respond conversationally, like a helpful colleague”
- Provide good/bad example pairs in guidance
- Test tone by reading responses out loud—if you wouldn’t say it to a friend, it’s too formal
- Review low-rated conversations for tone complaints
Best Practices for Chat Monitoring
Establish Chat-Specific Baselines
Don’t compare chat to ticketing performance: Track chat metrics separately:- Week of [date]
- Chat CSAT: X%
- Chat resolution: Y%
- Avg messages per conversation: Z
- Notes: What changed this week?
Weekly Chat Review Ritual
Monday morning routine (15-20 minutes):-
Check weekend chat performance:
- Filter metrics to Saturday-Sunday
- Note CSAT and resolution rate
- Compare to weekday performance
- Check if AI maintained quality without team available
-
Review last week’s improvements:
- Did knowledge additions improve specific topics?
- Did guidance changes affect tone or conciseness?
- Track one improvement at a time to measure impact
-
Identify this week’s focus:
- Pick 1-2 specific issues to address
- Set measurable goal (e.g., “Reduce ‘Billing’ topic abandonment from 25% to 18%”)
- Plan specific actions (add 3 billing snippets, update guidance)
- Quick metrics check for the week
- Note any significant changes or incidents
- Plan weekend monitoring if needed
- Document week’s performance for next Monday review
Conversation Sampling Strategy
You can’t read every conversation. Use strategic sampling: Daily samples (5-10 conversations):- 3 most recent conversations (spot-check quality)
- 2 lowest-rated from today (identify acute issues)
- 2-3 random conversations (avoid bias)
- 10 lowest-rated (understand dissatisfaction)
- 10 escalated (identify avoidable escalations)
- 10 unresolved/abandoned (understand where users leave)
- 10 highly-rated (learn what’s working well)
- 10 random (representative sample)
- Deep dive across all rating levels
- Segment by topic (30 conversations per top topic)
- Segment by channel if using multiple chat platforms
- Track improvement in previously problematic areas
Correlate Chat Metrics with Business Outcomes
Connect chat performance to business goals: Customer acquisition:Optimization Strategies for Chat
Speed Optimization
If response times exceed 10 seconds: Investigate causes:- Check data provider sync status (slow providers delay responses)
- Review knowledge retrieval performance (too many sources slow search)
- Test during peak vs. off-peak (infrastructure scaling issue?)
- Check external API timeouts (integration dependencies)
Abandonment Reduction
If abandonment rate exceeds 25%: Step 1: Identify abandonment points Filter to unresolved conversations, read 30-50:- Do users leave after 1st AI response? (Answer quality issue)
- Do users leave after 3-4 messages? (Follow-up knowledge gaps)
- Do users leave after long delays? (Response time issue)
- Read 20 conversations about that topic
- Note what information users needed but didn’t get
- Add that information to knowledge base
- Update guidance for better handling of that topic
- Monitor next week’s abandonment rate for that topic
CSAT Improvement
If chat CSAT below 70%: Diagnosis process:-
Filter to low ratings:
- Conversations with 1-2 star ratings
- Last 30 days
- Chat channels only
-
Categorize complaints:
- Speed issues: “Too slow” “Took forever” “Waited too long”
- Accuracy issues: “Wrong answer” “Didn’t help” “Incorrect information”
- Tone issues: “Rude” “Unhelpful” “Robotic” “Formal”
- Completeness issues: “Not enough detail” “Vague” “Didn’t answer my question”
-
Quantify each category:
-
Address root causes:
- Accuracy issues → Add knowledge, improve data provider quality
- Tone issues → Refine guidance, add tone examples
- Speed issues → Optimize response time (see Speed Optimization above)
- Completeness issues → Ensure answers directly address questions
-
Measure improvement:
- Track CSAT weekly after changes
- Read follow-up low-rated conversations to verify issues resolving
- Target 3-5 percentage point improvement per month
Escalation Optimization
If escalation rate exceeds 20% for chat: Determine if escalations are appropriate: Read 30-50 escalated conversations: Appropriate escalations (don’t try to eliminate):-
Add missing knowledge:
- Create snippets for questions AI incorrectly escalates
- Ensure common questions have direct answers
-
Adjust escalation rules:
- Review trigger keywords (too sensitive?)
- Increase AI confidence threshold for autonomous handling
- Add more specific escalation criteria (not just keywords)
-
Improve guidance:
- Instruct AI to attempt answering before escalating
- Provide examples of when to escalate vs. when to try
- Add guidance for handling follow-up questions
Troubleshooting Common Chat Issues
Response Time Suddenly Increased
Symptoms: Average response time jumped from 5 seconds to 20+ seconds. Diagnosis steps:- Check deployment status (recent changes to profile or knowledge?)
- Review data provider sync health (any providers failing or timing out?)
- Test in Preview Panel (is delay affecting all questions or specific topics?)
- Check for infrastructure issues (external APIs down?)
- Review traffic volume (sudden spike causing congestion?)
- Roll back recent profile changes if timing correlates
- Disable slow or failing data providers temporarily
- Contact support if infrastructure issue
- Scale up resources if traffic spike is sustained
CSAT Dropped Suddenly
Symptoms: CSAT was 80%, now 65% with no obvious change. Diagnosis steps:- Compare date ranges: When exactly did drop occur?
- Check for deployments: Did profile change around that time?
- Read recent low-rated conversations: What specific complaints?
- Check if drop is topic-specific: Filter by topic, see if isolated
- Review team changes: New agents affecting handoffs?
- If after deployment: Review changes, roll back if necessary
- If topic-specific: Add knowledge for that topic urgently
- If tone-related: Adjust guidance for better tone
- If unclear: Keep monitoring, may be temporary anomaly
High Abandonment Rate
Symptoms: 40%+ of conversations ending unresolved. Diagnosis steps:- Calculate abandonment by message count (where are users leaving?)
- Check response times (slow responses causing abandonment?)
- Read abandoned conversations (what questions not being answered?)
- Filter by topic (specific topics causing abandonment?)
- Compare to previous period (sudden change or gradual trend?)
- Immediate triage: Add knowledge for top 3 abandonment topics
- Speed fixes: Optimize slow data providers
- Quality fixes: Improve initial response relevance
- Follow-up fixes: Add knowledge for common follow-up questions
Users Repeatedly Asking Same Question
Symptoms: Multi-message conversations where user asks same question 2-3 times. Diagnosis:- AI not directly answering the question
- AI providing related info instead of specific answer
- AI’s answer buried in long response (user didn’t see it)
- AI misunderstanding question due to phrasing
-
Improve answer directness:
- Update guidance: “Answer the specific question asked in first sentence”
- Add examples of direct vs. indirect answers
-
Add knowledge variations:
- Create snippets with multiple phrasings of same question
- Include common follow-up formulations
-
Improve conciseness:
- Shorten responses so key info isn’t buried
- Lead with direct answer, offer more detail after
Chat Works Well in Testing, Poorly in Production
Symptoms: Preview Panel conversations look great, but real user CSAT is low. Common causes:-
Test questions don’t match real user questions:
- Testing with well-formed questions
- Real users ask with typos, informal language, context
- Solution: Test with exact user questions from conversation history
-
Testing during off-peak, production during peak:
- Response time fine in testing, slow during traffic
- Solution: Test during actual peak hours
-
Testing individual messages, not full conversations:
- Single-message test looks good
- Multi-turn dialogue breaks down
- Solution: Test full conversation flows, not isolated Q&A
-
Different audiences or channels:
- Testing Web channel, production includes Slack with different expectations
- Solution: Test across all active channels
Next Steps
Now that you understand chat performance monitoring:- Review Conversations - Read actual chat conversations to understand metrics in context
- General Metrics Dashboard - Access the complete General view for chat analytics
- Analyze Topics - Identify which chat topics need improvement
- Improve Answers - Use chat insights to refine knowledge and guidance
- Test Your AI - Create test suites specifically for chat scenarios
- Ticketing Performance - Compare chat performance to ticketing metrics