Every AI support vendor claims their product will "transform your customer experience." Most of them are, at best, overselling what their product actually does. Here's how to cut through the hype.
Let's talk about how we talk about AI support
The term "AI agent" has become meaningless. Vendors apply it to everything from basic FAQ chatbots to sophisticated systems that can actually resolve complex issues. This isn't just confusing. It presents a real challenge for buyers trying to make informed decisions.
Most of the content published about "AI agents for customer support" is vendor-written and designed to generate leads, not help you evaluate solutions. We're trying to do something different with this guide. We're going to give you a framework for actually understanding what's on the market, what questions to ask, and how to figure out what you need. Yes, we sell an AI concierge that does support, but we'll save you the hard pitch. If you walk away from this article feeling a competitor is a better fit for your needs, that's fine. Better informed decisions are better for everyone.
A taxonomy for AI support tools
The market breaks down into four distinct categories, and understanding the difference before you start your search will save you months.
FAQ chatbots
What they do: Search your knowledge base, surface relevant articles, summarize content for customers.
Best for: Simple, high-volume queries where self-service is acceptable. Think "what are your hours?" or "how do I reset my password?"
Limitations: Can't take action. Can't follow complex procedures. When a customer needs something done, these tools hit a wall.
Who's here: Many of the tools calling themselves "AI agents" are actually sophisticated FAQ chatbots. If a vendor can't clearly explain what actions their AI takes beyond answering questions, you're probably looking at a chatbot with better marketing.
Copilots
What they do: Draft responses for human agents to review and send.
Best for: Organizations very early in AI adoption who want a safety blanket. Teams that aren't ready to let AI interact directly with customers (e.g. in a heavily regulated sector where direct AI interaction feels too risky).
Limitations: The efficiency gains are marginal. Research suggests around 14%, with less impact for more experienced agents. A major potential downside of copilots is that they risk creating roles where humans are just rubber-stamping AI output. If the copilot is good enough to trust, why have a human in the loop?
The hard truth: Copilots feel safer than they are. At scale, human agents aren't carefully reviewing every draft. They're clicking send to keep up with volume and hit their goals.
Deflection-first AI
What they do: Attempt to answer everything. Escalate what they can't answer or divert.
Best for: Organizations that prioritize volume reduction above all else.
Limitations: This is where a large chunk of the market sits, and it's where the problems start. If your AI attempts to answer 100% of tickets and successfully resolves 40%, you've created 60 failed interactions. That's a 60% frustration rate.
Vendors love deflection-first approaches because they can charge per conversation and claim high "engagement" numbers. But engagement is not the same as resolution. A customer who gives up in frustration counts as "deflected" in the metrics.
Action-taking concierges
What they do: Actually resolve issues by taking action within your systems. Process refunds. Update accounts. Coordinate with third parties. Ship replacement products.
Best for: Complex, high-stakes environments where resolution, rather than deflection, is the goal.
Limitations: Requires deeper integration with your systems, which in turn requires a more thoughtful implementation. They also require ongoing iteration rather than set-and-forget.
The key difference: These systems don't just tell customers how to solve problems. They solve them. Consider a customer whose card is declined while traveling after they book an airport transfer. A FAQ chatbot explains fraud detection policies, leaving the customer frustrated. An action-taking concierge blocks the suspicious transactions, creates a new virtual card, coordinates with the taxi company to update payment details, and arranges physical card delivery to the next hotel — all while keeping the customer informed about what's happening.
The design philosophy that matters
The difference between these categories isn't primarily technological. They all use LLMs. They all have access to similar underlying capabilities. The difference is design philosophy.
Are you optimizing for avoiding customers (deflection) or for helping them (resolution)?
This isn't a semantic distinction. It shapes every decision a vendor makes: how they price, what they measure, how they train their AI, what features they build. Vendors optimizing for deflection build systems that try to handle everything and succeed at some of it. Vendors optimizing for resolution build systems that know their limits and focus on actually solving problems.
The term "AI concierge" is instructive here. A concierge solves your problem. A concierge at a hotel doesn't hand you a FAQ sheet about restaurants. They make you a reservation.
Characteristics of an AI support concierge
What does this look like in practice? What are the characteristics of a concierge that differentiate it from the other types of AI agent on the market?
Action-taking, not answer-giving. An AI concierge doesn't just tell customers how to request a refund or agents how to have a difficult conversation. It processes the refund and handles the difficult conversation.
Multi-channel orchestration. The concierge can call the merchant, text the customer, email the vendor. Simultaneously if needed.
Self-awareness. It knows what it doesn't know. It escalates more complex issues cleanly with full context, not after trapping customers in frustrating loops.
Judgment calls. For businesses that empower human agents to make exceptions, an AI concierge can do exactly the same.
Don't ignore the complexity curve
Support tickets follow a power law distribution. Your easiest 50% of tickets might take 5 minutes each. Your hardest 5% might take 50 minutes. The ability to climb that complexity curve is what actually unlocks value from AI deployments.
Most AI support tools top out quickly. They handle the simple stuff (password resets, order status, basic FAQs) and escalate everything else. That's fine if your support is mostly simple stuff. But for companies with complex products, regulated industries, or high-stakes customer relationships, the simple tickets aren't where the pain is.
If you're evaluating AI support, ask yourself: where does my support team actually spend their time? If the answer is complex, multi-step issues that require judgment and system access, you need AI that can climb the complexity curve. If the answer is high volumes of simple, repetitive questions, a basic FAQ solution might be enough.
What to actually measure
Metrics that matter
Resolution rate (not deflection rate). Did the problem actually get solved? This requires defining what "solved" means for your business and tracking whether customers come back with the same issue.
CSAT on AI-handled tickets. Are customers satisfied with AI interactions specifically? This is different from overall CSAT. You need to isolate how AI is performing.
Escalation quality. When AI hands off to a human, does the human have full context? Clean escalations that set up humans for success are a sign of well-designed AI. Messy handoffs that frustrate both customers and agents are a sign of AI that's in over its head.
Ratio of good to bad AI interactions. This is the real measure of quality. If your AI handles 1,000 tickets and 600 are genuinely resolved while 400 are frustrated customers, that's a 60% success rate. If your AI handles 500 tickets and 450 are genuinely resolved, that's 90%. The second scenario is better even though the total volume handled by AI is lower.
Metrics that mislead
Deflection rate. A customer who gives up in frustration counts as "deflected." A customer who finds a workaround without AI help counts as "deflected." This metric can quickly be gamed into meaninglessness.
AI engagement rate. Optimizing for AI touching more tickets incentivizes bad experiences. The goal isn't for AI to touch everything. It's for AI to succeed at what it touches.
Per-conversation pricing. When vendors charge per conversation regardless of outcome, they're incentivized to have AI attempt everything. Your incentives and their incentives diverge.
The question to ask yourself
"If my AI attempts 100 tickets and resolves 40, are the 60 failures creating enough frustration to offset the 40 successes?"
For most businesses, the answer is yes. Failed AI interactions don't just fail to help. They actively damage customer relationships and make future AI interactions harder because customers learn to immediately ask for humans.
Implementation realities vendors will gloss over
1) Prompting is coaching, not programming
You don't configure an AI agent once and walk away. You iterate. You review conversations, identify gaps, refine instructions, and improve over time. Think of it like managing a team member who learns fast but needs ongoing feedback.
Vendors who promise "set it and forget it" are either misleading you or building something that doesn't actually work well. The good news is that AI feedback loops are much faster than human training loops. You can make meaningful improvements in days, not months.
2) You don't need perfect documentation first
The old advice for outsourcing ("get your house in order before you hand off") doesn't apply to AI the same way. AI feedback loops are fast enough to iterate your way to good processes.
Start with what you have. Deploy on a limited, but meaningful, scope. See where the AI struggles. Improve your documentation based on real gaps, not theoretical ones. This is actually more efficient than trying to anticipate everything upfront.
3) Integration depth matters more than AI sophistication
An AI with access to your systems will outperform a "smarter" AI that can only read help articles. If your AI can look up order status, process refunds, update account details, and coordinate with third parties, it can actually solve problems. If your AI can only search your help center and summarize articles, it's limited to answering questions.
When evaluating vendors, ask what integrations they support and how deep those integrations go. "We integrate with Shopify" could mean they can read order data, or it could mean they can process refunds, update shipping addresses, and cancel orders. The difference is everything.
4) Testing and evaluation are non-negotiable
Any serious deployment needs robust testing before going live. You should be able to run test conversations, validate how the AI handles specific scenarios, and audit decisions before customers see them.
Vendors who don't offer strong testing and evaluation tools are either hiding something or don't understand the stakes. In regulated industries especially, the ability to explain why AI made specific decisions isn't optional.
The competitive landscape
Vendor | Positioning | Lorikeet assessment |
|---|---|---|
Intercom Fin | FAQ automation + RAG built into Intercom ecosystem | Strong for Intercom users with basic needs. Struggles with complex procedures. Good if you're already on Intercom and your support is mostly straightforward. |
Zendesk AI | AI bolted onto comprehensive ticketing | Deep Zendesk integration, but AI capabilities are less sophisticated than specialized players. You're buying the ecosystem, not best-in-class AI. |
Decagon | "Agentic AI" for enterprise | Well-funded (~$35M ARR per Sacra). Claims agentic capabilities but head-to-heads suggest gaps in complex scenarios. |
Sierra | Celebrity CEO, consumer brand focus | ~$104M ARR per Sacra. Strong marketing, targets large consumer brands. Less focused on complex B2B use cases. |
Ada | Pre-LLM legacy with updates | "Coach, don't code" messaging is solid. Mixed reviews on complex use cases. Strong foundation but showing age in some areas. |
Forethought | Deflection-focused AI | Leads with impressive-sounding metrics but definitions are fuzzy. Deflection-first philosophy means high engagement, variable resolution. |
Salesforce Agentforce | Enterprise AI within Salesforce | Scale advantage and deep ecosystem integration. Less specialized than pure-play support AI. You're buying the platform, not the best support AI. |
What about Lorikeet?
We should be transparent: we wrote this guide and we sell AI support software. Here's our honest self-assessment.
Where we're strong:
Complex, high-stakes environments (healthcare, fintech, regulated industries)
Multi-step workflows that require judgment
Multi-channel orchestration — our AI can call vendors, text customers, and email partners simultaneously
Voice support that actually takes action, not just answers questions
Companies with high CX standards who won't accept mediocre AI interactions
Where we might not be the right fit:
Simple eCommerce with basic FAQ needs — we might be more than you need
Organizations that want set-and-forget with no iteration — that's not how this works
Teams that measure success by deflection rate rather than customer outcomes — we're not optimized for that metric
What we've seen in head-to-heads:
Flex compared us to Decagon directly and chose Lorikeet. They saw 2x CSAT improvement and 50% faster resolution.
Magic Eden switched from Intercom Fin and saw CSAT jump from 45% to 74%.
Linktree evaluated Fin, Decagon, and Assembled AI before choosing us. Auditability and control were the deciding factors.
Arbor Health saw [TODO: confirm Arbor metrics] after switching to Lorikeet for their regulated healthcare workflows.
We're not claiming we're best for everyone. We're claiming we're best for a specific type of customer: complex businesses with high standards who care about resolution, not deflection.
Questions to ask any vendor
Before you sign anything, get clear answers to these questions:
What's your definition of "resolved" and how is that measured? If they can't give you a clear, auditable definition, be skeptical of their metrics.
Can we see actual conversation logs from customers similar to ours? Demos are choreographed. Real conversations reveal real capabilities.
What happens when the AI can't help? How does escalation work? The handoff experience matters as much as the AI experience. Bad escalations create frustrated customers and frustrated agents.
How do you price, and how does that align with our goals? Per-conversation pricing misaligns incentives. Per-resolution pricing is better but requires clear resolution definitions. Understand what you're paying for.
What testing and evaluation tools do you provide? If we can't test before deploying and audit after, we're flying blind.
What actions can your AI actually take, beyond answering questions? Get specific. "Integrates with Shopify" isn't the same as "can process refunds in Shopify."
Who are your customers in our industry, and can we talk to them? References matter. If they can't connect you with similar customers, ask why.
The mindset shift
AI support isn't about avoiding customers. It's about serving them better at scale.
The vendors who understand this are building fundamentally different products from those who don't. They measure resolution, not deflection. They focus on quality of interactions, not quantity. They invest in action-taking capabilities, not just answer-giving.
When you're evaluating AI support, the question isn't "which vendor has the best AI?" The question is "which vendor's design philosophy matches what we're trying to achieve?"
If you want to reduce support costs by making it harder for customers to get help, there are tools for that. If you want to deliver better support at scale by actually solving customer problems, there are tools for that too. They're not the same tools, even if they all call themselves "AI agents."
Book a call
See what Lorikeet is capable of








