Voice-Based AI Agents: Insights from Cypher 2024

Discover how Sarvam AI's voice-based AI agents enable scalable, multilingual, and affordable customer interactions.
Session

At Cypher 2024, Vivek Raghavan, Co-Founder of Sarvam AI, presented the revolutionary role of voice-based AI agents in transforming engagement at scale. Raghavan discussed Sarvam AI’s mission to make generative AI accessible for billions of users, particularly in India, by focusing on small models, voice-led interactions, and organizational sovereignty. The session emphasized the potential of voice-based AI in diverse applications, including customer service, commercial transactions, and multilingual support.


Core Concepts

Small Models for Scalability

  • While large models like GPT are powerful, Raghavan emphasized that for tasks requiring repetitive or frequent use at scale, small models are more efficient.
  • Small models offer lower latency, consume less energy, and can handle high-frequency tasks, making them ideal for daily interactions at a population scale.

Voice-Led AI for India

  • India is a country where voice-based communication is integral, especially across diverse regional languages.
  • Sarvam AI has developed voice agents capable of understanding and responding in 10 Indian languages, allowing businesses to reach a wider audience with seamless voice interactions.
  • These agents are designed to work across channels, including phone calls, WhatsApp, and in-app interfaces.

Organizational Sovereignty

  • Raghavan highlighted the importance of organizational sovereignty, where businesses can maintain control over their data, compute, and AI models.
  • By using small models, organizations can deploy AI systems within their own infrastructure, ensuring data privacy and reducing dependency on third-party AI services.

    Challenges and Solutions

    Creating Scalable Voice Agents

    • Traditional generative AI solutions often focus on text-based interactions. Sarvam AI is building solutions that use voice as the primary mode of interaction.
    • These voice agents are multilingual by default, supporting seamless transitions between Indian languages in a single conversation.

      Reducing Costs with Small Models

      • The cost of deploying large AI models for voice-based interactions can be prohibitive, especially for high-frequency tasks.
      • Sarvam AI’s focus on small, task-specific models allows businesses to deploy voice-based agents at a cost as low as ₹1 per minute, making AI-powered interactions affordable at scale.

        Multimodal AI Interactions

        • Sarvam AI’s voice agents support multimodal interactions, combining voice, text, images, and even UI elements within platforms like WhatsApp.
        • This allows businesses to offer complex, interactive experiences, such as product exploration or transaction completion, using voice commands augmented with visual and text-based content.

          Implementation Insights

          Voice AI Across Multiple Channels

          • Sarvam AI’s voice agents can be deployed across phone calls, WhatsApp, and apps, ensuring that businesses can engage with users through their preferred channels.
          • These agents can conduct tasks ranging from product recommendations to booking appointments, handling voice and text inputs interchangeably.

          Orchestrating Multiple AI Models

          • Sarvam’s architecture involves orchestrating multiple AI models that handle speech recognition, translation, and response generation.
          • This orchestration ensures that voice-based interactions remain fast and contextually accurate, delivering real-time responses without the need for large, slow models.

          Low-Code Agent Deployment

          • Sarvam AI provides a low-code platform that allows businesses to create and deploy voice agents with minimal technical expertise.
          • Businesses can create fully functional voice agents in just 15 minutes, with the option to deploy them in 10 languages, reducing the time and complexity of AI integration.


            Industry Impact

            Widespread Adoption of Voice-Based AI

            • Voice agents are making it easier for businesses to engage with customers who prefer verbal interactions, especially in regions with lower literacy rates or where digital literacy is a barrier.
            • The ability to deploy AI agents that speak local languages opens up new opportunities for businesses to connect with rural and regional populations.

            Commercial Transactions Through Voice

            • Sarvam AI’s agents can facilitate commercial transactions, including payments, through voice interfaces, transforming how businesses conduct sales and customer service.
            • This capability is particularly useful in sectors like e-commerce, healthcare, and customer support, where fast, frictionless interactions are critical.

            Future of Sovereign AI

            • As businesses become more concerned about data privacy, the concept of sovereign AI—where organizations maintain full control over their data and AI models—is gaining traction.
            • Sarvam AI’s solutions align with this trend, offering businesses the ability to own and control their AI infrastructure while maintaining high efficiency.

            Conclusion

            • Vivek Raghavan’s session at Cypher 2024 highlighted the transformative potential of voice-based AI agents in India and beyond.
            • By focusing on small, efficient models, multilingual capabilities, and organizational sovereignty, Sarvam AI is enabling businesses to deploy scalable, affordable AI solutions that meet the needs of diverse populations.
            • The future of AI in India lies in voice-led interactions, and Sarvam AI is at the forefront of this revolution, making generative AI accessible to billions.

            Transform your team into AI powerhouses

            Targeted suite of solutions for enterprises aiming to harness the power of AI. MachineHack is your partner in building a future-ready workforce adept in artificial intelligence.

            Online AI Hackathons to accelerate innovation

            With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.