
Sesame, the conversational AI startup founded by Oculus veterans, has secured $250 million in a Series B funding round and simultaneously opened access to the beta version of its voice-first AI app on iOS.
This achievement marks a step forward in Sesame’s mission of bringing highly expressive, natural-sounding voice AI technology to everyday users, while also setting the stage for the company’s future producing smart glasses hardware.
The funding round, led by Sequoia Capital and Spark Capital, brings Sesame’s total valuation to over $1 billion, reflecting investor confidence in the company’s approach to building emotionally intelligent AI-powered voice assistants.
Sesame’s appeal lies in its fundamental approach to voice AI. Unlike traditional voice assistants that convert text-to-speech through a rigid system, Sesame’s proprietary Conversational Speech Model (CSM) generates speech directly from both text and audio token simultaneously, enabling natural conversational flow with swift response times.
According to Sesame, CSM “leverages the history of the conversation to produce more natural and coherent speech.”
The tech built by Sesame captures rhythm, emotion, and expressiveness in real time, which allows the model to imitate behaviours that can make interactions feel remarkably human – in this case, the model can laugh, pause mid-sentence, adjust tone dynamically, and respond to emotional cues.
During a demo release of Sesame’s two famous voice AI models Maya and Miles back in February, the company emphasized the mission of building a fully independent voice AI assistant with the aim to be ridiculously human-like.
“At Sesame, our goal is to achieve “voice presence” – the magical quality that makes spoken interactions feel real, understood, and valued,” the company wrote in a post. “We are creating conversational partners that do not just process requests; they engage in genuine dialogue that builds confidence and trust over time. In doing so, we hope to realize the untapped potential of voice as the ultimate interface for instruction and understanding.”
In the first few weeks of the demo release, Maya and Miles had more that one million people engage with it, boasting of its fun experience.
Sequoia Capital also noted in its investment announcement that their experience of spending hours talking to both models “was unlike anything we’d used before,” and that it was a culminating factor of the venture capital (VC) company’s decision in partnering with Sesame and co-leading this Series B funding round.
“Sesame’s conversational layer felt different. It doesn’t just translate LLM output into audio – it generates speech directly, capturing the rhythm, emotion, and expressiveness of real dialogue,” the company wrote. “The voices felt alive – engaging, witty, even surprising. It was fun. Maya and Miles both have character and personality.”
Additionally, Sesame aims to embed its AI companion into lightweight, fashion-forward smart glasses designed for all-day wear. Sequoia emphasized that Sesame’s eyewear will be stylish enough to wear even without AI functionality turned on.
The $250 million will allow Sesame to build the glasses to feature high-quality radio and an AI companion that “observes the world alongside you,” which would create a multimodal experience that combines voice with environmental awareness.
The startup is led by Brendan Iribe, former Oculus co-founder and CEO, alongside Anki Kumar, the former CTO of AR startup Ubiquity6.
For now, Beta testers of the voice-first AI app are required to maintain confidentiality about their experiences for now, which suggests Sesame may be interested in carefully controlling the narrative of its product as it scales toward wider availability.
As the integration of AI continues to spread into every area of users’ lives, especially in this case where it becomes more conversational and human-like, Sesame’s approach to building human-like AI products offers a glimpse into a future where devices listen, understand, and respond closely like the companions that they are.
In other words, the convergence of AI, wearables, and conversational interfaces stands to reshape consumer tech.
