The tech upgrades powering Alexa’s new gen AI capabilities

Today, we introduced Alexa+, the next generation of Alexa. Rebuilt from the ground up, this new version of Alexa uses a state-of-the-art architecture that automatically connects a variety of large language models (LLMs), agentic capabilities, services, and devices at scale. This makes Alexa+ much more conversational, smarter, personalized, and capable of getting more things done for customers.

With an undertaking of this scale, the team had to solve many technical challenges along the way. Here are five of the biggest advances we made to unlock Alexa+ and deliver our next generation assistant, powered by generative AI.

We built an all-new architecture to connect to tens of thousands of services and devices

LLMs are great for conversations, but they don’t inherently support APIs, which are the core protocols to getting things done outside of a chat window and in the real world for customers—things like finding and booking an appointment or ordering your groceries. To augment the native capabilities of LLMs, we built an all-new architecture to orchestrate APIs at scale.

This architecture is what will let customers quickly and seamlessly connect with services they already use in their daily life: GrubHub, OpenTable, Ticketmaster, Yelp, Thumbtack, Vagaro, Fodor’s, Tripadvisor, Amazon, Whole Foods Market, Uber, Spotify, Apple Music, Pandora, Netflix, Disney+, Hulu, Max, smart home devices from companies like Philips Hue and Roborock, and so much more.

On top of that, we enabled LLMs to not only integrate with APIs, but to string together multiple such calls in a row. That let us capitalize on LLMs’ natural strength of free-form conversation to be even more useful by handling multifaceted requests.

The result is an experience more like one you’d have with a real-life personal assistant. For example, you can ask Alexa+ for help making a lunch reservation at your favorite restaurant and to share that plan with your friend, and Alexa+ will not only book that reservation but also send a text message to your requested contact.

We built systems to deliver accurate, real-time info

One of the biggest challenges with LLMs is that despite their brilliance, they can be unpredictable. They often give different answers to the same questions, and sometimes even hallucinate. That can work in some instances for chat, but not when you’re controlling devices in your home or asking about what’s happening in the news. So, we built all-new systems to help Alexa+ leverage grounding techniques when answering customer questions.

We also partnered with world-class news sources including the Associated Press, Reuters, The Washington Post, TIME, Forbes, Business Insider, Politico, USA TODAY, publications from Conde Nast, Hearst, and Vox, and more than 200 additional outlets, so Alexa+ can deliver accurate, real-time news and information. This helps build an incredible depth of knowledge for Alexa+ that never stops learning.

We minimized latency

Customers expect Alexa to be fast, yet there’s inherent tension when balancing accuracy and speed.

To manage that tradeoff, we built a sophisticated routing system using state-of-the-art models from Amazon Bedrock—including Amazon Nova and Anthropic Claude—instantly matching each customer request with the best model for the task at hand, balancing all the requirements of a crisp, conversational experience.

We keep Alexa’s personality, and personalize responses to you

Customers have long told us that they and their families love Alexa's personality. From the very beginning, we built our AI assistant to be smart, considerate, empathetic, and inclusive—and to have a sense of humor. So, we optimize each model built into our architecture to ensure they reflect Alexa’s personality.

We also want Alexa+ to grow with customers—to personalize the experience based on things you’d want Alexa to remember, like your favorite music artists, books you want to read, or the types of food you just don’t like. Alexa+ does this, both implicitly and explicitly, by matching common patterns, occasionally asking to confirm your preferences, and recalling specific facts you ask Alexa to remember. The underlying system then incorporates these affinities so Alexa+ can deliver the most relevant responses for each request—meaning, the more you use Alexa+, the better your experience will get.

We added agentic capabilities

For Alexa+ to be an incredibly useful AI assistant, we couldn’t limit the experience to only work with APIs that exist today. Not every company has a ready-built set of externalized APIs, or at least not one that can handle every action a customer would like to take.

So, we added agentic capabilities, essentially teaching Alexa+ to navigate the digital world as a person would. That means customers can ask Alexa+ to do something, and Alexa can take the customer request, navigate to a developer’s website, and complete tasks requested by the customer.

Developers who want to build experiences for Alexa+ can learn more in the Amazon Developer blog.

Alexa+ isn't just another AI chatbot. It’s our next-generation AI assistant that is much more conversational, smarter, personalized, and capable of getting even more things done for customers. We can’t wait for you to try it.