Make your net apps hands-free with Amazon Nova Sonic
Graphical person interfaces have carried the torch for many years, however right now’s customers more and more anticipate to speak to their purposes. Amazon Nova Sonic is a state-of-the-art basis mannequin from Amazon Bedrock, that helps allow this shift by offering pure, low-latency, bidirectional speech conversations over a easy streaming API. Customers can collaborate with the purposes by voice and embedded intelligence reasonably than merely working them.
On this submit we present how we added a real voice-first expertise to a reference utility—the Good Todo App—turning routine activity administration right into a fluid, hands-free dialog.
Rethinking person interplay by collaborative AI voice brokers
Vital usability enhancements are sometimes deprioritized—not as a result of they aren’t helpful, however as a result of they’re troublesome to implement inside conventional mouse-and-keyboard interfaces. Options like clever batch actions, customized workflows, or voice-guided help are regularly debated however deferred as a result of UI complexity. That is about voice as a further, general-purpose interplay mode—not a alternative for device-specific controls or an accessibility-only answer. Voice allows new interplay patterns, it additionally advantages customers of assistive applied sciences, reminiscent of display screen readers, by providing a further, inclusive option to work together with the appliance.
Amazon Nova Sonic goes far past one-shot voice instructions. The mannequin can plan multistep workflows, name backend instruments, and preserve context throughout turns in order that your utility can collaborate with the customers.
The next desk exhibits voice interactions from completely different utility domains, like activity administration, CRM, and assist desk.
| Voice interplay (instance phrase) | Intent / aim | System motion / conduct | Affirmation / UX |
|---|---|---|---|
| Mark all my duties as full. | Bulk-complete duties | Discover person’s open duties → mark full → archive if configured |
All 12 open duties are marked full. |
| Create a plan for getting ready the Q3 price range: break it into steps, assign house owners, and set deadlines. |
Create multistep workflow | Generate plan → create duties → assign house owners → set deadlines → floor evaluate choices |
Plan created with 6 duties. Notify house owners? |
| Discover enterprise leads in APAC with ARR over $1M and draft customized outreach. |
Construct focused prospect record and draft outreach |
Question CRM → assemble filtered record → draft customized messages for evaluate |
Drafted 24 customized outreach messages. Overview and ship? |
| Prioritize all P1 tickets opened within the final 24 hours and assign them to on-call. |
Triage and assign | Filter tickets → set precedence → assign to on-call → log modifications |
12 P1 tickets prioritized and assigned to the on-call staff. |
Amazon Nova Sonic understands the intent, invokes the required APIs, and confirms the outcomes—no types required. This helps to create an surroundings the place productiveness is multiplied, and context turns into the interface. It’s not about changing conventional UI, it’s about unlocking new capabilities by voice.
The pattern utility at a look
With the Good Todo reference utility, customers can create to-do lists and handle notes inside these lists. The applying affords a targeted but versatile interface for activity monitoring and notice group. With the addition of voice, the appliance turns into a hands-free expertise that unlocks extra pure and productive interactions. In Good Todo App, customers can say:
- “Add a notice to observe up on the mission constitution.”
- “Archive all accomplished duties.”
Behind every command are targeted actions—like creating a brand new notice, organizing content material, or updating activity standing—executed by speech in a means that feels pure and environment friendly.
How Amazon Nova Sonic bidirectional APIs work
Amazon Nova Sonic implements a real-time, bidirectional streaming structure. After a session is initiated with InvokeModelWithBidirectionalStream, audio enter and mannequin responses circulation concurrently over an open stream:
- Session Begin – Shopper sends a
sessionStartoccasion with mannequin configuration (for instance, temperature and topP). - Immediate and Content material Begin – Shopper sends structured occasions indicating whether or not upcoming information is audio, textual content, or device enter.
- Audio Streaming – Microphone audio is streamed as base64-encoded audio enter occasions.
- Mannequin Responses – Because the mannequin processes enter, it streams the next responses asynchronously:
- Computerized speech recognition (ASR) outcomes
- Instrument use invocations
- Textual content responses
- Audio output for playback
- Session Shut – Conversations are explicitly closed by sending
contentEnd,promptEnd, andsessionEndoccasions.
Nova Sonic Structure Diagram
You should use this event-driven strategy to interrupt the assistant (barge-in), allow multi-turn conversations, and assist real-time adaptability.
Resolution structure
For this answer, we use a serverless utility structure sample, the place the UI is a React single page utility. The React single web page utility is built-in with backend net APIs working on server-side containers. The Good Todo App is deployed utilizing a scalable and security-aware AWS structure that’s designed to assist real-time voice interactions. The next picture supplies an structure overview of AWS companies working collectively to assist bidirectional streaming wants of a voice enabled utility.

Key AWS companies embody:
- Amazon Bedrock – Powers real-time, bidirectional speech interactions by the Amazon Nova Sonic basis mannequin.
- Amazon CloudFront – A content delivery network (CDN) that distributes the appliance globally with low latency. It routes /(root) site visitors to the React utility hosted on an Amazon S3 bucket and
/apiand/novasonicsite visitors to the Software Load Balancer. - AWS Fargate for Amazon Amazon Elastic Container Service (Amazon ECS) – Runs the backend containerized companies for WebSocket dealing with and REST APIs able to supporting lengthy lived bidirectional streams.
- Application Load Balancer (ALB) – Forwards net site visitors
/api(HTTPS REST API calls) to backend ECS companies, dealing with Good Todo App APIs, and/novasonic(WebSocket connections) to ECS companies managing real-time voice streaming with Amazon Nova Sonic. - Amazon Virtual Private Cloud (Amazon VPC) – Gives community isolation and safety for backend companies. The Public Subnets host the Software Load Balancer (ALB) and Personal Subnets host ECS Fargate duties working WebSocket and REST APIs.
- NAT Gateway permits Amazon ECS duties in non-public subnets to extra securely hook up with the web for operations like Cognito JWT token verification endpoints.
- Amazon Simple Storage Service (Amazon S3) –Hosts React frontend for person interactions
- AWS WAF – Helps defend the Software Load Balancer (ALB) from malicious site visitors and enforces safety guidelines on the utility layer.
- Amazon Cognito – Manages authentication and points tokens.
- Amazon DynamoDB – Shops utility information reminiscent of to-do lists and notes.
The next picture illustrates how the person requests are served with assist for low-latency bidirectional streaming.
Request Workflow
Deploying the answer
To guage this answer, we offered pattern code of a Good Todo App accessible at GitHub repository.
Good Todo App consists of a number of impartial Node.js tasks, together with a CDK infrastructure mission, a React frontend utility, and backend API companies. The deployment workflow makes certain that the elements are appropriately constructed and built-in with AWS companies like Amazon Cognito, Amazon DynamoDB, and Amazon Bedrock.
Conditions
Deployment steps
- Clone the next repository:
- For first-time deployment, use the next automated script:
This script will:
- Set up the dependencies utilizing npm (node bundle supervisor)
- Construct the elements and container picture utilizing regionally put in docker engine
- Deploy the infrastructure utilizing CDK (CDK BootStrap ==> CDK Synth ==> CDK Deploy)
- Replace surroundings variables with Amazon Cognito settings
- Rebuild the UI with up to date surroundings variables
- Deploy the ultimate infrastructure (CDK Deploy)
Verifying deployment
After deployment is profitable, full the next steps:
- Entry the Amazon CloudFront URL offered within the CDK outputs.
Observe: The URL proven within the picture is for reference solely, each deployment will get a singular URL.
Profitable deployment display screen shot
- Create a brand new person by signing up utilizing the Create Account part.
Create Consumer and Log in
- Take a look at the voice performance to confirm the combination with Amazon Nova Sonic. The next picture illustrates a dialog between the signed-in person and the Amazon Bedrock agent. The AI agent is ready to invoke current APIs, and the UI is up to date in actual time to replicate agent’s actions.
Granting Microphone entry to the appliance
Voice interplay in Good Todo App
Clear up
You may take away the stacks with the next command.
Subsequent steps
Voice isn’t simply an accessibility add-on—it’s turning into the first interface for complicated workflows.
Seems speaking is quicker than deciding on—particularly when your app talks again.
Strive these sources to get began.
- Sample Code repo – A working Amazon Nova Sonic integration
you’ll be able to run regionally. See how real-time voice interactions, intent dealing with, and multistep flows are
applied finish to finish. - Amazon Nova Sonic hands-on workshop – A guided lab that walks you
by deploying Amazon Nova Sonic in your AWS account and testing voice-native options. - Amazon Nova Sonic docs – Gives API reference, streaming examples, and greatest
practices that can assist you design and deploy voice-driven workflows. - Contact your AWS account staff to be taught extra about how AI-driven options can rework your operations.
In regards to the authors
Manu Mishra is a Senior Options Architect at AWS, specializing in synthetic intelligence, information and analytics, and safety. His experience spans strategic oversight and hands-on technical management, the place he opinions and guides the work of each inner and exterior clients. Manu collaborates with AWS clients to form technical methods that drive impactful enterprise outcomes, offering alignment between expertise and organizational targets.
AK Soni is a Senior Technical Account Supervisor with AWS Enterprise Help, the place he empowers enterprise clients to attain their enterprise targets by providing proactive steerage on implementing progressive cloud and AI/ML-based options aligned with business greatest practices. With over 19 years of expertise in enterprise utility structure and improvement, he makes use of his experience in generative AI applied sciences to reinforce enterprise operations and overcome current technological limitations.
Raj Bagwe is a Senior Options Architect at Amazon Net Companies, primarily based in San Francisco, California. With over 6 years at AWS, he helps clients navigate complicated technological challenges and makes a speciality of Cloud Structure, Safety and Migrations. In his spare time, he coaches a robotics staff and performs volleyball. He may be reached at X deal with @rajesh_bagwe.