JavaScript Conferences

JSNation US 2024

JSNation US 2024

English versionEN

Web Speech API Insights

Ana Rodrigues

Introducing the Web Speech API

Web Speech API offers a fascinating opportunity to integrate speech recognition capabilities directly into web applications. It is divided into two primary components: speech recognition and speech synthesis. Our focus here is on the speech recognition aspect, which enables developers to harness spoken input for various tasks like form inputs and continuous dictation.

Despite its potential, this API presents certain challenges, notably its inconsistent browser support. While some browsers like Chrome use a server-based recognition engine, others have limited or no support, which can be a hurdle for universal application.

Understanding Browser Support Challenges

The journey with the Web Speech API is not without its obstacles. One of the biggest hurdles is browser compatibility. Chrome, for instance, utilizes a server-based recognition engine, meaning audio is sent to a web service for processing. This limits offline functionality and raises privacy concerns.

On the other hand, browsers like Firefox have yet to implement this feature fully, citing privacy and data processing concerns. This inconsistency in support makes it challenging to create a universally accessible application, as developers must account for varying levels of functionality across different browsers.

Real-World Applications and Limitations

Despite its limitations, the Web Speech API has found use in several applications. A notable example is Google Translate's microphone function, which allows users to speak into an input field and see the text translated in real-time.

However, the API's reliance on server-based recognition engines means it can't be used offline, and only browsers backed by large corporations with access to extensive data sets can leverage these capabilities fully. This creates a gap between the potential of the API and its real-world applicability.

Experimenting with Fun Projects

The exploration of the Web Speech API can lead to engaging projects. One such example is creating a gamified karaoke experience in a browser. By using speech recognition to match lyrics being sung to the displayed lyrics, it's possible to create a fun, interactive experience.

However, this is not without its quirks. The API's speech recognition feature stops after a period of inactivity to conserve resources. Developers can work around this by adding event listeners that restart recognition, but this can lead to an annoying experience on mobile devices where notification sounds indicate the microphone's status.

Building a Simple Demo

To see the Web Speech API in action, a simple demo can be created. For instance, voice navigation in a kitchen setting can be useful when your hands are occupied. By using voice commands to scroll through a recipe, users can interact with the page without touching the device.

This demo highlights the API's potential for hands-free interaction, although it requires fine-tuning to ensure accurate recognition and response to commands, especially in noisy environments or with non-native accents.

Potential and Future Directions

There's significant potential in the Web Speech API, but it's not quite there yet for mainstream use. The technology's imperfections are apparent, but it offers a great experimental platform for developers. Many fun demos and projects highlight its capabilities, even if they're not perfect.

Developers interested in voice interfaces should consider designing with accessibility in mind. This means avoiding vague content, ensuring voice commands are clear and direct, and testing how synthesized speech sounds across different devices and contexts.

Conclusion

The Web Speech API offers intriguing possibilities for integrating speech recognition into web applications. While challenges like inconsistent browser support and server-based processing exist, the API remains an exciting tool for experimentation. Developers can learn a lot by building with these APIs, exploring voice interface design, and contributing to the growth of this technology.

Watch full talk with demos and examples:

Watch video on a separate page

This talk has been presented at JSNation US 2024, check out the latest edition of this JavaScript Conference.

FAQ

Anna is a frontend developer at the agency Hattar and a member of the IndieWeb community. She spends her free time blogging and experimenting with web technologies.

The talk is about creating a gamified karaoke experience in a browser using the Web Speech API, focusing on speech recognition and its challenges and potential.

The Web Speech API is a browser API that includes speech recognition and speech synthesis functionalities. It is used for applications like form input, continuous dictation, and control.

Anna faced challenges with browser support, as the Web Speech API is not supported by all browsers and often requires server-based processing, which can cause privacy concerns and offline limitations.

The purpose of Anna's karaoke project is to create a more interactive and gamified karaoke experience using web technologies, particularly the Web Speech API, to enhance user engagement.

Limitations include lack of support across all browsers, reliance on server processing for some browsers, inability to work offline, and issues with privacy concerns.

Alternative projects include Tony Edwards' talk on using the Web Speech API for jotting down rhymes and Stephanie Eccles' 12 Days of Web Dev Challenge. There are also polyfills and projects like the Common Voice from Mozilla.

Anna advises that side projects don't need to be monetized or become open source to be valid. Building "useless" things can be fun and educational, and it's okay to create for personal satisfaction.

While it may not be widely used at work, the Web Speech API can be utilized for accessibility, voice interfaces, and experimental projects that explore the capabilities of speech recognition in web applications.

Anna mentions the Rasmus as a personal anecdote that inspired her to create a karaoke project since there was only one Rasmus song available at karaoke, which wasn't her favorite.

browser api case study

Ana Rodrigues

21 min

21 Nov, 2024

Comments

Sign in or register to post your comment.

Proceed to separate video page with transcriptions and chapters

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Summit 2023

24 min

Debugging JS

Top Content

Watch video: Debugging JS

Mark Erikson

Debugging JavaScript is a crucial skill that is often overlooked in the industry. It is important to understand the problem, reproduce the issue, and identify the root cause. Having a variety of debugging tools and techniques, such as console methods and graphical debuggers, is beneficial. Replay is a time-traveling debugger for JavaScript that allows users to record and inspect bugs. It works with Redux, plain React, and even minified code with the help of source maps.

web development javascript case study best practices debug

A Framework for Managing Technical Debt

TechLead Conference 2023

35 min

A Framework for Managing Technical Debt

Top ContentPremium

Alex Moldovan

Today's Talk discusses the importance of managing technical debt through refactoring practices, prioritization, and planning. Successful refactoring requires establishing guidelines, maintaining an inventory, and implementing a process. Celebrating success and ensuring resilience are key to building a strong refactoring culture. Visibility, support, and transparent communication are crucial for addressing technical debt effectively. The team's responsibilities, operating style, and availability should be transparent to product managers.

team productivity developer challenges case study

Building a Voice-Enabled AI Assistant With Javascript

JSNation 2023

21 min

Building a Voice-Enabled AI Assistant With Javascript

Top Content

Tejas Kumar

Author of the "Fluent React" bestselling book, software engineer with 23 years of experience, and host of the developer-loved ConTejas Code podcast.

This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.

artificial intelligence case study

A Practical Guide for Migrating to Server Components

React Advanced 2023

28 min

A Practical Guide for Migrating to Server Components

Top Content

Watch video: A Practical Guide for Migrating to Server Components

Fredrik Höglund

Fredrik Höglund

React query version five is live and we'll be discussing the migration process to server components using Next.js and React Query. The process involves planning, preparing, and setting up server components, migrating pages, adding layouts, and moving components to the server. We'll also explore the benefits of server components such as reducing JavaScript shipping, enabling powerful caching, and leveraging the features of the app router. Additionally, we'll cover topics like handling authentication, rendering in server components, and the impact on server load and costs.

react next.js react query react server components react 18 case study

Monolith to Micro-Frontends

React Advanced 2022

22 min

Monolith to Micro-Frontends

Top Content

Ruben Casas

Microfrontends are considered as a solution to the problems of exponential growth, code duplication, and unclear ownership in older applications. Transitioning from a monolith to microfrontends involves decoupling the system and exploring options like a modular monolith. Microfrontends enable independent deployments and runtime composition, but there is a discussion about the alternative of keeping an integrated application composed at runtime. Choosing a composition model and a router are crucial decisions in the technical plan. The Strangler pattern and the reverse Strangler pattern are used to gradually replace parts of the monolith with the new application.

micro-frontends developer challenges micro frontends react case study

Power Fixing React Performance Woes

React Advanced 2023

22 min

Power Fixing React Performance Woes

Top Content

Watch video: Power Fixing React Performance Woes

Josh Goldberg

Open Source enthusiast, TypeScript contributor, writing a book on Typescript

This Talk discusses various strategies to improve React performance, including lazy loading iframes, analyzing and optimizing bundles, fixing barrel exports and tree shaking, removing dead code, and caching expensive computations. The speaker shares their experience in identifying and addressing performance issues in a real-world application. They also highlight the importance of regularly auditing webpack and bundle analyzers, using tools like Knip to find unused code, and contributing improvements to open source libraries.

react performance case study

Workshops on related topic

Building a Shopify App with React & Node

React Summit Remote Edition 2021

87 min

Building a Shopify App with React & Node

Top Content

Workshop

Jennifer Gray

Hanna Chen

2 authors

Shopify merchants have a diverse set of needs, and developers have a unique opportunity to meet those needs building apps. Building an app can be tough work but Shopify has created a set of tools and resources to help you build out a seamless app experience as quickly as possible. Get hands on experience building an embedded Shopify app using the Shopify App CLI, Polaris and Shopify App Bridge.We’ll show you how to create an app that accesses information from a development store and can run in your local environment.

e-commerce shopify with react shopify case study

Build a chat room with Appwrite and React

JSNation 2022

41 min

Build a chat room with Appwrite and React

Workshop

Wess Cope

API's/Backends are difficult and we need websockets. You will be using VS Code as your editor, Parcel.js, Chakra-ui, React, React Icons, and Appwrite. By the end of this workshop, you will have the knowledge to build a real-time app using Appwrite and zero API development. Follow along and you'll have an awesome chat app to show off!

web development react chat app case study realtime

Hard GraphQL Problems at Shopify

GraphQL Galaxy 2021

164 min

Hard GraphQL Problems at Shopify

Workshop

Rebecca Friedman

Jonathan Baker

Alex Ackerman

Théo Ben Hassen

Greg MacWilliam

5 authors

At Shopify scale, we solve some pretty hard problems. In this workshop, five different speakers will outline some of the challenges we’ve faced, and how we’ve overcome them.

Table of contents:
1 - The infamous "N+1" problem: Jonathan Baker - Let's talk about what it is, why it is a problem, and how Shopify handles it at scale across several GraphQL APIs.
2 - Contextualizing GraphQL APIs: Alex Ackerman - How and why we decided to use directives. I’ll share what directives are, which directives are available out of the box, and how to create custom directives.
3 - Faster GraphQL queries for mobile clients: Theo Ben Hassen - As your mobile app grows, so will your GraphQL queries. In this talk, I will go over diverse strategies to make your queries faster and more effective.
4 - Building tomorrow’s product today: Greg MacWilliam - How Shopify adopts future features in today’s code.
5 - Managing large APIs effectively: Rebecca Friedman - We have thousands of developers at Shopify. Let’s take a look at how we’re ensuring the quality and consistency of our GraphQL APIs with so many contributors.

graphql scalability case study

Build Modern Applications Using GraphQL and Javascript

Node Congress 2024

152 min

Build Modern Applications Using GraphQL and Javascript

Workshop

Emanuel Scirlet

Miguel Henriques

2 authors

Come and learn how you can supercharge your modern and secure applications using GraphQL and Javascript. In this workshop we will build a GraphQL API and we will demonstrate the benefits of the query language for APIs and what use cases that are fit for it. Basic Javascript knowledge required.

graphql web development case study

0 To Auth In An Hour For Your JavaScript App

JSNation 2023

57 min

0 To Auth In An Hour For Your JavaScript App

WorkshopFree

Asaf Shen

Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.js backend + Vanilla JS frontend) to authenticate users with One Time Passwords (email) and OAuth, including:
- User authentication – Managing user interactions, returning session / refresh JWTs- Session management and validation – Storing the session securely for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.

authentication case study

Build a knowledge base with Gatsby, Contentful and AWS

React Summit 2022

152 min

Build a knowledge base with Gatsby, Contentful and AWS

Workshop

Abdelrhman Adel

Abdelrhman Adel

In this workshop, we will go over how to build a knowledge base using Gatsby, a static site generator Framework that uses React and graphQL, Contentful, a Headless CMS to drive the content and deploy it to AWS S3.

graphql gatsby aws case study

Follow us

Upcoming events

Subscribe to the top JS conferences

and grow in-depth as engineer with insights from library authors and core teams

JSNation US 2025

New York, Nov 17 - 20, 2025

Want to sponsor our events?

React Summit US 2025

New York, Nov 17 - 21, 2025

React Advanced 2025

London, Nov 27 - Dec 1, 2025

TechLead Conf London 2025: Adopting AI in Orgs Edition

London, Nov 28, 2025

AI Coding Summit 2026

February, 2026

React Advanced Canada 2026

Toronto, Mar 24 - 26, 2026

Node Congress 2026

April, 2026

JSNation 2026

Amsterdam, Jun 11 - 15, 2026

React Summit 2026

Amsterdam, Jun 11 - 15, 2026

React Day Berlin 2026

Berlin, December, 2026