Let’s Build a TV Spatial Navigation

In this talk, I'll take you through my journey as I joined the team supporting our Smart TVs application and share my experience learning one of the most overlooked but essential pieces of functionality we have.

Rate this content
Bookmark
Watch video on a separate page
Video Summary and Transcription
The video delves into spatial navigation, highlighting the challenges of implementing it in smart TV applications. It explains the need for a spatial navigation library due to the diverse operating systems of smart TVs. The talk explores how to build a web application for the user interface, which simplifies maintenance but loses native platform support for spatial navigation. The lack of browser support for spatial navigation is addressed, with a proposal in progress. The video discusses using the getBoundingClientRect method to select the next node after a key press, filtering nodes by direction and main axis, and selecting the closest one based on distance. The NavigationEngine class is updated with the handleNavigation method to implement this logic. The video also covers using a useFocusRef hook function to manage focus without static IDs, addressing challenges like complex arrangements and circular navigations. It mentions using React and React Router DOM for the demo application, with navigational nodes registered and events listened from TBControl. The video encourages developers to explore open-source projects for spatial navigation, like the one from Norwegian Media, and to get involved in building a community around smart TV application development.

FAQ

The topic of Sergio Avalos's talk is Spatial Navigation for smart TV applications.

Sergio Avalos is a software engineer at Spotify, working on the team behind the Spotify client that runs on smart TVs.

Spatial Navigation is a term used to describe the process of navigating a TV interface using the directional keys on a TV remote control.

Using IDs for navigational elements can be error-prone, difficult to work with dynamic views, and adds extra information unrelated to the application logic.

As of 2023, browser support for Spatial Navigation is still a work in progress. There is a proposal in draft, but it is not yet implemented.

Yes, there is an open-source project provided by Norwegian Media, released in 2019, but it wasn't available when Spotify's smart TV application was initially developed.

Sergio Avalos suggests using a hook function that returns a callback for setting the reference of the HTML element and managing focus without relying on static IDs.

Some advanced challenges in Spatial Navigation include handling non-matrix layouts, managing focus on pop-ups, and implementing circular navigation for convenience.

Developers can use the library linked in Sergio Avalos's presentation to start building smart TV applications without developing Spatial Navigation logic from scratch.

A library for Spatial Navigation is needed because smart TVs have different operating systems, and using a web application for the user interface can lose native platform support, including Spatial Navigation.

1. Introduction to Spatial Navigation#

Short description:

Welcome to the talk on Spatial Navigation. We'll be discussing the challenges of implementing spatial navigation for TV controls and why a library is needed. The market for smart TVs has multiple brands with their own operating systems, making it necessary to have native applications for each. However, to simplify maintenance, we built a web application for the user interface. Unfortunately, this approach resulted in the loss of native platform support for spatial navigation. Although there is a proposal to provide this functionality in browsers, it is still a work in progress.

Welcome, everyone. Thank you very, very much for joining this talk.

My name is Sergio Avalos, and we're going to be talking about Spatial Navigation. But rather than talking, we're going to be building in.

I'm a software engineer at Spotify and I recently joined about a year ago the team working behind the Spotify client that runs on your smart TV. That means that for this talk, we're not going to be talking about mobile, neither we're going to be talking about desktop. And most importantly, we're not going to be talking about mouse. Instead, we're going to be talking about TV control, that gadget that I bet all of you have in your living rooms.

Special navigation is nothing else, just a fancy name for describing what you do with the TV control when you are pressing the directional keys, the arrow keys for selecting the one application or just navigating to one of them. That got me very curious when I joined the recent team where I'm working because I didn't know that, I mean, I was surprised that one had to create a library for that. So I decided to dig into the code and I was fascinated. Not because the code was amazing, I mean, it was fine, the code, but most importantly because I felt that it was a very interesting problem to solve. So that's what this talk is about. I want to share with you my learning of how I learned about this library and what a better way to learn than just building it ourselves.

But in case you wonder, because that was my first impression, why do we need to build a library for spatial navigation? I mean, isn't it a huge utility that should be provided by the platforms? And the answer is yes, totally. If you're building a native application. Let me try to explain.

The market for smart TVs is quite cement, there are many brands and each of them run their own operating system. That means that you need to have your native application running for each of them. But, just to make our lives easier and reduce the maintenance costs, we decided to build a user interface using a web application that can be loaded in each of the native app. That gave us a great interoperability of shipping the same code to all these native applications. But, it came at the cost of losing the support from the native platform. In that case, it's obviously the spatial navigation.

Then, I was thinking, okay, okay. But, the year is 2023. Shouldn't that be provided by the browser? I mean, the browser, nowadays, is a very sophisticated piece of software. And, the answer is not yet. It's a work in progress. There is a proposal. It's still a draft for building this functionality, but it's not there yet.

2. Improving the Approach to Spatial Navigation#

Short description:

We need to continue waiting. Are there any open-source projects we could use? Norwegian Media released one in 2019, but our application is older. Let's start building it. Wrap each navigational element with an ID and tell them where to go. This approach has caveats: difficult with dynamic views, prone to mistakes, and adds extra information. Let's improve this approach by developing the extra logic to connect TV control with our application.

We need to continue waiting. Then, I was thinking, okay, okay. But, are there any open-source projects out there that we could use? And, actually, there is. Thank you very much, Norwegian Media, for providing this. Unfortunately, they released it in, no, unfortunately, but they released it in 2019, and our application is a little bit older than that. So, we didn't have any back then.

Having answered that question, let's start. Let's start building it. If I ask you just from top of your head, like your intuition, how would you do it? I don't know about you, but for me, it was, I mean, the simplest that I could come up, and I think I read it on a blog from Norwegian Media, and even from Netflix. It's basically, you just wrap each of what I call navigational elements as the element that the user can interact with it, with just an ID, just identify them, and then you tell them where to go. Take for example, the sidebar of our application, the Spotify application. Each of these elements is just a link for the home view, the search, and so on, and like I've explained before, you wrap them with an ID, and in that wrapper, you tell them where to go, so if you are going to the, if you're in the search and you go to the app, then you tell them, go to this ID that is the home.

That approach actually gets the job done, but it obviously have a few caveats as you, I can imagine, you can anticipate. One is that it is difficult to work with dynamic views. Think for example, recommendations, the developer doesn't know what they are going to get. Also, it's ever prone, because the developer is the developers role to add this ID manually, so mistakes can happen. We're humans. And finally, it just adds extra information that is not related to the application, like I said, this is just a utility that should be invisible to the application layer. So let's improve this approach.

For this presentation, I built a very small application app that basically has just two views. It's a welcome. Click on this one. Then you go to another view that renders you just a surprise for whatever. And then you have the go back link. And then you come back to SMBN. It works perfectly well with the mouse, but it doesn't work with the TV control. So this is exactly what we're going to do. We're going to develop the extra logic that we need to connect TV control with our very simple application. Demo application.

3. Demo Application and Spatial Navigation Logic#

Short description:

In the demo application, we have the index page with the router configuration for the welcome page and the surprise page. Each view is a React component, such as the question box and the go back link. We register navigational nodes, listen to events from the TBControl, and select the next element based on the direction. We create a NavigationEngine class to handle this logic and make it available to the app using a context provider. The API for setting the HTML element reference is straightforward, with a focus function.

Demo application. OK. Awesome. So I'm just going to go very briefly to the source code of the demonstration app. We have the index page that you get from the React app and create React app, a script. Inside of it, we have the component for our application that is just the router configuration for going to the welcome page and the surprise page. For this, I'm using React router DOM library.

And each of these components, I'm sorry, each of these views are just another React component that we have. For example, for the welcome page, we have an array of 10 empty elements and we only use it to render 10 different components that is the question box. And for the surprise view, we have already hard coded the links of the images that will display randomly along with another go back link. Finally, these two components are the question box and the go back link. That is nothing else, but it just only uses the link component from the React router. In the case of the question box, it's like rendering the question box image and the other one is basically just rendering the children. That is the text that says, go back.

Okay. So here is this. Let's jump into the logic of the spatial navigation. First, we start registering all the navigational nodes, then we listen to the events coming up from the TBControl. Finally, from there we select the element that should go depending on the direction. And finally, we just update the cursor, meaning what is the next element that should be focused? If I put everything on a diagram so it's crystal clear for you from the steps 1 to 3, you can see that each of the question boxes is going to be registered on a class called NavigationEngine with the method RegisterNode. We add an event listener called OnKeyDown that will call the HandleNavigation method from this class that we just defined. All right, step number one, let's create a class of this NavigationEngine that has a private variable called Node, and then one method for adding nodes to this private variable and another one for removing. Then we go back to the index script where we instantiate this class NavigationEngine for the purpose of this talk we make it available to the whole global Finally, we also make it available inside our app using a context provider. I hope you don't believe that we are writing directly from the Windows. That's only for this presentation. Finally, we go back to the navigation, not finally, but we go back to the navigational element because I wanted to show you first the API that I encountered. I felt it was super simple. It was just a hook function that returns you a callback for like setting the reference of the HTML element that you're rendering. And then there is focus as simple as that. You don't need to think about IDs.

4. UseFocusRef Hook and TV Control Integration#

Short description:

To use the useFocusRef hook function, you need to create a reference value with a callback, generate a unique ID, and obtain an instance of the navigation engine. The registerNode method is called when the component is rendered and removed when it's sub-mounted to avoid memory leaks. We can debug the navigational nodes to ensure they are rendered correctly. We add an event listener to the document to listen to key presses and call the handleKeyEvent callback function. We use a map to define directional keys and integrate them with our internal values in the app.

It's only, that's the only thing that you need to do. How do you call this useFocusRef hook function? Well, it first, you create a reference value with the callback to instantiate this reference. Then you generate a unique ID. And finally you obtain one of the instance of the navigation engine using the context provider. And with the help of the useEffect function, every time the component is render, is mounted, excuse me, we're going to call the method called registerNode. And when it's sub-mounted we're going to remove it to avoid having memory leaks.

Cool. We are going to now debug this. So we want to make sure like all the navigational nodes are render and if we look at the nodes variable we see that we have 10. We click in each one of them and then again we have only one, so it's refreshing. We can even inspect inside and if we do see the reference is pointed to the HTML element. We go back and then again we have 10. So it's working. Let's go with the step number two.

Listen to the TV control. In the where we are in the in our application component we add an event listener to the document so that every time the any key is pressed then we just call this a callback function called handleKeyEvent. To generate that callback we basically just are we're using a method that will distinguish whether the key that you are pressing is one of the directional ones, the arrows, and just for this step we're going to console.login so we're able to debug it. To build these directional keys we already have a map where we define what is a directional key and that map is just basically the integration between our internal values in the app what we define as being up next and the values coming from the from the platform, in this case the our platform is a desktop, where it can change depending on if you're actually running on a smart tv or on a gaming console for example.

5. Integration Logic and Node Selection#

Short description:

This part focuses on the integration logic for the native component of the application. We explore how to select the next node after a key press by using the getBoundingClientRect method to obtain the dimensions and coordinates of each node. We then filter the nodes by direction and main axis, and select the closest one based on distance. The NavigationEngine class is updated with the handleNavigation method to implement this logic. Finally, we update the course based on the initial diagram.

This is the part of the integration logic that the native component native application needs to know but we're not going to do it for this presentation.

So let's go and see if the events of the tv control are being registered and yeah we can see here I press the key down and it's telling me like it's the arrow key I go to the left and when then we have the left.

Now we can go with the funnest part of this code which is selecting the node after the user press the key. In case you were wondering why do we need the reference it's because we can call this method called getBoundingClientRect that gives you exactly the dimensions and also the coordinates according to the viewport where the element is rendered. That means that if you take all the nodes and you call this method one by one then you get all the information that you need to build that logic so you can forget in this moment about the application or whatever render.

With this information we can exactly decide where this focus should go. So how do you choose this the next node? First you filter all the nodes by the direction, then you filter by the main axis, and finally you pick the closest one just by the distance. Let's go step by step. Imagine that we're talking about another matrix, a bigger one, five by five. You are in the middle and the key that is pressed is going to the right. Then you will filter the first the last two columns, excuse me, but if you are pressing to the top then you pick the first two rows. From those, you choose by the main axis, so if you are going to the right then you choose those nodes that are between the margin top and the margin bottom. The same if you are going up then you choose between the margin right and the margin left. Sorry, it should be like that! Finally, once you narrow to those two, then you pick the one closest to the distance. How does it look in code? We go back to the NavigationEngine class that we defined before, we add the handleNavigation method, and we do this step by step. First we filter by the direction, and we do that with the help of this dictionary that already has the predefined method that you need to filter all those methods. Then from there, you do exactly the same but you are going to do the filter by the main axis, and then you pick the closest element. And we're pretty much done. Now we can go back to the step number two, remove the console.log, and call the method called handleNavigation. Cool. Let's see if this works. So if I click the down, then we can see the I already have it, auto-complete. We can see this is the element class. So we're in this corner. If we press to the left, then we have this one. If we press down, then you know what's going to happen, right? Awesome. So it's working, but it's not selecting the element that is supposed to be selected. Let's do that. Step number four, update the course. So we go back to the diagram that I showed at the beginning and we're going to update it.

6. Adding Subscribers and Final Remarks#

Short description:

So every time a note is raised, we add a subscriber and execute callbacks to notify all subscribers. We update the useHook function to keep track of the focused element's state. We demonstrate the functionality and mention the challenges of complex arrangements, annoying pop-ups, and circular navigations. We provide links to a report and a library for further exploration. We encourage getting in touch and building a community around smart TV application development.

So every time a note is raised, we're going to add a subscriber. So the note can say like, Hey, if anything happens, please let me know. Notify me. Let's do this in code. If we go back to the navigation engine class, then we add one method for adding those subscriptions. Another one for executing all those callbacks called notifyAllSubscribers. And we call this method exactly after we found one of the elements.

We update the useHook function. We update the value that we previously defined and is focused with a new hook function that will keep the state whether that element in turn is focused or not. And again, with the help of the useEffect function, every time the component is mounted, first we check it initially like, hey, am I focused? Yeah or not. We update the state. And we also subscribe for whenever the navigation engine calls the handle navigation, then I just check, hey, is it me who is focused? And if so, then we just update the state.

Cool, we're pretty much done. Let's just do a quick demonstration. So it's going to top-down, it's going down, left, left, we click on OK, and again. So perfect, it's working. I was afraid of the demo. So this is just the beginning, because trust me, from here, it just gets a lot more complicated, a lot more funnier. For example, your application is not going to be a perfect shape matrix instead, where you're going to deal with a more complex arrangement where you have the sidebar or the different columns, so literally you need to deal with those corner cases. What about those pop-annoying and inconvenient pop-ups that tell you, hey, do you want to buy premium? This is an interesting case because you need to focus the attention on just two elements, accept or buy. Although all those navigation notes are behind, you just need to focus on those two. And finally, one of my favorite ones is called circular navigations, where you want to solve the constraints in some areas. For example, the sidebar menu, if the user is hitting down, down, down, down, it reaches the bottom, just for convenience, we want to reach it to the top. Trust me, this is a total game-changer. But we're not going to continue because we're out of time. But if you feel curious and you want to continue the party, here is the link of the report that I used for this presentation so we can continue creating. On the other hand, if you feel inspired and say like, I want to create my first smart application for the smart TV, there is the link of the library that you can use so you don't have to build this from scratch. However, if you think you've been thinking like, oh, this is way too complicated, you can do it in a much easier way. Please get in touch. My personal goal of this talk is like to create a small community because building an application for a smart TV is not easy. So let's get in touch. Let's help each other. But that's it for now. Muchas gracias. Thank you very much. Here you have again the links to the QR codes of the links. And I really hope to hear from you.

Sergio Avalos
Sergio Avalos
18 min
23 Oct, 2023

Comments

Sign in or register to post your comment.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Debugging JS
React Summit 2023React Summit 2023
24 min
Debugging JS
Top Content
Watch video: Debugging JS
Debugging JavaScript is a crucial skill that is often overlooked in the industry. It is important to understand the problem, reproduce the issue, and identify the root cause. Having a variety of debugging tools and techniques, such as console methods and graphical debuggers, is beneficial. Replay is a time-traveling debugger for JavaScript that allows users to record and inspect bugs. It works with Redux, plain React, and even minified code with the help of source maps.
A Framework for Managing Technical Debt
TechLead Conference 2023TechLead Conference 2023
35 min
A Framework for Managing Technical Debt
Top Content
Today's Talk discusses the importance of managing technical debt through refactoring practices, prioritization, and planning. Successful refactoring requires establishing guidelines, maintaining an inventory, and implementing a process. Celebrating success and ensuring resilience are key to building a strong refactoring culture. Visibility, support, and transparent communication are crucial for addressing technical debt effectively. The team's responsibilities, operating style, and availability should be transparent to product managers.
Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
A Practical Guide for Migrating to Server Components
React Advanced 2023React Advanced 2023
28 min
A Practical Guide for Migrating to Server Components
Top Content
Watch video: A Practical Guide for Migrating to Server Components
React query version five is live and we'll be discussing the migration process to server components using Next.js and React Query. The process involves planning, preparing, and setting up server components, migrating pages, adding layouts, and moving components to the server. We'll also explore the benefits of server components such as reducing JavaScript shipping, enabling powerful caching, and leveraging the features of the app router. Additionally, we'll cover topics like handling authentication, rendering in server components, and the impact on server load and costs.
Power Fixing React Performance Woes
React Advanced 2023React Advanced 2023
22 min
Power Fixing React Performance Woes
Top Content
Watch video: Power Fixing React Performance Woes
This Talk discusses various strategies to improve React performance, including lazy loading iframes, analyzing and optimizing bundles, fixing barrel exports and tree shaking, removing dead code, and caching expensive computations. The speaker shares their experience in identifying and addressing performance issues in a real-world application. They also highlight the importance of regularly auditing webpack and bundle analyzers, using tools like Knip to find unused code, and contributing improvements to open source libraries.
Monolith to Micro-Frontends
React Advanced 2022React Advanced 2022
22 min
Monolith to Micro-Frontends
Top Content
Microfrontends are considered as a solution to the problems of exponential growth, code duplication, and unclear ownership in older applications. Transitioning from a monolith to microfrontends involves decoupling the system and exploring options like a modular monolith. Microfrontends enable independent deployments and runtime composition, but there is a discussion about the alternative of keeping an integrated application composed at runtime. Choosing a composition model and a router are crucial decisions in the technical plan. The Strangler pattern and the reverse Strangler pattern are used to gradually replace parts of the monolith with the new application.

Workshops on related topic

Build Modern Applications Using GraphQL and Javascript
Node Congress 2024Node Congress 2024
152 min
Build Modern Applications Using GraphQL and Javascript
Featured Workshop
Emanuel Scirlet
Miguel Henriques
2 authors
Come and learn how you can supercharge your modern and secure applications using GraphQL and Javascript. In this workshop we will build a GraphQL API and we will demonstrate the benefits of the query language for APIs and what use cases that are fit for it. Basic Javascript knowledge required.
Building a Shopify App with React & Node
React Summit Remote Edition 2021React Summit Remote Edition 2021
87 min
Building a Shopify App with React & Node
Top Content
WorkshopFree
Jennifer Gray
Hanna Chen
2 authors
Shopify merchants have a diverse set of needs, and developers have a unique opportunity to meet those needs building apps. Building an app can be tough work but Shopify has created a set of tools and resources to help you build out a seamless app experience as quickly as possible. Get hands on experience building an embedded Shopify app using the Shopify App CLI, Polaris and Shopify App Bridge.We’ll show you how to create an app that accesses information from a development store and can run in your local environment.
Build a chat room with Appwrite and React
JSNation 2022JSNation 2022
41 min
Build a chat room with Appwrite and React
WorkshopFree
Wess Cope
Wess Cope
API's/Backends are difficult and we need websockets. You will be using VS Code as your editor, Parcel.js, Chakra-ui, React, React Icons, and Appwrite. By the end of this workshop, you will have the knowledge to build a real-time app using Appwrite and zero API development. Follow along and you'll have an awesome chat app to show off!
Hard GraphQL Problems at Shopify
GraphQL Galaxy 2021GraphQL Galaxy 2021
164 min
Hard GraphQL Problems at Shopify
WorkshopFree
Rebecca Friedman
Jonathan Baker
Alex Ackerman
Théo Ben Hassen
 Greg MacWilliam
5 authors
At Shopify scale, we solve some pretty hard problems. In this workshop, five different speakers will outline some of the challenges we’ve faced, and how we’ve overcome them.

Table of contents:
1 - The infamous "N+1" problem: Jonathan Baker - Let's talk about what it is, why it is a problem, and how Shopify handles it at scale across several GraphQL APIs.
2 - Contextualizing GraphQL APIs: Alex Ackerman - How and why we decided to use directives. I’ll share what directives are, which directives are available out of the box, and how to create custom directives.
3 - Faster GraphQL queries for mobile clients: Theo Ben Hassen - As your mobile app grows, so will your GraphQL queries. In this talk, I will go over diverse strategies to make your queries faster and more effective.
4 - Building tomorrow’s product today: Greg MacWilliam - How Shopify adopts future features in today’s code.
5 - Managing large APIs effectively: Rebecca Friedman - We have thousands of developers at Shopify. Let’s take a look at how we’re ensuring the quality and consistency of our GraphQL APIs with so many contributors.
0 To Auth In An Hour For Your JavaScript App
JSNation 2023JSNation 2023
57 min
0 To Auth In An Hour For Your JavaScript App
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.js backend + Vanilla JS frontend) to authenticate users with One Time Passwords (email) and OAuth, including:
- User authentication – Managing user interactions, returning session / refresh JWTs- Session management and validation – Storing the session securely for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.