Video Summary and Transcription
This is a presentation on accessibility and screen readers. The speaker discusses the evolution of screen readers and how they adapted to graphical user interfaces. Accessibility APIs and the accessibility tree are introduced, allowing programs to construct a text database used by assistive technologies. The accessibility tree may vary across browsers and platforms, excluding elements that are not relevant to assistive technologies. The ARIA hidden state and element properties play a role in determining the accessibility of elements, and the accessible name can be derived from text content or specified using ARIA attributes.
1. Introduction to Accessibility and Screen Readers
This is a presentation by a team from the Google Cloud Platform, discussing the topic of What's Accessibility 3? The speaker, Mathilde, shares some background on the importance of accessibility and the evolution of screen readers from text-based operating systems to graphical user interfaces. She explains how screen readers adapted to the complexity of graphical user interfaces by constructing a text database called an off-screen model.
This is a presentation by a team from the Google Cloud Platform, and we're on a virtual stage to show you how to build a really powerful and fast-growing platform. Nice to see you all, especially I know it's 5 o'clock, I know it's late, we had a big day today, so I'm happy to see this room is full and that you all look awake. So thanks for being there.
So let's start. I made a mistake on this first slide. I hope it's not a bad sign for the rest of the presentation. The name of this presentation is What's Accessibility 3? Hi, my name is Mathilde, I'm a front-end developer and an accessibility professional. I left my job at Shopify a couple of weeks ago, so I don't work there anymore. You can hear from my accent that I'm from France originally, but I live in Madrid in Spain.
So the topic of today is What's Accessibility 3? and we'll cover that, but before, I'd like to go a bit back in time and give some context on why the accessibility is an interesting topic. So on this slide, we can see a picture of an 18-key square keyboard with nine digits, and the A, B, C, D letters help and stop keys. And out of curiosity, can you raise your hand if you know what this device is? I see no hands. Actually, I'm not surprised. If I saw any hand up, I would have been pretty amazed. But this device is from 1988, and it's a screen reader keypad that was developed by IBM, and so this keypad was coupled with a screen reader software as part of one of the first screen reading systems that were developed to allow people who are visually impaired or who are blind to access computers.
But did you ever wonder how screen readers were working back then? So in a very simple way, in a text-based operating system like MS-DOS, it was kind of easy for screen readers to access the characters that were presented on the screen, and all they had to do was to convert this text into speech. But as we know, computers evolved fast, and already by the end of the 80s, text-based operating systems were replaced by graphical user interfaces. And it became much more complicated for screen readers because a graphical user interface doesn't just run the characters. It runs the pixels, and the information presented on the screen is just much more complicated than what it was. For example, text can belong to different windows, but the screen readers should only read the text on the currently-selected windows. So you have different types of text. You have menus. You have items. You have buttons. And on top of that, you have elements that are purely visuals like icons. So how did screen readers adapt to that? Well, this is a picture of an article called Making the GUI Talk, so the Graphical User Interface Talk written in 1991. And it explains a new approach they've been developing at IBM to make screen readers work with graphical user interfaces. And the idea was to construct some kind of text database that models what's displayed on the screen. And this database was called an off-screen model, and in the previous article, the author explains that in the off-screen model, assistive technologies had to make assumptions about the role of things based on how they are drawn on the screen. For example, if a text has a border or background, then it's probably selected.
2. Accessibility APIs and the Accessibility Tree
Accessibility APIs were introduced in the late 90s, allowing programs and applications to construct the text database used by assistive technologies. Accessibility APIs provide a tree representation of the interface, with objects that describe UI elements, their properties, roles, states, and events. The accessibility tree is a separate representation of the DOM that focuses on accessibility-related information. Browsers generate the accessibility tree using the render tree, which is based on the DOM minus hidden elements. DevTools can be used to view the accessibility tree.
Or if there's a blinking insertion bar around it, probably the user can enter text. I'm pretty sure you can imagine how complex the systems were and how difficult this was to maintain, and there was a sort of ambiguity. So on top of that, every time a new version of the user interface would come out, screen readers had to ensure the off-screen model was still accurate. Drinking time, sorry. So what's a better solution? Well, already at the end of the, in the late 90s, accessibility APIs were introduced, and the role of accessibility APIs was to allow programs and applications to construct the text database that assistive technologies were previously building themselves. Concretely, they allow operating systems to describe UI elements as objects with names and properties, so that assistive technologies can access those objects. And I put accessibility APIs in the plural form because there are many, they are platform-specific. You have some for Mac, you have some for Windows, you have some for Android, and they're pretty much standards.
So you might wonder what does an accessibility API look like? And in practice, it's a tree representation of the interface. For example, on the Mac, a window would be an application that contains two children, a menu bar, the thing that's at the top, and a window that contains the application itself. Then the menu bar would also contain, as children, all the menu items, and the window would contain a left bar, a close button, search inputs, et cetera. For each UI element, developers can typically set roles, is that the window, is that a menu? They can set properties. What's the position of this window? What's its size? They can set states. Is this menu item selected or not? And other properties that are useful for developers, like events. Okay, so now we all know, we know all about screen readers and accessibility APIs, but what about the accessibility tree then? Well, the accessibility tree is a separate representation of the DOM that focuses on accessibility-related information. It is built by the browser that then passes this information to the platform accessibility API. So in summary, the accessibility tree makes the link between your HTML, the platform accessibility API, and assistive technology.
So now let's look at how browsers generate this tree. If you think of a typical page that's meant to be interacted with a screen, a mouse and a keyboard, a typical critical rendering path would look like this. The browser creates the DOM and the CSS object model, then it creates the render tree, which is basically the DOM minus some elements that are hidden by CSS. When the render tree is ready, the browser can determine the exact size and position of the element, does the layout step, and then it paints the nodes on the screen one by one. And after that, the user can interact with the page with its mouse, keyboard, et cetera. Now looking at the experience from the point of view of a user that uses assistive technology, it was different. Well, the layout and the paint step aren't relevant here because it's not something that assistive technologies will leverage. However, the browser will use the render tree, the tree without the hidden elements, to construct the accessibility tree. Then the accessibility tree will pass the information to the platform accessibility API that can be queried by assistive technologies like screen readers. Okay, so let's look at what the tree looks like. So how to show the accessibility tree. The best way to get a grasp of it is to look at it using DevTools. I'm going to use Chrome DevTools in this presentation.
3. The Accessibility Tree and Its Variations
The accessibility tree may vary slightly across different browsers and platforms. In Chrome, you can view the accessibility tree by opening the DevTools and switching to the accessibility tree view. The accessibility tree is similar to the DOM, but excludes elements that are not relevant to assistive technologies. Elements hidden by CSS or with the hidden attribute, as well as elements with ARIA hidden true, are excluded from the accessibility tree.
However, because the tree is built by the browser and because it's built for the platform API, you have a slightly different tree if you look at another browser or another platform. These are not huge differences though.
So on Chrome, the first thing to do is to open the DevTools. Then in the bottom panel, the one where you have the size tab, you have an accessibility tab. There make sure that the enable full page accessibility tree checkbox is checked. And now on the elements tab of the top panel, the one in which usually inspect the DOM, there's a button called switch to accessibility tree view. And this button, when you click it, is going to show you the full page accessibility tree. Right in the panel where the DOM usually is.
So let's see it in action. Here's a super small HTML page. We can see it has a main tag. It has a couple of headings. It has an ordered list, a footer, and a link inside. And if you look at the accessibility tree for this page, we can see that it looks pretty similar to the DOM. We see the main, the headings, the list items. There's some little differences. The footer is called content info and the anchor tag is called link. But overall, the structure is pretty similar. We also notice those two objects that say ignored. And here they refer to the HTML and body tags.
And actually, they're not the only elements that are not included in the tree. Because accessibility tree is meant for assistive technology, it excludes all elements which are not relevant to assistive technologies. For example, elements that have a layout function and not a semantic function like the divs or the spans that you use only for styling. So it excludes the elements that are in the HTML but that are not visible. That can be elements that are hidden by CSS properties like displaying none or visibility hidden or also the hidden HTML attribute. Also you have the images when empty attributes which are not exposed. And until now, it shouldn't be confusing and it shouldn't be a surprise. Now, it gets a bit more complex when we get into ARIA. The accessibility tree won't include the elements with ARIA hidden true as well as the children. And by the way, if you're not sure what's the purpose of ARIA hidden, it's exactly this.
4. Understanding ARIA Hidden and Element Properties
The ARIA hidden state is used to indicate whether an element is exposed to an accessibility API. Elements like purely decorative or duplicated content are common exclusions. The accessibility tree includes objects with various properties such as role, name, and description. These properties are useful for screen readers in properly announcing elements and providing additional information. Semantic elements in HTML usually have an implied role, but custom roles can also be set with caution. Manually setting a role attribute can override the native HTML role and affect the properties inherited by the element.
If I quote from MDN, the ARIA hidden state indicates wherever the element is exposed to an accessibility API. In general, you will want to give this attribute to elements that are purely decorative, to duplicated contents or to the collapsed elements for example in a drop down menu or something like that. And the last common exclusion are the elements with role none or presentation. Those two roles are synonym. And it's like ARIA hidden except that the node itself is not going to be exposed. However, the children are still going to be part of the tree. And once again, for this ARIA attribute, being hiding the elements from the accessibility tree is the first purpose of the attribute.
So going back to the elements that are included in the tree, each element is actually an object and that can have a bunch of properties. In Chrome DevTools, you can inspect those properties in the accessibility panel. Objects in the accessibility trees can have a role, they can have a name, a description and other properties. So let's now look at a couple of mappings between HTML and accessibility tree objects. So not on the layout of the slide, on the left, you can see the HTML code and on the right, the accessibility object in the accessibility tree. So for example, a button element will be presented in the accessibility tree as an object with role button and the name referring to the text content here, toggle menu. And this information is useful for screen readers because they can now announce the element properly, like toggle menu button. At the bottom of the screen you see is a screenshot from the output of voiceover which is Mac default screen reader. So it represents what the screen reader would say out loud. On top of the name and the role, elements with certain roles can have additional state and properties. For example, a check box can have a check state or a heading can have a level property. And not only this information is used when the element is announced, like here in the case of a heading, it's going to, the text of the heading will be preceded by heading level 1 but also it can be used by screen readers in other parts of their user experiences. For example, screen readers typically allow users to navigate the page by headings. So using the proper formatting, our heading is going to be included in this menu as well.
Going further into the role property, semantic elements usually have an implied role. And there's a couple of exceptions, but more or less it's a one to one mapping between HTML tags and roles. For example, the nav tag has a navigation role and an ordered list have a list role. Another way for screen readers to leverage this information is, for example, if you have a list, the screen reader will say how many elements are in this list and what is the position of the currently seen elements. So now what happens if we manually set a role attribute on an HTML element? For example, here, we convert our, we change our URL to have the role tab list and the list item to have a tab role. When the value of the role will take precedence over the native HTML role of the element, it also means that the, for example, the list items will inherit all the properties that will be normally given to a tab element and not to a list element. And there are good reasons to set custom roles on HTML elements like this. Building a widget like a tab is a good example, but in general, it's recommended to be a bit careful when doing that because it can have unexpected consequences.
5. Understanding Accessible Names and ARIA Attributes
The accessible name identifies user interface elements. It can be derived from the text content, but some elements have optional or custom accessible names using ARIA attributes. The accessible name is determined by user agents based on a ranked list of sources, with ARIA attributes taking precedence over text content. However, using ARIA attributes for names can have translation issues.
Now let's look at the name property, also called the accessible name. First, the name is meant to identify user interface elements. So lots of elements simply take the accessible name from the text content. That's the case for headings, for example, but also elements like buttons or links. Not all elements have a name, though. Typically elements that are decorative don't, as well as elements that just run the text like paragraphs.
For some elements, the accessible name is optional. For example, navigation. By default navigation is not named and is just announced as navigation, but you can give it a name using area label or area label byte, and this name will be used as screen reader. For example, they would announce a menu of area label main menu as main menu navigation. Something that sometimes has used cases, let's say you have a lot of menus on your page, you might want to label them.
And you might wonder when do the accessible name come from the text content and when does it come from area attributes? Well, the accessible name is determined by user agents from a list of possible source that are ranked by order of preference. This is called accessible name computation. And as a general rule, this is the accessible name will be determined in the following order. First the element that's referred by the area label byte label, then the area label, then the text content, that by the way, includes the content in the before and after elements, and then the title attributes. This is a bit of a simplification. There's a bit more depth into this. For example, if you have an input element, they will use the label as accessible name or images will use the content of the attribute. But the most important thing here is that area attributes always overwrite the element's accessible name. And there are lots of use cases where this is perfectly fine. But it can also bring issues. A common pitfall, for example, is the content of HTML attributes aren't translated. So, if you put the label of your button into an area label instead of into the content of this element, and a person uses an automatic translator, they won't get the translation for the specific element.
Comments