diff --git a/docs/accessibility.md b/docs/accessibility.md index c0c57797e44864..b12e90894f7294 100644 --- a/docs/accessibility.md +++ b/docs/accessibility.md @@ -1,515 +1,10 @@ -# Accessibility Overview +# Accessibility -Accessibility means ensuring that all users, including users with disabilities, -have equal access to software. One piece of this involves basic design -principles such as using appropriate font sizes and color contrast, -avoiding using color to convey important information, and providing keyboard -alternatives for anything that is normally accomplished with a pointing device. -However, when you see the word "accessibility" in a directory name in Chromium, -that code's purpose is to provide full access to Chromium's UI via external -accessibility APIs that are utilized by assistive technology. +* [Accessibility Overview](accessibility/overview.md) -**Assistive technology** here refers to software or hardware which -makes use of these APIs to create an alternative interface for the user to -accommodate some specific needs, for example: +## Chrome OS -Assistive technology includes: - -* Screen readers for blind users that describe the screen using - synthesized speech or braille -* Voice control applications that let you speak to the computer, -* Switch access that lets you control the computer with a small number - of physical switches, -* Magnifiers that magnify a portion of the screen, and often highlight the - cursor and caret for easier viewing, and -* Assistive learning and literacy software that helps users who have a hard - time reading print, by highlighting and/or speaking selected text - -In addition, because accessibility APIs provide a convenient and universal -way to explore and control applications, they're often used for automated -testing scripts, and UI automation software like password managers. - -Web browsers play an important role in this ecosystem because they need -to not only provide access to their own UI, but also provide access to -all of the content of the web. - -Each operating system has its own native accessibility API. While the -core APIs tend to be well-documented, it's unfortunately common for -screen readers in particular to depend on additional undocumented or -vendor-specific APIs in order to fully function, especially with web -browsers, because the standard APIs are insufficient to handle the -complexity of the web. - -Chromium needs to support all of these operating system and -vendor-specific accessibility APIs in order to be usable with the full -ecosystem of assistive technology on all platforms. Just like Chromium -sometimes mimics the quirks and bugs of older browsers, Chromium often -needs to mimic the quirks and bugs of other browsers' implementation -of accessibility APIs, too. - -## Concepts - -While each operating system and vendor accessibility API is different, -there are some concepts all of them share. - -1. The *tree*, which models the entire interface as a tree of objects, exposed - to assistive technology via accessibility APIs; -2. *Events*, which let assistive technology know that a part of the tree has - changed somehow; -3. *Actions*, which come from assistive technology and ask the interface to - change. - -Consider the following small HTML file: - -``` - - - How old are you? - - - - -
- - -
- - -``` - -### The Accessibility Tree and Accessibility Attributes - -Internally, Chromium represents the accessibility tree for that web page -using a data structure something like this: - -``` -id=1 role=WebArea name="How old are you?" - id=2 role=Label name="Age" - id=3 role=TextField labelledByIds=[2] value="42" - id=4 role=Group - id=5 role=Button name="Back" - id=6 role=Button name="Next" -``` - -Note that the tree structure closely resembles the structure of the -HTML elements, but slightly simplified. Each node in the accessibility -tree has an ID and a role. Many have a name. The text field has a value, -and instead of a name it has labelledByIds, which indicates that its -accessible name comes from another node in the tree, the label node -with id=2. - -On a particular platform, each node in the accessibility tree is implemented -by an object that conforms to a particular protocol. - -On Windows, the root node implements the IAccessible protocol and -if you call IAccessible::get_accRole, it returns ROLE_SYSTEM_DOCUMENT, -and if you call IAccessible::get_accName, it returns "How old are you?". -Other methods let you walk the tree. - -On macOS, the root node implements the NSAccessibility protocol and -if you call [NSAccessibility accessibilityRole], it returns @"AXWebArea", -and if you call [NSAccessibility accessibilityLabel], it returns -"How old are you?". - -The Linux accessibility API, ATK, is more similar to the Windows APIs; -they were developed together. (Chrome's support for desktop Linux -accessibility is unfinished.) - -The Android accessibility API is of course based on Java. The main -data structure is AccessibilityNodeInfo. It doesn't have a role, but -if you call AccessibilityNodeInfo.getClassName() on the root node -it returns "android.webkit.WebView", and if you call -AccessibilityNodeInfo.getContentDescription() it returns "How old are you?". - -On Chrome OS, we use our own accessibility API that closely maps to -Chrome's internal accessibility API. - -So while the details of the interface vary, the underlying concepts are -similar. Both IAccessible and NSAccessibility have a concept of a role, -but IAccessible uses a role of "document" for a web page, while NSAccessibility -uses a role of "web area". Both IAccessible and NSAccessibility have a -concept of the primary accessible text for a node, but IAccessible calls -it the "name" while NSAccessibility calls it the "label", and Android -calls it a "content description". - -**Historical note:** The internal names of roles and attributes in -Chrome often tend to most closely match the macOS accessibility API -because Chromium was originally based on WebKit, where most of the -accessibility code was written by Apple. Over time we're slowly -migrating internal names to match what those roles and attributes are -called in web accessibility standards, like ARIA. - -### Accessibility Events - -In Chromium's internal terminology, an Accessibility Event always represents -communication from the app to the assistive technology, indicating that the -accessibility tree changed in some way. - -As an example, if the user were to press the Tab key and the text -field from the example above became focused, Chromium would fire a -"focus" accessibility event that assistive technology could listen -to. A screen reader might then announce the name and current value of -the text field. A magnifier might zoom the screen to its bounding -box. If the user types some text into the text field, Chromium would -fire a "value changed" accessibility event. - -As with nodes in the accessibility tree, each platform has a slightly different -API for accessibility events. On Windows we'd fire EVENT_OBJECT_FOCUS for -a focus change, and on Mac we'd fire @"AXFocusedUIElementChanged". -Those are pretty similar. Sometimes they're quite different - to support -live regions (notifications that certain key parts of a web page have changed), -on Mac we simply fire @"AXLiveRegionChanged", but on Windows we need to -fire IA2_EVENT_TEXT_INSERTED and IA2_EVENT_TEXT_REMOVED events individually -on each affected node within the changed region, with additional attributes -like "container-live:polite" to indicate that the affected node was part of -a live region. This discussion is not meant to explain all of the technical -details but just to illustrate that the concepts are similar, -but the details of notifying software on each platform about changes can -vary quite a bit. - -### Accessibility Actions - -Each native object that implements a platform's native accessibility API -supports a number of actions, which are requests from the assistive -technology to control or change the UI. This is the opposite of events, -which are messages from Chromium to the assistive technology. - -For example, if the user had a voice control application running, such as -Voice Access on Android, the user could just speak the name of one of the -buttons on the page, like "Next". Upon recognizing that text and finding -that it matches one of the UI elements on the page, the voice control -app executes the action to click the button id=6 in Chromium's accessibility -tree. Internally we call that action "do default" rather than click, since -it represents the default action for any type of control. - -Other examples of actions include setting focus, changing the value of -a control, and scrolling the page. - -### Parameterized attributes - -In addition to accessibility attributes, events, and actions, native -accessibility APIs often have so-called "parameterized attributes". -The most common example of this is for text - for example there may be -a function to retrieve the bounding box for a range of text, or a -function to retrieve the text properties (font family, font size, -weight, etc.) at a specific character position. - -Parameterized attributes are particularly tricky to implement because -of Chromium's multi-process architecture. More on this in the next section. - -## Chromium's multi-process architecture - -Native accessibility APIs tend to have a *functional* interface, where -Chromium implements an interface for a canonical accessible object that -includes methods to return various attributes, walk the tree, or perform -an action like click(), focus(), or setValue(...). - -In contrast, the web has a largely *declarative* interface. The shape -of the accessibility tree is determined by the DOM tree (occasionally -influenced by CSS), and the accessible semantics of a DOM element can -be modified by adding ARIA attributes. - -One important complication is that all of these native accessibility APIs -are *synchronous*, while Chromium is multi-process, with the contents of -each web page living in a different process than the process that -implements Chromium's UI and the native accessibility APIs. Furthermore, -the renderer processes are *sandboxed*, so they can't implement -operating system APIs directly. - -If you're unfamiliar with Chrome's multi-process architecture, see -[this blog post introducing the concept]( -https://blog.chromium.org/2008/09/multi-process-architecture.html) or -[the design doc on chromium.org]( -https://www.chromium.org/developers/design-documents/multi-process-architecture) -for an intro. - -Chromium's multi-process architecture means that we can't implement -accessibility APIs the same way that a single-process browser can - -namely, by calling directly into the DOM to compute the result of each -API call. For example, on some operating systems there might be an API -to get the bounding box for a particular range of characters on the -page. In other browsers, this might be implemented by creating a DOM -selection object and asking for its bounding box. - -That implementation would be impossible in Chromium because it'd require -blocking the main thread while waiting for a response from the renderer -process that implements that web page's DOM. (Not only is blocking the -main thread strictly disallowed, but the latency of doing this for every -API call makes it prohibitively slow anyway.) Instead, Chromium takes an -approach where a representation of the entire accessibility tree is -cached in the main process. Great care needs to be taken to ensure that -this representation is as concise as possible. - -In Chromium, we build a data structure representing all of the -information for a web page's accessibility tree, send the data -structure from the renderer process to the main browser process, cache -it in the main browser process, and implement native accessibility -APIs using solely the information in that cache. - -As the accessibility tree changes, tree updates and accessibility events -get sent from the renderer process to the browser process. The browser -cache is updated atomically in the main thread, so whenever an external -client (like assistive technology) calls an accessibility API function, -we're always returning something from a complete and consistent snapshot -of the accessibility tree. From time to time, the cache may lag what's -in the renderer process by a fraction of a second. - -Here are some of the specific challenges faced by this approach and -how we've addressed them. - -### Sparse data - -There are a *lot* of possible accessibility attributes for any given -node in an accessibility tree. For example, there are more than 150 -unique accessibility API methods that Chrome implements on the Windows -platform alone. We need to implement all of those APIs, many of which -request rather rare or obscure attributes, but storing all possible -attribute values in a single struct would be quite wasteful. - -To avoid each accessible node object containing hundreds of fields the -data for each accessibility node is stored in a relatively compact -data structure, ui::AXNodeData. Every AXNodeData has an integer ID, a -role enum, and a couple of other mandatory fields, but everything else -is stored in attribute arrays, one for each major data type. - -``` -struct AXNodeData { - int32_t id; - AXRole role; - ... - std::vector> string_attributes; - std::vector> int_attributes; - ... -} -``` - -So if a text field has a placeholder attribute, we can store -that by adding an entry to `string_attributes` with an attribute -of ui::AX_ATTR_PLACEHOLDER and the placeholder string as the value. - -### Incremental tree updates - -Web pages change frequently. It'd be terribly inefficient to send a -new copy of the accessibility tree every time any part of it changes. -However, the accessibility tree can change shape in complicated ways - -for example, whole subtrees can be reparented dynamically. - -Rather than writing code to deal with every possible way the -accessibility tree could be modified, Chromium has a general-purpose -tree serializer class that's designed to send small incremental -updates of a tree from one process to another. The tree serializer has -just a few requirements: - -* Every node in the tree must have a unique integer ID. -* The tree must be acyclic. -* The tree serializer must be notified when a node's data changes. -* The tree serializer must be notified when the list of child IDs of a - node changes. - -The tree serializer doesn't know anything about accessibility attributes. -It keeps track of the previous state of the tree, and every time the tree -structure changes (based on notifications of a node changing or a node's -children changing), it walks the tree and builds up an incremental tree -update that serializes as few nodes as possible. - -In the other process, the Unserialization code applies the incremental -tree update atomically. - -### Text bounding boxes - -One challenge faced by Chromium is that accessibility clients want to be -able to query the bounding box of an arbitrary range of text - not necessarily -just the current cursor position or selection. As discussed above, it's -not possible to block Chromium's main browser process while waiting for this -information from Blink, so instead we cache enough information to satisfy these -queries in the accessibility tree. - -To compactly store the bounding box of every character on the page, we -split the text into *inline text boxes*, sometimes called *text runs*. -For example, in a typical paragraph, each line of text would be its own -inline text box. In general, an inline text box or text run contians a -sequence of text characters that are all oriented in the same direction, -in a line, with the same font, size, and style. - -Each inline text box stores its own bounding box, and then the relative -x-coordinate of each character in its text (assuming left-to-right). -From that it's possible to compute the bounding box -of any individual character. - -The inline text boxes are part of Chromium's internal accessibility tree. -They're used purely internally and aren't ever exposed directly via any -native accessibility APIs. - -For example, suppose that a document contains a text field with the text -"Hello world", but the field is narrow, so "Hello" is on the first line and -"World" is on the second line. Internally Chromium's accessibility tree -might look like this: - -``` -staticText location=(8, 8) size=(38, 36) name='Hello world' - inlineTextBox location=(0, 0) size=(36, 18) name='Hello ' characterOffsets=12,19,23,28,36 - inlineTextBox location=(0, 18) size=(38, 18) name='world' characterOffsets=12,20,25,29,37 -``` - -### Scrolling, transformations, and animation - -Native accessibility APIs typically want the bounding box of every element in the -tree, either in window coordinates or global screen coordinates. If we -stored the global screen coordinates for every node, we'd be constantly -re-serializing the whole tree every time the user scrolls or drags the -window. - -Instead, we store the bounding box of each node in the accessibility tree -relative to its *offset container*, which can be any ancestor. If no offset -container is specified, it's assumed to be the root of the tree. - -In addition, any offset container can contain scroll offsets, which can be -used to scroll the bounding boxes of anything in that subtree. - -Finally, any offset container can also include an arbitrary 4x4 transformation -matrix, which can be used to represent arbitrary 3-D rotations, translations, and -scaling, and more. The transformation matrix applies to the whole subtree. - -Storing coordinates this way means that any time an object scrolls, moves, or -animates its position and scale, only the root of the scrolling or animation -needs to post updates to the accessibility tree. Everything in the subtree -remains valid relative to that offset container. - -Computing the global screen coordinates for an object in the accessibility -tree just means walking up its ancestor chain and applying offsets and -occasionally multiplying by a 4x4 matrix. - -### Site isolation / out-of-process iframes - -At one point in time, all of the content of a single Tab or other web view -was contained in the same Blink process, and it was possible to serialize -the accessibility tree for a whole frame tree in a single pass. - -Today the situation is a bit more complicated, as Chromium supports -out-of-process iframes. (It also supports "browser plugins" such as -the `` tag in Chrome packaged apps, which embeds a whole -browser inside a browser, but for the purposes of accessibility this -is handled the same as frames.) - -Rather than a mix of in-process and out-of-process frames that are handled -differently, Chromium builds a separate independent accessibility tree -for each frame. Each frame gets its own tree ID, and it keeps track of -the tree ID of its parent frame (if any) and any child frames. - -In Chrome's main browser process, the accessibility trees for each frame -are cached separately, and when an accessibility client (assistive -technology) walks the accessibility tree, Chromium dynamically composes -all of the frames into a single virtual accessibility tree on the fly, -using those aforementioned tree IDs. - -The node IDs for accessibility trees only need to be unique within a -single frame. Where necessary, separate unique IDs are used within -Chrome's main browser process. In Chromium accessibility, a "node ID" -always means that ID that's only unique within a frame, and a "unique ID" -means an ID that's globally unique. - -## Blink - -Blink constructs an accessibility tree (a hierarchy of [WebAXObject]s) from the -page it is rendering. WebAXObject is the public API wrapper around [AXObject], -which is the core class of Blink's accessibility tree. AXObject is an abstract -class; the most commonly used concrete subclass of it is [AXNodeObject], which -wraps a [Node]. In turn, most AXNodeObjects are actually [AXLayoutObject]s, -which wrap both a [Node] and a [LayoutObject]. Access to the LayoutObject is -important because some elements are only in the AXObject tree depending on their -visibility, geometry, linewrapping, and so on. There are some subclasses of -AXLayoutObject that implement special-case logic for specific types of Node. -There are also other subclasses of AXObject, which are mostly used for testing. - -Note that not all AXLayoutObjects correspond to actual Nodes; some are synthetic -layout objects which group related inline elements or similar. - -The central class responsible for dealing with accessibility events in Blink is -[AXObjectCacheImpl], which is responsible for caching the corresponding -AXObjects for Nodes or LayoutObjects. This class has many methods named -`handleFoo`, which are called throughout Blink to notify the AXObjectCacheImpl -that it may need to update its tree. Since this class is already aware of all -accessibility events in Blink, it is also responsible for relaying accessibility -events from Blink to the embedding content layer. - -## The content layer - -The content layer lives on both sides of the renderer/browser split. The content -layer translates WebAXObjects into [AXContentNodeData], which is a subclass of -[ui::AXNodeData]. The ui::AXNodeData class and related classes are Chromium's -cross-platform accessibility tree. The translation is implemented in -[BlinkAXTreeSource]. This translation happens on the renderer side, so the -ui::AXNodeData tree now needs to be sent to the browser, which is done by -sending [AccessibilityHostMsg_EventParams] with the payload being serialized -delta-updates to the tree, so that changes that happen on the renderer side can -be reflected on the browser side. - -On the browser side, these IPCs are received by [RenderFrameHostImpl], and then -usually forwarded to [BrowserAccessibilityManager] which is responsible for: - -1. Merging AXNodeData trees into one tree of [BrowserAccessibility] objects, - by linking to other BrowserAccessibilityManagers. This is important because - each page has its own accessibility tree, but each Chromium *window* must - have only one accessibility tree, so trees from multiple pages need to be - combined (possibly also with trees from Views UI). -2. Dispatching outgoing accessibility events to the platform's accessibility - APIs. This is done in the platform-specific subclasses of - BrowserAccessibilityManager, in a method named `NotifyAccessibilityEvent`. -3. Dispatching incoming accessibility actions to the appropriate recipient, via - [BrowserAccessibilityDelegate]. For messages destined for a renderer, - [RenderFrameHostImpl], which is a BrowserAccessibilityDelegate, is - responsible for sending appropriate `AccessibilityMsg_Foo` IPCs to the - renderer, where they will be received by [RenderAccessibilityImpl]. - -On Chrome OS, RenderFrameHostImpl does not route events to -BrowserAccessibilityManager at all, since there is no platform screenreader -outside Chromium to integrate with. - -## Views - -Views generates a [NativeViewAccessibility] for each View, which is used as the -delegate for an [AXPlatformNode] representing that View. This part is relatively -straightforward, but then the generated tree must be combined with the web -accessibility tree, which is handled by BrowserAccessibilityManager. - -## WebUI - -Since WebUI surfaces have renderer processes as normal, WebUI accessibility goes -through the blink-to-content-to-platform pipeline described above. Accessibility -for WebUI is largely implemented in JavaScript in [webui-js]; these classes take -care of adding ARIA attributes and so on to DOM nodes as needed. - -## The Chrome OS layer - -The accessibility tree is also exposed via the [chrome.automation API], which -gives extension JavaScript access to the accessibility tree, events, and -actions. This API is implemented in C++ by [AutomationInternalCustomBindings], -which is renderer-side code, and in JavaScript by the [automation API]. The API -is defined by [automation.idl], which must be kept synchronized with -[ax_enums.idl]. - -[AccessibilityHostMsg_EventParams]: https://cs.chromium.org/chromium/src/content/common/accessibility_messages.h?sq=package:chromium&l=75 -[AutomationInternalCustomBindings]: https://cs.chromium.org/chromium/src/chrome/renderer/extensions/automation_internal_custom_bindings.h -[AXContentNodeData]: https://cs.chromium.org/chromium/src/content/common/ax_content_node_data.h -[AXLayoutObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXLayoutObject.h -[AXNodeObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXNodeObject.h -[AXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXObject.h -[AXObjectCacheImpl]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXObjectCacheImpl.h -[AXPlatformNode]: https://cs.chromium.org/chromium/src/ui/accessibility/platform/ax_platform_node.h -[AXTreeSerializer]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_tree_serializer.h -[BlinkAXTreeSource]: https://cs.chromium.org/chromium/src/content/renderer/accessibility/blink_ax_tree_source.h -[BrowserAccessibility]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility.h -[BrowserAccessibilityDelegate]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility_manager.h?sq=package:chromium&l=64 -[BrowserAccessibilityManager]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility_manager.h -[LayoutObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/layout/LayoutObject.h -[NativeViewAccessibility]: https://cs.chromium.org/chromium/src/ui/views/accessibility/native_view_accessibility.h -[Node]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/Node.h -[RenderAccessibilityImpl]: https://cs.chromium.org/chromium/src/content/renderer/accessibility/render_accessibility_impl.h -[RenderFrameHostImpl]: https://cs.chromium.org/chromium/src/content/browser/frame_host/render_frame_host_impl.h -[ui::AXNodeData]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_node_data.h -[WebAXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/public/web/WebAXObject.h -[automation API]: https://cs.chromium.org/chromium/src/chrome/renderer/resources/extensions/automation -[automation.idl]: https://cs.chromium.org/chromium/src/chrome/common/extensions/api/automation.idl -[ax_enums.idl]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_enums.idl -[chrome.automation API]: https://developer.chrome.com/extensions/automation -[webui-js]: https://cs.chromium.org/chromium/src/ui/webui/resources/js/cr/ui/ +* [ChromeVox for Developers](accessibility/chromevox.md) +* [ChromeVox on Desktop Linux](accessibility/chromevox_on_desktop_linux.md) +* [Updating brltty braille drivers](accessibility/brltty.md) +* [Updating the patts speech synthesis engine](accessibility/patts.md) diff --git a/docs/accessibility/brltty.md b/docs/accessibility/brltty.md new file mode 100644 index 00000000000000..94fd260ff3ee39 --- /dev/null +++ b/docs/accessibility/brltty.md @@ -0,0 +1,65 @@ +# BRLTTY in Chrome OS + +Chrome OS uses the open-source [BRLTTY](http://mielke.cc/brltty/) +library to provide support for refreshable braille displays. + +We typically ship with a stable release build of BRLTTY plus some +cherry-picked patches. + +## Updating BRLTTY or adding a patch + +First, follow the public +[Chromium OS Developer Guide](http://www.chromium.org/chromium-os/developer-guide) to check out the source. +At a minimum you'll need to create a chroot. +You do not need to build everything from source. +You do need to start the devserver. + +Next, flash your device to a very recent test build. Internally at Google +you can do this with the following command when the dev server is running, +where CHROMEBOOK_IP_ADDRESS is the IP address of your Chromebook already +in developer mode, and $BOARD is your Chromebook's board name. + +```cros flash ssh://CHROMEBOOK_IP_ADDRESS xbuddy://remote/$BOARD/latest-dev/test``` + +The BRLTTY files can be found in this directory: + +```third_party/chromiumos-overlay/app-accessibility/brltty``` + +The first thing you'll need to do is edit the ebuild symlink to change the +revision number. The real file is something like brltty-5.4.ebuild, +but the revision will be something like brltty-5.4-r5.ebuild. You'll need +to increment it. + +To increment it from r5 to r6, you'd do something like this: + +``` +rm brltty-5.4-r5.ebuild +ln -s brltty-5.4.ebuild brltty-5.4-r6.ebuild +git add brltty-5.4-r6.ebuild +``` + +The changes we make are all patches against a stable release of brltty. +To add a new patch, put it in the files/ directory and reference it in +brltty.bashrc + +Once you're done adding patches or making other changes, flash it to your +device like this: + +``` +emerge-$BOARD brltty +cros deploy CHROMEBOOK_IP_ADDRESS brltty +``` + +After that, reboot your Chromebook and verify that brltty works. + +To upload a change, use repo, something like this: + +``` +repo start . +git commit -a + BUG=chromium:12345 + TEST=Write what you tested here +repo upload . +``` + +Note that you shouldn't need to run cros_workon. diff --git a/docs/accessibility/chromevox.md b/docs/accessibility/chromevox.md new file mode 100644 index 00000000000000..ebc6f6f5ac0719 --- /dev/null +++ b/docs/accessibility/chromevox.md @@ -0,0 +1,131 @@ +# ChromeVox (for developers) + +ChromeVox is the built-in screen reader on Chrome OS. It was originally +developed as a separate extension but now the code lives inside of the Chromium +tree and it's built as part of Chrome OS. + +To start or stop ChromeVox on Chrome OS, press Ctrl+Alt+Z at any time. + +## Developer Info + +Code location: ```chrome/browser/resources/chromeos/chromevox``` + +Ninja target: it's built as part of "chrome", but you can build and run +chromevox_tests to test it (Chrome OS target only - you must have target_os = +"chromeos" in your GN args first). + +## Developing On Linux + +ChromeVox for Chrome OS development is done on Linux. + +See [ChromeVox on Desktop Linux](chromevox_on_desktop_linux.md) +for more information. + +## ChromeVox Next + +ChromeVox Next is the code name we use for a major new rewrite to ChromeVox that +uses the automation API instead of content scripts. The code is part of +ChromeVox (unique ChromeVox Next code is found in +chrome/browser/resources/chromeos/chromevox/cvox2). + +ChromeVox contains all of the classic and next code in the same codebase, it +switches its behavior dynamically based on the mode: + +* Next: as of version 56 of Chrome/Chrome OS, this is default. ChromeVox uses new key/braille bindings, earcons, speech/braille output style, the Next engine (Automation API), and other major/minor improvements +* Next Compat: in order to maintain compatibility with some clients of the ChromeVox Classic js APIs, some sites have been whitelisted for this mode. ChromeVox will inject classic content scripts, but expose a Next-like user experience (like above) +* Classic: as of version 56 of Chrome/Chrome OS, this mode gets enabled via a keyboard toggle Search+Q. Once enabled, ChromeVox will behave like it did in the past including keyboard bindings, earcons, speech/braille output style, and the underlying engine (content scripts). +* Classic compat for some sites that require Next, while running in Classic, ChromeVox will use the Next engine but expose a Classic user experience (like above) + +Once it's ready, the plan is to retire everything other than Next mode. + +## ChromeVox Next + +To test ChromeVox Next, click on the Gear icon in the upper-right of the screen +to open the ChromeVox options (or press the keyboard shortcut Search+Shift+O, O) +and then click the box to opt into ChromeVox Next. + +If you are running m56 or later, you already have ChromeVox Next on by +default. To switch back to Classic, press Search+Q. + +## Debugging ChromeVox + +There are options available that may assist in debugging ChromeVox. Here are a +few use cases. + +### Feature development + +When developing a new feature, it may be helpful to save time by not having to +go through a compile cycle. This can be achieved by setting +```chromevox_compress_js``` to 0 in +chrome/browser/resources/chromeos/chromevox/BUILD.gn, or by using a debug build. + +In a debug build or with chromevox_compress_js off, the unflattened files in the +Chrome out directory (e.g. out/Release/resources/chromeos/chromevox/). Now you +can hack directly on the copy of ChromeVox in out/ and toggle ChromeVox to pick +up your changes (via Ctrl+Alt+Z). + +### Fixing bugs + +The easiest way to debug ChromeVox is from an external browser. Start Chrome +with this command-line flag: + +```out/Release/chrome --remote-debugging-port=9222``` + +Now open http://localhost:9222 in a separate instance of the browser, and debug the ChromeVox extension background page from there. + +Another option is to use emacs jade (available through -mx +package-list-packages). + +It also talks to localhost:9222 but integrates more tightly into emacs instead. + +Another option is to use the built-in developer console. Go to the +ChromeVox options page with Search+Shift+o, o; then, substitute the +“options.html” path with “background.html”, and then open up the +inspector. + +### Running tests + +Build the chromevox_tests target. To run +lots of tests in parallel, run it like this: + +```out/Release/chromevox_tests --test-launcher-jobs=20``` + +Use a test filter if you only want to run some of the tests from a +particular test suite - for example, most of the ChromeVox Next tests +have "E2E" in them (for "end-to-end"), so to only run those: + +```out/Release/chromevox_tests --test-launcher-jobs=20 --gtest_filter="*E2E*"``` + +## ChromeVox for other platforms + +ChromeVox can be run as an installable extension, separate from a +linux Chrome OS build. + +### From source + +chrome/browser/resources/chromeos/chromevox/tools has the required scripts that pack ChromeVox as an extension and make any necessary manifest changes. + +### From Webstore + +Alternatively, the webstore has the stable version of ChromeVox. + +To install without interacting with the webstore UI, place the +following json block in +/opt/google/chrome-unstable/extensions/kgejglhpjiefppelpmljglcjbhoiplfn.json + +``` +{ +"external_update_url": "https://clients2.google.com/service/update2/crx" +} +``` + +If you're using the desktop Linux version of Chrome, we recommend you +use Voxin for speech. Run chrome with: “google-chrome +--enable-speech-dispatcher” and select a voice provided by the speechd +package from the ChromeVox options page (ChromeVox+o, o). As of the +latest revision of Chrome 44, speechd support has become stable enough +to use with ChromeVox, but still requires the flag. + +In the ChromeVox options page, select the flat keymap and use sticky +mode (double press quickly of insert) to emulate a modal screen +reader. diff --git a/docs/accessibility/chromevox_on_desktop_linux.md b/docs/accessibility/chromevox_on_desktop_linux.md new file mode 100644 index 00000000000000..fc6850c2841aee --- /dev/null +++ b/docs/accessibility/chromevox_on_desktop_linux.md @@ -0,0 +1,110 @@ +# ChromeVox on Desktop Linux + +## Starting ChromeVox + +On Chrome OS, you can enable spoken feedback (ChromeVox) by pressing Ctrl+Alt+Z. + +If you have a Chromebook, this gives you speech support built-in. If you're +building Chrome from source and running it on desktop Linux, speech and braille +won't be included by default. Here's how to enable it. + +## Compiling the Chrome OS version of Chrome + +First follow the public instructions for +[Chrome checkout and build](https://www.chromium.org/developers/how-tos/get-the-code). + +Create a GN configuration with "chromeos" as the target OS, for example: + +```> gn args out/ChromeOSRelease``` + +...in editor, add this line: + +``` +target_os = "chromeos" +is_component_build = true +is_debug = false +``` + +Note: Only ```target_os = "chromeos"``` is required, the others are recommended +for a good experience but you can configure Chrome however you like otherwise. +Note that Native Client is required, so do not put enable_nacl = false in +your file anywhere! + +Now build Chrome as usual, e.g.: + +```ninja -C out/cros chrome``` + +And run it as usual to see a mostly-complete Chrome OS desktop inside +of a window: + +```out/cros/chrome``` + +By default you'll be logged in as the default user. If you want to +simulate the login manager too, run it like this: + +```out/cros/chrome --login-manager``` + +You can run any of the above under it’s own X session (avoiding any +window manager key combo conflicts) by doing something like + +```startx out/cros/chrome``` + +## Speech + +If you want speech, you just need to copy the speech synthesis data +files to /usr/share like it would be on a Chrome OS device: + +``` +git clone https://chromium.googlesource.com/chromiumos/platform/assets +sudo cp assets /usr/share/chromeos-assets +``` + +Next, move to that directory and unzip the NaCl executables. You only need +to do the one for your host architecture: + +``` +cd /usr/share/chromeos-assets/speech_synthesis/patts +unzip tts_service_x86-64.nexe.zip +``` + +Finally, fix the permissions: + +``` +sudo chmod oug+r -R /usr/share/chromeos-assets +``` + +**Be sure to check permissions of /usr/share/chromeos-assets, some + users report they need to chmod or chown too, it really depends + on your system.** + +After you do that, just run "chrome" as above +(e.g. out/cros/chrome) and press Ctrl+Alt+Z, and you should hear it +speak! If not, check the logs. + +## Braille + +ChromeVox uses extension APIs to deliver braille to Brltty through +libbrlapi and uses Liblouis to perform translation and +backtranslation. + +Once built, Chrome and ChromeVox will use your machine’s running +Brltty daemon to display braille if ChromeVox is running. Simply +ensure you have a display connected before running Chrome and that +Brltty is running. + +Testing against the latest releases of Brltty (e.g. 5.4 at time of +writing) is encouraged. + +For more general information, see [ChromeVox](chromevox.md) + +# Using ChromeVox + +ChromeVox keyboard shortcuts use Search. On Linux that's usually your +Windows key. If some shortcuts don't work, you may need to remove +Gnome keyboard shortcut bindings, or use "startx", as suggested above, +or remap it. + +* Search+Space: Click +* Search+Left/Right: navigate linearly +* Search+Period: Open ChromeVox menus +* Search+H: jump to next heading on page diff --git a/docs/accessibility/overview.md b/docs/accessibility/overview.md new file mode 100644 index 00000000000000..c0c57797e44864 --- /dev/null +++ b/docs/accessibility/overview.md @@ -0,0 +1,515 @@ +# Accessibility Overview + +Accessibility means ensuring that all users, including users with disabilities, +have equal access to software. One piece of this involves basic design +principles such as using appropriate font sizes and color contrast, +avoiding using color to convey important information, and providing keyboard +alternatives for anything that is normally accomplished with a pointing device. +However, when you see the word "accessibility" in a directory name in Chromium, +that code's purpose is to provide full access to Chromium's UI via external +accessibility APIs that are utilized by assistive technology. + +**Assistive technology** here refers to software or hardware which +makes use of these APIs to create an alternative interface for the user to +accommodate some specific needs, for example: + +Assistive technology includes: + +* Screen readers for blind users that describe the screen using + synthesized speech or braille +* Voice control applications that let you speak to the computer, +* Switch access that lets you control the computer with a small number + of physical switches, +* Magnifiers that magnify a portion of the screen, and often highlight the + cursor and caret for easier viewing, and +* Assistive learning and literacy software that helps users who have a hard + time reading print, by highlighting and/or speaking selected text + +In addition, because accessibility APIs provide a convenient and universal +way to explore and control applications, they're often used for automated +testing scripts, and UI automation software like password managers. + +Web browsers play an important role in this ecosystem because they need +to not only provide access to their own UI, but also provide access to +all of the content of the web. + +Each operating system has its own native accessibility API. While the +core APIs tend to be well-documented, it's unfortunately common for +screen readers in particular to depend on additional undocumented or +vendor-specific APIs in order to fully function, especially with web +browsers, because the standard APIs are insufficient to handle the +complexity of the web. + +Chromium needs to support all of these operating system and +vendor-specific accessibility APIs in order to be usable with the full +ecosystem of assistive technology on all platforms. Just like Chromium +sometimes mimics the quirks and bugs of older browsers, Chromium often +needs to mimic the quirks and bugs of other browsers' implementation +of accessibility APIs, too. + +## Concepts + +While each operating system and vendor accessibility API is different, +there are some concepts all of them share. + +1. The *tree*, which models the entire interface as a tree of objects, exposed + to assistive technology via accessibility APIs; +2. *Events*, which let assistive technology know that a part of the tree has + changed somehow; +3. *Actions*, which come from assistive technology and ask the interface to + change. + +Consider the following small HTML file: + +``` + + + How old are you? + + + + +
+ + +
+ + +``` + +### The Accessibility Tree and Accessibility Attributes + +Internally, Chromium represents the accessibility tree for that web page +using a data structure something like this: + +``` +id=1 role=WebArea name="How old are you?" + id=2 role=Label name="Age" + id=3 role=TextField labelledByIds=[2] value="42" + id=4 role=Group + id=5 role=Button name="Back" + id=6 role=Button name="Next" +``` + +Note that the tree structure closely resembles the structure of the +HTML elements, but slightly simplified. Each node in the accessibility +tree has an ID and a role. Many have a name. The text field has a value, +and instead of a name it has labelledByIds, which indicates that its +accessible name comes from another node in the tree, the label node +with id=2. + +On a particular platform, each node in the accessibility tree is implemented +by an object that conforms to a particular protocol. + +On Windows, the root node implements the IAccessible protocol and +if you call IAccessible::get_accRole, it returns ROLE_SYSTEM_DOCUMENT, +and if you call IAccessible::get_accName, it returns "How old are you?". +Other methods let you walk the tree. + +On macOS, the root node implements the NSAccessibility protocol and +if you call [NSAccessibility accessibilityRole], it returns @"AXWebArea", +and if you call [NSAccessibility accessibilityLabel], it returns +"How old are you?". + +The Linux accessibility API, ATK, is more similar to the Windows APIs; +they were developed together. (Chrome's support for desktop Linux +accessibility is unfinished.) + +The Android accessibility API is of course based on Java. The main +data structure is AccessibilityNodeInfo. It doesn't have a role, but +if you call AccessibilityNodeInfo.getClassName() on the root node +it returns "android.webkit.WebView", and if you call +AccessibilityNodeInfo.getContentDescription() it returns "How old are you?". + +On Chrome OS, we use our own accessibility API that closely maps to +Chrome's internal accessibility API. + +So while the details of the interface vary, the underlying concepts are +similar. Both IAccessible and NSAccessibility have a concept of a role, +but IAccessible uses a role of "document" for a web page, while NSAccessibility +uses a role of "web area". Both IAccessible and NSAccessibility have a +concept of the primary accessible text for a node, but IAccessible calls +it the "name" while NSAccessibility calls it the "label", and Android +calls it a "content description". + +**Historical note:** The internal names of roles and attributes in +Chrome often tend to most closely match the macOS accessibility API +because Chromium was originally based on WebKit, where most of the +accessibility code was written by Apple. Over time we're slowly +migrating internal names to match what those roles and attributes are +called in web accessibility standards, like ARIA. + +### Accessibility Events + +In Chromium's internal terminology, an Accessibility Event always represents +communication from the app to the assistive technology, indicating that the +accessibility tree changed in some way. + +As an example, if the user were to press the Tab key and the text +field from the example above became focused, Chromium would fire a +"focus" accessibility event that assistive technology could listen +to. A screen reader might then announce the name and current value of +the text field. A magnifier might zoom the screen to its bounding +box. If the user types some text into the text field, Chromium would +fire a "value changed" accessibility event. + +As with nodes in the accessibility tree, each platform has a slightly different +API for accessibility events. On Windows we'd fire EVENT_OBJECT_FOCUS for +a focus change, and on Mac we'd fire @"AXFocusedUIElementChanged". +Those are pretty similar. Sometimes they're quite different - to support +live regions (notifications that certain key parts of a web page have changed), +on Mac we simply fire @"AXLiveRegionChanged", but on Windows we need to +fire IA2_EVENT_TEXT_INSERTED and IA2_EVENT_TEXT_REMOVED events individually +on each affected node within the changed region, with additional attributes +like "container-live:polite" to indicate that the affected node was part of +a live region. This discussion is not meant to explain all of the technical +details but just to illustrate that the concepts are similar, +but the details of notifying software on each platform about changes can +vary quite a bit. + +### Accessibility Actions + +Each native object that implements a platform's native accessibility API +supports a number of actions, which are requests from the assistive +technology to control or change the UI. This is the opposite of events, +which are messages from Chromium to the assistive technology. + +For example, if the user had a voice control application running, such as +Voice Access on Android, the user could just speak the name of one of the +buttons on the page, like "Next". Upon recognizing that text and finding +that it matches one of the UI elements on the page, the voice control +app executes the action to click the button id=6 in Chromium's accessibility +tree. Internally we call that action "do default" rather than click, since +it represents the default action for any type of control. + +Other examples of actions include setting focus, changing the value of +a control, and scrolling the page. + +### Parameterized attributes + +In addition to accessibility attributes, events, and actions, native +accessibility APIs often have so-called "parameterized attributes". +The most common example of this is for text - for example there may be +a function to retrieve the bounding box for a range of text, or a +function to retrieve the text properties (font family, font size, +weight, etc.) at a specific character position. + +Parameterized attributes are particularly tricky to implement because +of Chromium's multi-process architecture. More on this in the next section. + +## Chromium's multi-process architecture + +Native accessibility APIs tend to have a *functional* interface, where +Chromium implements an interface for a canonical accessible object that +includes methods to return various attributes, walk the tree, or perform +an action like click(), focus(), or setValue(...). + +In contrast, the web has a largely *declarative* interface. The shape +of the accessibility tree is determined by the DOM tree (occasionally +influenced by CSS), and the accessible semantics of a DOM element can +be modified by adding ARIA attributes. + +One important complication is that all of these native accessibility APIs +are *synchronous*, while Chromium is multi-process, with the contents of +each web page living in a different process than the process that +implements Chromium's UI and the native accessibility APIs. Furthermore, +the renderer processes are *sandboxed*, so they can't implement +operating system APIs directly. + +If you're unfamiliar with Chrome's multi-process architecture, see +[this blog post introducing the concept]( +https://blog.chromium.org/2008/09/multi-process-architecture.html) or +[the design doc on chromium.org]( +https://www.chromium.org/developers/design-documents/multi-process-architecture) +for an intro. + +Chromium's multi-process architecture means that we can't implement +accessibility APIs the same way that a single-process browser can - +namely, by calling directly into the DOM to compute the result of each +API call. For example, on some operating systems there might be an API +to get the bounding box for a particular range of characters on the +page. In other browsers, this might be implemented by creating a DOM +selection object and asking for its bounding box. + +That implementation would be impossible in Chromium because it'd require +blocking the main thread while waiting for a response from the renderer +process that implements that web page's DOM. (Not only is blocking the +main thread strictly disallowed, but the latency of doing this for every +API call makes it prohibitively slow anyway.) Instead, Chromium takes an +approach where a representation of the entire accessibility tree is +cached in the main process. Great care needs to be taken to ensure that +this representation is as concise as possible. + +In Chromium, we build a data structure representing all of the +information for a web page's accessibility tree, send the data +structure from the renderer process to the main browser process, cache +it in the main browser process, and implement native accessibility +APIs using solely the information in that cache. + +As the accessibility tree changes, tree updates and accessibility events +get sent from the renderer process to the browser process. The browser +cache is updated atomically in the main thread, so whenever an external +client (like assistive technology) calls an accessibility API function, +we're always returning something from a complete and consistent snapshot +of the accessibility tree. From time to time, the cache may lag what's +in the renderer process by a fraction of a second. + +Here are some of the specific challenges faced by this approach and +how we've addressed them. + +### Sparse data + +There are a *lot* of possible accessibility attributes for any given +node in an accessibility tree. For example, there are more than 150 +unique accessibility API methods that Chrome implements on the Windows +platform alone. We need to implement all of those APIs, many of which +request rather rare or obscure attributes, but storing all possible +attribute values in a single struct would be quite wasteful. + +To avoid each accessible node object containing hundreds of fields the +data for each accessibility node is stored in a relatively compact +data structure, ui::AXNodeData. Every AXNodeData has an integer ID, a +role enum, and a couple of other mandatory fields, but everything else +is stored in attribute arrays, one for each major data type. + +``` +struct AXNodeData { + int32_t id; + AXRole role; + ... + std::vector> string_attributes; + std::vector> int_attributes; + ... +} +``` + +So if a text field has a placeholder attribute, we can store +that by adding an entry to `string_attributes` with an attribute +of ui::AX_ATTR_PLACEHOLDER and the placeholder string as the value. + +### Incremental tree updates + +Web pages change frequently. It'd be terribly inefficient to send a +new copy of the accessibility tree every time any part of it changes. +However, the accessibility tree can change shape in complicated ways - +for example, whole subtrees can be reparented dynamically. + +Rather than writing code to deal with every possible way the +accessibility tree could be modified, Chromium has a general-purpose +tree serializer class that's designed to send small incremental +updates of a tree from one process to another. The tree serializer has +just a few requirements: + +* Every node in the tree must have a unique integer ID. +* The tree must be acyclic. +* The tree serializer must be notified when a node's data changes. +* The tree serializer must be notified when the list of child IDs of a + node changes. + +The tree serializer doesn't know anything about accessibility attributes. +It keeps track of the previous state of the tree, and every time the tree +structure changes (based on notifications of a node changing or a node's +children changing), it walks the tree and builds up an incremental tree +update that serializes as few nodes as possible. + +In the other process, the Unserialization code applies the incremental +tree update atomically. + +### Text bounding boxes + +One challenge faced by Chromium is that accessibility clients want to be +able to query the bounding box of an arbitrary range of text - not necessarily +just the current cursor position or selection. As discussed above, it's +not possible to block Chromium's main browser process while waiting for this +information from Blink, so instead we cache enough information to satisfy these +queries in the accessibility tree. + +To compactly store the bounding box of every character on the page, we +split the text into *inline text boxes*, sometimes called *text runs*. +For example, in a typical paragraph, each line of text would be its own +inline text box. In general, an inline text box or text run contians a +sequence of text characters that are all oriented in the same direction, +in a line, with the same font, size, and style. + +Each inline text box stores its own bounding box, and then the relative +x-coordinate of each character in its text (assuming left-to-right). +From that it's possible to compute the bounding box +of any individual character. + +The inline text boxes are part of Chromium's internal accessibility tree. +They're used purely internally and aren't ever exposed directly via any +native accessibility APIs. + +For example, suppose that a document contains a text field with the text +"Hello world", but the field is narrow, so "Hello" is on the first line and +"World" is on the second line. Internally Chromium's accessibility tree +might look like this: + +``` +staticText location=(8, 8) size=(38, 36) name='Hello world' + inlineTextBox location=(0, 0) size=(36, 18) name='Hello ' characterOffsets=12,19,23,28,36 + inlineTextBox location=(0, 18) size=(38, 18) name='world' characterOffsets=12,20,25,29,37 +``` + +### Scrolling, transformations, and animation + +Native accessibility APIs typically want the bounding box of every element in the +tree, either in window coordinates or global screen coordinates. If we +stored the global screen coordinates for every node, we'd be constantly +re-serializing the whole tree every time the user scrolls or drags the +window. + +Instead, we store the bounding box of each node in the accessibility tree +relative to its *offset container*, which can be any ancestor. If no offset +container is specified, it's assumed to be the root of the tree. + +In addition, any offset container can contain scroll offsets, which can be +used to scroll the bounding boxes of anything in that subtree. + +Finally, any offset container can also include an arbitrary 4x4 transformation +matrix, which can be used to represent arbitrary 3-D rotations, translations, and +scaling, and more. The transformation matrix applies to the whole subtree. + +Storing coordinates this way means that any time an object scrolls, moves, or +animates its position and scale, only the root of the scrolling or animation +needs to post updates to the accessibility tree. Everything in the subtree +remains valid relative to that offset container. + +Computing the global screen coordinates for an object in the accessibility +tree just means walking up its ancestor chain and applying offsets and +occasionally multiplying by a 4x4 matrix. + +### Site isolation / out-of-process iframes + +At one point in time, all of the content of a single Tab or other web view +was contained in the same Blink process, and it was possible to serialize +the accessibility tree for a whole frame tree in a single pass. + +Today the situation is a bit more complicated, as Chromium supports +out-of-process iframes. (It also supports "browser plugins" such as +the `` tag in Chrome packaged apps, which embeds a whole +browser inside a browser, but for the purposes of accessibility this +is handled the same as frames.) + +Rather than a mix of in-process and out-of-process frames that are handled +differently, Chromium builds a separate independent accessibility tree +for each frame. Each frame gets its own tree ID, and it keeps track of +the tree ID of its parent frame (if any) and any child frames. + +In Chrome's main browser process, the accessibility trees for each frame +are cached separately, and when an accessibility client (assistive +technology) walks the accessibility tree, Chromium dynamically composes +all of the frames into a single virtual accessibility tree on the fly, +using those aforementioned tree IDs. + +The node IDs for accessibility trees only need to be unique within a +single frame. Where necessary, separate unique IDs are used within +Chrome's main browser process. In Chromium accessibility, a "node ID" +always means that ID that's only unique within a frame, and a "unique ID" +means an ID that's globally unique. + +## Blink + +Blink constructs an accessibility tree (a hierarchy of [WebAXObject]s) from the +page it is rendering. WebAXObject is the public API wrapper around [AXObject], +which is the core class of Blink's accessibility tree. AXObject is an abstract +class; the most commonly used concrete subclass of it is [AXNodeObject], which +wraps a [Node]. In turn, most AXNodeObjects are actually [AXLayoutObject]s, +which wrap both a [Node] and a [LayoutObject]. Access to the LayoutObject is +important because some elements are only in the AXObject tree depending on their +visibility, geometry, linewrapping, and so on. There are some subclasses of +AXLayoutObject that implement special-case logic for specific types of Node. +There are also other subclasses of AXObject, which are mostly used for testing. + +Note that not all AXLayoutObjects correspond to actual Nodes; some are synthetic +layout objects which group related inline elements or similar. + +The central class responsible for dealing with accessibility events in Blink is +[AXObjectCacheImpl], which is responsible for caching the corresponding +AXObjects for Nodes or LayoutObjects. This class has many methods named +`handleFoo`, which are called throughout Blink to notify the AXObjectCacheImpl +that it may need to update its tree. Since this class is already aware of all +accessibility events in Blink, it is also responsible for relaying accessibility +events from Blink to the embedding content layer. + +## The content layer + +The content layer lives on both sides of the renderer/browser split. The content +layer translates WebAXObjects into [AXContentNodeData], which is a subclass of +[ui::AXNodeData]. The ui::AXNodeData class and related classes are Chromium's +cross-platform accessibility tree. The translation is implemented in +[BlinkAXTreeSource]. This translation happens on the renderer side, so the +ui::AXNodeData tree now needs to be sent to the browser, which is done by +sending [AccessibilityHostMsg_EventParams] with the payload being serialized +delta-updates to the tree, so that changes that happen on the renderer side can +be reflected on the browser side. + +On the browser side, these IPCs are received by [RenderFrameHostImpl], and then +usually forwarded to [BrowserAccessibilityManager] which is responsible for: + +1. Merging AXNodeData trees into one tree of [BrowserAccessibility] objects, + by linking to other BrowserAccessibilityManagers. This is important because + each page has its own accessibility tree, but each Chromium *window* must + have only one accessibility tree, so trees from multiple pages need to be + combined (possibly also with trees from Views UI). +2. Dispatching outgoing accessibility events to the platform's accessibility + APIs. This is done in the platform-specific subclasses of + BrowserAccessibilityManager, in a method named `NotifyAccessibilityEvent`. +3. Dispatching incoming accessibility actions to the appropriate recipient, via + [BrowserAccessibilityDelegate]. For messages destined for a renderer, + [RenderFrameHostImpl], which is a BrowserAccessibilityDelegate, is + responsible for sending appropriate `AccessibilityMsg_Foo` IPCs to the + renderer, where they will be received by [RenderAccessibilityImpl]. + +On Chrome OS, RenderFrameHostImpl does not route events to +BrowserAccessibilityManager at all, since there is no platform screenreader +outside Chromium to integrate with. + +## Views + +Views generates a [NativeViewAccessibility] for each View, which is used as the +delegate for an [AXPlatformNode] representing that View. This part is relatively +straightforward, but then the generated tree must be combined with the web +accessibility tree, which is handled by BrowserAccessibilityManager. + +## WebUI + +Since WebUI surfaces have renderer processes as normal, WebUI accessibility goes +through the blink-to-content-to-platform pipeline described above. Accessibility +for WebUI is largely implemented in JavaScript in [webui-js]; these classes take +care of adding ARIA attributes and so on to DOM nodes as needed. + +## The Chrome OS layer + +The accessibility tree is also exposed via the [chrome.automation API], which +gives extension JavaScript access to the accessibility tree, events, and +actions. This API is implemented in C++ by [AutomationInternalCustomBindings], +which is renderer-side code, and in JavaScript by the [automation API]. The API +is defined by [automation.idl], which must be kept synchronized with +[ax_enums.idl]. + +[AccessibilityHostMsg_EventParams]: https://cs.chromium.org/chromium/src/content/common/accessibility_messages.h?sq=package:chromium&l=75 +[AutomationInternalCustomBindings]: https://cs.chromium.org/chromium/src/chrome/renderer/extensions/automation_internal_custom_bindings.h +[AXContentNodeData]: https://cs.chromium.org/chromium/src/content/common/ax_content_node_data.h +[AXLayoutObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXLayoutObject.h +[AXNodeObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXNodeObject.h +[AXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXObject.h +[AXObjectCacheImpl]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/accessibility/AXObjectCacheImpl.h +[AXPlatformNode]: https://cs.chromium.org/chromium/src/ui/accessibility/platform/ax_platform_node.h +[AXTreeSerializer]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_tree_serializer.h +[BlinkAXTreeSource]: https://cs.chromium.org/chromium/src/content/renderer/accessibility/blink_ax_tree_source.h +[BrowserAccessibility]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility.h +[BrowserAccessibilityDelegate]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility_manager.h?sq=package:chromium&l=64 +[BrowserAccessibilityManager]: https://cs.chromium.org/chromium/src/content/browser/accessibility/browser_accessibility_manager.h +[LayoutObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/layout/LayoutObject.h +[NativeViewAccessibility]: https://cs.chromium.org/chromium/src/ui/views/accessibility/native_view_accessibility.h +[Node]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/Node.h +[RenderAccessibilityImpl]: https://cs.chromium.org/chromium/src/content/renderer/accessibility/render_accessibility_impl.h +[RenderFrameHostImpl]: https://cs.chromium.org/chromium/src/content/browser/frame_host/render_frame_host_impl.h +[ui::AXNodeData]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_node_data.h +[WebAXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/public/web/WebAXObject.h +[automation API]: https://cs.chromium.org/chromium/src/chrome/renderer/resources/extensions/automation +[automation.idl]: https://cs.chromium.org/chromium/src/chrome/common/extensions/api/automation.idl +[ax_enums.idl]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_enums.idl +[chrome.automation API]: https://developer.chrome.com/extensions/automation +[webui-js]: https://cs.chromium.org/chromium/src/ui/webui/resources/js/cr/ui/ diff --git a/docs/accessibility/patts.md b/docs/accessibility/patts.md new file mode 100644 index 00000000000000..5a7fce52891bea --- /dev/null +++ b/docs/accessibility/patts.md @@ -0,0 +1,77 @@ +# The Chrome OS PATTS speech synthesis engine + +Chrome OS comes with a speech synthesis engine developed internally at Google +called PATTS. It's based on the same engine that ships with all Android devices. + +## Building from source + +This is for Googlers only. + +Visit [http://go/chrome-tts-blaze](http://go/chrome-tts-blaze) +for instructions on how to build the engine from source and get the +latest voice files. + +When debugging, start Chrome from the command-line and set the +NACL_PLUGIN_DEBUG environment variable to 1 to print log messages to stdout. + +## Updating + +First, follow the public +[Chromium OS Developer Guide](http://www.chromium.org/chromium-os/developer-guide) to check out the source. +At a minimum you'll need to create a chroot. +You do not need to build everything from source. +You do need to start the devserver. + +Next, flash your device to a very recent test build. Internally at Google +you can do this with the following command when the dev server is running, +where CHROMEBOOK_IP_ADDRESS is the IP address of your Chromebook already +in developer mode, and $BOARD is your Chromebook's board name. + +```cros flash ssh://CHROMEBOOK_IP_ADDRESS xbuddy://remote/$BOARD/latest-dev/test``` + +Before you can make changes to PATTS, the first thing you need to run +(from the chroot) is call cros_workon with two relevant ebuilds: + +``` +cros_workon --board=$BOARD start chromeos-assets +cros_workon --board=$BOARD start common-assets +``` + +Next, make sure you're in the platform/assets directory and run +```repo start``` and create a branch. + +``` +cd platform/assets +repo start . +``` + + +The PATTS data files can be found in this directory: + +```platform/assets/speech_synthesis/patts``` + +When updating the files, the native client files (nexe) need to be zipped. + +Replace all of the files you need to update, commit them using git, +then from the chroot, run: + +``` +emerge-$BOARD common-assets +cros deploy CHROMEBOOK_IP_ADDRESS common-assets +``` + +Note that you need to call cros_workon on both chromeos-assets and +common-assets. You will be changing files in chromeos-assets, but +to flash it onto your device, you need to emerge and deploy +common-assets. + +After that, reboot your Chromebook and verify that speech works. + +To upload the change, use repo upload, something like this: + +``` +git commit -a + BUG=chromium:12345 + TEST=Write what you tested here +repo upload . +``` diff --git a/third_party/WebKit/LayoutTests/accessibility/readme.md b/third_party/WebKit/LayoutTests/accessibility/readme.md index ab9236d798411d..5dfce1dd9fdc74 100644 --- a/third_party/WebKit/LayoutTests/accessibility/readme.md +++ b/third_party/WebKit/LayoutTests/accessibility/readme.md @@ -2,11 +2,13 @@ ## General Info on LayoutTests: Building and Running the Tests -See https://chromium.googlesource.com/chromium/src/+/master/docs/testing/layout_tests.md for general info on how to build and run LayoutTests. +See [Layout Tests](/docs/testing/layout_tests.md) for general +info on how to build and run layout tests. ## Old vs. New There are two styles of accessibility layout tests: + * Using a ```-expected.txt``` (now deprecated) * Unit-style tests with assertions @@ -22,4 +24,3 @@ The code that implements the bindings is here: * ```components/test_runner/web_ax_object_proxy.cc``` You'll probably find bindings for the features you want to test already. If not, it's not hard to add new ones. -