Photo of a hand on a remote scrolling through tv settings

On July 18, 2024, the FCC released Report and Order (FCC 24-79) which implements a “readily accessible” requirement for closed captioning display settings on various video devices, allowing users to customize font size, type, color, position, opacity, and background to enhance readability and viewing preferences. This Order addresses the difficulties many users, particularly those who are deaf or hard of hearing, face due to complex navigation, inconsistent device interfaces, limited customization options, and inadequate support. This initiative responds to widespread complaints about the accessibility challenges of closed captioning.

There will be four elements to consider in deciding whether or not these display settings are "readily accessible”, which manufacturers of covered apparatus and multichannel video programming distributors (MVPDs) will need to comply with in this Order. These four elements include: proximity, ensuring settings are easy to navigate to; discoverability, making them straightforward to find; previewability, allowing users to see changes in real-time; and consistency/persistence, maintaining user settings across devices and sessions.

FCC Commissioner Anna Gomez stated, "Ensuring that those who are deaf and hard of hearing can locate and adjust closed caption settings is essential to their being able to meaningfully access and enjoy video programming. While this is a milestone to be proud of, as technology continues to advance, it is crucial that manufacturers prioritize the inclusion of accessibility features into product development from the beginning. Accessibility by design."

The discussions and rulings on these matters emphasize the FCC's commitment to improving accessibility in communications technologies, ensuring that closed captioning features are more user-friendly and customizable. Hopefully, these changes will be implemented sooner rather than later, so more people can enjoy the benefits of closed captions.

Photo of a vintage TV with an alert from the emercency alert system

The Federal Communications Commission’s (FCC) Public Safety and Homeland Security Bureau (PSHSB) issued a Public Notice to remind Emergency Alert System (EAS) participants of their obligation to ensure that EAS alerts are accessible to persons with disabilities.

The Federal Emergency Management Agency (FEMA), in coordination with the FCC, will conduct a nationwide Emergency Alert System (EAS) and Wireless Emergency Alert (WEA) test on October 4, 2023.

The Public Notice also reminded EAS Participants that they must file ETRS Form Two after the nationwide EAS test no later than October 5, 2023, and they must file ETRS Form Three on or before Nov. 20, 2023. For TV stations, to be visually accessible, EAS texts must be displayed as follows (as it relates to closed captioning):

“At the top of the television screen or where it will not interfere with other visual messages (e.g., closed captioning),” and “without overlapping lines or extending beyond the viewable display (except for video crawls that intentionally scroll on and off the screen)…”

This is in addition to another FCC Public Notice which states:

Individuals who are Deaf or Hard of Hearing. Emergency information provided in the audio portion of programming also must be accessible to persons who are deaf or hard of hearing through closed captioning or other methods of visual presentation, including open captioning, crawls or scrolls that appear on the screen. Visual presentation of emergency information may not block any closed captioning, and closed captioning may not block any emergency information provided by crawls, scrolls, or other visual means.”

As EAS alerts are expected to be more common in the future, this is something that we in the captioning industry will be prepared for and do our part to make it better for viewers.

Photo of a video conferencing platform

On June 8, 2023, the Federal Communications Commission (FCC) released a Report and Order, Notice of Proposed Rulemaking, aiming to further ensure accessibility for all individuals in video conferencing services. The action establishes that under Section 716 of the Twenty-First Century Communications and Video Accessibility Act of 2010 (CVAA), video conferencing platforms commonly used for work, school, healthcare, and other purposes, fall under the definition of "interoperable video conferencing service."

Under Section 716 of the CVAA, Advanced Communications Services (ACS) and equipment manufacturers are required to make their services and equipment accessible to individuals with disabilities, unless achieving accessibility is not feasible. ACS includes interoperable video conferencing services such as Zoom, Microsoft Teams, Google Meet, and BlueJeans. The FCC previously left the interpretation of "interoperable" open, but in this latest report, it adopted the statutory definition without modification, encompassing services that provide real-time video communication to enable users to share information.

In the Notice of Proposed Rulemaking, the FCC seeks public comments on performance objectives for interoperable video conferencing services, including requirements for accurate and synchronous captions, text-to-speech functionality, and effective video connections for sign language interpreters.

The FCC's actions on this item are an important step toward ensuring that people with disabilities have equal access to video conferencing services. The Report & Order will help to make video conferencing more accessible and promote greater inclusion and participation of people with disabilities.

Photo of a conference call on Zoom

On October 11, 2022, the Federal Communications Commission (FCC) released the latest CVAA biennial report to Congress, evaluating the current industry compliance as it pertains to Sections 255, 716, and 718 of the Communications Act of 1934. The biennial report is required by the 21st Century Communications and Video Accessibility Act (CVAA), which amended the Communications Act of 1934 to include updated requirements for ensuring the accessibility of "modern" telecommunications to people with disabilities.

FCC rules under Section 255 of the Communications Act require telecommunications equipment manufacturers and service providers to make their products and services accessible to people with disabilities. If such access is not readily achievable, manufacturers and service providers must make their devices and services compatible with third-party applications, peripheral devices, software, hardware, or consumer premises equipment commonly used by people with disabilities.

Accessibility Barriers

Despite major design improvements over the past two years, the report reveals that accessibility gaps still persist and that industry commenters are most concerned about equal access on video conferencing platforms. The COVID-19 pandemic has highlighted the importance of accessible video conferencing services for people with disabilities.

Zoom, BlueJeans, FaceTime, and Microsoft Teams have introduced a variety of accessibility feature enhancements, including screenreader support, customizable chat features, multi-pinning features, and “spotlighting” so that all participants know who is speaking. However, commentators have expressed concern over screen share and chat feature compatibility with screenreaders along with the platforms’ synchronous automatic captioning features.

Although many video conferencing platforms now offer meeting organizers synchronous automatic captioning to accommodate deaf and hard-of-hearing participants, the Deaf and Hard of Hearing Consumer Advocacy (DHH CAO) pointed out that automated captioning sometimes produces incomplete or delayed transcriptions and even if slight delays of live captions cannot be avoided, these captioning delays may cause “cognitive overload.” Comprehension can be further hindered if a person who is deaf or hard of hearing cannot see the faces of speaking participants, for “people with hearing loss rely more on nonverbal information than their peers, and if a person misses a visual cue, they may fall behind in the conversation.”

Automated vs. Human-generated Captions

At present, the automated captioning features on these conference platforms have an error rate of 5-10%. That’s 5-10 errors per 100 words spoken and when the average conversation rate of an English speaker is 150 words per minute, you’re looking at the possibility of over a dozen errors a minute.

Earlier this year, our team put Adobe’s artificial intelligence (AI) powered speech-to-text engine to the test. We tasked our most experienced Caption Editor with using Adobe’s auto-generated transcript to create & edit the captions to meet the quality standards of the FCC and the deaf and hard of hearing community on two types of video clips: a single-speaker program and one with multiple speakers.

How did it go? Take a look: Human-generated Captions vs. Adobe Speech-to-text

In the July ‘21 release of Premiere Pro, Adobe introduced its artificial intelligence (AI) powered speech-to-text engine to help creators make their content more accessible to their audiences. Their extensive toolset allows their users to edit, stylize, and export captions in all supported formats straight out of the sequence timeline of a Premiere Pro project. A 3-step process of auto-transcribing, generating, and stylizing captions all within the platform already familiar to its users delivers a seamless experience from beginning to end. But how is the accuracy of the final product?

Today, AI captions, at their best, have an error rate of 5-10% - much improved over the 80% accuracy we saw just a few years ago. High accuracy is crucial for the deaf and hard-of-hearing audience as each error adds to the possibility of confusing the message. To protect all audiences that rely on captioning to understand programming on television, the Federal Communications Commission (FCC) set a detailed list of quality standards by which all captions must meet to be acceptable for broadcast back in 2015. Preceding those standards, the Described and Captioned Media Program (DCMP) published its Captioning Key manual over 20 years ago and has since been a valuable reference for captioning of both entertainment and educational media targeted to audiences of all age groups. Simply having captions present on your content isn’t enough, it needs to be accurate and best replicate the experience for all audiences.

Adobe’s speech-to-text engine has been one of the most impressive that our team has seen to date, so we decided to take a deeper look at it and run some tests. We tasked our most experienced Caption Editor with using Adobe’s auto-generated transcript to create & edit the captions to meet the quality standards of the FCC and the deaf and hard of hearing community on two types of video clips: a single-speaker program and one with multiple speakers. Our editor used our Pop-on Plus+ caption product for these examples, which are our middle-tier quality captions that fulfill all quality standard requirements but are not always 100% free of errors.

Did using Adobe’s speech-to-text save time, or did it create more work in the editing process than needed? Here’s how it went…

In-depth comparison documents that evaluate the captions cell-by-cell are available for download here:

Single Speaker Clip

In this example, we used the perfect scenario for AI: clear audio, a single speaker at an optimal words-per-minute (WPM) speaking rate, and no sound effects or music.

The captions contained the following issues that would need to be corrected by Caption Editor:

Here’s the clip with Adobe’s speech-to-text captions overlayed on the top half of the video, and ours on the bottom half.

Multiple Speaker Clip

For the next clip, we went with a more realistic example of television programming where there are multiple speakers, an area where AI is known to struggle and has difficulties identifying the speakers. This clip also features someone with a pronounced accent, commentators speaking over one another, and proper names of athletes – all of which our editors take the time to research and understand.

The same errors detailed in the single-speaker example are present throughout, among the other difficulties we expected it to have. In fact, there were so many errors that our editor was unable to use the transcript from Adobe and started from the beginning using our own workflow.

Here’s a sample of the first 9 cells of captions with what Adobe transcribes in the first column, notes from our Caption Editor, and how it should look.

Adobe’s Automated SRT Caption FileIssueFormatted by Aberdeen
something
 you are never seen in your life, correct?
No speaker ID.(Pedro Martinez)
It's something you have
never seen in your life,
“Correct” is spoken by new speaker.(Matt Vasgersian)
Correct!
So it's.Missing text.So it's--so it's MVP
of the year!
So we're all watching something
 different. OK
(Pedro)
We're all watching
something different.
He gets the MVP.Okay, he gets the MVP.
I'd be better off.Completely misunderstood music lyrics.♪ Happy birthday to you ♪
Oh, you, you guys.(Matt)
You guys.
Let me up here to dove into the opening
night against the Hall of Fame.
Merged multiple sentences together.Just left me up here to die.
You left me up here to die
against the hall of famer.

Take a look at the clip. Again, with Adobe's speech-to-text on the top and Aberdeen on the bottom.

In-depth comparison documents that evaluate the captions cell-by-cell are available for download here:

The Verdict

Overall, the quality of the auto-generated captions exceeded expectations, and we found them to be in the top tier of speech-recognition engines available. The timing and punctuation were particularly impressive. However, when doing a true comparison to the captioning work that we would consider acceptable, AI does not meet Aberdeen’s broadcast quality standard.

Aberdeen's post-production Caption Editors are detail-oriented and grammar savvy and always strive to portray every element of the program with 100% accuracy so that the viewer misses nothing. For our most experienced Caption Editor, it took a 5:1 ratio in time for them to edit and correct the single-speaker clip; meaning, for every minute of video, it took 5 minutes to clean up the transcript and captions. Assuming your team is educated in the proper timing of caption cells, line breaks, and grammar, a 30-minute program may take over 2.5 hours to bring up to standards with a usable transcript. In the second example, the transcript was unusable and would have taken more time to clean up than it did to transcribe from scratch. Double that timeline now.

Consider all of the above when using this service. Do you have the time and resources to train your staff to know how to edit auto-generated captions and get them up to the appropriate standards? How challenging may your content be for the AI? Whenever and however you make the choice, make sure you deliver the best possible experience to your entire audience.

In the history of our planet, littering is a relatively new problem. It was around the 1950s when manufacturers began producing a higher volume of litter-creating material, such as disposable products and packaging made with plastic. Much like the boom of manufacturers creating more disposable packaging, new video content is being pushed out to streaming platforms in incredible volumes every day.

Along with all this new video content, there are noticeable similarities between littering and a prevalent problem in our industry: inaccessible media – specifically poor captioning quality. Instead of it being food wrappers, water bottles, plastic bags, or cigarette butts, it’s misspellings, lack of punctuation, missing words, or the wrong reading rate (words-per-minute on the screen) that affects readability.

The motives behind littering and choosing poor-quality captioning are similar and it generally boils down to one of the following reasons: laziness or carelessness, lenient law enforcement, and/or presence of litter already in the area. Both are very selfish acts, allowing one person to take the easy route by just discarding their trash wherever they please, or in the case of captioning, choosing the quickest & cheapest option available to fulfill a request without any regard to the quality. When it comes to organizations enforcing the guidelines and standards, if their efforts are relaxed, it will encourage a lot of people to not follow them. And the presence of other content creators getting away with inaccessible media will, no doubt, encourage others to take the same route.

In The Big Hack’s survey of over 3,000 disabled viewers, four in five disabled people experience accessibility issues with video-on-demand services. “66% of users feel either frustrated, let down, excluded or upset by inaccessible entertainment.” In fact, “20% of disabled people have canceled a streaming service subscription because of accessibility issues.” It’s clear: inaccessible media is polluting video content libraries.

Viewers that do not utilize closed captions may not always think about how poor-quality captions affect the users that do, just like the consequences of littering on the community and animals that all share the Earth’s ecosystem are often overlooked. Education and awareness are important tools in reducing the problem. If we allow it to become commonplace, much like litter, bad captioning will wash away into the “ocean” of online video content and become permanent pollution our video “eco-system.”

So, what can we do about it before it’s too late? Much like with littering, we can start with community cleanups. Let the content creators know that you value captioning and would enjoy their content more if captions were present and accurately represent the program to all viewers. Find their websites and social media pages and contact them – make them aware. And if it’s on broadcast television, let the FCC know.

Clean communities have a better chance of attracting new business, residents, and tourists – the same will go for the online video community. Quality captioning is your choice and, for the sake of the video community, please evaluate the quality of work done by the captioning vendors that you’re considering and don’t always just go for the cheapest and quickest option. Help keep the video community clean.

Internet Closed Captioning Quality

There’s a growing trend on social media and sites like Reddit and Quora to showcase captioning errors from television and numerous online platforms. As accessibility laws tighten and the quality standards for captioning on broadcasts become more rigorous, how do these bloggers have so much fuel for their posts on captioning errors? It is a simple question with many complicated answers.

Live television programming is captioned in real-time either by machines or humans working with a stenotype machine (like those used in courtrooms) and thus tends to lag slightly behind and, inevitably, will include some paraphrasing and errors. While the Federal Communication Commission requires American television stations' post-production captions to meet certain standards, the Internet is still vastly unregulated. Video-sharing websites like YouTube have struggled to provide accessible captions. Despite YouTube's recent efforts to improve accessibility, their captions continue to disappoint viewers, especially those of the deaf and hard-of-hearing community.

In a 2014 The Atlantic article called "The Sorry State of Closed Captioning," Tammy H. Nam explains why machines cannot create the same experience humans can.  She posits, "Machine translation is responsible for much of today’s closed-captioning and subtitling of broadcast and online streaming video. It can’t register sarcasm, context, or word emphasis." By using machines instead of human writers and editors, sites like YouTube are not providing the same viewing experience to the deaf and hard of hearing as they are to their other patrons. Humans can understand which homophone to use based on context. There is an enormous difference between the words soar and sore, air and heir, suite and sweet. Humans can also determine when noise is important to the plot of a story and thereby include it in the captions so that a non-hearing viewer won't miss critical details. In the same Atlantic article, deaf actress Marlee Matlin says, "I rely on closed captioning to tell me the entire story…I constantly spot mistakes in the closed captions. Words are missing or something just doesn’t make sense." Accessible closed captions should follow along exactly with the spoken dialogue and important sounds so that viewers are immersed in the story. Having to decipher poor captions takes the viewer out of the flow of the story and creates a frustrating experience.

YouTube created its own auto caption software for its creators to use in 2010. The software is known for its incomprehensible captions. Deaf YouTuber and activist Rikki Poynter made a video in 2015 highlighting the various ways in which YouTube's automatic captions are inaccessible. She wrote a 2018 blog post explaining her experience with the software, "Most of the words were incorrect. There was no grammar. (For the record, I’m no expert when it comes to grammar, but the lack of punctuation and capitalization sure was something.) Everything was essentially one long run-on sentence. Captions would stack up on each other and move at a slow pace." For years, Rikki and other deaf and hard-of-hearing YouTube users had to watch videos with barely any of the audio accurately conveyed. Although her blog post highlights the ways in which YouTube's automatic captions have improved since 2015, she writes, "With all of that said, do I think that we should choose to use only automatic captions? No, I don’t suggest that. I will always suggest manually written or edited captions because they will be the most accurate. Automatic captions are not 100% accessible and that is what captions should be." The keyword is accessible. When captions do not accurately reflect spoken words in videos, television shows, and movies, the stories and information are inaccessible to the deaf and hard of hearing. Missing words, incorrect words, poor timing, captions covering subtitles, or other important graphics all take the viewer out of the experience or leave out critical information to fully understand and engage with the content. Until web resources like YouTube take their deaf and hard-of-hearing viewer's complaints seriously, they will continue to alienate them.

So, what can we do about poor web-closed captioning? Fortunately, the Internet is also an amazing tool that allows consumers and users to have a voice in the way they experience web content. Deaf and hard-of-hearing activists like Marlee Matlin, Rikki Poynter, and Sam Wildman have been using their online platforms to improve web-closed captions. Follow in their footsteps and use the voice that the web gives you. Make a YouTube video like Rikki Poynter or write a blog post like Sam Wildman's post, "An Open Letter to Netflix Re: Subtitles.

The Internet is a powerful platform in which large companies like Google can hear directly from their consumers. If you would like to see the quality of closed captions on the web improve, use your voice. Otherwise, you'll continue to see memes like this one...

Last month, the FCC amended a few sections of Title 47 CFR 79.1: the rule pertaining to closed captioning of televised video programming. The amendments, specifically to 79.1(g)(1) through (9) and (i)(1) through (2), along with the removal of (j)(4), are a follow-up to the proposed reallocation of responsibilities of the Video Programmers and Video Program Distributors first established back in early 2016. The updates to the rule reflect the final decisions on how a compliance ladder will operate when handling consumer complaints related to closed captioning quality concerns.

The ruling focuses on two different scenarios based on how the consumer may approach making a complaint. The FCC recommends filing all complaints within 60 days of the problem either directly with the FCC, or with the Video Program Distributor (VPD) responsible for delivering the program to the consumer. Depending on how the complaint is filed, the review and steps taken to correct the issue should follow the steps below.Read

The final deadline of the FCC’s Twenty-First Century Communications and Video Accessibility Act (CVAA) pertaining to online videos is quickly approaching. On July 1, 2017, video clips (both straight-lift and montage clips) of live and near-live television programming (such as news or sporting events) will need to observe the following turnaround times for posting online with captions:

Live & Near-live Programming

Live programming is defined as programming shown on TV substantially simultaneously with its performance. When the Commission evaluates the compliance of captioning standards on live programming, there’s an understanding that live programming cannot be perfect since there’s a human element to live captioning and no opportunity to review and edit captions in a live setting. Therefore, there’s a little bit of leeway provided given the nature of live programming.

Near-live programming, which is programming that is performed and recorded within 24 hours prior to when it is first aired on television, is evaluated under the same standards applied to live programming. Although the FCC encourages measures to be taken prior to the program’s airing to improve its captioning quality, it’s understood that the window of time to make those corrections is very limited.

Revisiting the Internet Captioning Rules

The rules of the CVAA require video programming distributors that show programming on TV to post captioned clips of their programming on their own websites or applications ("apps").  Currently, the video clips rules do not apply to third-party websites or apps.

It’s also important to remember that consumer-generated media (e.g., home videos) shown on the Internet are not required to be captioned unless they were shown on TV with captions.

Further reading from the FCC: Captioning of Internet Video Programming & Twenty-First Century Communications and Video Accessibility Act (CVAA).

Repurposing caption files for the web can be as simple as reformatting and a quick file conversion. After all, the videos have already been transcribed. It’s just a matter of matching your video player’s specifications for web play-out. To learn more about getting your Internet clips compliant, please contact us.

NRB 17 Learning Arena FCC Regulations

Earlier this month at NRB Proclaim 17, we hosted a brief talk with broadcasters regarding the most updated FCC laws pertaining to closed captioning. Although it was intended as a review of the laws that are already in place, it proved that many of the laws are still not known, or unclear, to many broadcasters.

The talk was hosted at the Learning Arena located in the vendor exhibit hall. The goal of the Learning Arena is to foster true interaction between exhibiting companies and convention participants to share and connect; highlighting relevant education and training.Read