Posted by Alyssa Dunagan, Sales Manager at Aberdeen Broadcast Services

In the modern era dominated by automated phone systems and AI-driven chatbots, genuine human interactions in online customer service are becoming a rarity. At Aberdeen, we recognize the significance of delivering an exceptional client experience, and we believe that availability and prompt response time play pivotal roles in achieving this goal. Our commitment to clear, effective, and timely communication is deeply ingrained in our company's core values.

As the Sales Manager at Aberdeen, I can assure you that we prioritize team availability round-the-clock, 24/7. Whether you prefer phone calls, text messages, or emails, we have you covered. Our “Chat With Us” feature on our website is regularly available throughout the day, providing the convenience of quick and personalized responses from a human, not a chatbot.

Client Satisfaction Through Quick Response Times

Our success in delivering quick responses has been remarkable, setting us apart from our competitors. Studies by InsideSales and Harvard Business Review have shown that responding to new inquiries within five minutes increases the chances of connection by 100 times. Astonishingly, only a small percentage of companies respond within an hour, with even fewer responding within 24 hours. At Aberdeen, we take pride in our average response time of 7 minutes, with many inquiries addressed within 2 to 3 minutes. This not only makes a positive first impression but also builds strong and lasting relationships with our valued clients.

Our ultimate goal is to differentiate ourselves through an exceptional customer experience and understanding your needs thoroughly. By promptly responding to inquiries and providing customized proposals, we ensure a seamless sales process and create a great working relationship with you. At Aberdeen, we are committed to your satisfaction and success.

But don’t take our word for it. Browse any of our 100+ positive client testimonials across all major business review platforms: https://aberdeen.io/testimonials/.

Photo of a video conferencing platform

On June 8, 2023, the Federal Communications Commission (FCC) released a Report and Order, Notice of Proposed Rulemaking, aiming to further ensure accessibility for all individuals in video conferencing services. The action establishes that under Section 716 of the Twenty-First Century Communications and Video Accessibility Act of 2010 (CVAA), video conferencing platforms commonly used for work, school, healthcare, and other purposes, fall under the definition of "interoperable video conferencing service."

Under Section 716 of the CVAA, Advanced Communications Services (ACS) and equipment manufacturers are required to make their services and equipment accessible to individuals with disabilities, unless achieving accessibility is not feasible. ACS includes interoperable video conferencing services such as Zoom, Microsoft Teams, Google Meet, and BlueJeans. The FCC previously left the interpretation of "interoperable" open, but in this latest report, it adopted the statutory definition without modification, encompassing services that provide real-time video communication to enable users to share information.

In the Notice of Proposed Rulemaking, the FCC seeks public comments on performance objectives for interoperable video conferencing services, including requirements for accurate and synchronous captions, text-to-speech functionality, and effective video connections for sign language interpreters.

The FCC's actions on this item are an important step toward ensuring that people with disabilities have equal access to video conferencing services. The Report & Order will help to make video conferencing more accessible and promote greater inclusion and participation of people with disabilities.

This article was co-written with the help of both ChatGPT and Google Bard as a demonstration of the technology discussed in this article. You can also read along with Aberdeen's President, Matt Cook in the recording below - but not really, this is Matt's voice cloned using a short clip of Matt's voice given to AI.

Artificial Intelligence (AI) has revolutionized numerous industries, and its influence on language-related technologies is particularly remarkable. In this blog post, we will explore how AI is transforming closed captioning, language translation, and even the creation of cloned voices. These advancements not only enhance accessibility and inclusion but also have far-reaching implications for communication in an increasingly globalized world.

AI in Closed Captioning

Closed captioning is an essential feature for individuals who are deaf or hard of hearing, enabling them to access audiovisual content. Traditional closed captioning methods rely on human transcriptionists, however, AI-powered speech recognition algorithms have made significant strides in this field.

Using deep learning techniques, AI models can more accurately transcribe spoken words into text, providing real-time closed captioning. This is not up to the FCC guidelines for broadcast but is oftentimes good enough for other situations where the alternative is to have no closed captions at all. These models continuously improve their accuracy by analyzing large amounts of data and learning from diverse sources. As a result, AI has made closed captioning more accessible, enabling individuals to enjoy online videos with greater ease.

Our team is working hard to develop and launch AberScribe, our new AI transcript application powered by OpenAI, sometime in mid-2024. From any audio/video source file, the AberScribe app will create an AI-generated transcript that can be edited in our online transcript editor and exported into various caption formats. AberScribe will also have added features for creating other AI-generated resources from that final transcript. Resources like summaries, glossaries of terms, discussion questions, interactive worksheets, and many more - the possibilities are endless.

Sign up to join the waitlist and be one of our first users: https://aberdeen.io/aberscribe-wait-list/

AI-Driven Language Translation

Language barriers have long hindered effective communication between people from different linguistic backgrounds. However, AI-powered language translation has emerged as a game-changer, enabling real-time multilingual conversations and seamless understanding across different languages.

Machine Translation (MT) models, powered by AI, have made significant strides in accurately translating text from one language to another. By training on vast amounts of multilingual data, these models can understand and generate human-like translations, accounting for context and idiomatic expressions. This has empowered businesses, travelers, and individuals to engage in cross-cultural communication effortlessly.

In addition to written translation, AI is making headway in spoken language translation as well. With technologies like neural machine translation (NMT), AI systems can listen to spoken language, translate it in real-time, and produce synthesized speech in the desired language. This breakthrough holds immense potential for international conferences, tourism, and fostering cultural exchange.

Cloned Voices and AI

The advent of AI has brought about significant advancements in speech synthesis, allowing for the creation of cloned voices that mimic the speech patterns and vocal identity of individuals. While cloned voices have sparked debates regarding ethical use, they also present exciting possibilities for personalization and accessibility.

AI-powered text-to-speech (TTS) models can analyze recorded speech data from an individual, capturing their vocal characteristics, intonations, and nuances. This data is then used to generate synthetic speech that sounds remarkably like the original speaker. This technology can be immensely beneficial for individuals with speech impairments, providing them with a voice that better aligns with their identity.

Moreover, cloned voices have applications in industries like entertainment and marketing, where celebrity voices can be replicated for endorsements or immersive experiences. However, it is crucial to navigate the ethical considerations surrounding consent and proper usage to ensure that this technology is used responsibly.

Conclusion

Artificial Intelligence continues to redefine the boundaries of accessibility, communication, and personalization in various domains. In the realms of closed captioning, language translation, and cloned voices, AI has made significant strides, bridging gaps, and enhancing user experiences. As these technologies continue to evolve, it is vital to strike a balance between innovation and ethical considerations, ensuring that AI is harnessed responsibly to benefit individuals and society as a whole.

Photo of a hand on a remote scrolling through a video library

Been tasked with figuring out how to implement closed captions in your video library? The process can be overwhelming at first. While evaluating closed captioning vendors, it’s good to understand the benefits of captioning, who your audience is, what to consider when it comes to quality, and what to expect from a vendor.

There are several things that an organization should consider and evaluate before choosing a closed captioning vendor. Some of the most important factors include:

Benefits of Closed Captioning

Overall, closed captioning is a valuable tool that can benefit a wide range of audiences. It makes videos more accessible, engaging, and comprehensible for everyone.

Evaluating Vendors

By considering these factors, organizations can choose a closed captioning vendor that will meet their needs and provide a high-quality service:

What to Expect in the Process

Use these tips when evaluating closed captioning vendors and you’ll ensure that their videos are accessible to everyone and that they provide a positive viewing experience for all viewers.

In 2022, just days before winning the primary to become the Democratic candidate for the Senate in Pennsylvania, John Fetterman suffered a stroke. Like many stroke victims, he experienced a loss of function that persisted long after his recovery, including lingering auditory processing issues that made it challenging for him to understand spoken words. In interviews in the months that followed, John Fetterman relied on closed-captioning technology to help him comprehend reporters' questions and assist in his debates against his primary opponent, Dr. Mehmet Oz.

Upon being elected to serve in the US Senate, closed-captioning devices were installed both at his desk and at the front of the Senate chambers to facilitate his understanding of his colleagues as they spoke on the Senate floor. John Fetterman serves on several committees, including the Committee on Agriculture, Nutrition, and Forestry; the Committee on Banking, Housing, and Urban Affairs; the Committee on Environment and Public Works; the Joint Economic Committee; and the Special Committee on Aging. Closed-captioning has proven invaluable, benefiting both John Fetterman and his constituents in Pennsylvania, extending its utility beyond merely enabling him to watch TV at night or understand reporters.

With the assistance of closed-captioning technology, John Fetterman has been able to serve the people of Pennsylvania at the highest levels of government. During a hearing with the Senate Special Committee on Aging, Fetterman himself expressed gratitude for the transcription technology on his phone, stating, "This is a transcription service that allows me to fully participate in this meeting and engage in conversations with my children and interact with my staff." He later added, "I can't imagine if I didn't have this kind of bridge to allow me to communicate effectively with other people."

Captioning and transcription efforts extend well beyond being a mere requirement for broadcasting a program. As captioning technology continues to advance, an increasing number of individuals, like John Fetterman, will have the opportunity to participate in public life, even at the highest levels of government. They will serve others, even as transcription and captioning technology serves them.

Take a look at his setup in action here. Dedicated monitors with real-time captions displayed are becoming an increasingly popular setup at live events. Alternatively, explore the convenience of live captioning on mobile phones, making captions accessible from any seat in the venue. Either option is easily achievable — contact one of our experts to find out more.

Photo of a conference call on Zoom

On October 11, 2022, the Federal Communications Commission (FCC) released the latest CVAA biennial report to Congress, evaluating the current industry compliance as it pertains to Sections 255, 716, and 718 of the Communications Act of 1934. The biennial report is required by the 21st Century Communications and Video Accessibility Act (CVAA), which amended the Communications Act of 1934 to include updated requirements for ensuring the accessibility of "modern" telecommunications to people with disabilities.

FCC rules under Section 255 of the Communications Act require telecommunications equipment manufacturers and service providers to make their products and services accessible to people with disabilities. If such access is not readily achievable, manufacturers and service providers must make their devices and services compatible with third-party applications, peripheral devices, software, hardware, or consumer premises equipment commonly used by people with disabilities.

Accessibility Barriers

Despite major design improvements over the past two years, the report reveals that accessibility gaps still persist and that industry commenters are most concerned about equal access on video conferencing platforms. The COVID-19 pandemic has highlighted the importance of accessible video conferencing services for people with disabilities.

Zoom, BlueJeans, FaceTime, and Microsoft Teams have introduced a variety of accessibility feature enhancements, including screenreader support, customizable chat features, multi-pinning features, and “spotlighting” so that all participants know who is speaking. However, commentators have expressed concern over screen share and chat feature compatibility with screenreaders along with the platforms’ synchronous automatic captioning features.

Although many video conferencing platforms now offer meeting organizers synchronous automatic captioning to accommodate deaf and hard-of-hearing participants, the Deaf and Hard of Hearing Consumer Advocacy (DHH CAO) pointed out that automated captioning sometimes produces incomplete or delayed transcriptions and even if slight delays of live captions cannot be avoided, these captioning delays may cause “cognitive overload.” Comprehension can be further hindered if a person who is deaf or hard of hearing cannot see the faces of speaking participants, for “people with hearing loss rely more on nonverbal information than their peers, and if a person misses a visual cue, they may fall behind in the conversation.”

Automated vs. Human-generated Captions

At present, the automated captioning features on these conference platforms have an error rate of 5-10%. That’s 5-10 errors per 100 words spoken and when the average conversation rate of an English speaker is 150 words per minute, you’re looking at the possibility of over a dozen errors a minute.

Earlier this year, our team put Adobe’s artificial intelligence (AI) powered speech-to-text engine to the test. We tasked our most experienced Caption Editor with using Adobe’s auto-generated transcript to create & edit the captions to meet the quality standards of the FCC and the deaf and hard of hearing community on two types of video clips: a single-speaker program and one with multiple speakers.

How did it go? Take a look: Human-generated Captions vs. Adobe Speech-to-text

Delivering content to broadcast outlets is a critical final step in Aberdeen Broadcast Services' AberFast Transcoding & Station Delivery service. However, this process is more than just a digital file delivery service; it encompasses a comprehensive preparation of digital files before their final delivery.

On September 15, 2022, Matt Cook, the President of Aberdeen Broadcast Services, hosted an informative 30-minute webinar. This event provided an exclusive look into the AberFast service, offering a detailed walkthrough of the entire process, from the initial upload of content to its final broadcast. Matt Cook delved deep into various aspects of the service, including meticulous audio and video quality control, correction methods, and the intricacies of inserting graphics. He also explained the complexities involved in standards and resolution conversions, the nuances of transcoding, and the various methods of delivery.

Furthermore, the webinar introduced attendees to the station and client portals that Aberdeen offers. These portals are designed to give clients and stations real-time updates on their projects, ensuring transparency and efficiency in the process.

Download the Video, Transcript, and Slides

In 2021 alone, AberFast successfully delivered over 61,000 digital files, establishing itself as a reliable provider of broadcast-ready video files to broadcasting outlets globally. This webinar revealed the intricate processes and attention to detail that enable Aberdeen to deliver such high-quality, broadcast-ready content consistently.

Open captions and closed captions are both used to provide text-based representations of spoken dialogue or audio content in videos, but they differ in their visibility and accessibility options.

Here's the difference between closed and open captions:

Open Captions

Closed Captions

FeatureOpen CaptionClosed Captions
VisibilityPermanently embedded in the videoSeparate text track that can be turned on or off
AccessibilityCannot be turned offCan be turned on or off by the viewer
ApplicationsWide audiences, noisy environmentsDiverse audiences, compliance with accessibility regulations
CreationAdded during video productionGenerated in real-time or embedded manually during post-production or uploaded as a sidecar file

Both open and closed captions serve the purpose of making videos accessible to individuals who are deaf or hard of hearing, those who are learning a new language, or those who prefer to read the text alongside the audio.

The choice between open or closed captions depends on the specific requirements and preferences of the content creators and the target audience.

In the July ‘21 release of Premiere Pro, Adobe introduced its artificial intelligence (AI) powered speech-to-text engine to help creators make their content more accessible to their audiences. Their extensive toolset allows their users to edit, stylize, and export captions in all supported formats straight out of the sequence timeline of a Premiere Pro project. A 3-step process of auto-transcribing, generating, and stylizing captions all within the platform already familiar to its users delivers a seamless experience from beginning to end. But how is the accuracy of the final product?

Today, AI captions, at their best, have an error rate of 5-10% - much improved over the 80% accuracy we saw just a few years ago. High accuracy is crucial for the deaf and hard-of-hearing audience as each error adds to the possibility of confusing the message. To protect all audiences that rely on captioning to understand programming on television, the Federal Communications Commission (FCC) set a detailed list of quality standards by which all captions must meet to be acceptable for broadcast back in 2015. Preceding those standards, the Described and Captioned Media Program (DCMP) published its Captioning Key manual over 20 years ago and has since been a valuable reference for captioning of both entertainment and educational media targeted to audiences of all age groups. Simply having captions present on your content isn’t enough, it needs to be accurate and best replicate the experience for all audiences.

Adobe’s speech-to-text engine has been one of the most impressive that our team has seen to date, so we decided to take a deeper look at it and run some tests. We tasked our most experienced Caption Editor with using Adobe’s auto-generated transcript to create & edit the captions to meet the quality standards of the FCC and the deaf and hard of hearing community on two types of video clips: a single-speaker program and one with multiple speakers. Our editor used our Pop-on Plus+ caption product for these examples, which are our middle-tier quality captions that fulfill all quality standard requirements but are not always 100% free of errors.

Did using Adobe’s speech-to-text save time, or did it create more work in the editing process than needed? Here’s how it went…

In-depth comparison documents that evaluate the captions cell-by-cell are available for download here:

Single Speaker Clip

In this example, we used the perfect scenario for AI: clear audio, a single speaker at an optimal words-per-minute (WPM) speaking rate, and no sound effects or music.

The captions contained the following issues that would need to be corrected by Caption Editor:

Here’s the clip with Adobe’s speech-to-text captions overlayed on the top half of the video, and ours on the bottom half.

Multiple Speaker Clip

For the next clip, we went with a more realistic example of television programming where there are multiple speakers, an area where AI is known to struggle and has difficulties identifying the speakers. This clip also features someone with a pronounced accent, commentators speaking over one another, and proper names of athletes – all of which our editors take the time to research and understand.

The same errors detailed in the single-speaker example are present throughout, among the other difficulties we expected it to have. In fact, there were so many errors that our editor was unable to use the transcript from Adobe and started from the beginning using our own workflow.

Here’s a sample of the first 9 cells of captions with what Adobe transcribes in the first column, notes from our Caption Editor, and how it should look.

Adobe’s Automated SRT Caption FileIssueFormatted by Aberdeen
something
 you are never seen in your life, correct?
No speaker ID.(Pedro Martinez)
It's something you have
never seen in your life,
“Correct” is spoken by new speaker.(Matt Vasgersian)
Correct!
So it's.Missing text.So it's--so it's MVP
of the year!
So we're all watching something
 different. OK
(Pedro)
We're all watching
something different.
He gets the MVP.Okay, he gets the MVP.
I'd be better off.Completely misunderstood music lyrics.♪ Happy birthday to you ♪
Oh, you, you guys.(Matt)
You guys.
Let me up here to dove into the opening
night against the Hall of Fame.
Merged multiple sentences together.Just left me up here to die.
You left me up here to die
against the hall of famer.

Take a look at the clip. Again, with Adobe's speech-to-text on the top and Aberdeen on the bottom.

In-depth comparison documents that evaluate the captions cell-by-cell are available for download here:

The Verdict

Overall, the quality of the auto-generated captions exceeded expectations, and we found them to be in the top tier of speech-recognition engines available. The timing and punctuation were particularly impressive. However, when doing a true comparison to the captioning work that we would consider acceptable, AI does not meet Aberdeen’s broadcast quality standard.

Aberdeen's post-production Caption Editors are detail-oriented and grammar savvy and always strive to portray every element of the program with 100% accuracy so that the viewer misses nothing. For our most experienced Caption Editor, it took a 5:1 ratio in time for them to edit and correct the single-speaker clip; meaning, for every minute of video, it took 5 minutes to clean up the transcript and captions. Assuming your team is educated in the proper timing of caption cells, line breaks, and grammar, a 30-minute program may take over 2.5 hours to bring up to standards with a usable transcript. In the second example, the transcript was unusable and would have taken more time to clean up than it did to transcribe from scratch. Double that timeline now.

Consider all of the above when using this service. Do you have the time and resources to train your staff to know how to edit auto-generated captions and get them up to the appropriate standards? How challenging may your content be for the AI? Whenever and however you make the choice, make sure you deliver the best possible experience to your entire audience.

This article is current as of February 4th, 2022.

A few months ago, Zoom announced that auto-generated captions (also known as live transcription) were now available for all Zoom meeting accounts. The development has been a long-awaited feature for the deaf and hard-of-hearing community.

As popular and ubiquitous Zoom has become, it can be overwhelming to understand its multiple meeting features and options – especially in regards to closed captioning. Here at Aberdeen Broadcast Services, we offer live captioning services with our team of highly trained, experienced captioners with the largest known dictionaries in the industry. CART (Communication Access Realtime Translation) captioning is still considered the gold standard of captioning (See related post: FranklinCovey Recognizes the Human Advantage in Captioning). Our team at Aberdeen strives to go above and beyond expectations with exceptional captioning and customer service.

Whether you choose to enable Zoom’s artificial intelligence (AI) transcription feature or integrate a 3rd-party service, like Aberdeen Broadcast Services, the following steps will help ensure you’re properly setting up your event for success.

Step 1: Adjust Your Zoom Account Settings

To get started, you'll need to enable closed captioning in your Zoom account settings.

Scroll down to the “Closed captioning” options.

In the top right, enable closed captions by toggling the button from grey to blue to “Allow host to type closed captions or assign a participant/3rd-party service to add closed captions.”

Below is a detailed description of the additional three closed captioning options here in the settings...

Allow use of caption API Token to integrate with 3rd-party Closed Captioning services

This feature enables a 3rd-party closed captioning service, such as Aberdeen Broadcast Services, to caption your Zoom meeting or webinar using captioning software. The captions from a 3rd-party service are integrated into the Zoom meeting via a caption URL or API token that sends its captions to Zoom. For a 3rd-party service such as Aberdeen to provide captions within Zoom, this feature must be enabled.

Allow live transcription service to transcribe meeting automatically

As mentioned earlier in this post, auto-generated captions or AI captions became available to all Zoom users in October 2021. Zoom refers to auto-generated captions as its live transcription feature, which is powered by automatic speech recognition (ASR) and artificial intelligence (AI) technology. While not as accurate, ASR is an acceptable way to provide live captions for your Zoom event if you are not able to secure a live captioner. If you will be having a live captioner through a 3rd-party service in your meeting, do NOT check “Allow live transcription service to transcribe meeting automatically.”

Unless you expect to use Zoom’s AI live transcription for most of your meetings, it is best to uncheck or disable live transcription as Zoom’s AI auto-generated captions will override 3rd-party captions in a meeting if live transcription is enabled.

Allow viewing of full transcript in the in-meeting side panel

This setting gives the audience an additional option to view what is being transcribed during your Zoom meeting or webinar. In addition to viewing captions as subtitles at the bottom of the screen, users will be able to view the full transcript on the right side of the meeting.

Check or enable this feature to provide additional options for accessibility.

Save Captions

The meeting organizer or host can control permission of who can save a full transcript of the closed captions during a meeting. Enabling the Save Captions feature grants access to the entire participant list in a meeting.

Transcript options from 3rd-party services may vary. At Aberdeen Broadcast Services, we provide full transcripts in a variety of formats to fit your live event or post-production needs. For more information, please see our list of captioning exports or contact us.

Step 2: Start your meeting and obtain the caption URL (API token)

Once the webinar or meeting is live, the individual assigned as the meeting host can acquire the caption URL or API token.

As the host, copy the API token by clicking on the Closed Caption or Live Transcript button on the bottom of the screen and selecting Copy the API token, which will save the token to your clipboard.

By copying the API token, you will not need to assign yourself or a participant to type. Send the API token to your closed captioning provider to integrate captions from a 3rd-party service into your Zoom meeting. We ask that clients provide the API token at least 20 minutes before an event (and no earlier than 24 hours) to avoid any captioning issues.

Step 3: Test Your Captions and Go!

Once the API token has been activated within your captioning service, the captioner will be able to test captions from their captioning software.

A notification in the Zoom meeting should pop up at the top saying “Live Transcription (Closed Captioning) has been enabled.” and the Live Transcript or Closed Caption button at the bottom of the screen will appear for the audience. Viewers can now choose Show Subtitle to view the captions.

Viewers will be able to adjust the size of captions by clicking on Subtitle Settings...

Can you also caption breakout rooms in Zoom?

Yes! Captioning multiple breakout rooms occurring at the same time is possible using the caption API token to integrate with a 3rd-party, such as Aberdeen Broadcast Services. Zoom's AI live transcription option is currently not supported in multiple Zoom breakout rooms, which is why it is important to consult with live captioning experts to make that happen. Contact us to learn more about how it works.

Enjoy this post? Email sales@abercap.com for more information or feedback. We look forward to hearing your thoughts!