VIDEO TO VOICE is looking to level the playing field for the blind and visually impaired. The German firm has created "Frazier", a modern editor that uses text to speech technology. Here's why this ground-breaking software is in great demand...
We are watching more videos than ever nowadays. Or should that be reading videos? Whether it’s scrolling through social media or catching up on our favourite Netflix series, we now mostly follow the subtitles, particularly when we need to keep the sound off.
We’re so accustomed to seeing subtitles, it is easy to forget why they were introduced in the first place beyond simple convenience or understanding foreign-language films. The technology was originally promoted to ensure audio-visual media is accessible to deaf and hearing-impaired audiences. For example, all English programming in the United States now requires closed captions while 80% of the UK’s television content is subtitled. A big win for the millions of people with hearing loss who understand English, but what about those who cannot see?
Unfortunately, blind and visually impaired people don't enjoy the same levels of access to audio-visual content.Pass the remote: Information at your fingertips, but not for all
In order to follow on-screen action, low-vision viewers require a voiceover that narrates what is happening during natural pauses in the dialogue. This is known as an audio description. While audio description is a highly useful service, only 4% to 11% of productions in Europe are broadcast with one.
In light of these grim statistics, Christian David and Lukas Pajonczek are looking to redress the balance. Seeing massive potential in a long-neglected market, the Berlin-based techies set up VIDEO TO VOICE with the aim of making audio description as widely available as subtitles through text to speech technology. In their view, there are several reasons as to why the film and TV industry has been put off by audio description until this point.
First, production can be a costly undertaking, since the process requires the involvement of manuscript writers, project managers, voice talent, and sound engineers. Human labour aside, recording studio and computer rental adds to the costs. So if a production company has to stick to a limited budget, providing an audio description is usually seen as a low priority.
Next, the traditional audio description process can be time-consuming. From manuscript writing to final export, all production stages require detailed planning and coordination. In particular, audio editing can be extremely time-intensive when dealing with longer content – UK communications regulator Ofcom estimates that a two-hour film can take up to 60 hours to prepare.
Finally, EU regulations have been doing little to improve accessibility for the blind and visually impaired. Soft legal wording meant service providers in member states aren’t compelled to provide an audio description for their productions. However, changes that came into effect on 19 September 2020 strengthened the relevant article wording, requiring production companies to push audio description output.
In 2017, VIDEO TO VOICE set about designing software that would resolve these issues in a single user-friendly solution.Easy navigation: What the English version of Frazier looks like on your computer screen with all information clearly presented
From the outset, the German tech company was committed to collaborating with academic institutions such as the University of Hildesheim throughout development. By the end of the year, they had created the first prototype of “Frazier”, a modern editor that uses text to speech audio description technology.
Text to speech technology converts written text into a synthesised spoken voice. First developed in the late 1970s with the Kurzweil reading machine, Text to speech has recently experienced improved voice quality and wider availability thanks to advancements in artificial intelligence. Frazier itself provides a highly impressive range of 300 voices in 40 languages – a larger selection than that provided by Google Cloud. This ensures the audio description creates the right mood for the production and fits the audience’s needs.
Most importantly for media service providers, text to speech audio description technology slashes production costs. For one, Frazier makes expensive recording studios and talent fees a thing of the past, since a manuscript author can work on an entire audio description from home on their own computer.
Frazier also brings about shorter turnaround times by skipping tedious steps such as voice recording and audio editing. Even so, existing manuscripts created in a traditional audio description workflow don’t go to waste, as they can be imported into the tool. Through Frazier's user-friendly navigation, the user can create an audio description that’s ready to broadcast in a few simple steps.
Finally, Frazier simplifies translation and localization processes. While English currently leads the way in the world of audio description, this shouldn't mean other languages get left behind. Therefore, VIDEO TO VOICE has integrated DeepL machine translation technology into its software to break down language barriers.Polyglot: English is currently the market leader for audio description, but other languages will need to catch up
As EU law will soon require continuous improvement in accessibility, text to speech technology provides the best solution for getting media service providers up to speed. In Frazier, VIDEO TO VOICE has developed an award-winning editor for creating barrier-free videos in an efficient and cost-efficient way.
With 30 million blind and visually impaired living in Europe today, Frazier also opens the gateway to an untapped consumer market that could pay huge dividends. It is easy to forget that people who fall into this category enjoy watching TV and going to the movies, just like anyone else. And as the film industry faces billion-dollar losses due to the Coronavirus pandemic, continuing to ignore such a large potential customer base could be a costly mistake.
VIDEO TO VOICE's goal of bringing audio description's availability in line with that of subtitling is certainly an ambitious one. Yet with innovative text to speech audio description technology such as Frazier, media service providers now have the perfect platform for boosting media accessibility to the blind and visually impaired.