Podcast transcripts: who's in control?
A transcript - a text-version of your podcast’s audio - is a good, useful thing.
There are a number of reasons why podcasters would produce transcripts of each show. Some podcasters do it as a service to their listeners.
“I’ve always resisted transcripts on the basis that listeners might stop listening when, as a podcaster, you really want people to hear and appreciate your audio,” said Naomi Fowler, who produces The Taxcast, a podcast focusing on tax justice and financial transparency. “But I asked my listeners if they’d find it useful via the monthly email I send them, and many replied and said yes. I can see from my email analytics that a sizable number of them do click on it.”
There are benefits, she says, in doing so. “They might reference my podcast in their own work, and having everything written down helps them do that more easily. It also helps them retain facts which they’ll find useful at a later stage: I do pack a lot in each month.”
Particularly for podcasts in different languages, transcripts help. Lory Martinez, the founder of Parisian podcast producer Studio Ochenta, explains: “We do multilingual podcasts and we found that the transcripts are super important in listener experiences. Every Ochenta original show has transcripts for both SEO and accessibility for listeners. Many listeners of foreign language versions of our shows use them as a guide to listen along to.”
Transcripts make podcasts more available to a potential wider audience, as Kevin Finn from Buzzsprout told us. “At Buzzsprout we believe that all podcasts should provide transcripts. Making podcast content accessible to those with a hearing impairment is simply the right thing to do.” Buzzsprout has been ahead of other podcast hosts, providing its podcasters with transcription tools for almost three years.
But, as Lory Martinez said, the SEO benefits are potentially great for podcasters.
“Transcripts make podcasts indexed and found in search engines more easily,” says Louis-guillaume Kan-Lacas, who is founder of Choses à Savoir, a podcast network in France with more than 3m listens every month. “We have been offering transcripts for all of our podcasts on www.chosesasavoir.com since the launch of our network 5 years ago. For sure it is an extra workload. No doubt about that. But it’s definitely worth it.”
How well do transcripts work for search engines? Raz Kaplan, the Marketing Manager of Audioburst, tells us that they’re working very well. “We’ve recently added visible automated transcripts into our audio search platform in order to test that exact issue. After 2 months we can say with full confidence that we’ve received 80% more visitors from Google during our test period. Therefore, transcripts are definitely a must-have. For podcasters who own a website, we recommend they edit and add the transcripts to their own website as blog posts.”
Transcripts can be produce by automation, with tools available from Amazon, Otter and others.
Listen Notes, a podcast search engine, used to offer transcripts for purchase. They stopped providing this service in May. Wenbin Fang, the company’s founder and CEO, tells us that it was a money-losing business. “Considering the cost for the speech to text api, server, and customer support (human time), the demand was not strong enough to justify the existence of the service.”
The automated services are okay, but not perfect. Here’s a short script, taken from our podcast of May 5 2020:
The latest from our newsletter at Podnews.net
[[IRA GLASS CLIP: “This American Life is delivered to public radio stations by PRX”]]
…and it’s won the first ever Pulitzer Prize for Audio Reporting, a journalism award founded in 1917. The award was for their episode The Out Crowd, with Molly O’Toole of the Los Angeles Times and Emily Green, a freelancer with Vice News. The judges described it as “revelatory, intimate journalism”. The other finalists were Ear Hustle and NPR’s White Lies.
And here’s what an automated service posted, on a third-party website:
The latest from our newsletter and Pod News Dot net maximize radio stations by PR ads. And it’s one the first ever Pulitzer Prize for reporting a journalism award founded in one thousand nine seventeen. The award was for their episode. The out crowd with Malia tool if the Los Angeles Times and Emily Green a freelance with vice news the judges described it as revelatory intimate journalism. The finalists were ear hustle and NPR’s white lies.
Apple uses automated speech-to-text technology like this for searching; Google has experimented with this, too, and Spotify’s new terms and conditions in May 2020 also gives them the right to transcribe your podcast in this way. There are clear benefits: a search for 'journalism award’ would have found the audio above, even if we didn’t use that phrase in our episode notes; however, reading the transcript wouldn’t have told you who won the prize.
Neither Apple, Google or Spotify publish the result: but automated transcriptions are posted on some websites. Many podcasters welcome them. “I think that adding something like that would broaden the search results, and has to be good, right?”, Forrest Kelly, the producer of The Best 5 Minute Wine Podcast tells us.
However, automated transcripts are mostly correct but partially wrong - and they’re not ideal to publish on a website in some cases. For complex legal reporting, having an incorrect transcript on a website could open the podcaster to legal trouble.
“Anything that is a made-up name or company name is likely to be misinterpreted by the bots: and could lead to confusion,” says Cody Boyce, the founder and CEO of Crate Media.
Listen Notes’s Wenbin Fang noted that that not everyone was happy with his transcription service. “Some podcasters were very angry,” he told us. “They thought we were selling their contents via transcripts - though it was a one time fee we had to pay a 3rd party speech to text API”.
Some podcasters offer transcripts as a Patreon benefit; others send them on request if a listener emails and asks. So, for some producers, there is a value to transcripts: and a concern about IP and copyright if automated transcripts have been made available.
“I understand the desirability of transcripts and how helpful they are, especially for accessibility for news or interview podcasts. However, when it’s original work, and someone else is providing them without approval, it gets a bit murky,” says producer Anna Priestland, who wrote and produced Letters of Love in WW2.
“If you have a narrative podcast, even one with actors or interviews, you normally have a perfect script anyway. But with automated transcripts… it’s like I wrote a play, and someone posted my script online without asking, and got it all badly wrong,” she added.
“We know automatic transcripts have their flaws,” says Audioburst’s Raz Kaplan. “We’re constantly working to improve our engine’s quality (we’ve just deployed a learning algorithm that updates week-by-week.) To answer that need for improved transcriptions, we offer podcasters the ability to edit and proof transcripts within Audioburst’s Creators interface.”
He added: “We know Google is using non-visible transcripts as a resource in their search algorithms, but unfortunately, non-Google search engines and indexes do need to visibly display the transcript for that content to be correctly indexed by their bots. We don’t think podcasters should be held captive by tech giants like Google and Apple, so we’re working within their rules to best serve the podcasting community. ”
The unwritten rule of using podcasts RSS feeds are generally understood. Podcasters want people to link to their work, but not to alter that work in any way. Editing or rehosting the audio, or altering the episode notes, will typically anger podcasters, who feel their own creative work is being interfered with.
Some podcasters find automated transcripts are welcome ways to help listeners find their shows: but others are asking for a bit more control.
“It would be convenient if the players allowed you to upload your own transcript for your episodes in a particular format, perhaps as part of your RSS feed,” says Crate Media’s Cody Boyce.
Kevin Finn agrees. “Ideally, podcasters would have a way to provide their own transcripts. Buzzsprout is currently working to provide transcription links in RSS feeds, and we’re working with a few partners already”, says Finn.
In an ideal world, podcasters would be able to opt-out of automated transcripts entirely, too.
“Adding the capability to not publish transcripts by podcasters who opt-out is an interesting feature that we’ll take up with our Product team, albeit with the realization that we think this is likely to have a negative effect on the podcaster’s SEO ranking. Of course that should be their call,” says Audioburst’s Raz Kaplan.
An RSS extension for transcripts would enable podcasters to point to a transcription file; or to opt-out of any published transcript: but this extension would need to be respected by podcast platforms. A standard format for transcripts, too, would need to be agreed upon.
How should we format transcripts? How should we link to them? And how can we stop automated transcripts being available should we wish?
Podcasting has historically lacked a cross-industry standards group; any improvements in the industry has been driven by Apple, which set standardised categories and artwork size.
But, Apple only standardises what it feels benefits Apple; and the lack of any leadership from an independent best practice group may be inhibiting podcasting’s growth.
In the meantime, transcripts are important; and posting them remains a useful exercise. Producer Anna Priestland is in no doubt.
“The next series I do, I want to support accessibility so I will strongly consider providing transcripts. But if I do, they will come from me, not a third party automated system. They need to be correct.”
|James Cridland is the Editor of Podnews, a keynote speaker and consultant. He wrote his first podcast RSS feed in January 2005; and also launched the first live radio streaming app for mobile phones in the same year. He's worked in the audio industry since 1989.|