If commodities are most valuable when they’re scarce, then time surely has to be the most prized of all – and there’s little more time consuming for journalists than the tedious process of transcription.
Throughout this year’s conference season, where a pervasive theme has been the role of technology in the future of journalism, I’ve been ruminating on the thought that if there are tasks in the newsroom that can be automated – thereby freeing up more time for journalists to focus on the aspects of their job that actually require human input – surely there’s a strong argument to let tech do its thing?
Trint is a startup founded by journalist Jeff Kofman which is seeking to do just that, by providing a tech-enabled alternative to hours spent playing back recordings from a dictaphone. We were delighted to speak to him to find out a little more about this particular pain point and get his views of the evolving role of tech and AI in journalism more generally.
Jeff, like practically every other journalist on the planet, you’ve spent countless hours over the years transcribing interviews. Was it this experience that inspired Trint?
Oh completely. I was a broadcast journalist on Canadian and American TV for more than 30 years. I figure I have spent thousands of hours of my life transcribing, so I’ve lived this pain point. I always hated the process of transcribing my interviews, news conferences, etc., but it’s what you have to do to find your quotes and your content. Or it was, until we launched Trint.
If you haven’t experienced this kind of workflow, it’s probably hard to imagine knowing that this even exists, but as any journalist will tell you it’s incredibly time consuming. When I first started talking to investors a lot of people looked at me oddly because I don’t think people outside of journalism understand that part of the process. I’d find people would kind of rumple their brow and say: “You mean you have to type it out manually? In today’s world?” To which I’d reply: “that’s why we’re talking. I think it’s crazy that we have to do that.”
With Trint suddenly a tedious task that you thought you had to do is now something that can be done for you using artificial intelligence and machine learning.
Is the incredulity to do with the fact that those people just assumed that the technology was already in place?
I put it this way: when I began in television news in the 1980s in Toronto (where I’m originally from) I started in a local TV newsroom on manual typewriters. We shot our stories on ¾” analog video with cameras that were so big they required two people to carry them: one held the camera, another held the recording deck and the Associated Press wires came in on scrolls of paper on teletypes that went clickety-clack. I held a little mini-cassette recorder in my hand to record my interviews and on the way back to the station I would hit play and hit stop and scribble out the quotes or the sound bites. Fast forward 30 years all of what I’ve described is now ancient history in terms of technology, but that transcription workflow until Trint was the same: it was still play, stop, type and rewind play, stop, type. So I think people just don’t understand that amidst all of the revolutions in technology that we’ve seen in the last generation that part of the workflow – which is essential – hasn’t really changed.
As you know until you get your content and get it right, nothing can happen. It becomes a bottleneck, a constraint, a delay and a cost and all of those things are what Trint is aimed at curing.
So can you give us a brief history of the evolution of Trint?
I had been working in journalism for a long time, during which I’d covered all manner of stories [including the Arab Spring, for which he won an Emmy] and wasn’t so keen on the push towards celebrity and felt like I needed a change, so I decided to take a buyout from my job as ABC News London Correspondent. I wasn’t sure what I was going to do.
By chance I met some developers who had done some very interesting experiments in transcription and saw the potential to collaborate and take that experiment much further. We began working on December 1st 2014 and we spent a couple of weeks contacting journalists and technical managers I knew in newsrooms around North America and Europe and basically said here’s what the transcription workflow is and here’s the challenge.
Those conversations laid the groundwork for Trint and one newsroom – CNN – told us that they’d love us to solve the problem but also said that they’d never use automated speech to text until they knew it wouldn’t burn them. They would only accept transcript they could trust.
That has been the problem with transcription software, hasn’t it? It isn’t always terribly reliable…
Exactly! That’s why “Transcripts You Can Trust” became the rallying cry at Trint. You can’t be burned with Trint transcripts because we tether the source audio to the text on the screen, so if there’s an error in the text you can hear it and correct it instantly.
To be clear, we don’t claim our transcripts are perfect. It’s more helpful to think of it as a first draft and, providing the audio quality of the recording is reasonably it should take away about 75% of the transcription workflow away.
How long did the building and testing take?
We built a prototype quite quickly, in about three months. When we tested it, we found that we had lots of assumptions that were being challenged and we had to rethink the user experience. We were working on it through 2015 and had our first real working beta in about a year. When we launched the open beta in early ’16, it instantly went viral, which was extraordinary. I was hoping we’d get about a thousand people on the system within three months or so, but we got them – I kid you not – in about three hours. If I had any doubt that there was an appetite for a solution, that reassured me we had tapped into a real need.
We launched the product September of 2016, and now we’re building an enterprise product. We’re really focused on helping big bigger media organizations integrate into it their workflow.
Are you finding any resistance to incorporating it into workflow or newsrooms?
One of the interesting challenges, in news organizations in particular, is that it’s just been assumed that transcription is part of a reporter’s job, so we’re having to re-educate people about that and essentially ask them to create a new line item in their budget. For some older organizations that’s kind of scary and there has been – predictably – some resistance.
Surely once they see how it frees up journalists’ time that changes, though?
Yes, absolutely. We hear this constantly: journalists themselves are saying that they’re able to do more stories in a week because they’re not locked away in a room transcribing. I think things like Trint have to be validated by users, and that’s a reasonable thing, but it also speaks to genuine effectiveness.
The talk and discussion around the area of AI, augmented journalism and tech in journalism has been something that’s been quite widely discussed this year on the conference circuit. What’s your perspective on that?
Right, I’ve seen that too. People sometimes tease me and say that I’m going to put journalists out of jobs, but I think they’re missing the point. The reality is – as anyone in media knows – that there are fewer reporting jobs than there used to be. My goal is to let people who still have jobs focus on being journalists, not stenographers.
At conferences we gives out these little sugar candies that look like pills and call them ‘Transcription Relief’. It’s our little bit of fun, but the underlying point is a serious one: it does ease the workflow pain. AI and machine learning don’t have to be your enemy.
One other thing that occurred to me was that in addition to being a transcription tool, Trint also has an application regarding searchability on the web, and I know this is something you’ve talked about elsewhere. Can you explain a little more about this?
Sure. If I look at my own work at ABC, or CBS or CBC my old stories and raw field interviews sit in video form on a shelf, or in disc form on a shelf, or in digital form on a server somewhere. There’s so much information there, but the reality is that none of that stuff is accessible, because there’s no way to search it easily.
In an interview like this, for example, which is going to be 20, 30 or 40 minutes long, the chances are you’re only going to use a bit of it. Imagine if a few years later you vaguely recall me saying something about Norway. If it’s just on tape, the only way to get that quote would be to listen back to it and, as we’ve said, with the time pressures journalists work with, chances are you’re not going to have time to do that. The concept of audio recordings not being searchable is called Dark Data. With Trint you can run simply search for the word “Norway” and instantly find it, hear it and decide if you want to use that reference. We like to say: Trint Sheds Light on Dark Data.
As form evolves online – by which I mean the increasing use of podcasts and video to present news stories – I can see the problem of Dark Data getting more problematic…
That’s true. And of course aside from digital arms of print media, there’s also broadcast media. It’s my view that Trint makes it easy to shed light on dark data because it’s so transparent and user-friendly and if you think of a news organization like the BBC, having searchable video content could be an enormous opportunity to surface content, create new content and to monetise it: every word spoken on video or audio over the last century could be searchable.
Well, it would be a wealth of information. The question is, would it be too much information?
Well I think that becomes a function of the search engine and the quality of the interface. We have some ideas to make it workable, but clearly we’d need to evolve them and test them.
Trint clearly straddles the world of journalism and tech. How do you feel that relationship is evolving industry-wide?
That’s an interesting question.
We’re in this fast moving landscape which is affecting how we communicate. The challenge for journalism is to figure out how to get real content through it. It’s fine to do listicles and cute animal pop ups on Facebook and other social media, but for me the concern is that if we don’t find a way to actually get people engaged in content about the world we live in and how we interact then we’re going to end up in a situation where we have leadership that is vacuous and unfocused and trivializes important things.
Digital media allows us, in 140 characters, to simplify complex problems that simply shouldn’t be simplified to that degree. I think that Trint’s role in this media landscape is to free journalists up from tasks that can be automated and use their time to produce stories that need to reported in a way that reaches and resonates with their intended audience.