While this development might initially strike fear into the heart of any already beleaguered journalist, fearing for their job and the world around them, on closer inspection it seems that this program might have the potential to get journalists back doing what they do best: analyzing and commenting on the news, rather than simply providing information about it. Content Insights writer, Em Kuntze, caught up with Claude to find out a little more.
Can you start off by telling me a little about Syllabs and how you created this program?
Well, the company is 10 years old and we’re experts in semantics: we develop a lot of technologies around things like web mining, text mining and natural language generation. The idea for robotic writing came about from our customers and at first we said no, but when the fifth client inquired we realized there was a need for it.
How long did it take to create the program?
Well, I think we’re on the 10th version now. The first prototype, which came out in 2011, took only three months to create because we already had a lot of the technology available to us from the text-mining side of the business; we had the linguistic knowledge to create the generator and the writing engine. We produced a major version last year, which transformed our writing engine and programming engine, so we have a specialized programming language used by our linguists to configure the robots.
Does it work across different languages?
Right now, we are able to produce text in French, English and Spanish, and adding a new language is quite easy. There are of course exceptions. Chinese, for instance, is quite complicated for the analysis part, but for the generation it is quite simple, and languages such as Finnish and Hungarian, which are linguistically so unique, are considerably more of a challenge.
What kind of news stories or reporting are you finding this particularly useful for?
It’s not a matter of stories, it’s a matter of opportunities. In order to be able to generate text, you need several different things. First, you need to be able to model what you’d be talking about. So, while you can do this for something like the results of a football match, it’s quite difficult to talk about love and politics – these are complicated concepts. You must give to the machine the context and how it will be able to talk about something.
The second problem is the data. If you don’t have the data, the machine can’t write anything.
The third problem – and usually people don’t think about this – is to work out how viable it is to develop a robot for these types of subjects. If you want to create 10 or 50 articles or texts, it’s usually far cheaper to write them manually than to use a robot. If you want to write thousands of texts, then a robot would be far cheaper.
So there are three considerations: the model, the data and the economics.
Does that mean that this is very well-suited to something that is happening at the same time: you mentioned football results. Presumably the model can be extrapolated over many, many football matches that are taking place over the course of a day…
Over the course of the day, or during the year, or during many years to come. You know that with football matches you’ll have thousands and thousands of articles to produce, because you have something like five leagues per country and you have several countries, and you have the Champions League – you have a lot of different leagues, so in this case it’s interesting to have the robot writing on matches.
The usual themes you want to use robots on are sports, elections, weather forecasts, financial information – there are dozens.
Do you think that the ability to report on hyper-local news fills a gap in the market?
Clearly, yes. To find the information about your municipality, especially if it’s a small one, is quite difficult. You have local and regional media and regional media have information for each of the towns, but they don’t have the resources to report on everything in every district. There are over 36,000 municipalities in France and it is, of course, impossible to have a journalist writing about each of these.
There are over 36,000 municipalities in France and it is, of course, impossible to have a journalist writing about each of these
During the election in France we reported the results municipality by municipality. Sainte-Reine de Bretagne, a small hamlet with just 35 inhabitants had its own report. This was 500 words in length, with graphics, and three people visited the site over 24 hours. This doesn’t sound a lot, but it was 10% of the inhabitants. Imagine 10% of the French population visiting your site in the same time frame. That’s more interesting, isn’t it?
Very, especially because we’re seeing a lot of these news agencies struggling with traffic in the wake of ad blocking and paywalls. Can robot writing help with things like traffic, engagement and reach?
When you’re a generalist like Le Monde or the BBC, being able to talk to a person in a specific location means that you’re only able to reach them with general information. So, yes, there’s certainly a potential problem with reach. But as a business you have to be visible in the web and you need a lot of content to do that, and that’s the second point.
There’s enormous potential in linking all the information about individual towns: so you’d have available, not only the text you produce about the election, but also a dozen other pieces of information and articles about each individual place, which you can then use to show some kind of story about the location, which is of great interest to people.
What about the issue of trust? Does that follow on from that of local representation?
Well, the problem about trust isn’t robot-specific, the trust is to do with the information you have available to you. So, if you have relevant and reliable information in your data, then you produce text which is reliable. The results for that election we just talked about were made available by the French Home Office, so that’s about as reliable as you can get.
You can have a biased robot if you program it to be biased
Often people say that robot journalism isn’t biased, but that’s not really true because you can have a biased robot if you program it to be biased, knowingly or unknowingly. So I don’t think it’s robot specific: it all comes back to the initial programming.
You mention the elections. I’m just thinking of the timescales that you’re working with. How quickly can these algorithms produce pieces of news? Could they be used in a breaking news cycle?
Yes, of course. With the robots you have two parts: you have the set-up phase and then you have the user phase. During the set up phase it can take some time – a week to 10 weeks, depending on the projects – and so this is why you have to produce a lot of text to make them profitable. But then, if they are configured, the production is almost real time. When the data is made available you can have the text on the website within a second. So, absolutely it can be used for breaking news, and, for instance, the Los Angeles Times use it for earthquakes. Did you know that?
No, I’m not sure I did…
Well a journalist programmed a robot. He plugged it in to an institution which gives information in real time about earthquakes and if the reading is more than, I don’t know, 5 on the Richter Scale, then a report is automatically produced on the website. Robot journalists can be used for breaking news, definitely.
So, again it’s all in the setup of those parameters?
What about the balance between robot journalists and traditional journalism? I can imagine there’s probably a degree of skepticism or fear depending on the journalist: the notion that robotic journalism might replace the human version. What are your thoughts on this?
There are a lot of opportunities for media outlets, this much is clear. And for journalists, too. The problem we have now is that media outlets want to cover as many stories as possible, so journalists are finding they have too much to write. What they’re producing is information, when they should be doing what they as journalists do best: analyzing, interviewing, investigating – using their expertise to comment and ask questions.
Robot writing can cover the information and a journalist can add his or her expertize on top of that.
Do you think that the media industry is ready to embrace that kind of collaboration?
I was at a conference recently and the chairman asked me how I thought robot journalists could find a place in the media, which isn’t known to be naturally innovative. But I think that assessment is wrong. The media is usually very innovative. They were the first to use the internet, the first to use semantic analysis. We’re seeing it now with some companies experimenting with things like virtual reality and 360 degree filming. So, in fact, they are often the first to do something, but more as a test. It’s more of a challenge to integrate it into the newsrooms so that journalists and robots are collaborating in a useful and sustainable way.