One of the first companies to try and automate fact-checking now says “there is no market for fact-checking” — at least, not as you and I know it.
Paris-based Trooclick launched its plug-in last June, promising to check the facts in IPO stories against Securities and Exchange Commission filings, and against other articles. The original business plan was to make traders the prime audience, and eventually transform the plug-in into an add-on for Bloomberg or Dow Jones terminals. Trooclick was one of just a handful of efforts to automate the fact-checking process, some of which I highlighted for the Columbia Journalism Review.
The plug-in worked well, CEO Stanislas Motte says — so well that, in a way, it killed itself off. “The algorithm worked and didn’t find any errors,” Motte says. The reason, he says with hindsight, is that companies know their words are being scrutinized by regulators, and don’t dare to make misstatements.
As a result, Motte now concludes, “There is no market for fact-checking, especially on financial and business news.” That savvy business audience already knows to trust a limited number of sources — and they can usually spot the important errors themselves, Motte says.
From errors to omissions
After the success-cum-failure of the plug-in, Trooclick began thinking where the problem really lies. Its conclusion: “The real problem is not on errors but on omission… Big speakers prefer to use omission rather than errors,” Motte says. The way you combat that problem is by presenting different points of view and facilitating debate, Motte says.
So in December, Trooclick announced a new product, with quite a different tack. The Opinion-Driven Search Engine uses the same natural language processing as its predecessor — technology due to receive a US patent on March 27— to scour news articles, blog posts and tweets. But instead of comparing facts against a reference, the new site categorizes quotes and paraphrases attributed to executives, analysts and journalists. (Trooclick describes all these statements as “quotes,” but in reality they do include paraphrasing too.) These “quotes” are designated either positive, negative or neutral, and the site displays lists of the positive and negative statements, side by side. Soon Trooclick hopes to move beyond “positive” and “negative” to perhaps three or five points of view on a given topic.
A viewpoint summary will be another key ingredient in Trooclick’s new recipe. With a huge chunk of readers never making it past the headline, Trooclick sees it as important to quickly summarize the major viewpoints on an issue in the first couple lines of each entry.
“Everything will be focused to give you the synthesis very quickly,” Motte says. “Today… on our website you can find 20, 30, 40 quotes [on an issue]. This is boring and maybe no one reads it. But this is only the beginning.”
The company, which has about $2 million in funding from its founders and France’s Banque Publique d’Investissement (Bpifrance), is considering two business models for the product. One is a white-label offering to social media or search giants, such as LinkedIn or Yahoo. The second is a b-to-b-to-b approach, in which a customer could use Trooclick technology to provide its own client companies with easily digestible media monitoring.
The company is aiming for some major advances in a very short time frame. In about three weeks the website will add the ability to filter stories by the person being quoted — a key move, Motte says, because he wants to start emphasizing speakers over news outlets. By June, Motte says he’s “80% confident” that Trooclick will have developed a capability to reliably detect and categorize three to five families of opinion for each topic, along with the functionality to summarize those opinions in a couple sentences.
And then it’s on to politics: by the end of this year, Motte wants Trooclick gearing up to tackle the 2016 US presidential election. By early 2016, Trooclick aims to analyze 50,000 news articles a day, on business, politics and other topics.
That seems a big leap for a product that still stumbles at times with classifying “positive” and “negative.” For example, here are some of the quotes Trooclick catalogued for the story, “Ryanair plans to offer low-cost flights between Europe and the U.S.”:
The circled comment is not exactly positive…. just sort of informational.
Here’s one from the story, “Lufthansa pilots to go on strikes on Wednesday”:
That might be positive if you side with the pilots and want to see their strike having an effect. For a lot of parties, I’d call this a negative.
I have no way of knowing how widespread the errors are, and I do see a lot of quotes that Trooclick has catalogued correctly. But seeing these errors does make me wonder if the company’s timetable is a little optimistic.
Motte acknowledges, “One of the biggest challenges for us is error rate,” though he won’t say what the site’s rate is. “If you are at 80% it’s great. The objective is to be more than 80%.”
Is Trooclick right for politics?
The move into politics is also surprising, given Motte’s views about fact-checking. “Speakers, companies, even politicians prefer omission [to making misstatements],” he says.
I just can’t buy that, given the 64 active fact-checking operations around the world, 22 of them in North America, and the frequency with which they find politicians making statements worthy of “Pants on Fire” or “Four Pinocchio” ratings. (Full disclosure: I’m a consultant for the American Press Institute’s Fact-Checking Project, so I arguably have a stake in seeing political fact-checking succeed.)
Where does this leave automated fact-checking?
Trooclick sales and marketing assistant Darcee Meilbeck says she does still think of the company’s work as fact-checking:
“In the last seven to eight months, yes, we have gone through a pivot… We realized that fact-checking is more than just true and false. That’s the story I’ve told people — we realize fact-checking isn’t just black and white. There is bias elimination that comes into it as well. That’s, I think, where we fit in at the moment.”
I’d say Trooclick’s new direction is an intriguing play at helping people to graze on news more intelligently — I’d hesitate to use the phrase “fact-checking” when no actual facts are being checked.
I must admit I was disappointed to see the company’s shift away from an automated tool that compared news reports to official reference sources. That disappointment that could be well driven by my own over-optimism rather than any realistic sense of what such a tool could today achieve, technologically or financially, and it’s not meant as criticism of the interesting work that Trooclick has turned to.
But while I may have been slightly dewy-eyed about what automated fact-checking can achieve now, I still think that for the long term, this target is both achievable and necessary. The battle against misinformation is going to require a combination of automation, leveraging of big data and some kind of social media or browser add-on, for the simple reason that most of us don’t go looking for verification, and even those of us who are verification junkies can’t possible verify everything we read. So the media ecosystem needs fact-checks that seek their readers out, rather than the other way round; and even better, seek out the lies that human journalists don’t have the bandwidth to.
In my CJR piece I very briefly highlighted a few tools and research projects that might fill that role. I didn’t know which would pan out and I’m not sure anyone does yet. But in the demise of Trooclick’s fact-checking plug-in, there’s an opportunity to formulate a couple hypotheses:
- Business journalism isn’t crying out for fact-checking, in the way that political or science journalism is. Reasons include the less contentious nature of the content and lower personal and ideological investment by readers.
- Automated fact-checking — especially the natural language processing component — is really, really hard. Maybe too hard, given the current state of the art, for a small start-up to handle. It’s possible such technology just isn’t ready for commercial roll-out yet — and the volume of research required to fine-tune it would be easier to carry out in academia or at huge companies like Google.
I welcome my readers’ thoughts on these theories, as well as their own prognoses for the future of automated fact-checking.
Stepping away from automated fact-checking for a moment, it’s also worth considering the role of crowdsourced verification — if only because of two high-profile launches in recent weeks. They are Fiskkit, a platform for commenting on the news, which won the Social Impact award at the 2015 Launch Festival startup conference; and Grasswire, a platform that invites the public to fact-check breaking news stories.
I wouldn’t close off crowdsourcing as an avenue to explore. If nothing else, I think Fiskkit’s combination of in-line annotation, logical fallacy tags, “respect” button (an outcome of the University of Texas’s Engaging News Project) and comments makes a good bid to be the forum for civil discourse that Facebook never was, and probably never could be. What I’m not sure it adds up to is good fact-checking. Wikipedia has shown us how far crowd-sourcing fact can take you, which is pretty far indeed — up to a point. I’ll be very surprised to see any crowd-sourced effort beat that track record.
Cross-posted to Medium.