Branching out in data journalism

By Meghan Hoyer, Michelle Minkoff and Troy Thibodeaux

Three members of AP’s data journalism team provide a behind-the-scenes look into how they work with the wider editorial team, discover high-impact stories and create new types of content for AP customers.

Building interactive data visualizations that allow readers to delve deep into stories. Devising an automatic alert system for election reporters to track campaign advertisement buys. Helping a reporter craft a FOIA request for documents and data from a government source.

AP interactive newsroom technology editor Troy Thibodeaux

That’s all part of our job as AP’s data journalism team. We’re a growing group of tech-savvy journalists who build newsroom tools for managing massive data and document dumps, crunch the numbers to discover high-impact stories and create new types of content for AP customers.

Introduced by the interactive department in 2013, our group began as a newsroom developer team with four coders and an editor whose mission was to apply cutting-edge technologies in the newsroom. The primary focus was building robust, data-driven interactives, such as the maps displaying U.S. election results.

Since then, the data journalism team has grown to include both news application developers and data analysts who work with AP reporters and editors around the world. While still collaborating with colleagues in interactives to create engaging, data-driven items on topics ranging from environmental hazards to traffic congestion, data journalism provided by our team now appears across all of AP’s distribution channels and platforms — with stories, videos and photos all informed by data analysis.

AP is now positioned as an industry leader in data journalism. The full power of code and quantitative analysis complements AP’s newsgathering and production every day.

Members of the data journalism team are seen here in AP’s headquarters in New York.

We have data journalists located across the U.S., with staffers in New York, San Francisco, Los Angeles, Phoenix, New Orleans and Washington. To keep in touch, our group relies on practices more common in technology startups, such as meeting daily via video chat to provide quick status updates on major projects and ask for help when needed.

We strive to be fully integrated into AP’s reporting team, conducting data interviews with government and private agencies and developing sources in places responsible for keeping the data we use. We frequently participate in brainstorming sessions with reporters and editors, looking for data angles in a story or discovering an interesting trend or outlier that leads to more on-the-ground reporting. Vetting data points, bulletproofing the final article, interactive or video, and ensuring data supports AP’s claims is also a significant role.

In September 2015, Dan Kempton provided the data support for an AP story about oilfield wastewater spills. After 14 months of compiling data, he found more than 21,000 spills totaling around 175 million gallons of wastewater had occurred over a six-year period. This “Only on AP” report was the first time data was compiled from the 11 states that produce more than 90 percent of U.S. onshore oil.

Data has long been a valuable product for AP, most evidently in election results.

Over the last two years, data behind our stories and interactives have been provided as an additional content stream for members and customers, allowing them to use AP’s analyses and numbers to write local versions of major national investigations.

Throughout 2015, as part of a project coordinated with the Associated Press Media Editors, AP investigated issues surrounding U.S. infrastructure. Along with the state government team and the national investigative team, our team acquired local figures to help explore the status of the U.S. water supply, power grid and transportation system.

In December 2015, Serdar Tumogoren helped compile and analyze data about the U.S. power grid, finding that while severe weather has become the leading cause of major outages, many utilities are not investing enough to protect themselves. His findings contributed to a story that showed on more than 100 front pages during Christmas week.

The response from members has been overwhelmingly positive. News organizations know the value data can bring to their work, but they frequently lack the technical skill to wrangle large data sets. Short-staffed newsrooms don’t have the time to collect, vet and understand data in a way that makes it easy to use.

The data packages AP provides remove the labor-intensive part of the operation, leaving only clean, well-documented figures and clear guidelines for using them accurately.

In addition to analysis work, our team is building a set of newsroom tools to help manage the growing deluge of documents and data. Justin Myers, our news automation editor, has created systems that alert reporters about breaking news on their beat as well as pipelines that turn data feeds into wire-ready content.

This year, we’re working to build a library of evergreen data sets to provide a constant stream of story ideas and leads.

This data collection effort is providing a cornerstone for future products that can help customers more easily access the AP data they need.

These efforts will be supported over the next two years by a $400,000 Knight Foundation grant provided in recognition of the value AP’s data journalism provides to a wide range of news organizations.

As a component of the grant, which is helping fund the hires of Larry Fenn and Angeliki Kastanis, AP will also work on a data distribution platform and draft a new data journalism section in the 2016 edition of “The Associated Press Stylebook.”

Our team’s skills and continuous work to improve data output are positioning AP as an industry leader in the frontier of data journalism, continuing the long AP tradition of innovation and discovery.

insights, video, images, text, multimedia