If you’re a local news publisher, you’ve most likely heard about the new AP DataKit. The free, open source command-line tool was released earlier this month. Despite AP DataKit being made specifically for newsrooms and journalists, many publishers of local news sites and online magazines are still unsure about how take full advantage of this interesting tool.
AP DataKit was developed to make it easier for journalists and editors to collaborate on data journalism projects. It does this by standardizing data and keeping projects organized. The tool itself works off a basic framework, comprised of a core product and a few plugins for managing data files.
Although AP DataKit was created by the Associated Press, it’s actually available for any newsroom to use. AP DataKit is not exclusive to newsrooms that subscribe to the Associated Press services, making this tool particularly attractive to local, independent publishers.
In interviews, Associated Press Data Journalism Team Editor Troy Thibodeaux has described the purpose of AP DataKit as to help journalists be more consistent in the way they work. With the tools in place, anyone on a newsroom’s team can jump on a project and get going right away.
How, exactly, does that work in the real world? AP DataKit makes it so all newsroom projects have the same basic structure. This applies mainly to standardized data practices, but also in the structure of big newsgathering projects. Essentially, AP DataKit handles the day-to-day details that are involved in putting together large-scale, collaborative data journalism projects. The tool will automatically do things like setting up connections between apps and standardizing naming conventions.
The goal here, for the Associated Press and the newsrooms around the country that will soon be using AP DataKit, is to develop a tool that makes it easier for journalists and editors to work together on projects.
If you’re an online publisher, you might be asking what problems AP DataKit is trying to fix. After all, newsrooms have been using tools like Slack and Basecamp to collaborate on projects for years. As just one example, more than 250 newsrooms have recently joined an initiative to collaborate on climate change coverage ahead of the UN climate summit. How could a tool like AP DataKit help them?
When dozens or more journalists get together to work on a project, they’re usually working remotely. Different newsrooms use their own programming languages and techniques to gather, sort, and analyze data. As a result, important details can get lost in translation. Key pieces of the stories they’re working on can go uncovered, as well.
By standardizing data and automatically handling things like naming conventions and setting up connections between apps, AP DataKit is making it easier for journalists to work together.
Here are three examples of ways that local publishers can use AP DataKit:
- Using “cookiecutter” templates, journalists can quickly create projects during ongoing breaking news events.
- In larger projects, editors can use AP DataKit to keep each part of their projects separated, so data, code, configuration, and documentation are never intermingled.
- With automated data and code syncing, publishers don’t have to worry about manual data storage or creating nightly backups.
How to Install AP DataKit
Hopefully by now you see the value in using AP DataKit. So how do you get started using the tool? Begin by following these steps for quick installation:
- Install Python 3 on your computers.
- Install datakit-project, the tool’s most popular plugin.
- Choose a Python or R project structure template, or customize your own template.
- Start your project. (On the command line, DataKit project create will create a project with a standardized file structure.)
Although it’s not required, you may want to grab additional plugins to manage and store data files. Plugins are also available to sync code to GitLab and GitHub.
You can view more tutorials and read more details about how to get started using AP DataKit here.