And the nominees are… The Washington Times

Washington Times - dc-salaries-graphic_s640x1049

The Washington Times is nominated for a Data Journalism Award in the category Data-driven investigation local/regional for Seniority salaries bulk up D.C.’s payroll‘. Journalist Luke Rosiak: “It’s an analysis of five years of public employee rosters for Washington, D.C. showed that political decisions made decades ago were having a major impact on the present for taxpayers, as a hiring boom in the early 1980s had led to a workforce in which a disproportionate amount had 30 years’ seniority. Many employees remained in lower-level positions but saw raises for tenure for decades, leading to janitors making $100,000. Other important but stressful jobs such as social workers, meanwhile, saw high turnover.” An interview about the making of a nominated data journalism production.

What inspired you to make Seniority salaries bulk up D.C.’s payroll?
To not just report the past, but predict the future, which I think is one thing that’s cool about some data work. Our report was the first to note that the city is sure to see an unprecedented wave of retirements in recent years, and that administrators will have to make decisions about whether to replace those workers and whether to retool job descriptions. (The story showed many job descriptions were still based on a 1980s way of doing business.) To inform taxpayers and promote accountability among high-paid civil servants.

Did you work by yourself or in a team?
Myself, because sometimes unfortunately it can be frustrating to get people in other departments to have the same priorities as you; because my newspaper doesn’t have the luxury of large staff; and because I can, having at least passing skill in writing, data analysis and interactive features.

How did you get a hold on the data you needed?
Information on employee salaries and start dates for recent years was obtained under a FOIA request and reconciled with old data that was in an archived PDF document, that I had to extract into a CSV using regular expressions and Python. Google Refine was used to standardize job titles between the two years so that I could compare the two.

Which tools were used making the production?
Python coding to extract data; the R statistics language to analyze it (that’s a hard one and one I don’t usually use/need); reporting and writing; Javascript for searchable database.

How long did you worked on this production?
One week, not including the time spent negotiating for and waiting for a FOIA response. (FOIA = American public records request law)

Were there any bumps in the road while making Seniority salaries bulk up D.C.’s payroll?
We don’t do too many interactive features, so putting up a searchable database was hard because while I could easily create it on my own computer, I had to figure out how to jam it into our content management system and deal with people in different departments and convince managers I wasn’t breaking anything. All very stupid and frustrating problems that come from larger organizations and a slowness to adapt.

Do you have a useful tip for starting data journalists?
I should note that 99% of the time I do data analysis in Microsoft Access. People talk down about it sometimes, but sometimes the more in the weeds of technology you get, the further you get away from the most real, obvious story. Access lets you quickly ask question after question of the data, and it can handle very large data sets.

Remember that the most important part of data journalism is journalism. Don’t get so bogged down in technical aspects that you forget you’re trying to distill and present interesting information to laymen. Developing *well-rounded* journalists who are fluent in high-level data work is one of the most critical issues facing the field; otherwise, a disconnect results where programmers enthusiastically attempt to make tools that practicing journalists don’t actually find useful or that have accuracy concerns, and reporters are unable to even recognize cases where computer strategies could benefit their work. Compared to an extended back-and-forth between researcher/programmer and writer, personal experience shows that having one person work on both the analysis and reporting of a story also provides dramatically more flexibility and power in storytelling. You develop that wide range of skills by exercising them on single-staffer enterprise like this.

‘And the nominees are…’ is a serie of interviews with the journalists behind the entries at the Data Journalism Awards Shortlist. The Data Journalism Awards (DJA) competition is the first international contest recognising outstanding work in the field of data journalism worldwide. The Data Journalism Awards were organised by the Global Editors Network, in collaboration with the European Journalism Centre and supported by Google. (All interviews in this serie were conducted through e-mail.)