Who cracks six figures – or more – among Ontario’s civil servants? got The Globe and Mail on the Data Journalism Award Shortlist. An interview with Stuart Thompson:’Data journalism naturally dovetailed with my other interests in multimedia journalism and programming, but I’ve really committed myself to it over the last year.’

Presenting your nominated production in an elevator pitch, what would you say?
Our interactive lets readers sort and search the Sunshine List — a collection of public salary employees earning more than $100,000 per year. Readers can find specific people or specific organizations. The real kicker is they can compare this year’s results to last year’s and see how the benefits and salaries have increased or decreased.

What inspired you to make ‘Who cracks six figures’?
This is a popular news item each year, but we never saw the data used in an interactive way. Since people like searching for specific names or ministries, we knew we wanted to present the entire data set in way that was easy to manipulate and search. Other tools allowed this, but they used clunky search programs that returned limited results and offered no context. We sat down and decided to do a searchable table that worked on the broswer side.

Did you work by yourself or in a team?
The project came from our Data Bureau, but mostly Laura Blenkinsop, Alisa Mamak and myself. We decided on the original concept and I was tasked with figuring out how it could work. I also discovered I could add the historical data layer during development.

How did you get a hold on the data you needed?
The data is released publicly, but only in a big HTML table across several pages on the government website. There’s no easy way to search or sort. We built a scraper that could pull data from any year since 1992. We wanted to get our interactive online as fast as possible on the release day, and we succeeded, scraping all the data and integrating with our interactive in about three hours.

Which tools were used making this production?
The tool was built using Javascript and two libraries: jQuery and SlickGrid. The foundation was provided by SlickGrid and we built a number of sorting and optimization functions to deal with a dataset this large. We also added the historical data layer.

How did it take to make ‘Who cracks six figures’?
It took about two weeks (off and on) to complete the scraper and a week or so to create the interactive.

Were there any bumps in the road?
The scraper took a lot of work because it had to accommodate unforeseen changes to the layout of the data. We built a number of fallbacks and safety measures to handle a variety of columns and rows. We ended up scraping several years of data to practice our refining process.

Do you have a useful tip for starting data journalists?
Data journalism is simply journalism. Your source might be a spreadsheet instead of an official, but the techniques are the same. Young people who want to become data journalists should focus on writing long features, investigative pieces and story-telling. Then supplement that with a solid foundation in statistics, analytics and database management. I wouldn’t worry so much about programming; any medium or large-sized paper will have qualified programmers on staff, especially if they have a data journalist as well.

‘And the nominees are…’ is a serie of interviews with the journalists behind the entries at the Data Journalism Awards Shortlist. The Data Journalism Awards (DJA) competition is the first international contest recognising outstanding work in the field of data journalism worldwide. The Data Journalism Awards were organised by the Global Editors Network, in collaboration with the European Journalism Centre and supported by Google. (All interviews in this serie were conducted through e-mail.)
