DataHarvest+ 2018: presentations, tools and datasets

A list of resources – presentations, datasets and tools etc. – mentioned during the DataHarvest+ 2018 European investigative journalism conference. This list is not complete, and will probably be updated later on.

Investigative reporting

  • Strategies to find personal information, by Marcus Lindemann, slides.
  • Gadgets for investigative reporting, by Marcus Lindemann, slides.
  • Can journalism networks help investigations under authoritarian regimes? THe case of Turkey, by Craig Shaw, Sebnem Arsu, and Efe Kerem Sozeri, slides.
  • Don’t fear the robots: 5 reasons to welcome automation, by Leila Haddou and Max Harlow, slides.

Data Journalism

Tools presentations


All Python materials will be collected in this Github repo. For the time being, use these links:


  • CSVMatch, a command line tool to find (fuzzy) matches between two CSV files, by Max Harlow.
  • Reconcile, a command line tool to enrich data by doing batch lookups against online services, by Max Harlow.
  • Tabula, a tool for liberating data tables stored in PDF’s.
  • Flourish for polished, beautiful datavisualisations.