Sugarcube

Sugarcube is a tool that aids journalists, non-profits, academic researchers, and human rights organizations in conducting investigations using online, publicly-available sources. It allows users to make local copies of these sources and automate the data processing. Sugarcube has been instrumental in the preservation workflow of the Syrian Archive, collecting documentation of human rights violations from 12 platforms and over 10,000 sources.

Background

Sugarcube is a modular tool. It is made up of lots of individual tasks, carried out through a small piece of code Sugarcube calls a "Plugin". Plugins can be configured to your needs (e.g. a DuckDuckGo search plugin can be configured to do a search for a specific term or list of terms), which you can string together in an automated sequence - Sugarcube calls this a "Pipeline".

Each task in the Pipeline produces information, or data (e.g. a list of search results or screenshots of tweets), which is then sent to the next Plugin in the Pipeline, and so on. What comes out at each stage depends on how you have set up the pipeline - search results, for example, could be automatically put into a text file, a CSV file, or a Google spreadsheet. This could then be put into a database, or compared against previous results, etc.

[plugin1] -> output -> [plugin2] -> output -> [plugin3] -> output

Once you've configured your Pipeline, it becomes an automated, repeatable workflow. One command sets the whole thing in motion.

Sugarcube was developed in close collaboration with a number of organisations, academics and activist in order to refine the tool and adapt it to the needs of those actively conducting data-based investigations.