A visual programming language for the Hub allows users to collaborative and visually construct data processing steps required to combine or remix data sets.
Privacy & Sharing The Hub will provide privacy controls for the project as well as the data sets within it. Users can choose to keep their data sets private or public as the project progresses through different stages of completion. They can choose to open up their data sets to the community so that they can be forked (for further remixing by other users) just like code on GitHub. Community The Hub aims to provide community spaces where new remixed data sets that are public are discussed and critiqued for feedback as they are developed. Additionally, we aim to curate offline components of hackathons and immersive exhibitions to improve awareness about the implications of data for justice . The Hub’s main objective is data co-creation and distribution, therefore our focus is to ensure that datasets are interoperable, easy to modify, compile remix/adapt and redistribute without technological barriers and flexible with fair use restrictions. The Hub’s design will support data synchronization and reconciliation of conflicting data changes, and maintain a strong change history and version control system. Creators/users will be required to submit machine readable data sets with associated metadata, to ensure maximum sharing and reuse of data and leading to high impact of data in the wider community. The Hub would enable fast searches and discovery of such data sets content and its metadata through the platform as well as through an API. The user community will be encouraged to categorise and improve the quality of metadata through auto-targeted call to actions to users with expertise in the related domains of the data set(s) in question. A community wide task board is provided that leverages collaborative filtering to drive consensus on which data sets need to be worked on. This enables the community to build consensus, create interest/ownership and drive scope of work. A council of moderators ensure regular review of the uploaded data sets alongside features such as rating/voting, reporting abuse, and commenting on each uploaded data set to highlight any issues and discuss resolutions.
The community wide task board leverages collaborative filtering to drive consensus on data sets which need to be worked on.
Keeping the Hub updated The platform aims to leverage the community as well as technology to maintain and keep data on the platform relevant. It will drive consensus building, encourage user feedback, as well as brings out ‘under utilised’ datasets to the attention of the community or project. The Hub will feature a community wide task board with collaborative filtering that helps the community prioritise and determine which data set(s) need to be worked on or created. We envision a homepage that is used extensively to promote such practices through editorials around interesting, creative and empirical data driven research projects, featured active projects, calls for collaboration, featured data sets as well as featured ongoing discussions. The Hub will be equipped to detect when the source data set has been updated and inform the relevant users and projects about such updates so they can take necessary actions. In the longer term, we are interested in exploring automating updation of such source data that is updated periodically. In addition to the above, we aim to design a number of features and initiatives to enable wider community growth. Here are some:
A community wide task board with voting to drive consensus around topics of interest, topics of priority, topics of urgency that could benefit from this community working together in producing new data sets
Workshops, trainings and hackathons on data literacy in context of how it is enabled via the Hub as well as topic based hackathons to create new data sets. A workshop methodology on how to run Hub workshops in your own city
Training materials and support to encourage interested users to take initiatives to raise awareness of data literacy, responsible use, privacy and security of data
An active editorial component of the Hub to inform the community of active or featured projects, discussions, and pressing needs of the community through the home page and other sections of the Hub
A reputation system that will allow contributors to be seen and appreciated by the community
A visualisation system that will allow users to showcase their output data set through visual data-stories
How can you be involved? If you are an organisation or individual interested in collaborating with us or commenting on the proposal, please get in touch with us at firstname.lastname@example.org or email@example.com. This proposal was a finalist at the Agami Data for Justice Challenge 2019. The concepts described in this document are under Creative Commons License CC BY-NC-SA 4.0 International