Black Corpus

  • Project:Black Corpus
  • Year: 2018... ongoing
  • Team: Ayodamola Tanimowo Okunseinde, Nikita R Huggins, Nicole Lloyd
  • Role:

    Concept, Design, Technology

Black Corpus is a research laboratory that seeks to address bias in machine learning systems while creating tools and artworks that make machine learning environments more accessible. Focused on Black and African Diasporic language, text, vernacular, and writing, the collective not only produces and analyzes alternative machine learning datasets but also creates related artworks that are meaningful and expressive. Some of the methodology implemented include identifying unique modes of communication internal to Black and African Diasporic communities, analysis of language structures and syntax, the use of machine learning tools to attempt to pull meaning from text, and the creation of digital and physically-based artworks that promote diversity in the machine learning field. The collective aims also to teach machine learning tools and methods to underrepresented communities, develop related art & technology curricula, and archive datasets and assets that may be utilized as research material.

Machine learning (ML) uses statistical techniques to give computers the ability to "learn" from data and thus improve performance on a specific task without being explicitly programmed. Though this field is growing exponentially, there exists a lack of representation of BIPOC groups in the field. This lack of representation in the programming of ML algorithms, collection of datasets, and development of ML-based products are deleterious and diminishes the potential positive social impact, efficacy, and economic impact of the field. This lack of diversity additionally has the potential to manifest itself in products and systems that are biased and dangerous to underrepresented groups by not fully taking these groups into account.