Commons:Structured data/About/Why

Structured Data on Commons aims to benefit the Wikimedia community in its broadest sense, and many re-users outside the Wikimedia ecosystem, including:

  • the communities of Commons and Wikimedia contributors;
  • cultural institutions (GLAMs) who publish their collections via Commons;
  • scientific and research organisations interested in working with large metadata datasets as a basis for research;
  • smaller and larger re-users of free media files online (from bloggers to large publishers);
  • smaller and larger developer communities (from app and web developers to the builders of search engines and operating systems).

Impact and benefits for the Wikimedia movement edit

The project will affect the Wikimedia movement in the following ways:

  1. Wikimedia Commons becomes a lot friendlier and more usable to developers. Structured Commons provides a new infrastructure of fine-grained APIs and other machine-readable endpoints, so that developers both within and outside the Wikimedia community can create consistent, reusable and reliable software that helps with editing, reusing and analyzing Commons media and its associated data. Without structured data, such tools rely on short-term solutions that break or produce bad data when MediaWiki core changes or when the volunteer community updates wikitext or categories.
  2. When it becomes easier to search Wikimedia Commons - in multiple languages! - Wikimedia contributors can more effectively illustrate Wikimedia projects such as Wikipedia. Without structured data, Wikipedians need to know English, need to know the category system on Commons well, and/or need to know the specific terms with which the files are described by uploaders, in order to be able to find suitable illustrations on Commons.
  3. Structured data allows for easier and simpler partnerships with content providers, especially knowledge institutions and organizations with media collections (such as cultural institutions or GLAMs). Without structured data, mass uploads of larger sets of well-described media files to Commons are technically complicated, even with relatively user-friendly tools like Pattypan. With structured data, the precise and complex metadata of files in institutional databases can more easily be integrated into Commons, also on a large scale.
  4. Categories (not yet implemented) and metadata can be created in multilingual ways, so that volunteers with different language skills can work together more easily, and files can be found via other languages than English. Multilingual categories on Commons have been a long-term request from the Commons community.

Impact and benefits for other organizations edit

  1. With structured data, Wikimedia Commons gains a large, and highly valued, new advantage for partner organizations who donate media: it will finally become possible to follow, and review, changes that have happened to 'their' media on Commons, such as improvements and translations of the metadata. When Wikimedia Commons has refined, structured APIs, it is also possible to import these changes to institutions' own catalogues again. In this way, the Wikimedia community does not only receive materials from GLAMs around the world, but it is also able to give back, in the form of improved and updated metadata, in a clean and consistent format.
  2. Structured data also makes Wikimedia Commons more attractive for knowledge institutions around the world, because a structured environment aligns much better with the advanced metadata in the specialized repositories that such institutions have built during the last decades. Better search and findability of media on Commons also provides a greater incentive to share collections there. Without structured data, the main incentive for institutions to upload to Commons is the volume of Wikipedia page views from pages that contain their media files. By improving Commons itself, expanding the way people can search for images and reuse them, we greatly expand the usefulness of Commons, also of those files that are not used as an illustration on Wikipedia.
  3. Many knowledge organizations, especially in regions like South and Southeast Asia, Latin America and Africa, don't have support from online cultural aggregators like Europeana, Trove and DPLA, and sometimes don't even have the technical capacity for hosting their own digitized collections. Especially with structured data, Wikimedia Commons can fill this gap, becoming a de facto hosting platform and aggregator for cultural media across the world - a reliable venue for sharing cultural heritage content under free formats and free licenses.

Impact and benefits for re-use of Commons media across the web edit

  1. Structured data on Commons makes it easier to dynamically re-use and to embed Wikimedia Commons content with proper attribution: because the data behind media is provided in a structured form, via detailed APIs, many content management systems and platforms (such as Drupal and Wordpress) can develop embed tools and plugins that help their end users to use media from Commons, while correctly complying with our licensing.
  2. The vocabulary for describing media files (such as creators, institutions, depicted people, places, animals, plants, buildings, historical events…) is drawn from Wikidata. There, these concepts are linked with the wider internet via identifiers. This allows for cross-internet discovery of relationships between media files - a foundational principle of the semantic web and Linked Open Data.
  3. With structured data, the content on Wikimedia Commons can more faithfully and more consistently be archived by Internet Archive and other digital archiving services, assuring longevity of that content, even if Wikimedia projects disappear. Digital archiving media files becomes easier and more precise when their associated metadata is properly structured.

Users and User Stories edit

 
Schematic of the main user groups of Structured Commons

Roughly, nine groups or types of people use Structured Commons. Each type has distinct needs and workflows.

The needs of these users are outlined in user stories.

 
Outline of user stories, version September 29, 2017
  • Viewers: They mainly visit Commons to find particular free files. They most likely never edit. Within this type are Remixers (people who will download, edit, and remix the images in their own new creations) and Embedders (bloggers, reporters, etc. who embed unaltered work on other pages).
  • Casual Uploaders: They actively (on average at least once every month) upload one image at a time (which may or may not be their own). These are amateurs who probably take most photos with their phones.
  • Batch Uploaders: They upload 20 or more images at a time using a batch upload tool. They are typically someone associated with a GLAM project, but not always.
  • Wikimedia enhancers: They are users on various wiki projects who search for images to use on those projects (Wikipedia, WikiVoyage, etc).
  • Photographers: They are pros or semi-pros who actively upload their own images to Commons. They are generally using DSLR/mirrorless cameras but may use phones in a pinch.
  • Editors: They actively edit media information on Commons for the sake of accuracy, completeness, or maintaining site quality.
  • Curators: They actively categorize, group, and label images to make things organized and easy to find. They may also be involved in picking featured, quality, and valued images.
  • Tool Builders: They are volunteer developers who write and release software to supplement Commons functionality or fill in functional gaps.
  • Admins: They are users who have special abilities to enforce the rules and primarily act as site enforcers of copyright policies and social norms. They may or may not actively donate media.

See also edit