Commons:Structured data/Media search/Future

edit

There are new features in Media Search that take advantage of Structured Data on both Wikidata and Commons. The Structured Data development team believes that structured data provides many opportunities to increase the exposure and discoverability of the valuable files on Commons. However, structured data on Commons is still relatively new. Collective efforts by the community, Wikimedia Deutschland, and the Wikimedia Foundation have made strides but there’s still much to discover.

Some of those discoveries have included frustrations with the capabilities of our existing data and its accompanying tool sets. The development teams have been disappointed that we couldn’t do more with our existing systems (like include all photos with depicts=“Macaw” in search results for “bird”; two concepts which are very difficult to connect via Wikidata statements), and we know the community has felt that disappointment as well. We will continue to explore ways to make these connections in search. We remain committed to exploring multiple options to make the most out of the resources we have within a reasonable time frame.

With that said, here’s a list of technical and logistical approaches we’re currently working on or will in the near future as it relates to search and Commons:

  1. Computer-Aided Tagging facilitated by machine vision. This provides some of the data we use to get search results with more relevant items. We continue to monitor its usage and make improvements as we learn from data and user experiences, such as the recent update to boost the priority of images with fewer than two categories in the popular queue.
  2. Using the Wikidata taxonomy tree to expand search results where applicable. We’ve been doing some technical experiments and there are some cases where this could work because the topic space is consistently modeled (dog and cat breeds, for example) and the query isn’t resource-intensive. However, there are many other topic spaces where results are either noisy or there are large gaps in coverage. We’re working on finding guidelines and performant solutions for when to use the ontology and when to use something else.
  3. The Structured Data Across Wikimedia program is exploring ways to extend structured data work in Commons and across other wikis. We are excited to explore the ideas proposed in the grant. The grant proposal is aspirational, and primarily serves as a loose framework for our work in the next three years. The grant also predates the inclusion of AbstractWiki/Wikilambda into Wikimedia projects. This and other new projects will require us to adapt and restructure our plans as we go, but the first scheduled target of that grant plan (enhancing search) is the foundation of our work on Media Search.