Open main menu

Wikimedia Commons β

User talk:Jheald



Sjoerd de Bruin (talk) 19:40, 12 October 2016 (UTC)

Mapping Case studyEdit

Its been a long time, but I wanted to reach back out to you about your work on the British Libraries Map geocoordinate project at the GLAM-Wiki 2015 conference, described at:

When last we talked, I was working for the Wikipedia Library program, so it was out of my scope to work with you to document the project.

However, since April, I have been responsible for supporting GLAM-Wiki more broadly. One of the efforts I am trying to work on: documenting the knowledge we have about working with digitized collections. I would like to feature the Mechanical curator project within the portal as a good example of Data enrichment once collections are digitized (see the first draft of the portal at ).

Would you be interested in working with me on a case study? Astinson (WMF) (talk) 16:38, 19 October 2016 (UTC)

@Astinson (WMF): Hi. I've been rather away from Wiki for the last couple of months, so I'm aware there's a bit of maintenance needed on the geocoordinate side (as well as starting to get the maps uploaded to Commons, which I've still not got underway). Hope to get a bit back into the saddle sometime in the next few weeks.
Most of the Mech Curator images currently up are due to heroic work by user Metilsteiner, who did a lot of picture cropping, regrading, and description.
Not sure how well the full manual method scales -- it can be quite hard. The other approach is load everything up and they will come, of which Fae has been a particularly successful exponent. He also has a very nice suite of dashboard scripts, to track who has then added enrichment & what.
Haven't read your portal yet, but he is certainly someone you should also be talking to. Jheald (talk) 22:23, 19 October 2016 (UTC)
Thanks for the reference to @Metilsteiner:: I would love to work with the two of you, to describe the process and strategy used for describing the content. I think these kinds of structured campaigns to update metadata on large swaths of content, make a lot of sense (and are very different from the approach of Fae), because the principle focus is on enhancing the media, instead of just integrating the content into Commons. Even if its not replicable, I want to make sure that we document the philosophy and approach, so that other folks can design structures for similar crowdsourcing projects. The case study could look like this Argentine Digitization Project or this Catalan Libraries Project. The focus is on showing folks what the possibilities look like, not necessarily what is repeatable (one of the problems I am noticing at the moment, is that most of our affiliates who do GLAM work, don't have a sense of scale or energy needed to refine the uploaded media or how to organize that).Astinson (WMF) (talk) 14:35, 20 October 2016 (UTC) — Preceding unsigned comment added by Jan Dittrich (WMDE) (talk • contribs) 12:55, 26 October 2016 (UTC)

Talking about Structured Data on CommonsEdit

Hello James,

to improve functionality for searching, organizing and maintaining media on Commons the Wikidata team wants to enable use of structured data on Commons. For this, we would like to learn from community members like you how you do these tasks on Commons.

If you would like to participate and support us, please let me know (here on the talk page or via mail to jan.dittrich AT and we will set up a date together to talk for about 30min. I often use google Hangouts for this, however, I am open for any other possibility (e.g. Skype, WebRTC…)

Kind Regards, --Jan Dittrich (WMDE) (talk) 15:40, 24 October 2016 (UTC)

@Jan Dittrich (WMDE): Hi Jan. Thanks for getting in touch. I've been a bit focussed on other things than wiki for the last few months. If I can have some time to get back up to speed with how things have been developing (and to get back into the issues), then I would be very happy to have a chat, or join a group discussion. But I do need some time to do some homework first! Jheald (talk) 19:13, 25 October 2016 (UTC)
Hi James, Thanks for your kind reply! My main interest lies in your workflows and goals you want to achieve (So right now, I imagine parts of the conversation to be like »I often need to upload this and that images and they should be used … however, it is difficult because …«). If you feel that this is something you want/could do without the homework (which sounded more technical), it should pose no problem in case you skip it. But do whatever you think is best.
PS.: While writing this answer my browser crashed and when posting this afterwards it affected the whole page, not just the section. I restored it. Sorry for the brief mess :-( --Jan Dittrich (WMDE) (talk) 13:00, 26 October 2016 (UTC)
Hey James :) I don't think anything has really changed from what you already know so it should be fine without doing any additional homework if you want. --Lydia Pintscher (WMDE) (talk) 15:13, 16 November 2016 (UTC)

Gallery / Category having standard templatesEdit

After reading through your js discussion to d:Wikidata:Project chat; rereading Commons:Wikidata/Commons-Wikidata sitelinks for the first time in a while; doing gallery <-> WD maintenance (noting I am generally not a gallery person and still don't know why we have galleries); I am wondering whether there needs to be a standard template that sits in all galleries and maybe (another?) in all categories. If we did that then we can address some of the issues about linking now that we have the ability to suck related links/interwikis.

Having a bot that cycles through and appends or prepends a template into pages in either of those two namespaces seems to be a particularly easy task. I know that this would not be an issue-free process, however, we have seen header templates work at other sites, we know that infoboxes are now accepted practice, so why aren't we looking at something auto-parameter'd to help us in this task set. Waiting for WD to meet the needs here will be tiresome, and we need to leverage what we do have. Of course it would be even better to have something in common.js that just auto-populated templates per namespace, though that may upset people too much.

@Multichill: as another agent provocateur and all round thinker on the subject matter.

Am I missing something bleeding obvious here?  — billinghurst sDrewth 04:57, 23 January 2017 (UTC)

I'm not sure how happy I should be with the "agent provocateur" designation.....
I see all the steps we're doing as an evolution leading to more structured data on Commons. User:Jheald/wdcat.js is a typical client side approach to get things more connected, adding templates is more a server side approach.
Some day in the future we'll have structured data and we can use that to describe files instead of having categories. We're not there yet so in the meantime we can still work on improving categories towards next generation categories. On Wikidata we already have the properties to relate categories and topics: category's main topic (P301), topic's main category (P910), category combines topics (P971) and Commons category (P373) to fill the gap of not having a category item. You could create a LUA based template that can be added to a gallery/category that displays some useful information in your local language based on Wikidata. Deploy that on a small number of pages, get feedback, improve and expand. Multichill (talk) 14:15, 23 January 2017 (UTC)

Time for some bot-driven templating?Edit

Hi @Billinghurst:, thanks for dropping by!
A couple of things I want to respond to. First the idea of bot-driven templating more generally.
One of the things in that Village Pump post advertising the wdcat.js script was certainly to trying to get people to think about was about adding Wikidata-link templates to Commons, and one of the questions that did occur to me was perhaps whether the time may have come to fire up the bots, and template all-out as many Commons cats as we think we can describe with Wikidata.
I do very much think that the best thing we can be doing now as community to be preparing the way for structured data is to be plumbing in as many Commons <-> Wikidata links as we can. Two things we can do towards that are getting more templates in, and making those templates to be more Wikidata-driven. I think it would be good to get to the point that such templates are so widespread and so established, that people then start to ask why their favourite category doesn't have one, and then go ahead and plumb it in themselves. Given that we have something like 1.5 million categories with P373s, there's clearly big potential for a bot drive to add such templates. If that then could mobilise people to identify more Wikidata matches, ie more P373s, that is what I think is then going to be so helpful when we start trying to topic-tag the subjects and attributes of individual images.
But there is a thing that gives me pause, which is this:
Suppose we have a P373 from a Wikidata item for say) a painter "Fred Jones" pointing to a particular Commons Category. We could get a bot to simply slam a {{Creator possible}} on the category. But is that the right thing to do, without a human eyeballing?
What worries me if that {{Creator possible}} makes the categorical statement that "This is a biographical category related to a single person..." But do we know that that is actually true of all the items in the category? Or that it will continue to be?
Thinking about eg User:Fae's outstanding upload from the Wellcome Library collection, one of the things that I think we all know about big uploads is that Commons categorisation is bloody hard. One of the few things it is possible to (reasonably confidently) auto-identify with a view to categorisation are names. I think Fae was quite right to follow that possibility, and so any image metadata that mentioned a Fred Jones in any context would get added to Category:Fred Jones -- regardless of whether it was our painter or not. (@Fae: that may be an over-simplification, please correct if so).
So I wonder if we shouldn't treat all categories that are defined just by Forename/Surname as actually diffusion/disambiguation categories, and insist that when we do indeed have a category which is indeed "a biographical category related to a single person", we should demand that that category should have more than just a Forename/Surname name. This thought in part is inspired by the Art UK project (formerly called Your Paintings), that we track with Wikidata property d:P1367. When they re-booted their site last autumn from Your Paintings to Art UK, they took the opportunity to change the majority of their artist identifiers, to specifically include dates as accurately as they had them, eg martini-simone-c-12841344, rather than simone-martini. I wonder if we shouldn't do the same; so that before doing any big bot run to make categories with {{Creator possible}}, perhaps we should first move the content that we are confident is by that creator from eg "Category:Fred Jones" to eg "Category: Fred Jones (1847-1903)", and make this kind of naming the standard convention for single-person biographical categories on Commons.
Any thoughts (@Fae, Multichill:) ? Is this suggestion for more detailed category names for identified individuals a proposal worth trying to push though into a Commons standard? Jheald (talk) 23:20, 24 January 2017 (UTC)
One other aspect to think about: User:Multichill is right to draw attention to the distinction between templates and wdcat.js as classic client-side/server-side approaches to achieve the same thing.
The catch is that SPARQL endpoints are some of the most resource-expensive servers you can run. Each wdcat.js SPARQL call is taking about 0.17 seconds of server time to run -- for each Commons category a user is loading. Which is fine, so long as the only users are the intersection of people who are die-hard Commons and die-hard Wikidata volunteers. But it's not something that (I believe) would scale, at least not in its present form, to provide for every casual Wikipedia user who happens to click through to a Commons category. Of course if one did want to go down that server-side route, there are efficiencies that could be made, eg maybe replacing the general-purpose SPARQL call with a dedicated lookup just for this property. And doing things centrally via a central server/client approach means of course that changes can be made without having to update millions of templates.
But I suspect that, for mass use, and certainly for the time being, templates probably are the easiest way to go to bring the joys of Wikidata information to the masses, rather than trying to roll out something like wdcat.js more widely, however useful I think it is. Jheald (talk) 01:21, 25 January 2017 (UTC)
And of course, the more that we can write templates to draw on the central Wikidata store

The category <-> category-item, gallery <-> article item thingEdit

I think this is the other prong of what you were asking about, @Billinghurst:. Some thoughts to follow. Jheald (talk) 23:22, 24 January 2017 (UTC)

First up, it seems a very long time ago since I wrote Commons:Wikidata/Commons-Wikidata sitelinks, and I wouldn't be nearly as strong today with the statement

"if the structure is to remain stable, predictable and traversable, it is essential that the category-to-category and article-to-article rule is observed."

The facts on the ground are that this simply isn't being observed; and probably (I suspect) actually it may not matter very much.

Taking the facts on the ground first, the most recent stats I know of are these from December 2015. (I'm meaning to update the table, but for various reasons I was pretty much away-from-Wiki for about the last 6 months, so there are a few things on my to-do list to bring up to date). The take-away, I think, is that there really hasn't been much consistent following of the category-to-category and article-to-article rule; also, looking at historical trends at that time, the strongest sitelink growth was overwhemlingly in Commonscat<->Article sitelinks, rather than Commonscat<->Category or Gallery<->Article.

The reasons for this are probably straightforward enough to understand, and (IMO) may be something that just need to be accepted.

So, on to the second question, are they actually a problem, these cross-namespace links?

Contra to what I wrote in December 2014, I suspect that actually they probably not really a problem.

Now that we have "arbitrary access", there is no over-riding need for a page to be sitelinked to a Wikidata item in order for a template to work. So long as the Q-number is specified for the relevant data-item, a template can draw from anywhere. In fact this is just as well, because for a {{Creator}} template or {{Authority control}} template sitting on a category, it's the article-like wikidata item rather than the category-like wikidata item that is more likely to hold the information of interest.

(more to follow) Jheald (talk) 23:57, 24 January 2017 (UTC)

I have posted some updated numbers at Wikidata Village Pump. Over the last year the trend in new sitelinks between Commons categories and Wikidata has been almost 4 to 1 in favour of links to article-like items over links to category-like items. I've suggested over there that "perhaps the time has come to accept this as mostly harmless". Jheald (talk) 23:49, 26 January 2017 (UTC)

Category -> query without the queryEdit

Category:Grade I listed buildings in Bedfordshire -> Category:Grade I listed buildings in Bedfordshire (Q8497784) -> list related to category (P1753) -> Grade I listed buildings in Bedfordshire (Q5591762) which has: the following data in is a list of (P360) :

  • Instance of (P31) -> building (Q41176)
  • located in the administrative territorial entity (P131) -> Bedfordshire (Q23143)
  • heritage status (P1435) -> Grade I listed building (Q15700818)

which looks an awful lot like the "wdt:P131+ wd:Q23143 ; wdt:P1435 wd:Q15700818 ; wdt:P31?/wdt:P279* wd:Q41176" you have on the category. With a bit of LUA magic you don't have to put a query on every category. That would be awesome. Multichill (talk) 15:34, 23 February 2017 (UTC)

Put it at Commons_talk:Structured_data/Overview#Experimental_.22category_contains.22_template, seems to be a more central venue. Multichill (talk) 15:38, 23 February 2017 (UTC)

Science-related coordinationEdit


We had some intersections in en.wikipedia, namely (surprisingly) in w:Talk:Qere and Ketiv, as well as around w:Spinor and—possibly—articles on physical constants and units. Could you, when time permitted, review this draft, please? I would like to see your comments on the associated talk page. Incnis Mrsi (talk) 13:27, 3 July 2017 (UTC)

Structured Data on Commons Newsletter, July 19, 2017Edit

Welcome to the newsletter for Structured Data on Wikimedia Commons! You can update your subscription to the newsletter. Do inform others who you think will want to be involved in the project!

Structured Data on Wikimedia Commons?Edit

The millions of files on Wikimedia Commons are described with a lot of information or (meta)data. With the project Structured Data on Wikimedia Commons, this data is structured more, and is made machine-readable. This will make it easier to view, search (also multilingually), edit, organize and re-use the files on Commons.

In early 2017, the Sloan Foundation funded this project (see documentation). Development takes place in 2017–2020. It involves staff from the Wikimedia Foundation and Wikimedia Deutschland (WMDE) and many volunteers. To achieve this, Wikibase support is added to Wikimedia Commons. Wikibase is the technology that is also used for Wikidata.

Recent developments: groundworkEdit

  • A new and crucial technical step (federation) now makes it possible to reference data from one Wikibase website in another. Because of this, it will be possible to use Wikidata's items and properties to describe media files on Commons.
  • Another important piece of groundwork is under development: so-called Multi-Content Revisions. This feature allows structured data to be stored alongside wiki text, so that one wiki page can contain several types of content.

Team updatesEdit

  • Amanda Bittaker was hired as Program Manager for Structured Data on Wikimedia Commons. Amanda will take care of the overall management of the project.
  • Sandra Fauconnier (known as Spinster in her volunteer capacity) is the new Community Liaison. She will support the collaboration between the communities (Commons, Wikidata, GLAM) and the product development teams at the Wikimedia Foundation and Wikimedia Deutschland.
  • We have open positions for a UX designer and a Product Manager!

Talking with communities and alliesEdit

  • Long-term feedback from GLAMs. Besides the Wikimedia community, many external cultural and knowledge institutions (GLAMs - Galleries, Libraries, Archives and Museums) are interested in Structured Data on Commons and are willing to provide feedback on the long-term plans for the project. Alex Stinson, GLAM strategist at the Wikimedia Foundation, is currently in contact with Europeana, DPLA, the Smithsonian and the National Archives of the United States. Alex is also looking for other GLAM institutions who might be able to advise on the long term. If you know of an institution or partner that may be appropriate for consultation, do get in touch with Alex.
  • Jonathan Morgan, design researcher, is starting to work on two projects:

What comes next?Edit

  • The Structured Data on Commons team meets in the week after Wikimania to lay the groundwork for the next steps. This includes new backend development and design work, for better and more clear integration of the structured data in pages on Wikimedia Commons.
  • The project's information pages on Wikimedia Commons will receive a long overdue update in the upcoming months. The team will also work on more and better communication channels. Feedback, wishes and tips are welcome at the project's general talk page.

Get involvedEdit

Many greetings from SandraF (WMF) (talk), Community Liaison for this project! 13:55, 19 July 2017 (UTC)

Structured Commons newsletter, October 25, 2017Edit

Welcome to the newsletter for Structured Data on Wikimedia Commons! You can update your subscription to the newsletter. Do inform others who you think will want to be involved in the project!

Community updates
Things to do / input and feedback requests
Presentations / Press / Events
Audience at Structured Commons design discussion, Wikimania 2017
Team updates
The Structured Commons team at Wikimania 2017

Two new people have been hired for the Structured Data on Commons team. We are now complete! :-)

  • Ramsey Isler is the new Product Manager of the Multimedia team.
  • Pamela Drouin was hired as User Interface Designer. She works at the Multimedia team as well, and her work will focus on the Structured Commons project.
Partners and allies
  • We are still welcoming (more) staff from GLAMs (Galleries, Libraries, Archives and Museums) to become part of our long-term focus group (phabricator task T174134). You will be kept in the loop of the project, and receive regular small surveys and requests for feedback. Get in touch with Sandra if you're interested - your input in helping to shape this project is highly valued!

Design research is ongoing.

  • Jonathan Morgan and Niharika Ved have held interviews with various GLAM staff about their batch upload workflows and will finish and report on these in this quarter. (phabricator task T159495)
  • At this moment, there is also an online survey for GLAM staff, Wikimedians in Residence, and GLAM volunteers who upload media collections to Wikimedia Commons. The results will be used to understand how we can improve this experience. (phabricator task T175188)
  • Upcoming: interviews with Wikimedia volunteers who curate media on Commons (including tool developers), talking about activities and workflows. (phabricator task T175185)

In Autumn 2017, the Structured Commons development team works on the following major tasks (see also the quarterly goals for the team):

  • Getting Multi-Content Revisions sufficiently ready, so that the Multimedia and Search Platform teams can start using it to test and prototype things.
  • Determine metrics and metrics baseline for Commons (phabricator task T174519).
  • The multimedia team at WMF is gaining expertise in Wikibase, and unblocking further development for Structured Commons, by completing the MediaInfo extension for Wikibase.
Stay up to date!

Warmly, your community liaison, SandraF (WMF) (talk)

Message sent by MediaWiki message delivery - 14:26, 25 October 2017 (UTC)

Re: Categories on CommonsEdit

Hello Jheald! Just wanted to give you a heads up that I have seen your response on the Village Pump. I'm actually taking a few days off at the moment, so response will be slower, but it's on the radar. Thanks! SandraF (WMF) (talk) 08:52, 6 November 2017 (UTC)

Structured Commons focus group update, December 11, 2017Edit

Hello! You are receiving this message because you signed up for the community focus group for Structured Commons :-)

Later this week, a full newsletter will be distributed, but you are the first to receive an update on new requests for feedback.

Three requests for feedback
  1. We received many additions to the spreadsheet that collects important Commons and Wikidata tools. Thank you! Now, you can participate in a survey that helps us understand and prioritize which tools and functionalities are most important for the Wikimedia Commons and Wikidata communities. The survey runs until December 22. Here's some background.
  2. Help the team decide on better names for 'captions' and 'descriptions'. You can provide input until January 3, 2018.
  3. Help collect interesting Commons files, to prepare for the data modelling challenges ahead! Continuous input is welcome there.

Warmly, your community liaison SandraF (WMF) (talk)

Message sent by MediaWiki message delivery (talk) - 16:40, 11 December 2017 (UTC)

Structured Commons newsletter, December 13, 2017Edit

Welcome to the newsletter for Structured Data on Wikimedia Commons! You can update your subscription to the newsletter. Do inform others who you think will want to be involved in the project!

Community updates
Things to do / input and feedback requests
A multi-licensed image on Wikimedia Commons, with a custom {{EthnologyItemMHNT}} Information template. Do you also know media files on Commons that will be interesting or challenging to model with structured data? Add them to the Interesting Commons files page.
Presentations / Press / Events
Presentation about Structured Commons and Wikidata, at WikimediaCon in Berlin.
  • Sandra presented the plans for Structured Commons during WikidataCon in Berlin, on October 29. The presentation focused on collaboration between the Wikidata and Commons communities. You can see the full video here.
Partners and allies
  • We are still welcoming (more) staff from GLAMs (Galleries, Libraries, Archives and Museums) to become part of our long-term focus group (phabricator task T174134). You will be kept in the loop of the project, and receive regular small surveys and requests for feedback. Get in touch with Sandra if you're interested - your input in helping to shape this project is highly valued!
  • Research findings from interviews and surveys of GLAM project participants are being published to the research page. Check back over the next few weeks as additional details (notes, quotes, charts, blog posts, and slide decks) will be added to or linked from that page.
  • The Structured Commons team has written and submitted a report about the first nine months of work on the project to its funders, the Alfred P. Sloan Foundation. The 53-page report, published on November 1, is available on Wikimedia Commons.
  • The team has started working on designs for changes to the upload wizard (T182019).
  • We started preliminary work to prototype changes for file info pages.
  • Work on the MediaInfo extension is ongoing (T176012).
  • The team is continuing its work on baseline metrics on Commons, in order to be able to measure the effectiveness of structured data on Commons. (T174519)
  • Upcoming: in the first half of 2018, the first prototypes and design sketches for file pages, the UploadWizard, and for search will be published for discussion and feedback!
Stay up to date!

Warmly, your community liaison, SandraF (WMF) (talk)

Message sent by MediaWiki message delivery - 16:32, 13 December 2017 (UTC)

Structured Commons - Design feedback request: Multilingual CaptionsEdit

Hello! You are receiving this message because you signed up for the the community focus group for Structured Data on Wikimedia Commons.

The Structured Data on Commons team has a new design feedback request up for Multilingual Captions support in the Upload Wizard. Visit the page for more information about the potential designs. Discussion and feedback is welcome there.

On a personal note, you'll see me posting many of these communications going forward for the Structured Data project, as SandraF transitions into working on the GLAM side of things for Structured Data on Commons full time. For the past six months she's been splitting time between the two roles (GLAM and Community Liaison). I'm looking forward to working with you all again. Thank you, happy editing. Keegan (WMF) (talk) 15:09, 24 January 2018 (UTC)

Structured Data feedback - What gets stored where (Ontology)Edit


There is a new feedback request for Structured Data on Commons (link for messages posted to Commons: , regarding what metadata from a file gets stored where. Your participation is appreciated.

Happy editing to you. Keegan (WMF) (talk) 22:58, 15 February 2018 (UTC)

First structured licensing conversation on CommonsEdit


The first conversation about structured copyright and licensing for Structured Data on Commons has been posted, please come by and participate. The discussion will be open through the end of the month (March). Thank you. -- Keegan (WMF) (talk) 17:26, 16 March 2018 (UTC)
Return to the user page of "Jheald".