Open main menu

Structured Data on Commons Newsletter - Fall 2018 editionEdit

Welcome to the newsletter for Structured Data on Wikimedia Commons! You can update your subscription to the newsletter. Do inform others who you think will want to be involved in the project!

Community updates
Things to do / input and feedback requests


Since the last newsletter:

Presentations / Press / Events
Partners and allies
  • The info portal on Structured Commons now includes a section on GLAM (Galleries, Libraries, Archives and Museums).
  • We are currently planning the first GLAM pilot projects that will use structured data on Wikimedia Commons. One project has already started: the Swedish Heritage Board researches and develops a prototype tool to provide improved metadata (translations, data additions...) from Wikimedia Commons back to the source institution. Read the project brief.
  • The documentation for batch uploads of files to Wikimedia Commons will be improved in 2019, as part of preparing for Structured Data on Wikimedia Commons. To prepare, the GLAM team at the Wikimedia Foundation wants to understand better which types of documentation you already use, and how you like to learn new GLAM-Wiki skills and knowledge. Fill in a short survey to provide input!
Stay up to date!

-- Keegan (WMF) (talk)

Message sent by MediaWiki message delivery - 17:58, 7 December 2018 (UTC)

Captions in JanuaryEdit

The previous message from today says captions will be released in November in the text. January is the correct month. My apologies for the potential confusion. -- Keegan (WMF) (talk) 20:43, 7 January 2019 (UTC)

Structured Data - file captions coming this week (January 2019)Edit

My apologies if this is a duplicate message for you, it is being sent to multiple lists which you may be signed up for.

Hi all, following up on last month's announcement...

Multilingual file captions will be released this week, on either Wednesday, 9 January or Thursday, 10 January 2019. Captions are a feature to add short, translatable descriptions to files. Here's some links you might want to look follow before the release, if you haven't already:

  1. Read over the help page for using captions - I wrote the page on because captions are available for any MediaWiki user, feel free to host/modify a copy of the page here on Commons.
  2. Test out using captions on Beta Commons.
  3. Leave feedback about the test on the captions test talk page, if you have anything you'd like to say prior to release.

Additionally, there will be an IRC office hour on Thursday, 10 January with the Structured Data team to talk about file captions, as well as anything else the community may be interested in. Date/time conversion, as well as a link to join, are on Meta.

Thanks for your time, I look forward to seeing those who can make it to the IRC office hour on Thursday. -- Keegan (WMF) (talk) 21:09, 7 January 2019 (UTC)

File:Vietnam. Three Fighter Squadron 161 (VF-161) F-4D Phantom II fighter aircraft from the attack aircraft carrier USS Midway (CVA-41) and three Corsair II attack aircraft from the attack aircraft carrier USS America ((...) - NARA - 558517.gifEdit

Hi, I don't know what your aim with this bot is, but most NARA files have been uploaded already in TIF-quality. I see no reason to upload the same files in low GIF-quality. I would be glad if you could either upload high-res files or at least exclude low quality duplicates of existing files. Thank you. Cobatfor (talk) 17:43, 11 January 2019 (UTC)

@Cobatfor: I am aware that this is unfortunately an issue. I had access to a small number of high-resolution TIFFs several years ago and uploaded them all. This was maybe 100,000 files. There are currently over 50 million files in the NARA catalog, so it is not the case that most have been uploaded in TIFF already. The current bot is uploading directly from the catalog, unlike the TIFF originals that were stored on a drive. There may be a small number of duplicates resulting from this process, and I would like to clean that up eventually. It is difficult to exclude these beforehand, because Wikimedia Commons does not have structured data (can't easily query on the identifier field to detect if it exists), and it is not really possible to programmatically determine that a version of a file already exists on Commons with the bot we have. I will need to write a different script to flag any items with the same identifier in order to deduplicate. Also, it's hard to tell with your example, but for many of these, the GIF is not just a low-resolution version of the TIFF. The TIFF is the master scan file, while the GIF may have had color levels adjusted, been cropped, or other edits made prior to being made catalog-ready. I have actually had trouble getting them deleted in the past, because Commons admins will not speedy-delete a duplicate if there has been any editing done. I have been required to write a deletion request for each one, which costs me a lot more time, and makes it less of a priority for me. Dominic (talk) 15:24, 30 January 2019 (UTC)
Thanks for the explanation. It shows one of he difficulties that I see with bot-uploads, another is e.g. categorization. But it also sets a light one the duplication issue, where I had similar experiences. Cheers Cobatfor (talk) 15:52, 30 January 2019 (UTC)

US National Archives bot down? and requestEdit

I notice the US National Archives bot hasn't uploaded anything since October, 2018. Has it been deactivated? Also, are there any plans to upload .jpg versions of the many .tiff master files? I know .tiff files are preferred for file fidelity, but .jpg are more convenient for displaying on Wikimedia sites. Also, may I request a bot-assisted upload of the NARA series Gerald R. Ford White House Photographs, compiled 08/09/1974 - 01/20/1977? The corresponding Commons category only has a about 80 images, while the NARA collection appears to have over 1,000. Note that some previously uploaded files uploaded without complete bot-generated meta data, e.g. this one, are more difficult to categorize. Thanks! (Update, I just read your user page, and understand if you can't contribute or respond right now due to the ongoing Federal shutdown. All the best! Take care.)--Animalparty (talk) 00:07, 15 January 2019 (UTC)

@Animalparty: As you guessed, I have been furloughed and I am catching up on my work inbox and other messages now. Thank you for your patience! The NARA upload bot is not really operating continuously; I operate it when I have the time and resources to do so, and sometimes when I do run it, I run into issues that I need to fix. I am trying to upload on a more regular basis, but I it's been a while, between the holidays and the shutdown. I can certainly prioritize that series, and let you know when I get to it. Also, regarding the TIFFs and JPGs, I uploaded a large number of TIFFs early on in our project, because we had them stored on a hard drive in the office. For these, there should usually be a JPG version already, unless there were some that were missed. For all the rest of the uploads, and future ones going forward, all I will be able to upload is the files from the online catalog. There are a very few TIFFs, but most of these are JPGs (or some low-quality GIFs, unfortunately). If you have been working with NARA images, or plan to, I would love to hear more about what you're working on. Thanks! Dominic (talk) 15:00, 30 January 2019 (UTC)

File:"Dance to the Talking Drum", 1963 - NARA - 558973.jpgEdit

File:"Dance to the Talking Drum", 1963 - NARA - 558973.jpg has been listed at Commons:Deletion requests so that the community can discuss whether it should be kept or not. We would appreciate it if you could go to voice your opinion about this at its entry.

If you created this file, please note that the fact that it has been proposed for deletion does not necessarily mean that we do not value your kind contribution. It simply means that one person believes that there is some specific problem with it, such as a copyright issue.

Please remember to respond to and – if appropriate – contradict the arguments supporting deletion. Arguments which focus on the nominator will not affect the result of the nomination. Thank you!

Afrikaans | العربية | беларуская (тарашкевіца)‎ | български | বাংলা | català | čeština | dansk | Deutsch | Deutsch (Sie-Form)‎ | Zazaki | Ελληνικά | English | Esperanto | español | eesti | فارسی | suomi | français | galego | עברית | hrvatski | magyar | Հայերեն | Bahasa Indonesia | íslenska | italiano | 日本語 | 한국어 | 한국어 (조선) | македонски | മലയാളം | Plattdüütsch | Nederlands | norsk nynorsk | norsk | occitan | polski | پښتو | português | português do Brasil | română | русский | sicilianu | slovenčina | slovenščina | shqip | српски / srpski | svenska | ไทย | Türkçe | українська | Tiếng Việt | 中文 | 中文(简体)‎ | 中文(繁體)‎ | +/−

Buckaroo bob 91 (talk) 00:37, 26 February 2019 (UTC)

Return to the user page of "Dominic".