Commons:Village pump/Proposals/Archive/2023/10

Unified license for government websites of Ukraine

Recently, when visiting government websites of Ukraine, almost everywhere at the bottom of the pages you can find the following description: All site materials are available under a Creative Commons Attribution 4.0 International license, unless otherwise noted. Is Wikimedia Commons required to have a single template for sites with the gov.ua domain, while the list of resources will be clearly monitored by administrators? MasterRus21thCentury (talk) 18:27, 4 October 2023 (UTC)

Leonore Template ?

Hello, I quite often use the Gallica Template to source my uploads. Is there anything like that for the Léonore Database? If not, could this be done? Thanks in advance. William C. Minor (talk) 05:14, 15 October 2023 (UTC)

 
We have templates for all types of sources (e.g. France-related ones), but I couldn't find one for Léonore. If this is a source we regularly use, it certainly makes sense to create a template. --rimshottalk 07:18, 3 November 2023 (UTC)

Increase of file size limit on Commons for future-proof purposes

Hey folks!

The current file size limit is 4 GiB (approx. 4.3 Gigabytes), see COM:MAXSIZE. I want to propose a increased file size limit. The limit was increased in April 2016 from 2 to 4 GiB.

Since then, the sizes of files increased over time due to larger video resolutions.

I want to give some examples when files exceed the 4 GiB threshold:

  • 4K YouTube videos after 25-35 minutes
  • FHD DSLR/DSLM videos 8-15 minutes
  • 4K DSLR/DSLM videos after 2.5-8 minutes
  • 8K DSLR/DSLM videos after 1.25-4 minutes

Videos for example exceed the size limit of 4 GiB quite fast, but also high-resolution scans of 3D objects from organizations like the Smithsonian Institution may offer files that are larger than the limit (and where file splitting is very problematic). I have a large aerial image of Munich that is also too large right now, but offers many details. Over time, more and more files will come into conflict with this limit, as cameras etc. will become more capable. I would like to propose an increase to 32 or 64 GiB if possible. When colored meshes on Commons will be available, a higher file size limit would also be very appreciated.

What do you think?

Greetings and thank you a lot, --PantheraLeo1359531 😺 (talk) 17:17, 9 October 2023 (UTC)

This has already been requested multiple times, but till now the WMF team did not work on a solution for the current technical limitations. GPSLeo (talk) 18:19, 9 October 2023 (UTC)
Thank you for mentioning, I hope this issue will be served soon :) --PantheraLeo1359531 😺 (talk) 20:21, 9 October 2023 (UTC)
  Comment It was recently made possible to upload files up to 5 GB. I don't know when this will be live. See phab:T191805. Yann (talk) 17:29, 2 November 2023 (UTC)
While the filesize is likely to increase a bit in the short term (year?) to 5GB (as mentioned by yann), a further rise is very unlikely to happen any time soon. Reasons:
1. It costs a TON of money. Big file handling is expensive. Much more expensive than text. Not just in storage (originals, backups), but also in network cost (all files have to be moved over the internal network), hardware costs (plain server capacity). There is currently 440TB of originals, 22TB of original video. And then a not exactly known amount of derivatives of those originals (thumbnails and smaller versions of the big videos). In many ways, video is already creating a 'disproportionate' impact, compared to how much it is used by the users.
2. Anything over 5MB basically cannot be send to a browser. Therefore you need to thumbnail, postprocess etc. So CPU, and yet more datastorage to store the results of that. Specifically in the above example you are speaking about things like 8K, that not even Youtube is doing (publicly). And that is for a good reason. Handling such big files essentially requires purpose designed hardware to do the decoding and encoding (GPU, or like Youtube, who design and manufacture their own hardware for this). It cannot really be done with the generic compute power that we have available within Wikimedia. I advise watching this video by LTT, where they comment on last year's Youtube experiment of charging for 4K. LTT runs their own video website called https://www.floatplane.com so they have some experience with the cost of high res video.
3. Media handling is unresourced by WMF. As in, there are 0 engineers dedicated to improving and modernizing the multimedia stack. The only effort spend is on keeping the current multimedia stack alive.
4. It is pretty hard to be AND wordpress AND the internet archive AND youtube AND thingiverse AND Wolfram Alpha on a shoestring budget. Wikimedia always has been 10+ years behind these websites and is likely to remain so. We can't scale like them, because we don't have a narrow focus, and because expertise at the bleeding edge is very expensive. Just waiting 10 years is the affordable way to get there.
5. The entire infrastructure for filestorage currently has a 5GB limit. Files above that size cannot be addressed, without major re-architecting of the storage layer used by Wikimedia. This rearchitecturing is unlikely to happen due to the earlier mentioned points.
The best solution for this is still to host these on a separate archive site, and upload a smaller more websuitable version to Commons. —TheDJ (talkcontribs) 19:54, 2 November 2023 (UTC)
Thank you very much for giving insights into this issue! The 5 GiB limit is a good thing to hear! Maybe we get a solution some time that is a compromise between resources and upload limits :) --PantheraLeo1359531 😺 (talk) 18:27, 4 November 2023 (UTC)
Otherwise we have to ask Google for server (technology) support ;D --PantheraLeo1359531 😺 (talk) 18:31, 4 November 2023 (UTC)
TheDJ, thanks from me, too, for your insightful statement. It continues to frustrate me that the WMF, with its heaps of donator money to spend, is allocating so few resources and investment to Commons. In my view, "In many ways, video is already creating a 'disproportionate' impact, compared to how much it is used by the users" is just one side of the coin: The very poor support of video on Commons isn't encouraging use - but it could be very valuable and an alternative to increasingly heavily commercialized platforms such as YouTube (which is now starting banning adblockers). Commons is the only truly free media platform where users don't pay with their personal data or with money for use, and I think it needs strengthening. The WMF has the money, more than enough, it's only a matter of willingness. A "shoestring budget" is not an inevitability - it's well known that the WMF's assets have risen each year by many millions. I certainly would not ask Google for any technology support, though, as I think this is one of the corporations we should distance us from. Gestumblindi (talk) 10:26, 9 November 2023 (UTC)
Millions is still a shoestring when it comes to video. This is another example of people having no idea how much organisations like Google spend on stuff like this. Start thinking in billions. —TheDJ (talkcontribs) 11:49, 9 November 2023 (UTC)
Compare with w:Vimeo. Half a billion revenue and still 80 million of losses in a year. So we'd need almost 600 million of donations a year to do what they do. And it's the only thing they do, they don't have to worry about anything but video. —TheDJ (talkcontribs) 11:56, 9 November 2023 (UTC)
Well, I would be happy with a far more limited offering than YouTube or Vimeo have, I don't think we need 4K or even 8K; all I ask for is a reasonably smooth upload and use of medium-sized, medium-quality video (I think Full HD - 1080p - shouldn't be too much to ask?), and we don't have even that, it's all very rickety. Gestumblindi (talk) 19:24, 9 November 2023 (UTC)
Personally, I think 5 GiB is enough for Commons. Our purpose is education, not entertainment. We don't need 8K videos to explain how mitochondria work. 480p works fine. 1080p is probably overkill. And 4320p (8K) is just totally unnecessary. What we need is better quality video content, not better video resolution. Nosferattus (talk) 16:46, 9 November 2023 (UTC)
Take a good look at 480p versus 1080p, say https://upload.wikimedia.org/wikipedia/commons/thumb/0/00/Grand_bassin_octogonal_Jardin_des_Tuileries_003.jpg/640px-Grand_bassin_octogonal_Jardin_des_Tuileries_003.jpg (428p) versus https://upload.wikimedia.org/wikipedia/commons/thumb/0/00/Grand_bassin_octogonal_Jardin_des_Tuileries_003.jpg/1280px-Grand_bassin_octogonal_Jardin_des_Tuileries_003.jpg (857p) versus https://upload.wikimedia.org/wikipedia/commons/0/00/Grand_bassin_octogonal_Jardin_des_Tuileries_003.jpg (2755p) File:Grand bassin octogonal Jardin des Tuileries 003.jpg. There's a smudge in the top middle in 428p that turns out to be a bird. Even going to 857p makes it much easier to see the details of the Ferris wheel and the people and the statues. If you want to stick with mitochodria, compare https://upload.wikimedia.org/wikipedia/commons/c/cf/Aging_Phenotype_by_mtDNA_Mutation_in_mice_Edgar_et_al._2009.png (1027p) to https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Aging_Phenotype_by_mtDNA_Mutation_in_mice_Edgar_et_al._2009.png/528px-Aging_Phenotype_by_mtDNA_Mutation_in_mice_Edgar_et_al._2009.png (480p), where text and details are nigh illegible. File:Aging Phenotype by mtDNA Mutation in mice Edgar et al. 2009.png --Prosfilaes (talk) 17:22, 9 November 2023 (UTC)
I think it depends on the video content how important resolution is. For simple animations, FHD is certainly enough. For historical (modern) events for example, 4K or even 8K is probably really appreciated, for documentation purposes. --PantheraLeo1359531 😺 (talk) 19:31, 12 November 2023 (UTC)
Addition: Is 8K unnecessary? Not always. 8K first of all allows cropping to distinct areas in the video. And let's take a video of a collapsing building, recorded in 8K, we can extract single images with a resolution of many full-frame cameras. If we take pictures from YouTube in Full HD, we usually have a quite low level of detail. --PantheraLeo1359531 😺 (talk) 19:38, 12 November 2023 (UTC)

Total size of uploads

Once we are here, is the total size (of uploads smaller that 5Gb) a problem? Are we safe for the time being, or is there a chance that WMF soon will not be able to host the entire volume of the files (say photos, not videos) which significantly grows every day?--Ymblanter (talk) 21:12, 9 November 2023 (UTC)

In my opinion, I see no problem regarding hosting. Right now we have at least ca. 480 TB. It is common that modern datacenters have 1.5 petabyte or more. With a yearly growth of ca. 80 TB in 2022, it will take time until Commons hits this threshold. I could also imagine WMF has some reserves, even if the data amount grows even faster. 1 Petabyte could be reached with 50x 20 TB hard disk drives. Probably, the used disks are smaller in capacity. If Commons will acquire several terabytes at once, this could be challenging, but I assume that we're save. --PantheraLeo1359531 😺 (talk) 19:35, 12 November 2023 (UTC)
Thanks, makes sense to me. Ymblanter (talk) 18:29, 15 November 2023 (UTC)
Flickr has database which vastly greater than Commons and I see no problem with it. Юрий Д.К 06:35, 15 November 2023 (UTC)
But Flickr, after it was bought by SmugMug, decided that it can not keep expanding, that the database was already too big, and users must pay if they want to be above the (pretty moderate) limit. Ymblanter (talk) 18:29, 15 November 2023 (UTC)
According to this website, flickr has 10 billion images. We are far away from that :) --PantheraLeo1359531 😺 (talk) 19:19, 20 November 2023 (UTC)
Even more than that — according to the Flickr Foundation presentation at GLAMWiki last week, it's more like 50 billion! Sam Wilson 00:14, 21 November 2023 (UTC)
That's really a lot --PantheraLeo1359531 😺 (talk) 17:56, 23 November 2023 (UTC)
If I'm interpreting [1] right, i think we are currently at 821TiB (presumably one of those lines is Eqiad data center and the other is codfw, but I'm just guessing). Presumably there might be multiple copies of media stored to guard against hardware failure. Generally I would suggest not worrying about server capacity as long as you are doing something useful to the mission unless someone from the WMF SRE team specifically says to worry. After all, the reason we have servers is so they are used. Bawolff (talk) 04:02, 24 November 2023 (UTC)
I'm not so worried about storage space myself, but the upload wizard doesn't seem to be very reliable and I don't think people should be able to upload extremely large files until there's at least a way to resume uploads. Otherwise there's really no point. If I can't even upload a 200mb image as someone with fast cable internet because it just times out then there's really no point though. --Adamant1 (talk) 04:10, 24 November 2023 (UTC)
@Adamant1: I suggest you try using User:Rillke/bigChunkedUpload.js (doc at User talk:Rillke/bigChunkedUpload.js and help at Help:Chunked upload).   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 13:20, 25 November 2023 (UTC)
Thanks for the suggestion. I'll have to do that. --Adamant1 (talk) 13:39, 25 November 2023 (UTC)
@Adamant1: You're welcome, but you forgot to ping me.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 13:42, 25 November 2023 (UTC)