Commons:Restricted uploads

Even the source files for the Wikimedia Foundation's own outreach materials are currently sitting on a non-WMF server, because Wikimedia Commons does not support Scribus files, which is a free/open source desktop publishing application.

Wikimedia Commons should permit upload of insecure, possibly proprietary file formats to make sure that source files for media objects can always be added. Such content cannot be used in articles, only linked to, and no thumbnail would be shown on the image description page. Due to issues of security and/or misuse (e.g. for file sharing), upload of such files must be restricted to trusted users. This would be implemented with a new user right, restricted-upload. By default, this user right would be given only to administrators, but over time, it could be expanded to other users who have a well-understood need to upload such files.

Rationale and process edit

There are many files for which source files cannot currently be uploaded:

  • PDF documents made with Scribus, OpenOffice, etc.
  • 3D renders made with Blender, POV-Ray, etc.
  • 3D structures of molecules in the chemical markup language made e.g. with Avogadro
  • Movie timelines created with PiTiVi, Cinelerra, etc.
  • Raw camera media such as Adobe DNG files
  • Archive files such as ZIP files containing multiple related source files (such as an HTML document and accompanying graphics)
  • Source code used to generate mathematical plots such as fractals

Source files are essential for reuse. Continued non-availability of source files is a major issue in long-term re-usability of files uploaded to Wikimedia Commons. For example, a user may render a PDF as a PNG in order to upload it; if later a small change needs to be made (such as adding/removing text or changing a color slightly), and the original contributor is unavailable, it must be needlessly re-created from scratch.

Unfortunately, any new file type supported by Wikimedia projects has to undergo security review to ensure that it does not allow execution of arbitrary code. Additionally, it is desirable to develop specialized handlers (for thumbnail generation, multi-page navigation, etc.) when new file types are added, which requires considerable effort. As a result, making available new file types has been a very slow process.

We should therefore permit users who enjoy a sufficient level of trust to upload files in any format. Users with restricted upload permission would act on behalf of other users needing to upload files in such formats to Wikimedia Commons, through a page like Commons:Restricted uploads/Requests. Uploaders would bear the responsibility of ensuring that the uploads adhere to the restricted uploads review criteria below.

Restricted uploads review criteria edit

  1. The upload must be source media for another non-restricted file already present on Wikimedia Commons. If the non-restricted file is deleted, the corresponding restricted file must be as well.
    • Rationale: Some users may wish to upload a restricted file, then tell readers to "click here to download file X," as a way of circumventing our advocacy of free formats. This is not permitted.
  2. The uploader must open and examine the file in an application supporting it to ensure that it represents what it claims to. If the user does not own such an application, they must request assistance from someone who does.
  3. The uploader should confirm that the file is of reasonable size considering the file type and the content.
    • Rationale: This is to ensure that the file contains no hidden or irrelevant content.
  4. The uploader must use virus scanning software to scan the content for viruses or other malware. A good cross-platform option is to use a web-based scanner such as VirSCAN which accepts file uploads and runs a variety of scanning programs on it.
  5. If it is possible to convert the file either from a restricted format to a non-restricted file format, or from a proprietary to a free and unencumbered format, using an automated process with no loss of information or substantial increase in filesize, the uploader should do so before uploading it. If it requires manual effort to convert, the restricted upload should be tagged with an appropriate template (like {{Convert to SVG}}).
    • Rationale: This is in line with our mission of supporting free formats. However, be wary of conversions that appear to be lossless but are not entirely so: it must preserve all content and all metadata. Conversions to JPEG, OGG, SVG, DNG, or Open Office formats are generally not lossless.

How would a restricted upload be rendered? edit

Thumbnails are not rendered for restricted uploads. In order to avoid user confusion, the file description page for a restricted upload would link to Commons:Restricted uploads for an explanation of the process for uploading files of the same type. Whenever a user requests to download a restricted upload, they would first have to read a security warning explaining that it may contain malicious content. This warning can be disabled in a user's preferences.

Feedback on future direction edit

The ability to upload files in almost arbitrary formats would make it clearer which free/open source formats are most important to support as unrestricted files in the future. For example, if we only get 5 .blend uploads per year through such a system, .blend support may not be a priority. If we get 1000 .odt uploads, .odt support may be significantly more important. It would also motivate development of conversion tools from popular proprietary formats.

Alternatives and their respective drawbacks edit

  • Alternative 1: Allow uploads from anyone, but identify certain file types as potentially insecure, and throw up a security warning to the user before making the file available for download. Do not permit in-line use of the file. The most obvious drawback to this is that it would allow Wikimedia Commons to be abused for file sharing. Files containing malware may remain undetected for weeks. Users may attempt to circumvent our advocacy of free formats by uploading media that is needlessly proprietary then link readers to it. Finally, file types that need to be downloaded to be inspected would be harder to patrol for illegal or inappropriate content, so the restriction to trusted users may have benefits also from a vetting perspective, establishing an intentional "trust barrier".
  • Alternative 2: Do nothing / continue to slowly add support for new file formats. The problem with this approach is that we're continually adding new files without sources which may never be recovered due to users leaving, harddisks crashing, etc. Indeed, even just the number of possible free/open source applications and output formats that can provide useful content for Wikimedia Commons seems to be growing significantly faster than our ability to support them.
  • Alternative 3: Only implement an improved ZIP file handler (or alternative archive format): display contents of ZIP files, and render thumbnails of contained media in unrestricted formats. ZIP files could then be uploaded by all users (or further restricted if desired), and patrolled thanks to the provided directory information. Any unsupported source format (plus dependencies) could be uploaded and referenced as ZIP. Drawback: Users could misunderstand this as a method to do batch upload. Management of individual files suffers. It might be desirable to support extraction (resulting in individual files on the server) by admins.