Commons:Bots/Requests/File Upload Bot (Magnus Manske)

Given that now there is some user authentication, that Magnus is responsive on the more minor issues, and that there has been substantial discussion elsewhere (which was satisfactorily resolved), this discussion is done. The bot is still running, and the required changes have been made or will be made. – Mike.lifeguard | @en.wb 15:44, 25 May 2008 (UTC)[reply]

File Upload Bot (Magnus Manske) - removing of bot flag requested

Operator: User:Magnus Manske

I request to remove the rights for this bot to upload images from Wikipedias. It is often used to upload images that are copyvios, fair use images or have doubtful or not accepted licensed to Commons. --ALE! ¿…? 10:59, 13 March 2008 (UTC)[reply]

For a list of problems with this bot, please read User talk:File Upload Bot (Magnus Manske). --ALE! ¿…? 11:05, 13 March 2008 (UTC)[reply]
For a list of good contributions of this bot, please see its contribution page or its gallery page. --Magnus Manske 12:00, 13 March 2008 (UTC)[reply]

Discussion

  1. The bot (which is not really a bot) does check for license tags in the original wikipedia. So, something on en.wikipedia tagged as "fair use" (or not tagged at all) can't be uploaded through the bot. It needs to have a license tag compatible with commons. That means that (presumably) at least two people thought it was OK - the one who copied it to Commons, and the guy who tagged it as OK in the first place. The is one human more than for most other uploads we get.
  2. The bot is also used for similar, semi-automated transfers from flickr. As for the wikipedia transfer, it checks the license automatically, and only transfers appropriate CC-licensed images. Did you check if maybe the perceived problem stems not from wikipedia, but from flickr images?
  3. As my bot talk page clearly states, if you see a suspicious image uploaded by my bot, just delete it. There's no need to warn it or talk to it ;-)
  4. Besides the bot's "user contributions", its uploads from wikipedia also automatically end up in a subcategory Category:Files moved to Commons requiring review - another quick and easy way to check for bad images.
  5. If that's all as great as I say, it should show up in some statistic. So I took a look at the bot data:
    • The talk page lists ~90 "warnings" for the last ~30 days. That's ~3 bad images/day (not counting those that were silently deleted).
    • The contributions page for the last 2000 edits (=uploads) goes back less than 8 days. That's >250 images/day.
So, that's (by very rough estimate) more than 80 good images per bad one. In my book, that counts as a successful operation...

I hope that clears things up a bit. --Magnus Manske 11:26, 13 March 2008 (UTC)[reply]

Based on my experience with this bot, I prefer to keep it without a bot flag to make human overlook coverage wider. Error rate too high on my opinion: too much derivative works, freedom of panorama violations from Flickr; transfers from projects where people doesn't care about copyrights too much. Also will be good idea to keep a log of upload requesters. --EugeneZelenko 14:08, 13 March 2008 (UTC)[reply]
Also the original description page on the source wikis usually get deleted so quickly that a check of the corectness of the license is impossible. --ALE! ¿…? 14:59, 13 March 2008 (UTC)[reply]
This is a problem with the English Wikipedia...files that are PD-US get transferred here and then quickly deleted there, even though there's supposed to be a wait time. This means valuable files that can be hosted just fine over there get hosted nowhere, because Commons deletes them for not being PD in country of origin. That problem, however, is not just specific to this bot. -N 17:13, 13 March 2008 (UTC)[reply]
Indeed. The bot is used by people to very quickly violate copyright, by for example transferring files from the English Wikipedia (such as WWII photos) that are not PD in country of origin, or otherwise not allowed. -N 17:13, 13 March 2008 (UTC)[reply]
I too would prefer to keep this account enabled to upload, but without the bot flag set, so that humans have a better chance to review what is done. Perhaps some pushback to en:wp about not deleting things so fast is in order too. ++Lar: t/c 18:48, 13 March 2008 (UTC)[reply]
  •  Comment This account does not have a bot flag, just "bot" in the name. Technically, this discussion should not be made here because there is no flag for a bureaucrat to remove. I suggest COM:AN. Patrícia msg 17:46, 13 March 2008 (UTC)[reply]
  • We do sometimes discuss nonbot bots (things that don't necessarily have the bot flag) here, it's not as rigid as how the WP:BAG does things... so it's cool that discussion is here in my view. ++Lar: t/c 18:48, 13 March 2008 (UTC)[reply]
  • Agree; discussion can remain here. Perhaps a pointer from there to here though. If there are problems with images being deleted too fast on enwiki (or other projects) then that's a) that project's problem and b) should be dealt with not by disabling this bot, but rather by educating the admins who are causing the problems. Part of the problem is quick deletions, but apparently there's a lack of understanding of the copyright issues, as pointed out by N. It doesn't look to me like there's any reason to disable the bot. Keep in mind that uploads are categorized per-wiki for easy review. If people aren't doing that, the bot's not to blame. – Mike.lifeguard | @en.wb 19:17, 13 March 2008 (UTC)[reply]
    • Hence the "technically" ;), so fine by me then too. I'd like to point out that this is a recurring issue [1] and that the account has been used for vandalism [2] [3]. This is a problem. I want to pour loads of "assume good faith" on the users that eventually use this bot to upload files but, as someone said before, File Upload Bot (Magnus Manske) basically acts like a public account. I am aware that the uploaded files all go into special categories that can be scanned for bad uploads, etc etc, but the users that are (ab)using the account to upload copyvios or badly sourced material cannot be held responsible: if they don't put their names on a special field that is not required to be filled in, we won't know who transwikied the media. We could expect some control from the admins that delete the media on the originating wikis (usually a speedy deletion because the "file is already in Commons") but let's be realistic, many times that simply doesn't happen. So it's a wacky system. Yes, more "good" files than "bad" ones are uploaded, but personally I still don't feel comfortable in letting this account upload more files under the present conditions. I beg Magnus to address this issue with another perspective than "it uploads more "good" files than "bad" ones"; some sort of log of who is using this account, as suggested by Eugene, is a minimum requirement, imho, and I'd be happy with a rather low daily upload limit (some hundreds of images, perhaps something between 200 and 500), to allow better uploading review. Patrícia msg 22:30, 13 March 2008 (UTC)[reply]
      • I think you raise a lot of valid concerns. But rather than asking for the upload privs of this account to be taken away, let's first ask Magnus (an exceedingly clever chap, everyone knows that) to see if there are things that can be done to address the concerns. For one, why not force use of a valid wiki account name somehow rather than making the name of the uploader be optional? Any way to do that? For another, what about putting a banner on the original image's description on the original wiki asking that it not be deleted until ___ (where maybe ___ is "the image has been reviewed as OK on commons which you can check by clicking on this link", or something???) There are others too, let's make a list... ++Lar: t/c 23:21, 13 March 2008 (UTC)[reply]
        • There is currently no way for TS tools to verify identities, so giving a username is not actually secure. But I'd support making it a required field, and verifying that the account exists on Commons and(?)/or(?) on the "other" wiki. This would still allow "impersonation" of a sort, but it's better than nothing. We could perhaps also require the specified username to be autoconfirmed. – Mike.lifeguard | @en.wb 06:37, 14 March 2008 (UTC)[reply]
The problem isn't this tool. The problem are bad licenses on the WP language versions from where images get uploaded. You can mis-use any tool. --Matthiasb 15:47, 16 March 2008 (UTC)[reply]
At NL.wiki, there is a lot of human review on the uploads done by this bot. The people who delete the local images, check what's going on at Commons. I use this tool a lot. If I make a mistake, I normally find a note at my local NL talkpage, from the moderator who checked if the local image could really be deleted. More than half of my uploads are easy to recognize, because they have an edit from my account within a few minutes after uploading, for example to add a category. Please keep this tool enabled, because disabling would discourage a lot of goodwilling people to transfer local images to Commons. GijsvdL 00:04, 18 March 2008 (UTC)[reply]
It is true that any tool can be misused, but to make it work only for someone with a Commons username, as it was required before, wouln't hurt anyone and would avoid abuse. CommonsHelper has worked like that in the past - it's not even a matter of if it's possible to do so, because it is. Patrícia msg 21:31, 24 March 2008 (UTC)[reply]

The Bot is a very helpful tool to establish a minimum correctness of transfers. If other projects' admins are not checking their transfers before deleting the local copy it's their problem, we (=Magnus) can cope with that only to a limited extent. Regards, Code·is·poetry 16:20, 2 April 2008 (UTC)[reply]

It's not their problem. It becomes a problem to Commons, because the files end up here, and the admins on other projects won't really have to care much about it. I'd be happy to see CommonsHelper reverted to the version where a valid Commons username was required to proceed with the upload, thus giving responsability to the user making the transfer. But well, as I said somewhere above, there is no bot flag to remove, so as far as I'm concerned, we can archive this discussion. Magnus does a wonderful job with coming up with all sorts of great tools and ideas, I just wish we could do something about this tool in particular to make everybody happy. Patrícia msg 12:06, 8 April 2008 (UTC)[reply]
Why should it be our problem? We delete insufficiently described files. They want to use it, they have to assure that it is valid on commons. Code·is·poetry 12:19, 8 April 2008 (UTC)[reply]
GijsvdL: Unfortunately, Commons admins are unable to see the previous file history after transwiking the image to Commons. It would be so helpful if Commons admins could have read-only access to deleted media in other wikis, but that's a whole different story. So, no, if you don't put a username, we won't know who did the transfer. I agree that prohibiting the bot to operate is not the way to go.
Codeispoetry: True, you have a very good point there. Unfortunately, that doesn't stop misuse of the tool, we still have to clean after it. As Matthiasb pointed out, any tool can be misused, but most of them allow accountability of who is misusing. But well, we clean, that's what we're here for. Patrícia msg 14:39, 8 April 2008 (UTC)[reply]
I wonder if it's possibly to separate the various tools which use this account so each one has it's own account. That might give us a better idea of where the problem lies, and if blocking is ever necessary, it'd limit the collateral to other users of only one tool. Also, it'd clarify that the uploads are actually not Magnus'. Magnus: comments? – Mike.lifeguard | @en.wb 16:12, 11 April 2008 (UTC)[reply]
On my point of view is the only problem with Magnus' bot, that his bot misses the original description and sources when the original language isn't english. Don't leave a human job to a bot. --Herrick 08:07, 18 April 2008 (UTC)[reply]
You missed other problem: people who request upload via bot are not accountable for their actions. And this issue should be addressed. --EugeneZelenko 14:53, 23 April 2008 (UTC)[reply]
But if their description and source date are correct and this bot miss them, it's Magnus failure. --Herrick 07:52, 29 April 2008 (UTC)[reply]

Proposal regarding anonymity

This is something I have brought up before elsewhere. I have seen sockpuppets, vandals, and just lazy people upload images via this service. I am far less concerned about the automatic transfer of images than I am about the anonymity. It is absolutely essential that this bot mark who allowed the transfer. A system like that of User:Flickr upload bot would be great (e.g., must first create page; 1-2 images per day for non-autoconfirmed; 6-8 for autoconfirmed; infinite for admins and trusted users - presumably added to a database after requesting such status)

As for the bot status, I would agree the status ought to be removed so that its uploads show up on the recent changes/files list. Patstuart (talk) 06:06, 25 April 2008 (UTC)[reply]

The account doesn't have a bot flag... I think that what you are proposing (and with which I fully agree) has been referred above, but mildy ignored by Magnus (at least he hasn't given more feedback, which is a pity). I'm not sure if there's much more to discuss. Sorry if I sound bitter. Patrícia msg 09:44, 25 April 2008 (UTC)[reply]
In fact, I'm not so sure there isn't. We can of course tell Magnus to either improve the bot, or it won't be used at all. This is not terribly radical: unauthorized bots are against policy to begin with, and any human with such a history of vandalism and copyright violations would probably be blocked. Patstuart (talk) 20:44, 25 April 2008 (UTC)[reply]
This is not the English Wikipedia, bots are allowed to run here unauthorized. -- Bryan (talk to me) 14:10, 28 April 2008 (UTC)[reply]
(should have commented sooner, sorry!) We don't have a BAG, but we do have a community which has spoken out against some bots being considered approved.... I've seen bots fail to get authorized, and thus not be allowed to run. If there is a strong consensus to withdraw the sanction of a bot, it would get, and stay, blocked. Whether or not it had a bot flag. That's my view anyway. ++Lar: t/c 19:55, 19 May 2008 (UTC)[reply]