Commons:Bots/Requests/TaxonBot

TaxonBot (talk · contribs)

Operator: Doc Taxon (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: cleaning up Template:Uncategorized on ::6::, which are already categorized in existing categories

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): continuous

Maximum edit rate (e.g. edits per minute): 5 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): tcl/tk

Doc Taxon (talk) 18:30, 9 December 2014 (UTC)[reply]

Discussion

Contributions doing the job meant above have been already done, see Special:Contributions/TaxonBot. The job worked faultless and it helps to clean up and correct the file pages (namespace ::6::). Thank you, -- Doc Taxon (talk) 18:30, 9 December 2014 (UTC)[reply]

@Multichill: is CategorizationBot doing this task? --Steinsplitter (talk) 18:34, 9 December 2014 (UTC)[reply]
if so, then does it not do this job. -- Doc Taxon (talk) 18:37, 9 December 2014 (UTC)[reply]
add: I have flags on two further WP projects, de:WP and gd:WP -- Doc Taxon (talk) 18:37, 9 December 2014 (UTC)[reply]
Doc Taxon is a valued bot operator at dewiki, so although I blocked the bot for now, I support this request. It should be checked first, though, what is going wrong in cases like [1]. --Krd 18:50, 9 December 2014 (UTC)[reply]
I could detect more of such error entries. This is not the only one at all. But only files uploaded by UploadWizard. -- Doc Taxon (talk) 21:14, 9 December 2014 (UTC)[reply]
I think will be good idea to keep {{Uncategorized}} if categories came from templates. For example, File:Radiostation Veluwe.jpg is definitely needs at least place category. --EugeneZelenko (talk) 15:12, 10 December 2014 (UTC)[reply]
It's possible to integrate such an exception into the bot, but does it really make sense? The file will be categorized anyway. The better way is it, to move the category link out of the template to the bottom of the file page. Such a function is also possible to integrate into the bot and makes more sense. -- Doc Taxon (talk) 01:01, 11 December 2014 (UTC)[reply]
In such cases categorization with template is same as no category. --EugeneZelenko (talk) 15:10, 11 December 2014 (UTC)[reply]
Okay, but it's possible, to drop these categories onto the bottom of the page source, to get the file page categorized. Otherwise it's possible, to except such categorizations. What makes more sense? And so I will do it. Any third meaning about it? -- Doc Taxon (talk) 23:03, 11 December 2014 (UTC)[reply]
I think will be good idea to keep template, if page doesn't have categories specified explicitly. --EugeneZelenko (talk) 14:47, 12 December 2014 (UTC)[reply]
Steinsplitter : Cleaning up uncategorized templates is something I do every once in a while. I haven't done it for ages so I wouldn't mind at all if someone picks this up
Doc Taxon: are you using a database query or are you looking at all images? Can you share your source? What I did was do a database query of images that had the uncategorized template and were also in a non-hidden category. Than loop over these images and double check if it actually has a real existing (non-hidden) category added to it and if that's the case, remove the uncategorized template.
Krd: that is also the answer to your question: Category:Charles Richard Crane didn't exist so that's why the bot tagged it. Multichill (talk) 23:45, 13 December 2014 (UTC)[reply]
Krd Multichill: Category:Charles Richard Crane exists since January 2014, so the bot tagged it correctly. Yes, the bot checks for non-hidden and existing categories only for all images in namespace 6.
Here's the source:
 #!/usr/bin/tclsh8.6
 source api2.tcl
 set lang commons ; source langwiki.tcl
 source procs.tcl
 set token [login [set wiki $dewiki]]
 set fl [open c-uncat.out r]
 set data [read $fl]
 close $fl
 set offset "6|[dict get [join [dict values [lindex [regexp -all -inline -line -- {^edit.+?$} $data] end]]] pageid]"
 puts \n$offset
 cont {ret1 {
   global wiki query get
   foreach item [embeddedin $ret1] {
      set title [dict get $item title]
      set pages [get [post $wiki {*}$query / titles $title / prop categories / clshow !hidden / format json] query pages]
      set cats [dict values [regexp -all -inline {\{(Category:.+?)\}} $pages]]
      states {
         step {
            set idx [lsearch $cats {*needing cat*}]
            if {$idx != -1} {
               set cats [lreplace $cats $idx $idx]
               goto step
            }
         }
      }
      set catinfos {}
      set l [llength $cats]
      foreach cat $cats {
         set catpages [get [post $wiki {*}$query / titles $cat / prop categoryinfo / format json] query pages]
         lappend catinfos [lindex $catpages 0]
      }
      set catinfo [llength [lsearch -all $catinfos -1]]
      if {$l > $catinfo} {
         set content [content [post $wiki {*}$get / titles $title]]
         set uncattype [dict values [regexp -nocase -inline -- {\{\{(Uncat.+?)\|} $content]]
         regsub -nocase -- {(\n\{\{Uncategorized.+?\}\})} $content {} ccontent
         regsub -all -- {(\n..)} $ccontent \n\n ccontent
         regsub -all -- {(\}\}\[\[)} $ccontent \}\}\n\[\[ ccontent
         puts \n$title\n[edit $title "Template:$uncattype obsolete" {} / text $ccontent]
      }
   }
 }} {*}$embeddedin / eititle Template:Uncategorized / einamespace 6 / eicontinue $offset

-- Doc Taxon (talk) 03:11, 25 December 2014 (UTC)[reply]

Oh, I undid this mistake. But I already debugged this error a long time ago, it's to read out of the bot source stated above. This problem does not happen any more. -- Doc Taxon (talk) 17:24, 10 January 2015 (UTC)[reply]
@Doc Taxon: I have unblocked the bot so that you can show us where you are up to. Please limit your test edits to ~30 edits for now. Thanks. --99of9 (talk) 00:47, 16 January 2015 (UTC)[reply]
@99of9: Oh thank you! I'll be in touch... -- Doc Taxon (talk) 12:42, 16 January 2015 (UTC)[reply]
I am starting the 30 edits now ... -- Doc Taxon (talk) 20:50, 30 January 2015 (UTC)[reply]
@99of9: Hi, I already made the wanted 30 edits 30/31st of January, but the 16th was an error. So I debugged it and wrote the script more professionally. So voilà, the next trial entries was without any errors and the script has done what it has to do. There should be no reference to a problem with it any more. -- Doc Taxon (talk) 04:01, 31 January 2015 (UTC)[reply]
@99of9: ping is the right template, -- Doc Taxon (talk) 04:03, 31 January 2015 (UTC)[reply]

This is the changed and much better script, already tested:

 #!/shared/tcl/bin/tclsh8.6
 source api2.tcl
 set lang commons ; source langwiki.tcl
 source procs.tcl
 cont {ret1 {
    global wiki query get parse
    foreach item [embeddedin $ret1] {
       set i -1
       set cats [dict values [regexp -all -inline {\{(Category:.*?)\}} [get [post $wiki {*}$query / titles [
          set title [dict get $item title]
       ] / prop categories / clshow !hidden] query pages]]]
       if ![string match -nocase {*Check cat*} [get [post $wiki {*}$parse / page $title / prop templates] parse templates]] {
          foreach cat $cats {
             incr i
             if {[string match {*needing cat*} $cat] ||
                 [string match -nocase {*Invalid SVG*} $cat] ||
                 [lindex [get [post $wiki {*}$query / titles $cat / prop categoryinfo] query pages] 0] == -1} {
                set cats [lreplace $cats $i $i]
                set i [expr $i - 1]
             }
          }
          if {[llength $cats] > 0} {
             regexp -nocase -- {\{\{(Uncat.*?)\|} [set content [content [post $wiki {*}$get / titles $title]]] -- uncattype
             regsub -nocase -- {(\n\{\{Uncategorized.*?\}\})} $content {} ccontent
             regsub -all -- {(\n{3})} $ccontent \n\n ccontent
             regsub -all -- {(\}\}|\]\])(\[\[|\{\{)} $ccontent \\1\n\\2 ccontent
             puts \n...\n$title\n$cats\n$uncattype\n\n$ccontent ; gets stdin
             puts \n$title\n[edit $title "Template:$uncattype obsolete" {} / text $ccontent]
          }
       }
    }
 }} {*}$embeddedin / eititle Template:Uncategorized / einamespace 6 / eicontinue 6|$offset
Looks like a useful task. Doc Taxon is a trusted bot operator on dewiki and i think he can also help on Commons:Bots/Work requests. --Steinsplitter (talk) 08:36, 31 January 2015 (UTC)[reply]
99of9: What do you say now about it? -- Doc Taxon (talk) 12:53, 3 February 2015 (UTC)[reply]
They look good to me. I suggest changing the edit summary from "Template:Uncategorized obsolete" to "Template:Uncategorized no longer applicable", because the former technically means that the template itself is out of date. --99of9 (talk) 23:36, 3 February 2015 (UTC)[reply]
Please include the first real (non-hidden) category in the edit summary. That makes debugging easier (for example if someone unhides a category and your bot comes along). Yes, flag it. Multichill (talk) 23:41, 3 February 2015 (UTC)[reply]
Oh okay, the last both opinions are simple to change, I'll do it promptly. -- Doc Taxon (talk) 02:11, 4 February 2015 (UTC)[reply]

If there are no further opinions on this bot, I propose we accept and flag it. --99of9 (talk) 23:36, 3 February 2015 (UTC)[reply]

Approved --99of9 (talk) 04:34, 5 February 2015 (UTC)[reply]
Thank you very much indeed -- Doc Taxon (talk) 19:50, 5 February 2015 (UTC)[reply]