User:DrTrigon/file-metadata

This is the test protocol and log task for file-metadata.

Setup a VirtualBox with osboxes.org Kubuntu_14.04.3-64bit.7z (see http://www.osboxes.org/kubuntu/), then boot the disk in VirtualBox (may be change of UUID needed), login and open 'Term'/Konsole:

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get autoremove
$ sudo apt-get install python-pip

insert guest additions medium in virtualbox

$ cd /media/osboxes/VBOXADDITIONS_4.3.36_105129
$ sudo ./VBoxLinuxAdditions.run
$ sudo shutdown -r now

enable bidirectional clipboard

In order to not have to do this step everytime again when resetting, exported the VM as 'Wikimedia GSoC Testing.ova' (OVF 1.0 including manifest) and created a snapshot 'plain/unused'. Both can be used to jump back to a fresh (resetted) machine.

Installation of file-metadata(-wikibot) edit

User:AbdealiJK/file-metadata, https://github.com/drtrigon/file-metadata-wikibot

$ sudo pip install invoke
$ wget https://raw.githubusercontent.com/drtrigon/install-file-metadata-wikibot/master/tasks.py

You can add --yes after each task/parameter to automatically install w/o asking questions. Currently preferred install method for testing most recent code is User:DrTrigon/file-metadata#github or docker for the master branch.

system package management edit

https://phabricator.wikimedia.org/T136985#2398708

$ invoke install_file_metadata_spm install_pywikibot

I had to use --upgrade in order to get the most recent version of pillow. cmake libboost-python-dev liblzma-dev needed for dlib, libjpeg-dev libz-dev needed for pillow compilation.

pip edit

https://phabricator.wikimedia.org/T136985#2398652

$ invoke install_file_metadata_pip install_pywikibot

Although installed, no virtualenv is used for this procedure everything gets installed into the system or to the home directory.

github edit

$ invoke install_file_metadata_git install_pywikibot

or for installation w/o asking questions

$ invoke install_file_metadata_git --yes install_pywikibot --yes

For info about virtualenv see User:DrTrigon/file-metadata#pip.

Docker edit

$ invoke install_docker --yes

then follow instructions on https://github.com/drtrigon/file-metadata-wikibot#installation.

Testing / Usage edit

https://phabricator.wikimedia.org/T136985#2401611, https://gist.github.com/AbdealiJK/a94fc0d0445c2ad715d9b1b95ec2ba03, https://gist.github.com/drtrigon/0002517ea812cc707e6ea2ecaf23d9b3

In order to run the bot(s) use:

# wikibot-filemeta-log -search:'eth-bib' -limit:5 -logname:test -dry
# wikibot-filemeta-simple -cat:SVG_files -limit:5

A wikibot-create-config will be needed in the beginning once, but the bot scripts will mention that.

test travis test travis config additional
docker-file-metadata-wikibot test .travis.yml Dockerfile
install-file-metadata-wikibot test .travis.yml tasks.py

Developing edit

https://phabricator.wikimedia.org/T136985#2403700

Method 1: For most contributors

  1. You fork the repository into your own account.
  2. Clone your fork, and create a branch if you'd like to modify code in a branch (That's normally easier later down the line)
  3. Modify the code, commit it, (The normal)
  4. Push the code to your fork (to the appropriate branch)
  5. Create a Pull Request to my repository's master branch using your fork's appropriate branch

Possibly, the following are good references:

Pywikibot/PAWS (docker) edit

osboxes@osboxes:~$ sudo docker pull yuvipanda/pawsuser
osboxes@osboxes:~$ sudo docker run -it yuvipanda/pawsuser bash
Welcome to PAWS!
Please behave responsibly
Getting Started: https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS
Questions? Need help? Find us on #pywikibot on IRC on freenode!
File bugs at https://phabricator.wikimedia.org/maniphest/task/create/?projects=PAWS
@PAWS:~$ export JPY_USER=DrTrigon
DrTrigon@PAWS:~$ pwb.py login
[...]

How to use it is described in: https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS

See also:

wikitech.wikimedia.org / ToolsLab edit

According to:

Install into a virtual environment:

$ ssh -i ~/.ssh/id_rsa drtrigon@login.tools.wmflabs.org
drtrigon@tools-bastion-03:~$ virtualenv venv
drtrigon@tools-bastion-03:~$ source venv/bin/activate
(venv)drtrigon@tools-bastion-03:~$ pip install -U pip
(venv)drtrigon@tools-bastion-03:~$ git clone https://github.com/pywikibot-catfiles/file-metadata.git
(venv)drtrigon@tools-bastion-03:~$ pip install -r ~/file-metadata/test-requirements.txt
(venv)drtrigon@tools-bastion-03:~$ pip install ~/file-metadata --upgrade

Test with:

(venv)drtrigon@tools-bastion-03:~$ cd ~/file-metadata/
(venv)drtrigon@tools-bastion-03:~/file-metadata$ pip install -e .
(venv)drtrigon@tools-bastion-03:~/file-metadata$ python -m pytest --cov
===================== 3 failed, 102 passed, 6 skipped in 177.52 seconds ======================
...(opencv tests fail hence has to be installed by conda)...

Run bot:

(venv)drtrigon@tools-bastion-03:~$ cd ~/file-metadata/file_metadata/wikibot/
(venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ pip install git+https://gerrit.wikimedia.org/r/pywikibot/core.git#egg=pywikibot
(venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ python generate_user_files.py
(venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ python log_bot.py -search:'eth-bib' -limit:5 -logname:test -dry

Run full long job on grid:

(venv)drtrigon@tools-bastion-03:~$ cp file-metadata/file_metadata/wikibot/pywikibot.lwp ~/
(venv)drtrigon@tools-bastion-03:~$ cp file-metadata/file_metadata/wikibot/user-config.py ~/
(venv)drtrigon@tools-bastion-03:~$ python $HOME/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -limit:5 -logname:eth-bib -dry
(venv)drtrigon@tools-bastion-03:~$ jsub -o $HOME/log_bot_eth-bib.log -j y -N log_bot_eth-bib python $HOME/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -logname:eth-bib
(venv)drtrigon@tools-bastion-03:~$ job -v log_bot_eth-bib

output to log:

[...]
SSLError: Can't connect to HTTPS URL because the SSL module is not available.

Run bot by screen:

drtrigon@tools-bastion-03:~$ screen bash
drtrigon@tools-bastion-03:~$ source venv/bin/activate
(venv)drtrigon@tools-bastion-03:~$ python ~/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -logname:eth-bib
CTRL + A then D (for detach)
drtrigon@tools-bastion-03:~$ screen -r
CTRL + A then D (for detach)
drtrigon@tools-bastion-03:~$ exit

Remove the virtual environment:

(venv)drtrigon@tools-bastion-03:~$ deactivate
drtrigon@tools-bastion-03:~$ rm -rf venv/
drtrigon@tools-bastion-03:~$ rm -rf file-metadata/

FAQ edit

  • If you encounter issues with wget accessing raw.githubusercontent.com you need to re-configure the VM guest and change network settings: Network Address Translation (NAT) -> Bridged networking

Related Pages edit