User:DrTrigon/file-metadata
This is the test protocol and log task for file-metadata.
Setup a VirtualBox with osboxes.org Kubuntu_14.04.3-64bit.7z (see http://www.osboxes.org/kubuntu/), then boot the disk in VirtualBox (may be change of UUID needed), login and open 'Term'/Konsole:
$ sudo apt-get update $ sudo apt-get upgrade $ sudo apt-get autoremove $ sudo apt-get install python-pip
insert guest additions medium in virtualbox
$ cd /media/osboxes/VBOXADDITIONS_4.3.36_105129 $ sudo ./VBoxLinuxAdditions.run $ sudo shutdown -r now
enable bidirectional clipboard
In order to not have to do this step everytime again when resetting, exported the VM as 'Wikimedia GSoC Testing.ova' (OVF 1.0 including manifest) and created a snapshot 'plain/unused'. Both can be used to jump back to a fresh (resetted) machine.
Installation of file-metadata(-wikibot) edit
User:AbdealiJK/file-metadata, https://github.com/drtrigon/file-metadata-wikibot
$ sudo pip install invoke $ wget https://raw.githubusercontent.com/drtrigon/install-file-metadata-wikibot/master/tasks.py
You can add --yes
after each task/parameter to automatically install w/o asking questions. Currently preferred install method for testing most recent code is User:DrTrigon/file-metadata#github or docker for the master branch.
system package management edit
https://phabricator.wikimedia.org/T136985#2398708
$ invoke install_file_metadata_spm install_pywikibot
I had to use --upgrade
in order to get the most recent version of pillow. cmake libboost-python-dev liblzma-dev
needed for dlib, libjpeg-dev libz-dev
needed for pillow compilation.
pip edit
https://phabricator.wikimedia.org/T136985#2398652
$ invoke install_file_metadata_pip install_pywikibot
Although installed, no virtualenv
is used for this procedure everything gets installed into the system or to the home directory.
github edit
$ invoke install_file_metadata_git install_pywikibot
or for installation w/o asking questions
$ invoke install_file_metadata_git --yes install_pywikibot --yes
For info about virtualenv
see User:DrTrigon/file-metadata#pip.
Docker edit
- Dockerfiles: https://github.com/pywikibot-catfiles/file-metadata/tree/ajk/docker, https://github.com/pywikibot-catfiles/docker-file-metadata
- Images: https://hub.docker.com/r/pywikibotcatfiles/file-metadata/tags/, https://hub.docker.com/r/pywikibotcatfiles/docker-file-metadata/tags/, https://hub.docker.com/r/drtrigon/catimages-gsoc/tags/
- https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
- https://docs.docker.com/engine/tutorials/dockerrepos/
- https://github.com/wsargent/docker-cheat-sheet
- e.g. build container from Dockerfile yourself:
$ sudo docker build -t gsoc_catimages_0.1.0.dev99999999999999 .; sudo docker tag gsoc_catimages_0.1.0.dev99999999999999 drtrigon/gsoc_catimages_0.1.0.dev99999999999999
- e.g. build container from Dockerfile yourself:
$ invoke install_docker --yes
then follow instructions on https://github.com/drtrigon/file-metadata-wikibot#installation.
Testing / Usage edit
https://phabricator.wikimedia.org/T136985#2401611, https://gist.github.com/AbdealiJK/a94fc0d0445c2ad715d9b1b95ec2ba03, https://gist.github.com/drtrigon/0002517ea812cc707e6ea2ecaf23d9b3
In order to run the bot(s) use:
# wikibot-filemeta-log -search:'eth-bib' -limit:5 -logname:test -dry # wikibot-filemeta-simple -cat:SVG_files -limit:5
A wikibot-create-config
will be needed in the beginning once, but the bot scripts will mention that.
test | travis test | travis config | additional |
---|---|---|---|
docker-file-metadata-wikibot | test | .travis.yml | Dockerfile |
install-file-metadata-wikibot | test | .travis.yml | tasks.py |
Developing edit
https://phabricator.wikimedia.org/T136985#2403700
Method 1: For most contributors
- You fork the repository into your own account.
- Clone your fork, and create a branch if you'd like to modify code in a branch (That's normally easier later down the line)
- Modify the code, commit it, (The normal)
- Push the code to your fork (to the appropriate branch)
- Create a Pull Request to my repository's master branch using your fork's appropriate branch
Possibly, the following are good references:
Pywikibot/PAWS (docker) edit
osboxes@osboxes:~$ sudo docker pull yuvipanda/pawsuser osboxes@osboxes:~$ sudo docker run -it yuvipanda/pawsuser bash Welcome to PAWS! Please behave responsibly Getting Started: https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS Questions? Need help? Find us on #pywikibot on IRC on freenode! File bugs at https://phabricator.wikimedia.org/maniphest/task/create/?projects=PAWS @PAWS:~$ export JPY_USER=DrTrigon DrTrigon@PAWS:~$ pwb.py login [...]
How to use it is described in: https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS
See also:
- https://phabricator.wikimedia.org/project/members/1648/
- https://hub.docker.com/r/yuvipanda/pawsuser/
- https://github.com/yuvipanda/paws/blob/master/singleuser/Dockerfile, https://github.com/yuvipanda/paws/blob/c367a711b60e20e6730277bc3faf5f16dee915a4/singleuser/Dockerfile
- https://wikitech.wikimedia.org/wiki/Tools_Kubernetes
wikitech.wikimedia.org / ToolsLab edit
According to:
- https://wikitech.wikimedia.org/wiki/User:AbdealiJK/file-metadata
- https://github.com/pywikibot-catfiles/docker-file-metadata/blob/master/Dockerfile.ubuntu
- http://docs.python-guide.org/en/latest/dev/virtualenvs/
- https://phabricator.wikimedia.org/T136985#2409894 ('eth-bib')
- https://github.com/pywikibot-catfiles/file-metadata/tree/master/file_metadata/wikibot
- https://github.com/drtrigon/catimages-gsoc/blob/master/tasks.py
- https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Grid#Submitting_simple_one-off_jobs_using_.27jsub.27
- https://commons.wikimedia.org/w/index.php?title=Special%3APrefixIndex&prefix=User%3ADrTrigon%2Flogs
- https://wikitech.wikimedia.org/wiki/Screen
Install into a virtual environment:
$ ssh -i ~/.ssh/id_rsa drtrigon@login.tools.wmflabs.org drtrigon@tools-bastion-03:~$ virtualenv venv drtrigon@tools-bastion-03:~$ source venv/bin/activate (venv)drtrigon@tools-bastion-03:~$ pip install -U pip (venv)drtrigon@tools-bastion-03:~$ git clone https://github.com/pywikibot-catfiles/file-metadata.git (venv)drtrigon@tools-bastion-03:~$ pip install -r ~/file-metadata/test-requirements.txt (venv)drtrigon@tools-bastion-03:~$ pip install ~/file-metadata --upgrade
Test with:
(venv)drtrigon@tools-bastion-03:~$ cd ~/file-metadata/ (venv)drtrigon@tools-bastion-03:~/file-metadata$ pip install -e . (venv)drtrigon@tools-bastion-03:~/file-metadata$ python -m pytest --cov ===================== 3 failed, 102 passed, 6 skipped in 177.52 seconds ====================== ...(opencv tests fail hence has to be installed by conda)...
Run bot:
(venv)drtrigon@tools-bastion-03:~$ cd ~/file-metadata/file_metadata/wikibot/ (venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ pip install git+https://gerrit.wikimedia.org/r/pywikibot/core.git#egg=pywikibot (venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ python generate_user_files.py (venv)drtrigon@tools-bastion-03:~/file-metadata/file_metadata/wikibot$ python log_bot.py -search:'eth-bib' -limit:5 -logname:test -dry
Run full long job on grid:
(venv)drtrigon@tools-bastion-03:~$ cp file-metadata/file_metadata/wikibot/pywikibot.lwp ~/ (venv)drtrigon@tools-bastion-03:~$ cp file-metadata/file_metadata/wikibot/user-config.py ~/ (venv)drtrigon@tools-bastion-03:~$ python $HOME/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -limit:5 -logname:eth-bib -dry (venv)drtrigon@tools-bastion-03:~$ jsub -o $HOME/log_bot_eth-bib.log -j y -N log_bot_eth-bib python $HOME/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -logname:eth-bib (venv)drtrigon@tools-bastion-03:~$ job -v log_bot_eth-bib
output to log:
[...] SSLError: Can't connect to HTTPS URL because the SSL module is not available.
Run bot by screen:
drtrigon@tools-bastion-03:~$ screen bash drtrigon@tools-bastion-03:~$ source venv/bin/activate (venv)drtrigon@tools-bastion-03:~$ python ~/file-metadata/file_metadata/wikibot/log_bot.py -search:'eth-bib' -logname:eth-bib CTRL + A then D (for detach) drtrigon@tools-bastion-03:~$ screen -r CTRL + A then D (for detach) drtrigon@tools-bastion-03:~$ exit
Remove the virtual environment:
(venv)drtrigon@tools-bastion-03:~$ deactivate
drtrigon@tools-bastion-03:~$ rm -rf venv/ drtrigon@tools-bastion-03:~$ rm -rf file-metadata/
FAQ edit
- If you encounter issues with
wget
accessingraw.githubusercontent.com
you need to re-configure the VM guest and change network settings: Network Address Translation (NAT) -> Bridged networking
Related Pages edit
- install pywikibot: https://phabricator.wikimedia.org/T136985#2401192
- do not install pywikibot through pip (this is just doubling code and causing confusion)
- install pywikibot FROM GIT according to Manual:Pywikibot/Gerrit
- install test bot script: https://phabricator.wikimedia.org/T136985#2403216