Category talk:Photos from Fotopedia

Latest comment: 9 years ago by Nemo bis in topic An alternative approach?

Intro edit

moved from Commons:Village pump#Fotopedia

Hi. w:Fotopedia is shutting down on August 10, 2014 (in 9 days). They host photos, some of which are under free licenses (CC). Could those photos be retrived to Commons? --Rinaku (t · c) 14:04, 1 August 2014 (UTC)Reply

(The interface is atrocious!) I browsed a few “stories” for photos and found some that are "©" (copywrighted), others "㏄⃝🚹⃝" (cc-by), yet others "㏄⃝🚹⃝⤿⃝" (cc-by-sa). I could not found any central repository, index, listing, or anything. A typical photo url is: http://i.images.cdn.fotopedia.com/r_TTcQJLHZ0-KuVrNYDJaqs-hd/Countries_of_the_World/America/United_States/Garden_of_the_Gods.jpg… -- Tuválkin 19:37, 1 August 2014 (UTC)Reply
Uhm, this is a bit embarrassing, but I have to ask: How did you manage to download a picture? I tried, but I didn´t find out how to do that. --Rudolph Buch (talk) 20:14, 1 August 2014 (UTC)Reply
Check something like «multimedia resources» under «page properties», it should work for any browswer (except maybe for MSIE, I don’t know — stopped using it in 1997) running in a proper computer (even in a Mac, I hope, but probably not in the kind of “device” this craptastic interface was designed for). -- Tuválkin 22:39, 1 August 2014 (UTC)Reply
Such a beautiful interface, but at the same time so frustrating because it totally doesn't give you what you want. The best way I could find free images was via https://www.google.ca/search?as_st=y&tbm=isch&hl=en&as_q=&as_epq=&as_oq=&as_eq=&cr=&as_sitesearch=fotopedia.com&safe=images&tbs=sur:fmc . Bawolff (talk) 20:10, 1 August 2014 (UTC)Reply
Fotopedia is closing down because a large proportion of its content is from other websites e.g. many articles simply mirror existing Wikipedia articles. Of the approximately 1.5 million images, it appears that two-thirds may be from Flickr. Of the remainder, it seems that only about 22,000 images are both non-Flickr and have Commons-compatible licenses. It might be worth asking a bot operator if they could harvest these ones but it would be difficult with barely a week left and many Wikimedians getting ready to party with Jimbo and the Gang very soon. Green Giant (talk) 21:29, 1 August 2014 (UTC)Reply
From Green Giant’s filtered google search I grabbed the 1st 100 pairs of image url + page url. It is only worth to upload them if there is a license reviewer to testify the original CC-BY and CC-BY-SA licenses before the site goes belly up. -- Tuválkin 23:07, 1 August 2014 (UTC)Reply
Those 100   uploaded now, added to the 98 already in Category:Photos from Fotopedia. Hardly dented those 20 thousand, sadly. A bot operation would be good. Those paltry 100, although grabbed via browser addon and uploaded with Vicuña, still took me a while to prepare, I probably missed the highest rez for some pics, for a few stoopid filenames had to be allowed, and still to be pinpointed authorship and added categorization. -- Tuválkin 03:09, 2 August 2014 (UTC)Reply
Rudolph Buch the download links are on the image page rather than the image itself. As an example, look at this page, and note there are two buttons labelled "Download" and "Actions". If you click Download it will open the full size image in a separate tab and then you can save it using your right-click menu. If you click Actions there is a dropdown menu, from which selecting "Download original" will again open the image in a separate tab and then it can be saved as above. Green Giant (talk) 21:39, 1 August 2014 (UTC)Reply
Meanwhile: Around 100 98 images from Fotopedia were had already been brought over to Commons: Special:Search/Fotopedia. -- Tuválkin 23:07, 1 August 2014 (UTC) (clarified. -- Tuválkin 05:20, 2 August 2014 (UTC))Reply
Tuválkin: I am trying to review these, but the link to Fotopedia is incomplete. Regards, Yann (talk) 04:59, 2 August 2014 (UTC)Reply
(No, you’re trying to review not these old ones, but the new ones, mentioned a couple lines above…) Many thanks! The mass upload was done with a generic link to the main page of Fotopedia as the source; I'm now uptading it, as said. The whole list is at Category talk:Photos from Fotopedia#Recently uploaded. -- Tuválkin 05:08, 2 August 2014 (UTC)Reply
OK, tell me when you have done with updating the links, then I can help reviewing. Also I noticed that several files didn't have the right license (cc-by-sa instead of cc-by, or wrong number) Regards, Yann (talk) 12:36, 2 August 2014 (UTC)Reply
I am 2/3 through the category reviewing images. Some remarks: as I said to your talk page, please do not add the direct link, it confuses the review script, and I don't think it adds anything. Then it takes longer for an already tedious task. Some of the licenses are wrong: cc-by-sa instead of cc-by, or wrong number. Then some images do not have any link to Fotopedia. I also rename some images with a meaningless name, and put for deletion some with a -nc or -nd license, and one out of scope self portrait. Regards, Yann (talk) 11:11, 5 August 2014 (UTC)Reply
Yesterday, on schedule, it went offline. There’s a placeholder page now, with final credits roll. -- Tuválkin 02:26, 12 August 2014 (UTC)Reply

Filenames in Fotopedia edit

From what I learned so far, a “grabbable” direct link to a maximum resolution image in Fotopedia is something like

http://images.cdn.fotopedia.com/(author)-(photo)-original.jpg

The respective file information page includes useful things like its licensing, title, author name (and link) and usage; its generic url is

http://www.fotopedia.com/items/(author)-(photo)

Author pages’ generic url is

http://www.fotopedia.com/users/(author)

(Pretty simple, after all, and simpler than most such sites’ urls.) Given (photo) and (author), a competent bot could scrubb this site off its 22000 compatibly licensed photos in a blink. That’s out of my abilities, though. -- Tuválkin 13:23, 5 August 2014 (UTC)Reply

Some filenames (so far a couple dozen among almost thousand) follow a different naming convention:
http://images.cdn.fotopedia.com/(uuid)-original.jpg
I don’t know why this is so, nor whether this is a more or less complete system of synonymity. So far a given image is either labelled like this or (most often) as given above. -- Tuválkin 20:52, 6 August 2014 (UTC)Reply

Both author and photo are made up of 11 characters, repeatable, case sensitive, from a-z, A-Z, 0-9, and also hyphen and underscore. (The connecting hyphen may neighbour a code hyphen on either side). -- Tuválkin 20:52, 6 August 2014 (UTC)Reply

Some author are not a jumble of 11 random characters, but a human readable/entered string, that can be less than 11 characters. -- Tuválkin 23:49, 6 August 2014 (UTC)Reply

Recently uploaded edit

Here’s the list of the urls for the 100 additional images uploaded today — of among about 22 thousand still to go in one week, before this site closes down. -- Tuválkin 02:48, 2 August 2014 (UTC)Reply

Extended content
  1. image url & image page + random context page    File:Chopper (motorcycle) by Suriya Donavanik.jpg
  2. image url & image page + random context page    File:Chiloé - Castro by S. Rossi.jpg
  3. image url & image page + random context page    File:Kaleŝoj.jpg
  4. image url & image page + random context page    File:Cepa volbo.jpg
  5. image url & image page + random context page    File:Blu'flav'verd.jpg
  6. image url & image page + random context page    File:lilins-2fa8ee8ffa95420e91b7c1e91244afc3-original.jpg
  7. image url & image page + random context page    File:Bibrka by Mykola Swarnyk.jpg
  8. image url & image page + random context page    File:Cyprus-Image stitching-Omodos.jpg
  9. image url & image page + random context page    File:Purico Complex.jpg
  10. image url & image page + random context page    File:mslorraine-0UWuC6xud4o-original.jpg
  11. image url & image page + random context page    File:Volcan Osorno and Saltos de Petrohue.jpg
  12. image url & image page + random context page    File:Laguna Miscanti by S. Rossi.jpg
  13. image url & image page + random context page    File:Reading The Night Sky; Is There Love In Space by Dean Kavanagh.jpg
  14. image url & image page + random context page    File:On The Roof, In The Old City by Dean Kavanagh.jpg
  15. image url & image page + random context page    File:Kajo postdompara.jpg
  16. image url & image page + random context page    File:Synevir by Mykola Swarnyk.jpg
  17. image url & image page + random context page    File:Hong Kong just before the storm.jpg
  18. image url & image page + random context page    File:Embalse el Yeso.jpg
  19. image url & image page + random context page    File:Gili Air Eastern coast looking at Lombok.jpg
  20. image url & image page + random context page    File:Heidarinia-1.jpg
  21. image url & image page + random context page    File:Iceland 000.jpg
  22. image url & image page + random context page    File:Iceland 001.jpg
  23. image url & image page + random context page    File:Iceland-List of waterfalls-Waterfall.jpg
  24. image url & image page + random context page    File:normlewis-9dc8b9cfc2383ad6688365c8881a2daa-hd.jpg
  25. image url & image page + random context page    File:normlewis-bf0e9f6afb5e85cf57764dc60e58db76-hd.jpg
  26. image url & image page + random context page    File:normlewis-e4c084455e73fad2211be81f2ce1bfc3-hd.jpg
  27. image url & image page + random context page    File:normlewis-f161664caff9938df6b2e84ecb23da7f-hd.jpg
  28. image url & image page + random context page    File:sippakorn-gp3jl3jSonc-hd.jpg
  29. image url & image page + random context page    File:Volcano Osorno.jpg
  30. image url & image page + random context page    File:Isla Damas.jpg
  31. image url & image page + random context page    File:Isola di Capraia by S. Rossi.jpg
  32. image url & image page + random context page    File:La Campana National Park by S. Rossi.jpg
  33. image url & image page + random context page    File:Salar de Tara by S. Rossi.jpg
  34. image url & image page + random context page    File:Serapias lingua hybrid.jpg
  35. image url & image page + random context page    File:Choppers (motorcycles) by Suriya Donavanik.jpg
  36. image url & image page + random context page    File:lilins-fd19a6d1a10a651564e475790b379e7e-hd.jpg
  37. image url & image page + random context page    File:Bitexco Financial Tower 000.jpg
  38. image url & image page + random context page    File:Citroen Traction Avant-Gia Long Palace.jpg
  39. image url & image page + random context page    File:Bitexco Financial Tower 001.jpg
  40. image url & image page + random context page    File:Bitexco Financial Tower 002.jpg
  41. image url & image page + random context page    File:Neĝa domo.jpg
  42. image url & image page + random context page    File:Neĝa arbo.jpg
  43. image url & image page + random context page    File:Pad'senfolia.jpg
  44. image url & image page + random context page    File:Ora horo.jpg
  45. image url & image page + random context page    File:CitroënAvant-FilleDerrière.jpg
  46. image url & image page + random context page    File:Saigon Central Post Office.jpg
  47. image url & image page + random context page    File:jmhullot-Ryn5wakGVkc-original.jpg
  48. image url & image page + random context page    File:Saigon Zoo and Botanical Gardens 000.jpg
  49. image url & image page + random context page    File:jmhullot-Zendtq A CI-original.jpg
  50. image url & image page + random context page    File:Saigon Central Post Office-District 1 Ho Chi Minh City-French Colonial.jpg
  51. image url & image page + random context page    File:Nub'timinda.jpg
  52. image url & image page + random context page    File:jmhullot-NDTUWQaCWyk-original.jpg
  53. image url & image page + random context page    File:Ben Thanh Market.jpg
  54. image url & image page + random context page    File:jmhullot-HSC1ZTzziwo-original.jpg
  55. image url & image page + random context page    File:jmhullot-MkD06Z3JQ78-original.jpg
  56. image url & image page + random context page    File:jmhullot-bn Dekq1vrE-original.jpg
  57. image url & image page + random context page    File:jmhullot-KiGOWZcTEtc-original.jpg
  58. image url & image page + random context page    File:jmhullot-Emkw U9K0j0-original.jpg
  59. image url & image page + random context page    File:Ho Chi Minh City Hall.jpg
  60. image url & image page + random context page    File:Saigon Zoo and Botanical Gardens 001.jpg
  61. image url & image page + random context page    File:Cu Chi tunnels.jpg
  62. image url & image page + random context page    File:Dam Sen Cultural Park.jpg
  63. image url & image page + random context page    File:Hotel Continental Ho Chi Minh City.jpg
  64. image url & image page + random context page    File:Dam Sen Cultural Park 001.jpg
  65. image url & image page + random context page    File:Dam Sen Cultural Park 002.jpg
  66. image url & image page + random context page    File:Ben Thanh Market 001.jpg
  67. image url & image page + random context page    File:Cu Chi tunnels 001.jpg
  68. image url & image page + random context page    File:Cu Chi tunnels 002.jpg
  69. image url & image page + random context page    File:Ho Chi Minh City Hall-List of city and town halls.jpg
  70. image url & image page + random context page    File:Monsoon-Cloud.jpg
  71. image url & image page + random context page    File:Dam Sen Cultural Park 003.jpg
  72. image url & image page + random context page    File:Ho Chi Minh City Hall 001.jpg
  73. image url & image page + random context page    File:Hotel Majestic Saigon.jpg
  74. image url & image page + random context page    File:Saigon Central Post Office-Gustave Eiffel.jpg
  75. image url & image page + random context page    File:Templo sunfala.jpg
  76. image url & image page + random context page    File:jmhullot-LVHxSCiTpls-original.jpg
  77. image url & image page + random context page    File:jmhullot-cgVG2irf6ts-original.jpg
  78. image url & image page + random context page    File:jmhullot-fSWmMhwz-pU-original.jpg
  79. image url & image page + random context page    File:jmhullot-kI3WlutbSa8-original.jpg
  80. image url & image page + random context page    File:jmhullot-uwbFz3dPjlI-original.jpg
  81. image url & image page + random context page    File:lilins-a0254bb0bb5b5d24738c2aecf598f667-original.jpg
  82. image url & image page + random context page    File:jmhullot-q7zHrlK1e0U-original.jpg
  83. image url & image page + random context page    File:Arbaro.jpg
  84. image url & image page + random context page    File:Padeto.jpg
  85. image url & image page + random context page    File:jmhullot-yB4GFgJp2Lc-original.jpg
  86. image url & image page + random context page    File:Iceland 002.jpg
  87. image url & image page + random context page    File:Volcanos Lascar left and Aguas Calientes right.jpg
  88. image url & image page + random context page    File:Chicago by Dragan Maksimovic.jpg
  89. image url & image page + random context page    File:Heidarinia-2.jpg
  90. image url & image page + random context page    File:Iceland 003.jpg
  91. image url & image page + random context page    File:Iceland-List of waterfalls.jpg
  92. image url & image page + random context page    File:Laguna Miniques.jpg
  93. image url & image page + random context page    File:Laredo Puerto 02.jpg
  94. image url & image page + random context page    File:Reflection after a rain.jpg
  95. image url & image page + random context page    File:ALMA by S. Rossi.jpg
  96. image url & image page + random context page    File:Drako 000.jpg
  97. image url & image page + random context page    File:Drako 001.jpg
  98. image url & image page + random context page    File:jmhullot-Fbg-6Mc3pXw-original.jpg
  99. image url & image page + random context page    File:jmhullot-fW9zl645xZ4-original.jpg
  100. image url & image page + random context page    File:Trg Slobode, Novi Sad.jpg

Second manual batch edit

I pre-selected and uploaded locally 683 669 656 more images (max res); this set doesn’t include any evident off-scope, copyvio/FoP, nor duplicate images. I’ll upload them with much better starting info, including also Yann’s remarks. -- Tuválkin 15:38, 5 August 2014 (UTC)Reply

A first experimental batch of 34 was uploaded just now — some great photos here! They need license review from an admin, now (pinging User: Yann). They likewise need categorization, some cropping, etc., but that can be done also after Fotopedia.com goes belly up, so I’ll focus on uploading more for now: 635 to go. -- Tuválkin 04:36, 6 August 2014 (UTC)Reply
There is some issue with the "Info" template. See File:Thai boys eating icecream.jpg. I think the name could be improved. I renamed this. And you should ask for reviewer right. Regards, Yann (talk) 14:01, 6 August 2014 (UTC)Reply
I know about the issue with the "Info" template, it is caused by me cramming more stuff in than Vicuña would let me to without breaking it. I did made some improvements in the two most recent batches (mainly now lowercase template arguments, to avoid automated duplication) and I do plan to fix it up upon recategorizing; the cleanup is trivial, anyway. As for reviewer right — cool, but it defeats the goal if one reviews one’s own uploads, especially when we’re pressed for time, I think. (Concerning filenames, we’re done talking.) -- Tuválkin 21:40, 6 August 2014 (UTC)Reply
I end up uploading 574 photos from this site; other people uploaded a few more, adding to the 98 we had from before the demise of Fotopedia was announced. Could have been (even) worse, in terms of salvaging, but read below about alternatives. -- Tuválkin 02:26, 12 August 2014 (UTC)Reply

License washing? edit

Hi, Seeing that many images on Fotopedia are copied from Flickr, I have some worries about license washing. See these cases: Commons:Deletion requests/File:Sylvain Lefebvre.jpg and Commons:Deletion requests/File:Ramos Casillas Copa del Rey 2011.jpg‎. Regards, Yann (talk) 17:51, 5 August 2014 (UTC)Reply

Yes, these cases need to singled outand shot on sight (maybe also this), but doesn’t seem to be an egregious case of serial copyvio, surely not enough to blanket the site as a questionable source. Of course all images from delinquent uploaders should be scrutined in priority after a copyvio is detected. (I’m glad these were not among those uploaded by me, at this time of requiem for Fotopedia…) -- Tuválkin 05:03, 6 August 2014 (UTC)Reply
The whole site not, but all images which have been copied from Flickr. If either it is not available on Flickr, and the license is not acceptable on Flickr, we should question them. Regards, Yann (talk) 15:12, 6 August 2014 (UTC)Reply
So far I didn’t find any clear indication that an image is also available in Flickr. Regardless of anything else, though, as Fotopedia is closing in 3 days and Flickr is not, anything that can be grabbed from Flickr should be left out, because time urges. -- Tuválkin 21:43, 6 August 2014 (UTC)Reply

An alternative approach? edit

Given that we have a list of 22'000 URLs that we'd like to copy, but currently no bot to do so... would it be maybe faster to save those 22'000 URLs to web.archive.org? I just did so for http://www.fotopedia.com/items/9MaaEZ8Hz0A-a2a7eY_MvkQ, and now archive.org even serves the full-res original file from its own copy. Just an idea... (Note: other items already exist at archive.org, for instance http://www.fotopedia.com/items/AW6O4d24noA-pTK2E6PugrM exists at [1], and the download link also works and serves the full-res from archive.org.) Lupo 22:17, 6 August 2014 (UTC)Reply

Looks like a fine idea, but do we «have a list of 22'000 URLs»? I hope so, but I don’t — who does? -- Tuválkin 23:43, 6 August 2014 (UTC)Reply
Isn't that the Google search that Green Giant posted above? Lupo 05:42, 7 August 2014 (UTC)Reply
Maybe, but I do not know how to extract what the browser gets from that http-req into a raw text list of urls. If someone knows, please post it here. -- Tuválkin 17:04, 7 August 2014 (UTC)Reply
Also: it might be worth trying to contact Fotopedia directly. Maybe they'd be interested in collaborating with WMF to salvage some of their user's contents. Lupo 05:45, 7 August 2014 (UTC)Reply
ArchiveTeam has a project for saving Fotopedia's content. --Gazebo (talk) 05:06, 8 August 2014 (UTC)Reply
Yay! They saved it all, should be available soon. -- Tuválkin 05:54, 8 August 2014 (UTC)Reply
It seems the images with potential to be hosted here may find another home, according to a Creative Commons blog posting of 9 August.—Odysseus1479 (talk) 06:59, 10 August 2014 (UTC)Reply
The Creative Commons images (I don't know if all free CC or any CC) are at https://archive.org/details/2014.08.fotopedia-cc-export-collection (277 GB). The collection linked above is mostly meant for Wayback machine (WARC files).
They're mostly unfree though:
$ grep -Ec "/(publicdomain|by|by-sa)/" export-fotopedia-cc.tsv 
48373
$ grep -Evc "/(publicdomain|by|by-sa)/" export-fotopedia-cc.tsv 
258220
--Nemo 15:43, 15 August 2014 (UTC)Reply
(ec) Excellent find. I downloaded http://archive.org/download/2014.08.fotopedia-cc-export-collection/fotopedia-tar.txt — the full list of all file-names and -sizes (and local timestamps and Unix flags). Too bad the HTML files are bundled with the JPGs — one needs to download that almost 300 Gb blob to get a couple megabytes of information and from there list the (relatively few) ones with CC-by and CC-by-SA licenses. -- Tuválkin 19:00, 15 August 2014 (UTC)Reply
Never mind: It is trivial to extract from export-fotopedia-cc.tsv that list of the (relatively few) photos with CC-by and CC-by-SA licenses. Someone savvy with grep and gz could do it. -- Tuválkin 20:50, 15 August 2014 (UTC)Reply
Doesn't seem to have happened yet though. :) Commons:Batch uploading/Fotopedia (thanks for creating, this discussion in Category_talk is not so easy to find). --Nemo 21:51, 19 September 2014 (UTC)Reply
Return to "Photos from Fotopedia" page.