User talk:Danmichaelo

This talk page is automatically archived by ArchiveBot. Any sections older than 60 days are automatically archived. Sections without timestamps are not archived.


Archive 1

A (challenging) proposal for CropTool enhancingEdit

Do you know as Internet Archive works with texts, how it is used as one of best free djvu/pdf sources for wikisource work, and details of the rich archive of different derived files that are stored into any IA item? In brief, there's one high-definition image for any pdf / djvu page of IA books, while both djvu and pdf images are highly compressed; and they can be downloaded and cropped manually; but ask me more if you want only if you are really interested about, it's an hard issue. --Alex_brollo Talk|Contrib 10:09, 18 August 2016 (UTC)

Let's discuss how the process could work. Would this be for djvu/pdf files that have already been transferred to Commons (I guess they have a link back to, or files that only live on In the latter case, I guess a user would just paste the URL from into CropTool and work from there.. Does the URLs from include page numbers? – Danmichaelo (δ) 11:58, 18 August 2016 (UTC)
Things are a little bit more complex - high resolution images of pages are stored as jp2 files, into a zip. A dynamic request to the right IA server builds a jpg image from jp2, then sent it. I.e of a IA url, giving back a high resolution jpg:
Aas you see the needed url is a call to a php script - not a static url; and it is wrapped into a very complex, multi-server url.  :-( --Alex_brollo Talk|Contrib 13:39, 18 August 2016 (UTC)
Given the IA item (LingenosoIdalgoDon_chisciotteDellaManciaVol.2), it seems like you can get the rest (server, dir) from the metadata api: . What confuses me about this file though is that the 64 MB pdf file is marked as "original", while the 843 MB is "derivative". Any idea why? – Danmichaelo (δ) 18:57, 18 August 2016 (UTC)
Yes. IA items are uploaded by contributors as pdf files or as zipped images; those uploads are the "original" files. Both are normalized and somehow deskewed by the powerful server of IA getting "derivative" _jp2 images; these are the source for any following elaboration (OCR, pdf, djvu....). Djvu is presently derived but it is not published; it is used to extract text and _djvu.xml, the latter contains "word mapping", t.i. coordinates of words into page image. I presume that IA book viewer uses _jp2 images + _djvu.xml to highlight searched words. _jp2 images are "omologous" to images wrapped into djvu or pdf files, so that coordinates of images into _jp2 can be exactly derived from jpg images coming from djvu/pdf IA files. --Alex_brollo Talk|Contrib 08:48, 19 August 2016 (UTC)

Another (easy) proposal for CropToolEdit

Most wikisource users of CropTool (two from three I presume by now.... ;-).... did you get some more feedback?) feel the need to one more input field to add one or more category names (it would great to use commonist convention, t.i., if I remember right, names of categories separated by a | character).

Perhaps, the best could be to have a preview of full description text that is going to be uploaded by CropTool, just as IA upload bot does, allowing users to add/to edit all what they need.

PS: Thanks for the drop-down field for page numbers!--Alex_brollo Talk|Contrib 14:24, 19 August 2016 (UTC)

@Alex brollo: Right, a full description editor is easy to add, but where should it be placed in the user interface? (without making it too clumsy and crowded :)) Let me know if you have suggestions. A category field is easier to fit in, I guess. It's also possible I could get HotCat to work. Btw. I've added a GitHub issue on this as well. – Danmichaelo (δ) 20:48, 30 August 2016 (UTC)
Consider that wikisourcians probably will be the main users of the tool, and that they are not confused by crowded textareas :-) Could it be hidden into a collapsed field? But really, a field for categories might be sufficient; the best would be if added categories could be "remembered" while uploading different pages of the same book.--Alex_brollo Talk|Contrib 21:11, 30 August 2016 (UTC)
Yeah, I think a collapsed field could work. But I'm not sure if a single category field could be more efficient to work with sometimes? In any case I'll experiment a little with this, but it probably won't be in the next couple of weeks since I need to prioritize other projects for some time now. – Danmichaelo (δ) 21:26, 30 August 2016 (UTC)
Thanks. There's no hurry, the tool as it is now allows a tenfold increase in speed and consistency of uploads... :-) I added a "CropTool calling gadget" into la.source too. --Alex_brollo Talk|Contrib 17:01, 31 August 2016 (UTC)

DigitaltMuseum: DEXTRA PhotoEdit

Hei, jeg får ikke {{DigitaltMuseum}} til å virke korrekt for bilder fra Norsk Teknisk Museum med inventarnummer som begynner på "DEX_KF_". Se for eksepmel File:Eldfisk_oljeplatform_(DEX_KF_000632).jpg, der bildelenken bare fører til DigitaltMuseums forside. - 4ing (talk) 07:17, 14 September 2016 (UTC)

@4ing: Hei, jeg tror dette er noe KulturIT må fikse på deres side, så jeg sendte dem en epost. Erfaringsmessig bruker de veldig lang tid å svare, men får se. – Danmichaelo (δ) 20:36, 25 September 2016 (UTC)
@4ing: Nå svarte de raskt: «Dette er en feil som har vært i lang tid. Vi jobber nå med en ny versjon av DigitaltMuseum. Vi vil sørge for at vi håndterer understrek der.» Så da må vi vel bare smøre oss med tålmodighet. – Danmichaelo (δ) 10:27, 28 September 2016 (UTC)
Takk, det ser ut til at dette berører svært mange av filbeskrivelsene som inneholder denne malen. - 4ing (talk) 14:05, 28 September 2016 (UTC)
@4ing: Søren, dette burde jeg egentlig sett med en gang, men malen hadde feil institusjonskode. Når det står "DEXTRA Photo" under "Institusjon", er det "KFS" som skal brukes. Nå funker lenka. Liste over støttede institusjonskoder finnes på {{DigitaltMuseum}} (ufullstendig, men utvides ved behov :)) – Danmichaelo (δ) 08:31, 6 October 2016 (UTC)
Takk skal du ha, da skal jeg rette opp kodene! - 4ing (talk) 09:30, 6 October 2016 (UTC)
Return to the user page of "Danmichaelo".