User:Bertspaan/maps

Structured Data for Maps

Goal:

Design a metadata format that:

  • captures transformations needed for the georectification of (historical) maps;
  • as well as the pixel mask that can be used to remove the non-cartographic parts of those maps.

This metadata format should work with Wikidata (d:Q15726418), Wikimedia Commons (File:Amsterdam1688.jpg) and IIIF ([1]).

This metadata format could use JSON Schema (https://json-schema.org/) to describe and verify metadata.


Resources edit

Phabricator ticket edit

Sample maps edit

Documentation / Presentations edit

Wikidata / SDC properties and resources for maps edit

Existing properties edit

Property proposals edit

Data to be stored edit

In [Map Warper](http://maps.nypl.org/warper/), the following information is captured for each georectified map:

The following JSON structure could store the GCPs and mask:

{
  "maps": [
    {
      "gcps": [
        {
          "image": [420, 503],
          "world": [4.900, 52.162]
        },
        {
          "image": [1801, 1700],
          "world": [4.991, 52.362]
        },
        {
          "image": [1001, 1201],
          "world": [4.224, 52.962]
        }
      ],
      mask: [
        [100, 2010],
        [4032, 2010],
        [4100, 300],
        [100, 200]
      ]
    }
  ]
}

Or we could use a GeoJSON-based format, like so: https://commons.wikimedia.org/wiki/Data:Pigot_and_Co_(1842)_p2.138_-_Map_of_Lancashire.georef.map

  • A scanned map can consist of multiple maps (e.g. map sheets, inset maps). We can do this by allowing multiple georectified maps in a single JSON \`maps\` object.
  • Is it necessary to allow for other projections? The GeoJSON standard decided to [only support WGS 84](https://tools.ietf.org/html/rfc7946#section-4), but their arguments are less important for this proposal; we're using GDAL anyway, and GDAL works with all map projections.
  • The mask polygon is different from a GeoJSON polygon: our mask does not support holes.

Tools edit

When we've designed this metadata format, we can:

`



Currently, only the bounding box is stored as structured data:

https://commons.wikimedia.org/wiki/File:Helsingin_kartta_Nummelin_1876.png

https://docs.google.com/presentation/d/1OkJZVjF471LPywNSCtuTuvjUEBRdHMI0znas_a1Ms6k/edit#slide=id.g59a6f0df93_0_2 https://observablehq.com/@bertspaan/proposal-for-wikimania-2019-hackathon

Wikimedia representation edit

Source code edit

https://github.com/bertspaan/wikimania-hackathon-2019