Digitizing and How to Avoid it

Objectives of lecture:

  1. Digitizing as transformations
    1. Hardware measurement and representation
    2. Software control of relationships
  2. Practical issues of making databases
    1. Huge expenditures on conversion
    2. Data Quality issues
  3. Avoiding Digitizing: <Finding some one who has already done it>

(Chapter 3)

GIS depends on data. In this early era of GIS adoption, that data comes mostly from the 'conversion' of existing maps. There are some nasty side effects of this process. The position of a symbol on a map (the representation) may be controlled by the original measurement of the corresponding 'feature', but it may have been 'displaced' as a part of map compilation, generalization etc [the sum of all the operations performed after the original measurement].

The measurement frameworks used in traditional maps (such as contours) may not match the measurement framework desired - thus transformations are the core of the conversion process.


Digitizing hardware:

Vector tracing

normally using a hand-held cursor on a digitizing table. The cursor is NOT a mouse. A mouse uses relative coordinates (it doesn't move when you pick it up). The digitizer cursor measures absolute location on the surface. When you plunk it down, you move to that position...

Point mode/ stream mode; accuracy mostly controlled by skill of operators and line weights, though the hardware may be accurate to .005 or .003 inch.

Raster scanning

originally single sensor on a drum, now push-broom CCD array map image converted into measurement by pixel; image processing produces vector measurements (edge detection, line following)

All digitizers produce coordinates in their local space (integers).

USGS GIS poster


Digitizing Software

Registration connects known points on the map image to their intended location. Hidden in most software: a 'fit' computed between measurements (more on this in lecture on changing coordinate reference systems).

Topological structure:
either the digitizing operator or the software can create the relationships. Software for 'planar enforcement' is essentially the overlay engine, applied in a different way.
(Remember the introduction to topology back in the Representation lecture.)

Different strategies:

Verification and Quality Control:

either a lot of visual inspection, or use software to verify relationships (integrity constraints) expected to occur - the topological model for geometry; attributes verified by other relationships (completeness from list of all objects) or brute force inspection.

Examples of Digitizing Errors

a) Missing line
This case occurs by leaving out Lake Kariba, or coding it as 'water' not 'international boundary'.
digitizing errors at dead-end node
Undershoot
digitizing errors attaching identifers
Overshoot
digitizing errors attaching identifers
a) Multiple identifers
from undershoots or missing lines.
digitizing errors attaching identifers
b) No identifier
from simple omission.

digitizing errors attaching identifers
c) Extraneous linework
creates unlabelled polygons.
digitizing errors attaching identifers
(Entering a railroad into the 'country' boundary network can create extra 'countries'.)


How to Avoid Digitizing:

You can avoid digitizing by finding a source for the digital database "somewhere" out there. This was a glimmer of hope a few years ago, and a near-reality these days.


Data Policies Worldwide

Traditional map making institutions expect to play a leading role in the future arrangements: Ordnance Survey in UK (drill down in to see the copyright fees...) and, until recently, the US Geological Survey sees the future in National Cartographic Data Bases, an inventory created by digitizing existing "topographic" maps.
USGS has now changed its approach with a mission much more directly oriented towards a digital future - now called the National Map. see also the Web GISData portal. [geography.usgs.gov not responding today, sorry]

Policies about public access, cost recovery, copyright and other legal issues vary widely.

This approach is fairly international but not the only possibility.

If digitizing is the current, transitional source of most GIS layers, the eventual source will be some network of cooperating systems. The USGS's role in the National Spatial Data Infrastructure demonstrates this concept, based on concepts originated by National Research Council Mapping Sciences Committee (the 1993 report was the second on the topic). [NSDI is a component of Al Gore's National Information Infrastructure - the infobahn]. The Federal Geographic Data Committee is in charge of the development of the federal portion of NSDI, but it really involves "partnerships" with local government and all kinds of distributed actors (not a centralized single production system).

Spatial Data Infrastructure (Geospatial Data Framework) projects now abound, including an international group. The UK had started up their own, among many copy-cats, it got turned into AskGiraffe?.

In some places, the database will not just be populated with the tired old maps. France (IGN) plans to create BD-Topo as a digital database (Completion advancing slowly). This database is similar in many respects to the City of Seattle Common Land Database, except that it is being updated...)

[Resources on Data Marketplace and Digitizing]


Index from Here: | Next Lecture | To Transformation Lecture | Measurement Framework Lecture | Representation Lecture | Schedule of Lectures | Labs and Due Dates | How to reach us
Version of 14 November 2003