Error in maps: data quality and fitness for use

Objectives of lecture:

  • Data Quality: verifying the representation (and thus its measurements)
  • Alternative definitions of data quality:
  • Components of Data Quality
  • Lineage, Positional Accuracy, Attribute Accuracy, Logical Consistency, Completeness
  • Other issues: time, role of semantic specification, indeterminate boundaries
  • Testing for Data Quality
  • Introduction to Geography 458

  • Views of Accuracy: London

    1747: Pine and Tinney

    Text at the bottom:

    To Martin Folkes, Esq. President of the Royal Society: This Plan of the Cities of London and Westminster and the Borough of Southwark, with the contiguous buildings; is humbly inscribed by his most humble servants John Pine and John Tinney.

    This Plan is taken from the large one, printed on 24 sheets of Imperial Paper from an actual Survey which was begun in March 1737, by Mr. John Rocque Land Surveyor, and engraved by Mr. John Pine Blue-Mantle Pursivant at Arms, and chief Engraver of Seals &c to his Majesty; and was finsihed and published in June 1747.

    The Space contained within the aforesaid Plan contains about 11500 Acres of Ground; and as it is laid down by a Scale of 200 Feet to an Inch, admits not only of an exact Description of all Squares, Courts, Alleys &c. in their true Proportions; but likewise of the Ground Plots of several Chruches, Halls, publick Buildings, and considerable houses and Gardens with their true Names, ane everything else that is necessary to render such a Work as compleat as possible.

    The following Certificate of Martin Folkes Esqr. and Peter Davall, Esqr. one of the Secretarys of the Royal Society, will it is presumed be a Satisfaction to the Publick of ye Care taken & Method followed in making this Survey.

    We cannot in Justice to Mr. Rocque and Mr. Pine refuse to acquaint ye Publick that they have in our Presence taken the true Bearings of a very great number of Steeples and other Remarkable Places, from different stations, and with excellent instruments of Mr. Sisson's & they have had the proportional Distances of a great many Points in different and very distant Parts of the Town computed trigonometrically, to whic Calcuylations they have strictly confined their Map. They have also taken the best Methods they were able to make use of, for the adjusting to the true Scale. Ane whereas some of the Distances computed by Trigonometry, were found to differ somewhat, though not very considerably, from the same Distances before collected from the Mensuration of Streets; they have not thought much of the Trouble of drawing the main Plan over again, before they began to engrave; and which last they have deferred, til we could venture to give them this additional Recommendation, by which we assure those it may convern, that we are well satisfied ye Work will be carefully perform'd &c. therefore well deserves their Favour & Encouragement.

    London, July 24, 1742 M Folkes. P. Davall.

    Published according to Act of Parliament 20th May 1749 and Sold by the Proprieters

    [G5754.L7.1749.R6.1970]

     

    Satellite View of London

    Technical Data of Image

    This image was produced from data acquired by the Landsat 5 satellite on 21st October 1984. Landsat, which orbits the earth at a height of 705 km (440 miles), carries a sensor called Thematic Mapper which builds up a picture of the earth's surface by measuring the amount of light at specific wavelengths, that is reflected from the surface. This information is recorded as a series of numbers; each number represnets the amount of light reflected in both visible and infra-red wavelength for an area equivalent to 30 metres by 30 metres on the ground. As a Landsat image is composed from an array of numbers it is ideally suited to analysis using computer aided techniques. The digital data were converted into an image by Hunting Surveys and Consultants Limited using HIPAS, the computer based Hunting Image Processing and Analysis System.

    The original Landsat data have been geometrically corrected so as to be aligned to the UK National Grid and three of the seven wavelengths recorded by Landsat have been used to produce this coloured image. These three wavelengths are equivalent to visible red light and two infra-red wavelengths which are just outside the nromal visible range. These data have been further enhanced using the computer to highlight several particular features, including the line of teh new M25 motorway, the road network in suburban areas and differences between water in reservoirs (shown in purple) and the river Thames (shown in magenta).

    [G5754.L7A4.1984.H8]

    Bombing Map of London: Luftwaffe 1941

    [G5754.L7.1941.G4]

     


    Data Quality: Alternative definitions of data quality:

    1. Conformance to expectations: fulfilling arbitrary thresholds
    2. Following established procedures: as with geodetic standards
    3. Fitness for use: Truth in labelling (distinct roles of producer and consumer)

    Error (mathematically)- difference between a measured quantity and its `true' value or defined socially by the amount of imprecision permitted in statements, specifications, etc.


    Historically, most people think immediately of positional accuracy as the main issue in map accuracy.

    National Map Accuracy Standard adopted in 1940s; still the basis for US Geological Survey National Mapping Division Standards; broader thinking in more recent content standards;

    What does NMAS leave out?


    Data Quality has become a part of the METADATA, though it is arguable which is a superset of the other... It is NOT easy to implement these new responsibilities.


    Components of Data Quality recognized by National Standard for Digital Data Quality

    1. Lineage
    2. Narrative of source materials used & procedures applied to produce the product.
    3. Parameters of projections and transformations; Decisions made and criteria used
    4. POSITIONAL ACCURACY
    5. Usually the component identified with "accurate" maps - expected error in position
    6. National Map Accuracy Standards: 90% of well-defined points within .02" at scale
    7. Inadequate for the range of map data
    8. New standards (ASPRS) revise this approach but still deal with well-defined points
    9. ATTRIBUTE ACCURACY
    10. Error in attribute value: continuous data treated as position (differences as distances)
    11. Categories: reported as misclassification matrix (omission and commission)
    12. testing procedure: random sample of points or exhaustive overlay
    13. LOGICAL CONSISTENCY
    14. Amount that the data fits into the expected structure (eg. topological model)
    15. tests based on internal evidence within database
    16. COMPLETENESS
    17. Exhaustiveness of coverage (are all counties shown... if one barn; all barns?)
    18. Use of mapping rules: minimum width, area ..



    See Resources for more materials on Standards for Data Quality

    What was left out of SDTS Data Quality

    Time

    At the time, the temporal reference of data was well-understood. The working group decided that time could not be tested separately from position and attribute. For some purposes, it might make sense to have a separate section dealing with time, even if the test cannot distinguish. ISO TC 211 seems to have done that...

    Semantic specification

    The SDTS standard assumes that the conversion from the world to the database is not particularly problematical. Accuracy can be simply tested when a test translates directly to the database. Recent work in France has demonstrated the utility of an intermediary concept between the "world" and the "database". They call it "terain nominale", and translate it into English with phrases like "nominal ground" (horrible) or "absract universe" (sounds pretty powerful). What they mean is not the Real World in all its complexity, but the world as seen through a particular approach, a particular set of rules. This concept clarifies how all tests are performed. They observe through a particular lens, a particular point of reference.

    Indeterminate Boundaries

    Many geographic phenomena have indistinct boundaries or have complex attributes. Testing a fuzzy object using sharp rules may give silly answers.


    TESTING

    Alternative approaches to testing data quality:

    Data quality issues are a constant in any kind of measurement. See water conditions disclaimer.


    Testing Procedures

    <see Resources for more examples>

  • Procedures for testing: all operations and transformations
  • Pairing multiple representations of the same object
  • Pairing by Overlay
  • Neighborhood Analysis
  • Practical aspects of Quality Assurance
  • Kinds of errors you can detect
  • How they might be fixed
  • Positional Accuracy Test:

    Measure each point twice; once from a source of higher accuracy. Tabulate differences, report standard deviation of distance between the two versions

    Categorical Coverages by Overlay

    exhaustive overlay of two sources (gives positional as well as attribute information)
    (for example Old Growth Mapping tests)


    Data Quality is the subject of a WHOLE course (offered next Winter)


    Index from Here: | Next Lecture | Representation Lecture | Resources for this lecture | Schedule of Lectures | Labs and Due Dates | How to reach us
    Version of 25 November 2003