Configuring the User:
Divisions of Labor and Knowledge
in the Practice of GIS

 

Nicholas Chrisman, Geography
University Of Washington
http://faculty.washington.edu/chrisman/

Presentation available:
http://faculty.washington.edu/chrisman/Present/4S2001.html

Full paper (draft form in .pdf)



Abstract

The idea of "configuring the user" was originally applied by Woolgar to the design of computer hardware. This paper contributes to the extension of the concept to the realm of software, with particular reference to geographic information systems (GIS). The software in use for GIS originally developed in the research sector. Over the past 20 years, the private commercial sector has taken control of innovation. The economic importance of GIS has increased at exponential rates. Through these changes, the underlying mathematical models have changed very little. The research community continues to offer new ideas, but the commercial sector has seemed to stick with the well-known and understood.

Some of the reluctance to change may come from a process of configuring the user. This paper sets out some cases from the practice of GIS that give evidence of the division of labor and the division of knowledge implied in the choice of mathematical models applied to GIS software. In the case of map registration, the software adopts an optimal least-squares model that requires protection from "blunders", while a more robust technique would shift the division of labor. Many other examples pervade the practice of GIS, but this one demonstrates the division of power, labor and knowledge in a complex network.

This paper demonstrates that the lines between the "technical" and the "context" are very hard to discern in the ways that software comes to encode social relationships. Lazy programmers attempt to configure users, uninformed users may not even notice their servitude. This is the start on a larger story of the complex negotiations in large socio-technical networks.



Outline of Presentation



Configuring the User

Steve Woolgar


Woolgar, S. 1991: Configuring the user: the case of usability trials. In A Sociology of Monsters: Essays on Power, Technology and Domination, ed. J. Law, p. 58-97. London: Routledge.

Later work: extending to software


Mackay, H., Carne, C., Beynon-Davies, P. and Tudhope, D. 2000: Reconfiguring the User: Using Rapid Application Development filing cabinet. Social Studies of Science 30: 737 - 757.


Study of in-house software development (bank)
Argue for symmetrical configuration ­ users not passive, can configure designers...
Suggest actor-network analysis as basis



Geographic Information Technology

Geographic Information Systems (GIS)
One (bland) definition:
­ A system of hardware, software, data, people, organizations and institutional arrangements for collecting, storing, analyzing and disseminating information about areas of the earth. (Dueker and Kjerne, 1989, Glossary of GIS terms, p. 7-8)

My (longer-winded) definition:
­ The organized activity by which people
· measure aspects of geographic phenomena and processes;
· represent these measurements, usually in the form of a computer database, to emphasize spatial themes, entities, and relationships;
· operate upon these representations to produce more measurements and to discover new relationships by integrating disparate sources; and
· transform these representations to conform to other frameworks of entities and relationships.
These activities reflect the larger context (institutions and cultures) in which these people carry out their work. In turn, the GIS may influence these structures."
(Chrisman, 1997 Exploring GIS, p. 5)



GIS Industry

GIS software centrally controlled
single vendor (ESRI) controls over 50% of US (world?) market
(ESRI privately held by founder and spouse)
annual maintenance fee (not single sale price)
strong control on access to company (all calls through intermediaries at both ends)
lavish user conferences (11,000 attendees in 2001) to build loyalty, listen to concerns
Software design originally based on prototype from research sector (Harvard University), follows trends in larger industry (object-orientation, databases, web servers); fairly conservative (models evolve)



Coordinate Transformations

To integrate diverse sources, geometric registration required.

· digitizer device units => projection of map
· remote sensing image => map space
· air photograph => control points

Registration software: a kind of infrastructure, taken for granted.

Steps:
1. Input: coordinates of "tic points" in source and destination coordinates
2. Solve best fit transformation, using a geometric model

A BLACK BOX: encapsulates science, hides details.



Prying Open the Black Box

Decisions required:
· Geometric model:
Similarity, Affine, Projective, Piecewise ...
· Estimation method
Ordinary Least Squares, Weighted LS ...
· Control point data
Number of points, distribution ...
· Adequacy
Sufficient fit, blunders (outliers), accuracy ...

Each decision has a social component.
Variations in disciplinary training
Expectations about users assumed by software
Division of labor implicit in divisions of error



Number of Points

Estimation requirements
Affine, the most common, has 6 unknowns
no internal estimate of error with 3 points.
4 points: barest of minima

How the black box is presented:
"Select 4 widely spaced points common to maps A and B to be used as tics for A."
(ESRI, 1991, Map Projections and Coordinate Management: Concepts and Procedures, page 5-13)
One vendor only permitted 4 points. (PlanetOne, http://www.planetonegis.com - now defunct)

Careful practice would require 20 or 30 points, not just 4.
Software documentation keeps it simple, configures user.



Why the affine?

Affine fits a different X and Y scale.
If sources really in same projection, X=Y.
Parameters reported in format that includes rotation and scaling in common values (user not informed)
Rotation, translation and scaling NOT adequate to convert a cylindric projection to a conic (eg. UTM to State Plane) [over a sufficiently large extent...]

Possible explanations:
Legacy effect: residual of hand calculator era (?) (though this would argue for similarity, not affine)
Differences in solutions insignificant (?)
Justified for printed maps whose paper might shrink more in one direction than another (?)
Software written without expertise from practice.
In any event: user has been configured.



Why Least Squares?

Simplicity:
Closed form equation: minimal software effort

Efficiency:
Best Linear Unbiased Estimator (BLUE) uses all information available to produce estimates
BUT
only has this property if points come from
normal iid (independently and identically distributed)

Depends on Division of Error (3 parts like Gaul)
· Systematic
· Random
· Blunders (not handled!)

User work keeps software simple.



What is a Blunder?

Random Error:
defined as what Least Squares requires.
Numerical properties extend to procedures
­ basically, random = good error

Blunder Detection:
User's responsibility (reversed digits, wrong object, ...)
Residuals "too big"
but Least Squares gives great weight to outliers, so residuals can be misleading.
With 4 points, blunder ensures wrong fit; more points required to be able to select.

A blunder is an data point that should not be used in Least Squares estimation...
(and why use least squares?)



Alternatives

Old constraints (particularly calculation) no longer applicable.

Robust Statistics:
estimation less dependent on distributions

Least Median Squares: one example
1. sample possible combinations of points
2. fit regression with least median squares
3. select best solution from iterations of 1 & 2

Can tolerate up to 50% blunders
(high breakdown point)
No weighting functions or complex user intervention
[Shiahn-Wern Shyue, PhD, 1989, University of Washington]



Redividing the Labor: Some lessons

Robust Statistics can tolerate blunders.
Models such as LMS offer a redefinition of error.

Research community can innovate.
But, unpublished PhD dissertations do not help get the word out...
Innovations have to make it into practice.

GIS users should expect better registration, but don't know to ask.

Software vendors should change.
The "most powerful" techniques are probably evaluated on inappropriate measures.
Power differences in socio-technical networks must be resisted and subverted.



Design of GIS Databases: layer cake

<INSERT Figure>

based on administrative logic more than technology



Alternative: Integrated Terrain Units (1960s)

Unit Area (%) Land Forms Soils Vegetation
1 30 Rugged hills with rounded sum- Mainly shallow coarse-textured Shrub woodland of ironbark
mits; irregularly benched slopes skeletal soils and bare rock; in and gum 40­80 ft high, iron-
often littered with boulders and moist cool sites humic surface- barks common, with E. punctata
with very frequent sandstone soils; infrequently on interbed- E aggiomerata, and E. oblonga,
outcrops including low cliffs up ded shales or arkosic sandstones and with scattered or dense
to 30 ft. high; fairly narrow flat- shallow podzolic soils (Binnie, Callitris endlicheri, Casuarina torulosa,
floored valleys 400­1000 ft deep Pokolbin); in stable sites coarse- and Persoonia spp.
textured earths below; shrubs usually
abundant and mixed, Legumi-
nosae common; ground cover
poor, of grasses and herbs
2 30 Rugged hills margined by sand- Similar to unit 1; predominantly As for unit 1, but with more
stone cliffs 50­500 ft high usual- coarse-textured non-humic herbs, shrubs, and non-eucalypt
ly overlooking steep shaly slopes skeletal soils; probably more trees in ravines and at bases of
littered with boulders; cavernous bare rock cliffs
weathering of the cliffs; narrow
inaccessible valleys 500­2500 ft
deep
3 35 Stony, hilly plateaux with ridges Restricted obsevations; similar Shrub woodland of ironbark
and escarpments up to 200 ft to units 1 and 2; deep yellow and gum 30 ft high, including
high; very steep margins includ- earth (Mulbring) in level, stable E. punctata, E. trachyphola,
ing cliffs up to 100 ft high; nar- site on plateau and stringybarks; ground cover
row gorges along the major rivers poor; many non-eucalypts in
ravines and at bases of cliffs
4 <5 Sandy alluvium occupying valley Restricted observations; deep Shrub woodland or ironbark
floors in unit 1; liable to frequent sandy stratified alluvial regosols and gum with an admixture of
flooding and deposition of sand (Rouchel); sedimentation in non-eucalypt trees, sometimes
in middle and upper reaches valley bottoms frequent and cleared and under pioneer
calamitous owing to low soil grasses
stability on sandstone hills
collaborative work done in the field, single map product



Choice of database models

Overlay (layer cake) model wins
success based on more reasonable demands made on users
ability to survive in arms-length (and unexpected) transactions

Integrated Terrain Units
more scientifically defensible (but the negotiations related to a given purpose)
Australian technique with wide adoption by international organizations (FAO)
major US proponent: President of ESRI ­ makes huge business out of layer-based approach...



Conclusions

Configuring the user seems to exist.
More like configuring work practices.
Not just another STS buzz-word?

The social can be followed in the technical choices.
map registration software HAS politics...

Divisions of labor are also divisions of knowledge
Power often involves time; who has priority; who gets to decide (see Rachel and Woolgar 1995 on the decision about the "technical")

Rachel, J. and Woolgar, S. 1995: The discursive structure of the socio-technical divide: the example of information systems development. The Sociological Review 43: 251-273.

 


Version of 30 October 2001