RASTER & VECTOR DATA
REPRESENTATION
Harikrishna M
Department of Civil Engineering
National Institute of Technology Calicut
National Institute of Technology Calicut
RASTER DATA REPRESENTATION
National Institute of Technology Calicut 2
INTRODUCTION
Raster model
Field-based model of geographic data representation
Best employed for continuous geographic data
Commonly known as tessellation model
Basic spatial data unit: Predefined area of space for which
attribute data are explicitly recorded
Regular or Irregular tessellation
Regular tessellation : triangular, square and hexagonal
Regular square or rectangular tessellation: Raster data model
Compatible with different types of hardware devices
Compatible with concepts and methods of bit-mapped images
Compatible with grid-oriented coordinate systems
National Institute of Technology Calicut 3
Nature & Characteristics of Raster Data
National Institute of Technology Calicut 4
Nature & Characteristics of Raster Data
Subdividing a geographic space into grid cells
Linear dimensions of each cell define the spatial resolution
of the data
Size of the smallest object in the geographic space to be
represented: Minimum Mapping Unit (MMU)
Grid size should be less than half the size of MMU
Each grid cell must contain a value
Value indicates the quantity or characteristic of the spatial
object or phenomenon that is found at that location of the
cell
Grid cell values can be directly for computation or indirectly
as code numbers references to an associated table 5
National Institute of Technology Calicut
Nature & Characteristics of Raster Data
In raster database, values pertaining to different
characteristics in the same cell location are stored in
separate files (map layers)
Raster data processing involves use of multiple raster
files
Positions of spatial objects or phenomena in raster
model are represented only to the nearest cell
Representation does not always correspond to the
spatial object or phenomena in the real world
National Institute of Technology Calicut 6
Nature & Characteristics of Raster Data
Individual cells that make up the real world object are
the entities
Identities of individual spatial objects are lost in raster
data model
Raster data are stored as a linear array of attribute
values
Location of each cell is implicitly defined by its row and
column numbers
Location of the cells can be computed when the data are
used for display and analysis
National Institute of Technology Calicut 7
Nature & Characteristics of Raster Data
To translate a linear array storage to a two dimensional
display, information should be stored in the header section
of the data file
File Header contains information on
Number of bits used to represent the value in each cell
Number of rows and columns
Type of image
Legend
Coordinate transformation
File formats for raster data files vary depending on the
algorithm used for data compression
National Institute of Technology Calicut 8
Principles of Raster Data Compression
Single raster data files contain several million grid cells
Black and white line map measuring 50cm× 50cm when
scanned at a resolution of 25 micrometers (~ 1000dpi) will
produce 400 million pixels
Actual size of file depends on bit depth: number of bits used
to represent the value of the pixel
Therefore, data compression is an important feature of
digital representation of raster data
Assumptions in raster data file compression
Cells representing areas of same entity type have identical values
Patterns of values tend to be spatially clumped
National Institute of Technology Calicut 9
Principles of Raster Data Compression
A number of algorithms are used to handle adjacent cells of
identical values
Run Length Coding
Adjacent cells along a row with the same value are treated as a group called as a
“run”
Instead of repeatedly storing the same value for each cell, the value is stored
once, together with the number of cells that make the run
Simple to understand and easy to implement
Disadvantage : Does not give a good compression ratio
More efficient compression technology is wavelet compression
Multi-resolution Seamless Image Database (MrSID) gives
compression ratio between 15:1 to 20:1 for 8-bit gray scale images
and 30:1 to 40:1 for 24-bit colour images
National Institute of Technology Calicut 10
Principles of Raster Data Compression
Original Raster Data 10A2B3A8B1C4B2C3B3C
Start encoding
A A A A A A A A A A A A
A A A A B B A A A A B B
A A A B B B A A A B B B
B B B B B C B B B B B C
B B B B C C B B B B C C
B B B C C C B B B C C C
Total number of values = 18
Total number of values = 36
National Institute of Technology Calicut 11
Quadtree Data Representation
Hierarchical tessellation model that uses grid cells of
variable sizes
Geographic space is divided by the process of recursive
decomposition
Instead of dividing the entire geographic space into grid
cells of the same size, quadtree model uses finer
subdivision of areas when finer detail occurs
Decomposition continues till a predefined maximum
number of iterations is reached, which determines the
minimum cell size that can be represented
National Institute of Technology Calicut 12
Quadtree Data Representation
13
National Institute of Technology Calicut
Quadtree Data Representation
For a typical topographic map of 50cm × 50cm, it takes 11
iterations to reach a resolution of 0.5mm and 13 iterations to
reach a resolution of 0.12mm
No explicit storage of coordinates is necessary for the
quadtree model
Position of individual cells can be found using cell
identification number
Hierarchical numbering system of cell identification allows
location of each cell to be computed relative to map origin
Descriptive attributes associated with cells are stored as
feature codes of individual cells
National Institute of Technology Calicut 14
Quadtree Data Representation
Advantages
Data-storage and search techniques are well researched and
understood
Compatible with Cartesian coordinate system for cartographic
applications
Recursive sub-division of geographic space facilitates physically
distributed storage, economic use of memory and expedites
browsing operations
Allows variable spatial resolution to be represented in accordance
with degree of complexity of geographic surface
Disadvantages
Difficult compared to simple raster model
Relatively complex process to
National Institute of Technology Calicut 15
Quadtree Data Representation
Disadvantages
Difficult compared to simple raster model
Relatively complex process to modify quadtree indices and tables
Best suited for
Areas where data is relatively homogeneous
Applications that require high-performance spatial search of the
database
National Institute of Technology Calicut 16
VECTOR DATA REPRESENTATION
National Institute of Technology Calicut 17
INTRODUCTION
Object-based approach
Best suited to represent discrete objects
Spatial objects are identified individually and represented
mathematically as coordinates
More complex than a raster data model
More difficult to implement
All vector models are built on 2 concepts
Decomposition of spatial objects into basic graphical elements
Use of topology (spatial relationships) to represent spatial objects in
addition to geometry (coordinates)
National Institute of Technology Calicut 18
Nature & Characteristics of Vector Data
Represented using coordinates
Spatial objects are represented by one of the 3 basic
graphical elements: points, lines, polygons
When graphical elements representing an individual
identifiable real-world feature are logically grouped together,
a graphical entity is formed
Graphical elements are “dumb” graphics and graphical
entities are “intelligent” graphics
National Institute of Technology Calicut 19
Nature & Characteristics of Vector Data
National Institute of Technology Calicut 20
Nature & Characteristics of Vector Data
Spaghetti
National Institute of Technology Calicut 21
Spaghetti Data Model
Any polygons that lie adjacent to each other must be
made up of their own lines, or stands of spaghetti
In other words, each polygon must be uniquely defined
by its own set of X, Y coordinate pairs, even if the
adjacent polygons share the exact same boundary
information
Data that are collected but not structured are said to be
in the spaghetti data model
National Institute of Technology Calicut 22
Spaghetti Data Model
Vector data obtained by map digitisation are said to be
in this data model as they are not structured
Spaghetti data model stores graphical elements, but not
graphical entities, defined by strings of coordinates
Spatial relationships are not explicitly encoded within the
spaghetti model
Considerable redundancy in this data model, as
boundaries between adjacent polygons are stored twice
Vector data in this model should be properly structured
for use in GIS
National Institute of Technology Calicut 23
Topological Data Model
Structured data built on the concept of topology
Among several topological data models, the commonly
used is the arc-node data model
“Arc”: line segment & “Node”: end points of line segment
Arc-node model also stores graphical elements, but also
stores explicitly the spatial relations between graphical
elements and relationships between arcs and their
respective nodes
Stored relationships allow graphical entities to be
constructed from basic graphical elements
National Institute of Technology Calicut 24
Vector Data Model
Vector data should be properly linked to descriptive data
in geographic databases
This is achieved using unique feature identifiers (FID)
assigned to individual spatial objects
By using common FIDs, graphical and descriptive
elements of vector data are correctly cross-referenced
during database creation and spatial data processing
Usually automates process, but linkage to descriptive
data is normally a manual process
National Institute of Technology Calicut 25
Topological Data Model
26
National Institute of Technology Calicut
Vector Data Model
Representation of vector data is governed by scale of
input data
Possibility of representing vector data differently at
different scales is associated with 2 important concepts
Cartographic generalisation: whereby line and area objects
are represented by more coordinates at a larger scale than at
a smaller scale
Cartographic symbolisation: whereby vector data are
represented by different symbols that serve to visually
distinguish them from one another when the data is displayed
National Institute of Technology Calicut 27
Vector Data Model
Vector data is stored as integers or floating-point
numbers
To avoid rounding errors that occur during data
processing, most software products store data by using
double-precision, floating point numbers
However, this does not ensure precise representation of
spatial objects as
Precision of data storage does not always mean accurate
description of data
Boundaries of spatial objects are fuzzy rather than exact
entities
National Institute of Technology Calicut 28
Concept of a Topological Map
Topology
Branch of mathematics that studies those properties of
geometric figures that are unchanged when the shape of a
figure is twisted, stretched, shrunk or distorted without
breaking
Field of geometry concerned with spatial relationships rather
than with rigid coordinates
Topological relationships for spatial objects was
proposed by Corbett (1979)
National Institute of Technology Calicut 29
Concept of a Topological Map
Topological relationships include 3 basic elements
Adjacency
Containment
Connectivity
National Institute of Technology Calicut 30
Concept of a Topological Map
Topological map
Map that contains explicit topological information on top of the
geographical information expressed in coordinates
All spatial entities are decomposed and represented in forms
of 3 basic graphical elements
Point entity is identified by a unique feature identifier
Line entity is identified by the line itself, as well as, its nodes
Topological relationships among linear entities are computed
and stored by using the identifier of the nodes
Polygon entities are formed by using line entities and their
respective nodes
National Institute of Technology Calicut 31
Concept of a Topological Map
Topological map
Once formed, polygon entities are individually identified by a
unique identification number
Topological relationship between polygon entities are
computed and stored by using adjacency information stored
with the line entities
Adjacency information includes nodes of the line and
identifiers of the polygons on the left and right of the line
Process of computing topological relationships: Topology
building
National Institute of Technology Calicut 32
Concept of a Topological Map
33
Concept of a Topological Map
34
Types of Topology
35
National Institute of Technology Calicut
Uses of Topological Relationships in GIS
Functional areas where used
Data Input and Representation
Spatial Search
Construction of complex spatial objects from basic graphical
elements
Integrity checks in database creation
National Institute of Technology Calicut 36
Data Input & Representation
Manual creation of topological map is next to impossible
Using topological relationships, the method of spaghetti
digitising can be adopted for graphical data input
Arc-based digitising procedure wherein no particular
sequence in digitising needs to be followed
Storing adjacency information in the form of identifiers
removes the need to duplicate line data
When data are plotted to show all the polygons, lines
are plotted and not polygons
National Institute of Technology Calicut 37
Data Input & Representation
38
Spatial Search by Topological Relationships
Tedious to find manually the data for parcels surrounding a
particular parcel in a land title information system
Process requires first using an index map to obtain parcel
identification number (PID) of all neighbouring parcels and
retrieving data from appropriate records
In GIS, adjacency information can be used to identify all
polygons that share a common boundary and apply their
PIDs to retrieve descriptive data from database
Connectivity information can be used for spatial search
using line data
National Institute of Technology Calicut 39
Spatial Search by Topological Relationships
National Institute of Technology Calicut 40
Construction of Complex Spatial Objects
Complex spatial objects are represented as complex
polygons in geographic databases
2 types of complex polygons
Those containing one or more holes (islands or enclaves)
Those made up of two or more polygons that are not
physically connected
Polygons can be constructed from vector lines using
topological information or a common identifier
National Institute of Technology Calicut 41
Construction of Complex Spatial Objects
National Institute of Technology Calicut 42
Integrity Checks in Database Creation
Graphical data must not contain any of the topological
errors
43
Integrity Checks in Database Creation
During topology building process, the computer will
identify topological errors and flag them automatically
Data input operator can check the errors and decide
how they must be corrected
Corrections of some errors can be done using the
concept of tolerance or interactive manual editing
44
THANK YOU
National Institute of Technology Calicut 45