5.1 Introduction
This chapter investigates the automatic classification of macro morphological landforms using GIS and digital elevation models (DEM). In the past, manual methods have been used for classifying macro morphological landforms from contour maps. Hammond's (1954 and 1964) procedure has to a certain extent become the de facto standard. A process developed by Dikau et al. (1991), which automates Hammond's manual procedures using GIS, is applied to the study area. Although this produces a classification that has good resemblance to the landforms in the area, it has some problems. A new process is presented that partly solves these problems. Landform classification is very sensitive to the operational definition used and this will be demonstrated. An application of fuzzy set theory that uses the notion of entropy is used to present this sensitivity.
For landscape classification, landform should be classified by morphology rather than rock type, structure, age or origin. It is usually the morphology that gives the greatest visual impression to the general public. Usually the rock type or structure is not even seen from a reasonable distance as the land may be covered by trees or buildings. Landscape assessment is concerned with the present character rather than the genesis. Genetic concepts are useful for understanding the processes forming the landforms but do not necessarily describe the appearance of a landform. The aims of a visual landscape classification are different from those of a genetic geomorphological classification, and therefore a different approach is required.
Within the fields of geomorphology and hydrology, the
automatic mapping of morphological landforms has been of interest, for instance in
modelling erosion (Dikau et al., 1991), providing watershed information (Band, 1986), and
mapping land components (Dymond et al., 1995). A morphological landform classification has
long been of interest to climatologist for developing climate models - topoclimatology
(Geiger, 1971). Although these disciplines have a different purpose for landform
information compared to landscape research, the ideas and methods initiated are very
useful. In general, geomorphological classifications are based at the meso-relief,
micro-relief and nano-relief levels, while landscape classification needs to incorporate
macro-relief, and some elements of meso-relief (Linton, 1970). Dikau (1989) defines the
macro landform scale to be landform greater than 10 square km and less than 1000 square km
in area.
5.2 Manual classification
Hammond (1954 and 1964) has developed a
macro morphological landform classification that was applied to the whole
of North and South America. Wallace (1955) used Hammond's classification,
with a few modifications, to classify New Zealand's landforms. Hammond's
classification is very quantitative with clear, explicit definitions that
can be easily applied by other researchers. It is perhaps this quality
that explains why Hammond's classification has been so widely applied.
The classification scheme used by Hammond is presented in Figure
5.1. A combination of three important parameters was used to identify
different landforms. These were relative (local) relief, slope, and profile
type. Relative relief is the maximum difference in height over a certain
area. Hammond used a square grid measuring 9.65km (6 miles) across to
determine the search area. After experimenting with different grid sizes,
Hammond (1964) found that this size was
"neither too small as to cut individual slopes in
two and thus distort the determination of local relief, nor so large as to include areas
of excessive diversity" (p.17). Gentle slope is used to distinguish areas of relief and
non relief. He chose 8 percent inclination as the upper limit of gentle slope, justifying
this value by saying that it,
"falls within the range of inclination in which the
difficulty of machine cultivation increases rapidly, erosion of cultivated fields becomes
troublesome, easy movement of vehicles becomes impeded, and in general one becomes highly
conscious that he [sic] has a sloping surface to deal with" (p.17). He also noted that the Soil Conservation Service in the
U.S. had used this threshold. However, the method used to identify this critical gradient
is not explained by Hammond. As discussed in section 0, this is an elusive parameter to
define. Profile type is explained in more detail in section 0. It is a means for
expressing whether flat areas are above or below the surrounding terrain and so is used
for identifying tablelands. Subsequent to Hammond's work other landform
classification schemes have been developed. Many are an adaptation of
Hammond's work and Table 5.1. summarizes
three of these. Wallace (1955) has produced the only morphological
classification of landforms for the entire of New Zealand (refer to Figure
5.2.). A 1:1,000,000 base map was used and this was completed
nearly forty years ago. As previously mentioned, Wallace used a method
based on Hammond's scheme. Wallace (1955) remarked regarding future developments
that he
"earnestly hoped that others with more advanced
concepts and better databases will work on a larger scale and reveal the inadequacies of
this early effort" (p. 27). Wallace did not explicitly calculate slopes because this
would have been too laborious. Today, such slope information is easily available because
automatic extraction of information from digital databases has advanced considerably.
These data would have probably been beyond Wallace's most wild hopes. Despite these
advances, which will be discussed and demonstrated in this chapter, there has been very
little further development in New Zealand with this type of morphological classification
since his attempt. This study will try to fulfil Wallace's hope. The only other real initiative or discussion on
morphological landform classification in New Zealand since Wallace's effort has been in
response to the Protected Natural Areas (PNA) programme (Myers et al., 1987). The PNA
programme was instigated to satisfy the requirements of the Reserves Act (1977) which
established provisions for
"...the preservation of representable samples of all
classes of ecosystems and landscape...". A discussion on landform classification resulting from
this produced two papers: "Terrain evaluation for rapid ecological survey"
(Crozier and Owen, 1983); and "A landform classification for PNA surveys in Southern
Alps" (Whitehouse, Basher, and Tonkin, 1990). It appears that the main emphasis of
the PNA survey was the protection of ecosystems and, in particular, significant
representations of natural flora. As a result, there was no deliberation over visual
landscape assessment theory. Crozier and Owen's classification scheme is based on the work
of Wallace, which in turn can be traced back to the work of Hammond. The classification
scheme devised by Whitehouse (et al.) appears to have been the adopted scheme used in the
PNA program for the Southern Alps. This was genetically based which means that landform
data collected for the PNA program is not the most appropriate for a visual landscape
classification. Landform data from the PNA program is also difficult to use because most
of it is not in digital format, and also the definitions of the different landform classes
are not precise enough. For example, "valley floor" is defined as, "the
comparatively broad, flat bottom of a valley". How broad is broad? With several
different field teams, there could be inconsistency between different areas. There have been many publications that describe New
Zealand's landforms from a genetic perspective. A recent notable example is Soons and
Selby (1982) but this does not help much for the development of a landform classification
that needs to be morphological. 5.3 Automated classification Computers have been used for extracting terrain
parameters from DEMs for at least the last twenty years. Collins (1975) discussed
different algorithms that could be used for identifying features such as tops of hills,
bottoms of depressions, watershed or depression boundaries and areas, storage potential of
watersheds, slope, and aspect. With the development of commercial GIS and national digital
databases (NDDB) in the mid 1980s, there has been a resurgence of interest in this field
(Dikau, 1989, Weibel, 1988, Weibel and Heller, 1991, Dikau et al., 1991, and Moore et al.,
1993). Significant advances have been made, and many processes for identifying these
parameters are now becoming standard functions within a GIS. Functions have been developed
for generalising extensive terrain surfaces using triangulated irregular networks (TIN)
(Midtbo, 1992). TIN and other algorithms have been used for generating DEMs from contours
(Weibel and Heller, 1993), and slope can be obtained easily from either a TIN or a DEM. It
is not the intention of this thesis to discuss in detail the mechanics of these functions
as many general GIS books do this (eg. Aronoff, 1991). What is of interest in this thesis
is how these parameters can be used to identify different landforms. Regarding landscape research, there have only been a few
published works on automatic landform classification. Barbanente et al.(1992) developed
routines for identifying ravines and cliffs automatically. These are not features that can
be justifiably included in a landscape classification because of the need to generalise.
Jackson (1990) used GIS to identify certain terrain parameters using what are now fairly
well known GIS functions. It is necessary now to determine more complex parameters and how
these parameters can be used for identifying landforms. The identification of parameters (parameterization) is an
important first step in identifying landforms. These parameters are then used to develop
parametric signatures of different landforms (described as formalisation). Dikau (1989)
used this approach to identify plateaux, convex scarps, straight front slopes, concave
foot-slopes, scarp forelands, cuesta scarps, valleys and small drainage ways, and crests.
Many of these landform features are, however, at the nano-meso scale, which is too
detailed for a landscape classification that requires macro scale landforms. Dikau, Brabb, and Mark (1991), in a very obscure
publication, developed automated routines that do identify macro landforms. The process
they developed automates Hammond's manual process nearly exactly and produces a similar
result, which they demonstrated on the landforms of the entire state of New Mexico in the
United States. Given that Hammond's classification has, to a certain extent, become the
standard approach for a morphological landform classification, this is a significant
development. In any classification, standardisation is important. The automated process
developed by Dikau et al. is therefore of particular relevance to this thesis and will be
discussed in detail. 5.3.1 Automating Hammond's classification scheme Table 5.2
compares Hammond's scheme with the automated scheme developed by Dikau
et al. The main difference between the two approaches is the number of
classes identified and the method of generalization. The combination of
parameter classes that Hammond's classification identifies could provide
as many as 96 landform units, but it only identifies the more common landform
units, which totalled 45. Perhaps this was required for practical reasons.
The automated approach identifies all 96 landform units. Hammond's process
also merged areas smaller than 2072 square kilometres into adjacent areas,
so that the information could be generalized on to a 1:5,000,000 scale
map. The automated approach does not do this. Another difference concerns the use of
spatial averaging windows. While a similar size square window was used
by Dikau et al. (9.8 km sides compared to Hammond's 9.65 km), the averaging
procedure was different. Hammond's approach moves the window along in
9.65km steps. This means that all the area within the window is generalised
to one landform type. With the automated approach a neighbourhood function
is used, as described in section 3.2.1.1,
and its window moves in 200m steps, where 200m is the raster cell length.
For each step, a generalization of the window was calculated and this
information was assigned to the focal cell (the cell in the centre of
the window). With Hammond's scheme, areas near the edge of the window
boundary could be easily generalised wrongly as information outside the
window boundary could be important to these areas but would not have been
considered. This problem is partly solved with the automated approach
using a neighbourhood focal function. The basic procedures used in the automated
approach developed by Dikau et al. are described in table
5.3.. It identifies the three components required - slope,
relative relief, and profile type. Slope was calculated using a three
by three moving window on a DEM, and from each placement of the window,
the nine adjacent elevation points were used. Relative relief was calculated
using a 49 by 49 moving window on a DEM (200m cell size). For each window
placement, the difference between maximum and minimum elevation was used
as the measure of relative relief. Figure
5.3. illustrates how the profile type was identified. As mentioned
previously, profile type is used to determine whether the flat areas are
above or below the surrounding terrain and is used principally for identifying
tablelands. Three classes are distinguished: lowland gentle sloping, upland
gentle sloping, and not gentle sloping. Upland and lowland profiles are
identified by first calculating the maximum elevation within the moving
window. The height of the central grid cell is subtracted from this. If
this is less than half of the relative relief within the moving window,
then the central cell is identified as upland. Otherwise, the central
cell is lowland. The resulting upland and lowland coverage is then overlaid
with a slope coverage to identify upland and lowland gentle sloping areas.
The percentage of gentle sloping areas that are in lowland profiles is
then calculated using a focal neighbourhood function. Once these three components have been identified
and classified, unique combinations are found by overlaying them. These
are listed in Table 5.4.,
where the codes are the same as used in Hammond's scheme (refer to Figure
5.1.). The subclasses are labelled using a capital letter,
a number, and a small letter. These represent the different components
used for identifying the subclasses. The capital letters from A
to D represent different slope classes, the numbers from 1
to 6 represent different relative relief classes, and the small
letters from a to d represent the different profile classes.
The combinations of the different classes identify the 96 different subclasses.
Once the subclasses are identified, the landform classes and types are
determined by grouping the subclasses as shown in Table
5.4 The database used by Dikau et al. for classifying the
landforms of New Mexico was a 100m grid DEM. This was used to generate a 200m grid DEM.
The software they used was a grid modelling system, an image processing system, and
ARC/INFO. The hardware they used was a Sun Sparc 2, Vax 4000, Microvax II, and Prime. 5.3.2 Automated classification of New Zealand's
landforms Given that Hammond's landform classification
scheme is reasonably well recognised and accepted, and also given that
this scheme has been previously automated, it was decided that an automated
process based on Hammond's scheme should be investigated for classifying
New Zealand's landforms. ARC/INFO, a Sun Sparc 10 workstation, and a 100m
contour database with spot heights were used. The contour database was
converted to a 200m grid DEM using ARC/INFO's TIN, and TIN to grid functions.
The process was thereafter similar to that developed by Dikau et al. (1991).
A range of neighbourhood functions, as discussed in section
3.2.1.1. were used, as well as, a slope function within the
GRID module of ARC/INFO, and a classify function (CLASS). The same class
intervals, codes and labels were used as in Dikau et al. (1991). Figure
5.4 shows the different stages of the process for the Banks Peninsula
region. First a DEM was produced. From this, slope can be calculated,
which was then classed as less than or greater than (and equal to) 8 percent.
The "mean slope" was calculated by assigning the value 100 to
areas that were gentle sloping (< 8%) and the value zero where it was
not. A focal mean function with a NAW of 5600m was then used to calculate
the percentage of the neighbouring area that was gentle sloping. These
percentages, classed into intervals, define the "mean slope"
component. Relative relief was calculated from the DEM using a focal range
function and a NAW of 5600m. A circular pattern results because of the
influence of high points that affect the whole of the circular NAW. The
relative relief values were classed into six intervals. Profile was calculated
from the DEM by using a focal maximum function, and relative relief to
identify upland and lowland profiles. This was then combined with the
slope classes to identify the three profile classes. The profile component
is represented by "profile percent" classes, which describe
the percentage of gentle sloping areas that are in lowland profiles. The
spatial averaging procedure used to accomplish this was as follows. A
focal sum function counts the number of cells in the neighbourhood that
were gentle sloping, and also the number of cells classed as lowland gentle
sloping. From these values, the percentage of gentle slope areas that
are lowland can be calculated. Figure 5.5
shows the resulting landform classes for the study area. The processing
time was about two hours. One difference between the process developed in this
study and that developed by Dikau et al. was the shape of the NAW. Dikau et al. used a
square window, while the process developed in this study uses a circle. A circle seems
more appropriate than a square, for the obvious reason that the extent of the boundary of
a circle will always be the same distance from the focal point, unlike a square. With the
latest GIS technology it is easy to use a circle as a moving window. Perhaps it was not a
viable option when Dikau et al. were developing their process. The radius used for the
search window in this study was calculated to be 5529m in order for the area of the window
to be the same as that used by Dikau et al. and Hammond. This radius is rounded to a
multiple of the cell size, which with a 200m cell size becomes 5600m. The automated process produces a classification
(Figure 5.5.) that has resemblance
to the landforms of this area and is similar to Wallace's classification
of the same area. It is difficult to quantitatively compare these two
classifications since Wallace's (1955) classification is not available
digitally. Wallace classifies virtually all of Banks Peninsula's landform
as "low mountains". The automated approach identifies a significant
proportion of Banks Peninsula as "low mountains" as well, but
it also recognises that large parts of Banks Peninsula have flat areas,
either as broad spurs on the far eastern parts of Banks Peninsula, or
as valley floors. These flat areas have affected the classification and
have resulted in a proportion of Banks Peninsula being identified as "open
low mountains". The automated approach has also integrated plains
and hills to generate a class that is a composition of these classes.
As identified in the criteria given in section 2.9,
composition is important for landscape classification. The automated process, however, does have
some problems. The first of these is the large regular shaped block in
the Canterbury plains identified as "flat or nearly flat plains"
in Figure 5.5.. In reality there
is no significant visual difference in landform between this area and
the neighbouring areas on the Canterbury Plains. This area is the result
of difficulties in producing an accurate TIN when the contours are far
apart. Subsequently, this affects the slope calculation, which is important
for distinguishing classes. This problem could be resolved if more contours
or spot heights were added. A second problem with the automated approach
is the way classes change as the distance away from the areas of relief
increases. For example, in Figure 5.5
the area between the Canterbury Plains and Banks Peninsula has a series
of classes going from "plains" to "plains with hills"
to "plains with high hills" to "plains with low mountains"
to "low mountains". This reflects a progressive change in relative
relief towards Banks Peninsula and is not a particularly desirable result.
It is not how you would expect people to conceptualize the landforms in
this area. As discussed above, it is desirable to have a composition class
that incorporates the change from plains to mountains but this should
not be done with progressive zonation. A third problem with this automated approach is that some
areas that are quite different in appearance are being classified the same. This is
particularly the case with areas classified as "open" Some areas are
"open" because they are at the interface between the plains and the mountains,
while other areas are also "open" because they are in a broad valley, or on flat
spurs. The process cannot distinguish between these different landforms. On the north
eastern side of Banks Peninsula an area is classified as "open low mountains"
and as previously noted this was because of the large flat spurs in this region. It does
not seem appropriate that this area should be classified the same as areas that are at the
interface between mountains and plains. The operational definition is unable to
distinguish some objects that are of micro or meso scale, such as flat spurs, from objects
that are of macro scale, such as plains. It is also for this reason that some areas are
classified as "tablelands" when they are just ordinary hills. Related to this scale issue is slope. Slope is very
dependent on the scale at which it is measured, a matter that will become more apparent in
section 5.4.3 when the effects of cell size are examined. This process uses the same slope
criteria as Hammond (8 percent), but measures slope at a different scale, thereby, in
effect, adopting a different slope criterion. It is necessary to determine whether this
new slope criterion is appropriate. This issue regarding slope is discussed further in
section 5.4.4. If it was thought to be appropriate that conical
volcanoes should be identified in the classification then this could in theory be included
in an automated process. Dikau (1989) shows how concave and convex surfaces (in any
direction) can be identified by using aspect and slope. It seems viable that conical
shapes could be identified by their convex surfaces in the horizontal direction, and,
possibly, concave surfaces in the vertical direction to develop a parametric signature of
conical shaped volcanos. However, the issue is whether it is appropriate that volcanos are
included in a landscape classification. Although this automated classification has problems, it
nevertheless has important advantages over manual processes. These are that it is totally
explicit and that it can also be applied to large areas to produce results relatively
quickly. This automated approach can also be viewed as just the start of a process that
can evolve as better techniques develop. Because the process is explicit, one can analyse
and improve on it. 5.4 Sensitivity to operational definition The automated approach developed by Dikau
et al. (1991) and then subsequently implemented in New Zealand is very
dependent on critical thresholds specified for different parameters. For
example, an eight percent slope threshold is used, and particular bounds
are chosen for the component class intervals. The process also uses a
neighbourhood analysis window that is defined by its radius. It would
be interesting to know the effect of changing these values. With GIS and
the use of macro programmes, it is possible to structure the process so
that different thresholds can be easily changed. The macro used to run
the landform classification process developed in this study contains variables
for all parametric thresholds. These variables were then defined at the
beginning by a separate sub-macro. As the processing time was only two
hours it was possible to produce many different classifications that were
the result of different parameter settings. Figure
5.6, Figure 5.7, and Figure
5.8. show, respectively, the effect of different slope thresholds,
relative relief class intervals, and NAWs on the resulting landform classification
(the relative relief class intervals are altered by dividing or multiplying
the class bounds by the factors shown in Figure
5.7.). The amount of agreement (ie. percentage of cells with
the same class) between the classification that uses 2 percent slope and
the classification that uses 14 percent slope is 21% for the Banks Peninsula
area. The agreement between classifications with relative relief decreased
by a factor of 4 and increased by a factor of 4 is 91%, and between a
NAW of 1,000m and 10,000m radius is 43%. These figures show that the resulting
classification is very dependent on how these parameters, especially slope
and the NAW, are defined. However, the sensitivity to these parameters
will depend on location. The sensitivity analysis does not produce surprising
results. The way the process is structured it is not surprising that if you change the
definition of gentle sloping from being less than 2 percent slope to less than 14 percent
slope, then there will be more "open mountains". By definition, in this
classification process, for an area to be classified "open" it must contain a
certain proportion of flat areas. By using 14 percent, then more areas will be identified
as gentle sloping, and therefore more area will be identified as "open". The
changes in relative relief levels have not affected the classification outcome
substantially for the Banks Peninsula region, but it is easy to conceive that changes in
relative relief classes could affect the outcome in certain locations where the topography
is close to being either a mountain or a hill. The effect of different NAW radii on the classification
process is more complicated. It needs to be remembered that NAWs were used at many
different stages of the process. It is used to calculate the percentage of area that is
gentle sloping, the relative relief, and three times when calculating profile. The same
size NAW was used for all these operations. The radius of the NAW will affect the boundary
between areas of relief and no relief, subsequently the distinction between the classes
"plains", and "plains with hills or mountains" changes with different
radii. With relative relief, the larger the NAW then the more likely that the difference
between the highest point and the lowest point will be greater. The size of the NAW also
affects the amount of generalisation. When the NAW radius is only 1000m, the
classification is more detailed than when the NAW radius is 10,000m. With a 1000m radius,
micro relief is being identified, such as flat spots on the eastern spurs that have been
identified as tablelands. As discussed previously, with landscape classification the
identification of macro landforms rather than micro landforms is important. Small flat
areas on spurs are not macro relief. Figure 5.6,
Figure 5.7, and Figure
5.8. show 21 different landform classifications of the same
area. For each figure only one parameter has been altered and the others
have been held constant. If the combinational effect of changing several
parameters simultaneously was investigated, then virtually hundreds of
different classifications would be produced. 5.4.1 A definitive classification When Hammond produced his landform classification, it
would not have been practical to investigate the effects of different operational
definitions. It would have been important that the definitions of different landforms be
chosen and only these are implemented, as this task would have been laborious enough. Now
with GIS technology, one can see that it is possible to investigate different parameter
thresholds. But it is still difficult to choose which operational definitions are
appropriate as it depends on whose conceptual model is being considered. For example, a
Dutch person will probably have a different definition of a mountain than a Nepalese. When
viewing landforms, some people may focus on small areas, while others may view more widely
and get an overall impression. As demonstrated, it is now possible to produce many
different conceptual models of landforms, but having hundreds of classifications is of
little use to research that needs a single frame of reference. A single classification
needs to be decided upon. One way of choosing an appropriate classification
is to use the class that occurs most frequently (majority), for a given
cell, from a wide range of different classifications that represent many
different conceptual model. This can be easily implemented with GIS. The
more advanced GIS software can do this with one command. Although hundreds
of different conceptual models can be created, it seems that with ARC/INFO
(version 6.2) only 47 coverages could be incorporated in the majority
function. Figure 5.9 is the majority
of 45 different classifications. The following parameter settings were
used: Five slope settings - 4, 6, 8, 10, and 12 percent; Three relative relief settings - Hammond's, Hammond's divided by 2, and Hammond's multiplied by 2; and Three NAW radii - 2400m, 5600m, and 8400m. The combination of all these settings produces 45
different classification. It should be noted that when the majority function is used in
ARC/INFO and there is no clear majority (ie. when two or more classes share the highest
frequency) for a particular cell, then no value is assigned to that cell. For Banks
Peninsula there were a few cells where this was the case, but where this happened the cell
value from Hammond's parameter settings was used instead. It should also be noted that a
cell size of 400m was used because of the amount of processing involved. A majority classification could be used as a definitive
classification because it incorporates a wide range of conceptual models. However, a
majority classification is sensitive to the range of conceptual models chosen, and perhaps
a different range is more desirable. With GIS this majority calculation is very quick, so
different ranges of parameter setting could easily be experimented with. On the other hand
it could also be argued that Hammond's classification should be the definitive
classification as it has been in use since 1954 and has become a de facto standard. 5.4.2 An application of fuzzy set theory As discussed in section 2.9.,
landscapes are fuzzy entities, as they are based on human conceptualization
and this varies between different people. Fuzzy set theory provides a
means of presenting this fuzziness by providing information that shows
the degree of membership of different classes that exist for each cell.
Using the example presented in the previous section, membership is calculated
by comparing all the 45 different outcomes. For each class, a coverage
is created that shows the degree of membership (frequency of occurrence)
that exists for different cells. The membership of each class was calculated
by first generating grid coverages that consisted of only the value for
that class, for example a grid coverage that consisted only of 2 (2 corresponded
to "tablelands"). An "equal to" function was used
to count for each cell how many of the 45 different classifications equalled
this blank coverage value. This provided information on the membership
of that class. This process was repeated for all the classes. Figure
5.10 shows the results for the landform types. In this case there
are only five possible classes so this information can be easily presented.
When there are hundreds of different classes, which will be the case with
a landscape classification that consists of the unique combination of
four different attributes, then this information will not be easy to present
and would in fact be too much for anyone to assimilate. One way of presenting this membership information for
easier assimilation is to use the notion of entropy (Wilson, 1970, Ashby, 1994). Entropy
provides information on the distribution of the membership of the different classes for a
given area (in this case a cell). It is implemented by first calculating for each class
the proportion of the 45 outcomes that are assigned to that class. Thus if a particular
cell is assigned to class A in 15 outcomes, the coverage for class A will show a value P
of 0.33 for that cell, while coverages for the other classes will show P values totalling
0.67. The entropy coverage is then created by combining these P values with the formula
for entropy (Eqn. 5.1). If the membership of one class is very high and the membership of
all the other classes is low then entropy will be low. If the memberships of all the
classes are fairly even and there is no class that stands out, then entropy will be high.
Low entropy indicates a high degree of consensus between classifications, and a high
entropy value means there is very little consensus between classifications. The entropy calculated from the 45 different landform
classifications generated for the Banks Peninsula area is shown in Figure 5.11. The entropy values show that when the classes are general
there is more agreement, but as the classes become more specific there is less agreement.
It is interesting to speculate whether this reflects consensus in society. Are people more
likely to agree that a particular landform is a mountain but less likely to agree whether
the mountain is high or low? Entropy appears useful for evaluating landscape
classifications and their application. For instance, one use for a landscape
classification is a frame of reference for psychophysical landscape assessment, as
discussed in section 2.5.1. It would be appropriate if the photos for the public
preference surveys were taken of areas where there is agreement over its classification.
Entropy provides this information. Figure 5.11 The entropy values calculated in Figure
5.11. are not specific to any one classification. They provide
general information about a particular area. However, it is possible to
provide consensus information that is specific to one classification.
If a definitive classification is agreed upon (and perhaps this will be
a majority classification) then it will be appropriate that consensus
information is obtained that is specific to that classification. This
can be done by again using the "equal to" function to count
how many of the 45 classifications equal a suggested definition for each
cell. If the majority classification, as shown in Figure
5.9., is accepted as the definitive classification then the
amount of agreement between this and the 45 different landform classifications
can be calculated. The result is shown in Figure
5.12. It can be argued that this approach (which will be now referred
to as the agreement model) is better than the use of entropy. The agreement
model is easier to understand and to implement within GIS. On the other
hand, entropy does provide additional information about all the other
possible classes that could be classified for a given area. This application of fuzzy set theory is simpler than that
used by Burrough (1989) and Burrough et al. (1992) for soil classification. Nevertheless,
it is still an effective application. Burrough's et al. (1992) approach is more complex
because it considers the probability of the different parameter settings that produce the
possible outcomes, whereas in this study, the probability of the different parameter
settings is assumed to be equal. This assumption is necessary because it is not known what
the probability of the different settings should be. Perhaps some settings, such as 14
percent slope, are unlikely to agree with public perception, and this should be
incorporated in the process by assigning this parameter setting a low probability. This
application is simpler also because it uses simulation to determine membership rather than
complex mathematical calculations. It should be remembered that the results from these
fuzzy set theory applications, presented previously, do not express the statistical
probability of a class. The results can only be used as a relative indication of the
probability of different classes. 5.4.3 The effects of cell size on the classification
process The effects of using different cell sizes
on the process were also investigated, and produced some interesting results.
Figure 5.13. shows that different
cell sizes have a significant effect on the resulting landform classification.
Over the whole study area, the agreement between 200m and 500m cell size
for the landform classes was 90%, although for Banks Peninsula it was
only 61%. The reason for this effect of cell size was investigated by
visualizing, for each cell size, the individual stages of the process.
Figure 5.14 and Figure
5.15 show the process for 100m and 1000m cell sizes respectively.
It is apparent that it is the variation in the slope classes that are
causing most of the variation in the output . Figure
5.16. and Figure 5.17.
show the effect of cell size on slope classes (70% agreement between 100m
and 1000m cell size for Banks Peninsula), and "mean slope" (54%
agreement between 100m and 1000m cell size for Banks Peninsula) respectively.
The reason for this variation in slope with different cell sizes becomes
apparent when the cells are examined in relation to the contours and TIN
lines (Figure 5.18). With this automated
process the DEM is produced from the TIN coverage. The DEM is then used
to determine slope by using a neighbourhood function that compares the
heights of the neighbouring cells and then calculates slope. From Figure
5.18., it is clear that as the cell size is increased the detail
in the topography is being lost. With a 100m cell size, non macro topography
is being identified, such as flat spots on spurs and ridge tops, and small
steep sections. With the larger cell sizes, such topography is being lost
and it even appears that detail at the macro scale is being lost as well.
This difference is thus affecting the "mean slope" (Figure
5.17.). This effect depends on the presence or absence of different
scales of topography, and whether this topography consists of flat objects
or steep objects. It illustrates the scale dependency of slope that Dymond
and Harmsworth (1994), and Moore et al. (1993) have also illustrated.
5.4.4 Slope - the elusive parameter Slope is a critical parameter for identifying landforms
and is used in manual methods as well as in automated methods. Yet slope is difficult to
objectively measure. To measure slope objectively using manual techniques in the field,
usually requires that a scale be specified by choosing a particular slope length.
Calculating the mean slope using a slope length of one metre will give a different result
to using a slope length of one kilometre. It is also necessary to specify where these
slope lengths begin and finish. For practical reasons, manual methods for calculating the
mean slope of an area have not been explicit, and so it is difficult to automate these
using GIS. A comparison was made between GIS generated slope
measurements and manual slope measurements for the whole of the study area. The LRI
contains manually measured slope information classed into intervals for areal units. The
LRI slope information was reclassed as flat if it contained a slope interval less than 12
percent, otherwise it was reclassed as non-flat. It was then stored as a 200m resolution
GIS layer. For comparison, a GIS generated slope coverage was produced from a 200m cell
size DEM. From this, a range of flat/non flat coverages were produced based the following
thresholds: 1, 2, 4, 6, 8, and 12 percent. These were then compared with the classified
LRI slope coverage, by calculating the amount of agreement (number of cells classified the
same). The agreements for the different slope thresholds were as follows: Slope Percentage agreement 1 87 2 88 4 88 6 88 8 87 12 84 These agreement figures appear to be quite high but they
actually reflect quite significant differences between manual and GIS slope measurements.
The analysis was done on very general slope classes (just two classes) and these classes
have a dramatic effect on the classification outcome. If two classifications were derived
for the study area and they both used a 12 percent threshold but one was based on the GIS
slope measurements and the other on the LRI data, then only 84% of the area in the
classifications would be in agreement (ie. 16% would be different). This analysis shows
that it is unwise to take slope thresholds based on manual measurement and use them in
classifications based on GIS measurement. The GIS slope measurements used in this study
and Dikau et al. (1991) are not flawed, they are just obtained differently. If slope information from the LRI is used
in the process then the "mean slope" is relatively stable with
different cell sizes as shown in Figure
5.19. There is 98% agreement between 100m and 1000m cell size for
Banks Peninsula. It is apparent from a comparison of Figure
5.17 and Figure 5.19. that
using LRI slope information provides a more stable result in relation
to cell size than using the DEM derived slope information. The slope information
in the LRI is obtained from field measurements that are determined at
a macro scale. This information is stored in a polygon coverage. Because
these polygons are large, detail is not lost when these polygons are converted
to grids, even with large cell sizes. The problem with using the LRI is
that the slope information for each areal unit is given as an interval.
If the terrain within the areal unit is variable then this slope interval
may be large. There can also be more than one slope interval given for
an areal unit. It can therefore be difficult to determine if the slope
of an areal unit is above or below the slope criteria. With the LRI data
it was assumed that an areal unit was "not flat" if it contained
a slope interval that extended above the critical slope threshold of 8%,
and because slope information is stored in intervals this resulted in
a 12% threshold being used. It should be noted that the LRI may be inconsistent
because of the difficulties in determining a totally explicit field method
for calculating slope, and that not all countries have access to such
databases. As demonstrated in the previous section, the "mean
slope" determined from DEMs changes considerably when the cell size is changed. How
do we know what is the best cell size to use? Also, is it desirable to have a process that
is dependent on a particular cell size? What happens if an accurate DEM with 200m cell
size is not available? Alternative methods for automatically calculating slope were
therefore investigated. Instead of calculating slope from a DEM
it is possible to derive slope from a TIN (based on the slope of the triangle
facets), and then convert this slope information directly to a grid coverage.
Figure 5.20. shows the effect
of different cell sizes on slope obtained directly from a TIN. There is
53% agreement in slope classes between 100m and 1000m cell size for Banks
Peninsula. There are some obvious differences with this figure compared
to Figure 5.16. where slope
is obtained directly from a DEM, especially with larger cell sizes. The
slope calculated directly from TIN is still very sensitive to cell size
because of the effects of micro topography. The TIN identifies micro relief
objects but these are generalised when converted to a grid coverage. The
degree of generalisation depends on what cell size used. The use of TIN
therefore does not solve the problem. Another alternative method for
determining "mean slope" that reduces the effect of micro relief and is less
sensitive to changes in cell size is to first remove small flat areas from the slope class
grid before the "mean slope" is calculated (slope can be calculated from either
a DEM or directly from a TIN). Small flat areas can easily be identified by their size.
From the definition for macro landform size given by Dikau (1989), this threshold size
should be 10 square kilometres. Once identified, these flat areas can be converted to
non-flat areas. This approach is implemented in the following section. 5.5 A new automated landform classification process
As previously mentioned, Dikau et al.'s (1991)
classification process has certain problems. These being that it produces a progressive
zonation when landform changes from plains to relief, it does not distinguish open valleys
from a plains-mountain interface, and it is affected by micro relief. A new process was
therefore developed that partly solves these problems. This process was developed using a
500m cell size to ensure the processing time was not too great. It will be demonstrated
that the outcome is not severely affected by cell size. Figure
5.21 and Figure 5.22. show
the different steps in the first phase of the process, which in summary
produces three classifications of landform:
1) a set of six relief types, 2) a division of "flat" types into open valley
and plain, and 3) identification of a special class of tableland within
the "plain" type. Starting with a DEM, a slope grid was derived just like
Dikau et al.'s (1991) process, and this was classified according to slope. However, a 4
percent threshold was used instead of an 8% threshold to distinguish the low gradient
cells. The reason for this is discussed later. Any small flat areas that were less than 10
square kilometres in size were then converted to non-flat areas to produce a "macro
slope classes" grid. The next three steps identified open valleys. An open valley is
a large flat area that has relief on opposite sides. This pattern was identified using an
expand and shrink sequence (as used for identifying indented coastlines in the previous
chapter). Areas identified as non-flat were expanded by 3000 metres (with a 500m cell size
this corresponds to six cells), and then shrunk by 3000m. The effect of these two steps
was that flat enclosed and semi-enclosed areas (open valleys) became non-flat. Open
valleys were then identified by using a conditional statement on the "macro-slope
classes" grid and the "shrunken" grid. That is, if a cell was flat in the
"macro-slope classes" grid and was not in the "shrunken" grid then it
was class as an open valley. For an area to remain classified as an open valley, it also
had to be more than 10 square kilometres in size. A conditional statement was used for
this. Relative relief was determined by Dikau et al.'s (1991)
process by using a focal range function. For areas that were previously identified as
non-flat, the relative relief was classified into five classes to produce a relief type
grid. The relief classes were: 0-150m - Low hills 150-600m - Hills 600-900m - High hills 900-1500m - Mountains Above 1500m - High mountains These relative relief classes are slightly different to
those used by Dikau et al. They are intended to reflect how New Zealanders conceptualise
terrain in New Zealand, although there is no substantive evidence to suggest how this is.
The Banks Peninsula region is classified as high hills by Glasson (1991) in a visual
assessment study. A relative relief interval of 600-900m achieves this. Two mountain
classes are recognised, distinguishing the grander mountains, which often have permanent
snow and bare rock, from the others. It should be noted that flat cells defined by
gradient were maintained as flat areas even though some had high relative relief
neighbourhoods. Tablelands were identified from upland and lowland
profiles and these profiles were identified in a similar way to Dikau et al.'s process.
However, the actual identification of Tablelands was simpler than Dikau et al.'s because
"profile percent" classes were not used. Instead, if an area was upland and flat
in the macro-slope coverage, then it was identified as a tableland. No tablelands were
identified in the whole region using this process. A coverage that has the potential to identify
eight morphological landform classes (five relief types, plains, open
valley, and tableland) was then produced by overlaying the maps of relief
types, open valleys, and tablelands. Figure
5.23 shows this for the whole study area. This landform components
map cannot be used in a landscape classification in this form because
it does not contain composition classes, but instead identifies the sharp
boundaries between different landform types (eg. plains and mountains).
However, it could be used for other purposes (eg. climate and hydrology
modelling). Once the landforms had been conceptualised,
the second phase of the landform classification could commence. Landform
compositions were identified in a similar way to that used for the landcover
attributes. Each of the eight landform components were singled out into
individual grids, with the value 100 assigned to cells where the particular
component is present, and the value zero where it is not. A focal mean
function, with a 3000m radius NAW, was then applied to each component
grid, and these mean values were placed into one of four class intervals
(the results are shown in Figure 5.24
and Figure 5.25.). These eight
spatial influence grids were then overlaid to produce a new grid that
contained unique combinations of them (a vector representation is shown
in Figure 5.25.). Since eight
grids were combined and each had the possibility of four different classes,
then the combined grid had the possibility of 65,536 unique classes. However,
there were only 613 unique combinations in the study area. Twenty two
landform classes were then identified by querying this combined coverage.
The classes are listed in Table 5.5.
under level 1, and the definitions used to identify them are described
in Appendix 4. The classes have been chosen because of their distinctiveness
in form, and to a certain extent reflect the classes used by past classifications.
Checks were made to ensure that the definitions were mutually exclusive
and exhaustive as described in section 4.2.3. Not all these landforms
existed in the study area. The resulting level 1 classification is shown
in Figure 5.26.. In deriving a landform component map, several
parameter thresholds had to be determined - 4 percent slope, a 6000m maximum
valley width criteria, and as already discussed the various relative relief
classes. A slope of 4 percent was used for distinguishing flat and non-flat
areas. This differs from Hammond's 8 percent, which was also adopted by
Dikau et al. (1991). As discussed in section 0, using DEMs to derive slope
produces a different result compared to using field measurements. Therefore
it is likely that a different slope threshold is needed with automation
compared to Hammond's method. The effects of different slope thresholds
were investigated by implementing the process with different slope thresholds
(Figure 5.27.). The amount
of agreement between the use of a 1% slope threshold and an 8% threshold
is 67%. With 8 percent, 7,528 more cells were classed as plains or open
valleys than with 1 percent. The opposite occurred for the classes containing
relief. Low hills and hills are virtually absent with 8 percent, and the
non relief classes extend well into areas that can be regarded as relief.
A comparison was made between the resulting slope classes
and the slope information in the LRI (similar to that shown in section 0 but this time
using a 500m cell size). As previously discussed, the LRI slope information is based on
areal units, slope is given in class intervals, and occasionally more than one interval is
given to an areal unit. Despite this, it still provides the best available representation
of slope for which a comparison can be made. A slope interval of 0-7 degrees (based on LRI
intervals of 1-3 and 4-7) was used to represent flat areas. The 4 percent threshold
produced a slope class grid that had the highest agreement with the LRI (91%). The slope
threshold of 1 percent and 8 percent both had agreements of only 88%. Four percent
therefore seems an appropriate threshold. Even when 4 percent was compared with the LRI
slope interval of 1-3 degrees, the agreement was still high (90%). Although hills are not
very well represented with a 4 percent threshold, it appears more suitably for identifying
the extent of open valleys. A 6000m maximum valley width threshold was decided upon
by assessing the effects of different width criteria. Valley widths vary considerably and
topographic maps show that these can be 5000m in the Rangitata catchment. To be sure all
such valleys were identified, 6000m was decided upon (this was achieved by using an expand
and shrink of 3000m). If the maximum valley width criterion is set too high then some
large basins become identified as open valleys. The landform classification can be easily
generalised by grouping different classes. This was done to produce six
different levels of generalisation. The way the different classes were
grouped is shown in Table 5.5.
Figure 5.28 shows graphically the
effect of different levels of generalisation. No keys are provided with
this figure to avoid cramming, but the colours are the same as used in
Figure 5.26 and the keys can be ascertained
by using this and Table 5.5. Like the
rationale for the level 1 classes, the classes in levels 2-6 have been
chosen because of their distinctiveness in form. At the more general levels
this distinctiveness needs to be more apparent. This new process produces a landform classification that
does not have the same problems as that developed by Dikau et al. (1991). The interface
between relief and plains is not identified as a progressive zonation, valley floors are
distinguished, and micro relief does not alter significantly the outcome. Cell size,
however, still affects the classification. There is 89% agreement between level 1
classifications based on 200m and 500m cell sizes. This is similar to the 90% found for
Dikau et al.'s landform classes. However, for a comparison between this new process and
Dikau et al.'s to be valid, it needs to be done at a similar level of generalisation. For
level 3, which has a similar number of classes as Dikau et al.'s landform types, there is
93% agreement between 200 and 500m cell size. Cell size is still affecting the calculation
of slope classes with this new process, despite the removal of small flat areas. Slope
classes particularly affect the boundaries of large open valleys that gradually get
steeper and therefore do not have a distinct boundary. What this classification identifies as
open valleys perhaps does not agree with how most people conceptualize
valleys. The definition of an open valley as a large flat area that has
non-flat areas on opposite sides, is perhaps too simple. People often
associate rivers with valleys, so perhaps a river must be in the vicinity.
This could be incorporated in the classification process. Another issue
is that where there is an isolated hill surrounded by flat areas, the
flat area between the hill and a nearby non-flat area becomes identified
as a valley. This can be seen in Figure
5.26. on the edge of the Canterbury Plains. This is a problem
with the process. One may also think that the maximum width of a valley
should be determine by how high the surrounding relief is. For example,
in the head of the Rangitata catchment the relief is very high, so although
the flat areas are very wide (5 km), one still gets an impression of being
in a valley. If the surrounding relief had been only low hills then this
area perhaps would not be conceptualised as a valley. This problem could
be solved with context dependent definitions that take the relative relief
into account, but this makes the process more complicated. As with the components discussed in chapter 4, the use of
a 3000m search radius for determining the spatial influence of different components can
also be questioned. There has been no cognitive research that can be used for determining
what spatial influence different components of the landscape have on people's
conceptualisation of the landscape. One could argue that this figure should not be
constant for landforms. Some components, such as high mountains, have more spatial
influence than other components, such as low hills. The use of context dependent search
radii could also be incorporated into the process. 5.6 Summary Automating landform classification is an interesting
challenge. It produces classifications that have a good resemblance to manual methods, and
because definitions are explicit they can be easily identified, questioned, and improved.
This has been demonstrated with Dikau et al.'s (1991) process. Several problems were
encountered when applying it to the study area: it produced a progressive zonation when
landform changes from plains to mountains; it did not distinguish open valleys from a
plains-mountain interface; and it was affected by micro relief. Also, the same slope
threshold was used as Hammond's even though slope was measured differently. Although
automating existing quantitative manual processes are important steps in the evolution of
automation, definitions may need to be calibrated. This is the case with slope
measurements. The effects of scale and generalisation also need special attention. Dikau et al.'s (1991) process can be improved by adopting
a 4% slope threshold, removing non macro relief, identifying open valleys using an
expand/shrink sequence, using different relative relief classes, and by using spatial
influence information of each component to identify landform compositions. A new process
has been developed that adopts these improvements. There are opportunities for improving
the process further with the use of more context dependent definitions, and the
identification of particular distinctive landforms such as conical volcanos.

The equation for entropy of a cell is: