|
M e t h o d s
Overview
Again, the three components of the IBCD (BCD-RICH, BCD-AREA, and BCD-POP) are derived from five indicators of BCD:
• number of languages
• number of ethnic groups
• number of religions
• number of bird and mammal species (combined0
• number of plant species
Each of the three parts of the IBCD gives equal weight to cultural and biological diversity. For example, a country’s overall BCD-RICH score is calculated as the average of its cultural diversity richness score (aggregated from the scores for languages, religions, and ethnic groups) and its biological diversity richness score (aggregated from the scores for bird/mammal species and plant species). The same holds true for BCD-AREA and BCD-POP.
When values for these indicators are ranked on a global basis, it becomes apparent that biocultural diversity is not evenly distributed. A few countries are megadiverse, with very large values; then the ranking rapidly diminishes to much lower values found in more typical countries. Because this makes comparisons among countries difficult, we used a common log scale to produce a linear distribution.
For example, the language indicator index for BCD-RICH is calculated as the log of the number of languages spoken in a country divided by the log of the number of languages spoken worldwide. The process was repeated for the other four indicators to derive BCD-RICH.
As noted above, to compensate for the fact that large countries tend to have a greater biological and cultural diversity than small ones simply because of their greater area (or greater population), we calculated two additional diversity values for each country by adjusting first for land area (BCD-AREA) and second for population size (BCD-POP). This was done by measuring how much more or less diverse a country is in comparison with an expected value based on its area or population alone. The method used is a modified version of that used by Groombridge and Jenkins (2002). The process was repeated for the other four indicators to derive BCD-AREA and BCD-POP.
The expected diversity was calculated using the standard formula for the species–area relationship log S = c + z log A where S = number of species, A = area, and c and z are constants derived from observation. Because the distributions of the five indicators against land area and population size are similar, we applied the same formula to indicators of cultural diversity. Hence, for BCD-AREA expected log Ni = c + z log Ai where Ni = number of languages, religions, ethnic groups, or species in country i, and Ai = area of country i. The same formula was used for BCD-POP, except that Pi (population
of country i) replaces Ai. To find the values of the constants c and z for each of the indicators, we scatter-plotted log Ni (where Ni = number of languages, religions, ethnic groups, or species in country i) against log Ai for all countries, and drew the best-fit
straight line through the points.
To calculate the deviation of each country from its expected value, we subtracted the
expected log Ni value from the observed log Ni value. The index is calibrated such that
the world, or maximum, value is set equal to 1.0, the minimum value is set equal to zero
and the average or typical value is 0.5 (meaning no more or less diverse than expected
given a country’s area or population).
Next, we describe the data sources of the indicators and relevant caveats to the data.
top
Cultural diversity indicators
Number of languages. Language data are derived from the 2000 edition of Ethnologue,
the standard reference list of the world’s languages. Other global language
compilations are either outdated, only treat the world’s larger languages, or else utilize a classification system that is not widely accepted by professional
linguists, at least as of yet.
Ethnologue is a country-by-country listing of languages. There are well over 7,000
entries in the 2000 edition, representing 6,809 unique languages, both living and extinct.
Each entry gives a main name for the language; a unique three-letter code to distinguish
languages with identical or similar names; alternative names by which the language was
or may still be known, whether in the local vernacular or the professional linguistic
literature (with derogatory names flagged); the number of mother-tongue speakers, if
known; the names of other countries in which the language is spoken, if any; known
dialects; linguistic information, such as language family affiliation, linguistic typology,
the availability of dictionaries and grammars; ecological information on the environment
and subsistence type of the speakers (for some languages); and, because the publishers
are a Christian missionary organization, information of the availability of the Bible in the
language. Ethnologue includes information on deaf sign languages, non-deaf sign
languages, pidgins and creoles, a few widely used artificial languages (e.g., Esperanto),
ritual and auxiliary languages that do not have mother-tongue speakers, and many
recently extinct languages. Ethnologue also includes qualitative information about
language endangerment. For example, an entry may contain statements to the effect that
children are no longer learning the language, or that all fluent speakers are over 50 years
old, or that the language has so few mother-tongue speakers left that it is “nearly extinct.”
The languages listed under each country heading in Ethnologue are either endemic
languages confined to that country alone, or other, non-endemic languages that the
editors consider of sufficient linguistic, political, or social interest to warrant an
individual entry under that country. For example, the main entry for English is under the
United Kingdom. Other, abbreviated entries for English, usually containing information
pertinent to the country (such as dialects spoken there) are found for many other countries
in which English is spoken, with cross-references to the main U.K. entry. In addition to
the individual language entries under each country heading, there is an introductory
paragraph for the country which includes (where applicable) a list of immigrant
languages spoken there. Only those immigrant languages that are still spoken in the
country of origin and that do not display significant dialect differences with the original
form are listed in this introductory paragraph. The editors caution that, for many
countries, the list is incomplete and in certain cases may be incorrect. Nonetheless, we
tallied the number of languages listed in the introductory paragraph and added it to the
number of languages listed individually under the country’s entry to arrive at a total
number of languages spoken in each country. Because the introductory list of immigrant
languages is often incomplete, the total number of languages reported here for each
country may be an undercount of the true number. This undercount is probably quite
significant for countries with large immigrant populations, such as the USA, where the
Ethnologue introductory list is certainly incomplete. On the other hand, for smaller
countries with less diverse immigrant populations, the number of languages reported here
may be accurate or only a slight undercount.
Number of religions. Data on religions are from the World Christian Encyclopedia,
second edition, a widely cited source for the numbers of religions,
denominations, and adherents worldwide. The authors define “religion” as “a grouping of
persons with beliefs about God or gods, and defined by its adherents’ loyalty to it, by
their acceptance of it as unique and superior to all other religions, and by its relative
autonomy”. The compilers of World Christian Encyclopedia
have tracked information on 19 major religions and related religious categories (such as
“nonreligious” and “atheist”) for more than 20 years and have time-series data going back to 1900 (on a global level).
Number of ethnic groups. Data on ethnic groups are also taken from the World
Christian Encyclopedia, second edition , primarily because it gives
detailed breakdowns for over 200 countries. The authors continue a classification system
published in the first edition of the Encyclopedia. In this system, the
world’s peoples are divided into 432 primary ethnolinguistic groups, each with an
identifying code. These codes are then applied to different groups at the national level to
produce 12,583 distinct ethnic groups; population numbers for each of these groups are
given and form the backbone of the Encyclopedia’s ethnolinguistic analysis.
top
Biological diversity indicators
Number of bird/mammal species; number of plant species. Data on bird/mammal
species richness, on and plant species richness—the two indicators of biological diversity
used for the IBCD—are taken from Global Biodiversity: Earth’s Living Resources in the
21st Century . It lists the total numbers of bird,
mammals and plant species recorded in each country, as well as the number of endemic
and threatened species in each of these three groups. Marine species are excluded. The
total for birds includes only those species which are known to breed in a particular
country, and not the total number recorded, which would include a number of nonbreeding
migrant and vagrant species and so inflate the species richness. Birds and
mammals, but not reptiles, amphibians, fish or invertebrates, are used to represent the
entire animal kingdom because only these two taxonomic groups have been extensively
surveyed and recorded in most countries. The other groups are less well studied, and the
numbers recorded in many countries are liable to change. However, any species richness
data should be regarded as being provisional, as the totals tend to change over time with
new surveys or changes to taxonomic classification. The number of plant species
recorded in a country is likely to change more than the number of bird and mammal
species, particularly in tropical countries, but plants are included here to give a more
balanced picture of biodiversity than would be given by looking at birds and mammals
alone. Currently, 9,946 bird species, 4,763 mammal species and 250,876 plant species
have been recorded worldwide.
Calculating the IBCD components
The IBCD gives equal weight to cultural and biological diversity. A country’s IBCD
value therefore is calculated as the average of its cultural diversity (CD) and its
biodiversity (BD), or:
IBCD = (CD + BD)/2
In measuring a country’s cultural diversity (CD), equal weight is given to linguistic,
religious, and ethnic diversity. Therefore CD is calculated as the average of a country’s
language diversity (LD), religion diversity (RD), and ethnic group diversity (ED):
CD = (LD + RD + ED)/3
In measuring biodiversity (BD), equal weight is given to animal species diversity (using
birds and mammals as a proxy for all animal species) and plant species diversity.
Therefore BD is calculated as the average of a country’s bird and mammal species
diversity (MD), and plant species diversity (PD):
BD = (MD + PD)/2
To calculate CD and BD, we first took the logarithms of the richness values for eachindicator in each country. For example, the number of languages (L) spoken in
Afghanistan is 49; log L = 1.69.
We used logarithms because BCD—whether measured using species, languages, ethnic
groups, or religions—is not distributed evenly around the world. If one ranks countries
according to the richness of any of the BCD indicators considered here, one gets a few
very large values recorded in the world’s megadiverse countries, rapidly diminishing to
lower values found in more typical countries. For example, 833 of the world’s 6,800
languages (12%) are spoken in just one country, Papua New Guinea. The average number
of languages spoken in the 229 countries and territories for which we have data is 45,
Mali being an average country in this respect. But only in a minority of the world’s
countries (50) are there 45 or more languages spoken, whereas in the majority of
countries (179) there are fewer than 45 languages spoken. In almost half of those
countries (85) there are fewer than 10 languages spoken. In other words, a few countries
hold a disproportionately large share of the world’s linguistic diversity . This is a wellknown
statistical pattern, known as a logarithmic distribution, and applies equally well to
the distribution of species, ethnic groups, and religions among countries. To adjust for
this, we used a logarithmic scale, the common log scale, to rank countries in the index.
This results in a linear distribution of the index values.
Applying a common log scale essentially compresses a large range of values down to a
manageable range. For example, as we noted above, the maximum number of languages
spoken in one country is 833, the average number of languages spoken is 45, while
several countries share the minimum value of 1 language only. Taking the common logs
of these three numbers (log 833 = 2.92; log 45 = 1.65; log 1 = 0) gives us a scale of 2.92
to 0 instead of 833 to 1 (see examples in the table below). Because the common log scale
smoothes out the skewed distribution into a linear distribution, the values, when
compared with one another, fall into a much more even (linear) pattern.
| Representative Log L values |
Country/Territory |
No.
languages
spoken (L) |
Log L |
Language Diversity index
LD-RICH (log Li/log Lworld) |
| World |
6,8000 |
3.83 |
1.000 |
| Papua New Guinea (highest) |
833 |
2.92 |
0.762 |
| Mali (average) |
45 |
1.65 |
0.431 |
| Bermuda (lowest) |
1 |
0.00 |
0.000 |
Calculating IBCD-RICH. To generate the raw richness component of the index (IBCDRICH),
we compared each country’s value with the global richness value. For example,
staying with language diversity, the index is calculated as the log of the number of
languages spoken in a country divided by the log of the number of languages spoken
worldwide. The total number of languages currently spoken is 6,800 (log 6,800 = 3.83). Hence the formula we used is:
XX-RICH = log Ni/log Nworld.
where XX = LD, RD, ED, MD, or PD;
Ni = number of languages, religions, ethnic groups, or species in country i;
Nworld = the actual observed number of languages, religions, ethnic groups, or species in the world;
Calculating IBCD-AREA. To compensate for the fact that large countries tend to have a
greater cultural and biological diversity than small ones simply because of their greater
area, a second component of the IBCD adjusts the BCD value for each country by
accounting for its land area. This was done by calculating how much more or less diverse
a country is in comparison with an expected value based on its area alone. For example, if
they were typical for their respective land areas, one would expect about 36 or 37
languages to be spoken in Papua New Guinea and about 50 languages to be spoken in
Mali. Therefore, based on their areas, Papua New Guinea is vastly more diverse than one
would expect, whereas Mali is slightly less diverse than one would expect. The method is
a modified version of that used by Groombridge and Jenkins (2002).
The expected diversity of a country is derived from the species-area relationship, which comes from ecological theory:
log S = c + z log A
where S = number of species;
A = area; and
c and z are constants.
The formula simply states that the log of the number of species present in a country or
territory increases in proportion with the log of the area of the country or territory. The
constants c and z can be derived by observation. We have applied the same formula to
indicators of cultural diversity, hence:
expected log Ni = c + z log Ai
where Ni = number of languages, religions, ethnic groups, or species in country i;
Ai = area of country i: and
c and z are constants.
Strictly speaking, the species-area formula applies only to biodiversity. However, we
used it to make area adjustments for the cultural indicators because, as noted above, the
global distribution patterns of richness in cultural diversity and biodiversity are similar.
To find the values of c and z for each of the indicators used in the IBCD-AREA analysis,
we scatter-plotted log Ni against log Ai for all countries, and drew the best-fit straight line
through the scatter; z is the slope of the line and c is the point where it intersects the yaxis.To calculate the deviation of each country from its expected value we simply
subtracted the expected log Ni value from the observed log Ni value.
Deviation from expected value = log Ni – expected log Ni
or log Ni – (c + z log Ai)
This gives a series of values for each country where a score of 0 means that the country is
exactly as diverse as one would expect based on its area, a score of 1 means it is ten times
more diverse, a score of 2 means it is a hundred times more diverse, a score of -1 means
it is ten times less diverse, a score of -2 a hundred times less, and so on.
The index is calculated such that the global value is equal to 1.0 and the minimum value
is zero. The global value for each of the five measures is also the maximum value, or, put
another way, the world as a whole is more diverse than any country, even after adjusting
for land area. The minimum value was selected by choosing a value below that of the any
country. Hence the formula used to calculate a country’s area-adjusted diversity value for
each of the five indicators was:
| XX-AREA = |
Di - Dmin |
Dmax - Dmin |
where
Di = observed log Ni – expected log Ni;
Dmin = a value below that of the least diverse country; and
Dmax = Dworld, the actual observed value for the entire world.
| Examples of IBCD-AREA methodology using language diversity data |
Country |
Area(A)
[thousand
km2] |
Log
A |
No.
languages
spoken
(L) |
Log L |
Expected log L
value |
Deviation from
expected
value
(D) |
Language
diversity
index
LDAREA |
| World |
136,605 |
8.14 |
6,800 |
3.83 |
2.33 |
1.50 |
1.00 |
| PNG |
463 |
5.67 |
833 |
2.92 |
1.56 |
1.36 |
0.954 |
| Turkmenistan |
488 |
5.69 |
37 |
1.57 |
1.57 |
0.00 |
0.525 |
| Greenland |
2,176 |
6.34 |
2 |
0.30 |
1.77 |
-1.47 |
0.062 |
| Minimum Value |
1,000 |
6.00 |
1 |
0.00 |
1.67 |
-1.67 |
0.000 |
Calculating IBCD-POP. Finally, a third component of the index, IBCD-POP, compensates for the fact that more populous countries tend to have greater cultural
diversity than small ones because of greater population size. This was done in the same
way as compensating for area, by calculating deviation from an expected value based on
population size alone, using the formula:
expected log Ni = c + z log Pi
where Ni = number of languages, religions, ethnic groups or species in country i;
Pi = population of country i: and
c and z are constants.
To calculate c and z, we scatter-plotted log Ni against log Pi for all countries, and added
the best-fit straight line; z is the slope of the line and c is the point where it intersects the
y-axis. To calculate the deviation from the expected value we simply subtracted the
expected log Ni value from the observed log Ni value.
The formula used to calculate a country’s population-adjusted value for each of the five indicators was the same as that used to calculate the area-adjusted value:
| XX-POP = |
Di - Dmin |
Dmax - Dmin |
where
Di = observed log Ni – expected log Ni;
Dmin = a value below that of the least diverse country; and
Dmax = Dworld, the actual observed value for the entire world.
However, unlike IBCD-AREA, for some component indicators the value of Dmax in
IBCD-POP was not equal to Dworld, and so an arbitrary maximum value was chosen at a
level greater than that of the most diverse country.
| Examples of IBCD-POP methodology using language diversity data |
Country |
Population(P)
[thousand] |
Log
P |
No.
languages
spoken
(L) |
Log L |
Expected log L
value |
Deviation from
expected
value
(D) |
Language
diversity
index
LD-POP |
| Max value |
6,056,710 |
6.78 |
12,000 |
4.08 |
2.48 |
1.60 |
1.000 |
| PNG |
4,809 |
3.68 |
833 |
2.92 |
1.34 |
1.58 |
0.994 |
| Pakistan |
141,256 |
5.15 |
76 |
1.88 |
1.88 |
0.00 |
0.477 |
| Korea, DPR |
22,268 |
4.35 |
2 |
0.30 |
1.58 |
-1.28 |
0.057 |
| Minimum Value |
10,000 |
4.00 |
1 |
0.00 |
1.46 |
-1.46 |
0.000 |
Missing data. If data are missing for a particular indicator (LD, RD, ED, MD, or PD)
within a component’s cultural diversity (CD) or biodiversity (BD) parts, the remaining
indicators are used to calculate that country’s IBCD. For example, if a country is missing
data for religion and ethnic groups, we used only languages to calculate the CD part of its
IBCD; if data are missing for plants, we used only birds/mammals to calculate its BD
part; and so on. However, a country must have at least one cultural diversity component
and one biodiversity component to calculate its IBCD. Notes to the tables show which
data are included for each country.
Additional remarks on the methodology. As noted above, for some indicators the
maximum and minimum values used to fix the top and bottom of the scale are theoretical
rather than observed. The drawback of using theoretical maxima and minima is precisely
that they are theoretical; hence it can always be argued that they are arbitrary to some
degree. Also, it is possible that the posited values could be superseded by new
information or different interpretations of existing data. For example, the number of
languages reported in Ethnologue has increased from edition to edition, not because
previously unknown languages are found (although a few have been), but because the
editors have reclassified dialects as separate languages based on new information and
interpretations from field linguists. It has also been argued that there are a very large
number—possibly several thousand—unreported deaf languages that need to be added to
the world’s total . Therefore, it is possible the theoretical
maximum number of languages used here will become obsolete when the next edition of
Ethnologue appears. If so, then one would have to go back and recalculate the language
data using the new, higher theoretical maximum.
Of course, the IBCD would also have to be recalculated if different datasets were chosen
for the indicators. For example, the ethnic group classification used here, from the 2001
edition of the World Christian Encyclopedia, results in a much higher number of ethnic
groups than is reported in some other sources in the anthropological literature. If, say,
data were drawn from the Human Relations Area Files instead, the results of the IBCD
might be completely different.
top
|