Principal Investigators:
David Harmon, M.S.
Jonathan Loh, M.Sc.

Text:©Terralingua 2004

Executive Summary

Background
Methods


Overview

Cultural diversity indicators
Biological diversity indicators
Calculating the IBCD components

Results
Discussion
Conclusion
Appendix
Data: items pop up on new screen, for easier reference

--------

--------

maps

References

For the full version of this report, including references and bibliography, please download the following .pdf: Full Text (.pdf)

 

M e t h o d s

Overview

Again, the three components of the IBCD (BCD-RICH, BCD-AREA, and BCD-POP) are derived from five indicators of BCD:

• number of languages
• number of ethnic groups
• number of religions
• number of bird and mammal species (combined0
• number of plant species

Each of the three parts of the IBCD gives equal weight to cultural and biological diversity. For example, a country’s overall BCD-RICH score is calculated as the average of its cultural diversity richness score (aggregated from the scores for languages, religions, and ethnic groups) and its biological diversity richness score (aggregated from the scores for bird/mammal species and plant species). The same holds true for BCD-AREA and BCD-POP.

When values for these indicators are ranked on a global basis, it becomes apparent that biocultural diversity is not evenly distributed. A few countries are megadiverse, with very large values; then the ranking rapidly diminishes to much lower values found in more typical countries. Because this makes comparisons among countries difficult, we used a common log scale to produce a linear distribution.

For example, the language indicator index for BCD-RICH is calculated as the log of the number of languages spoken in a country divided by the log of the number of languages spoken worldwide. The process was repeated for the other four indicators to derive BCD-RICH.

As noted above, to compensate for the fact that large countries tend to have a greater biological and cultural diversity than small ones simply because of their greater area (or greater population), we calculated two additional diversity values for each country by adjusting first for land area (BCD-AREA) and second for population size (BCD-POP). This was done by measuring how much more or less diverse a country is in comparison with an expected value based on its area or population alone. The method used is a modified version of that used by Groombridge and Jenkins (2002). The process was repeated for the other four indicators to derive BCD-AREA and BCD-POP.

The expected diversity was calculated using the standard formula for the species–area relationship log S = c + z log A where S = number of species, A = area, and c and z are constants derived from observation. Because the distributions of the five indicators against land area and population size are similar, we applied the same formula to indicators of cultural diversity. Hence, for BCD-AREA expected log Ni = c + z log Ai where Ni = number of languages, religions, ethnic groups, or species in country i, and Ai = area of country i. The same formula was used for BCD-POP, except that Pi (population of country i) replaces Ai. To find the values of the constants c and z for each of the indicators, we scatter-plotted log Ni (where Ni = number of languages, religions, ethnic groups, or species in country i) against log Ai for all countries, and drew the best-fit straight line through the points.

To calculate the deviation of each country from its expected value, we subtracted the expected log Ni value from the observed log Ni value. The index is calibrated such that the world, or maximum, value is set equal to 1.0, the minimum value is set equal to zero and the average or typical value is 0.5 (meaning no more or less diverse than expected given a country’s area or population).

Next, we describe the data sources of the indicators and relevant caveats to the data.

top

Cultural diversity indicators

Number of languages. Language data are derived from the 2000 edition of Ethnologue, the standard reference list of the world’s languages. Other global language compilations are either outdated, only treat the world’s larger languages, or else utilize a classification system that is not widely accepted by professional linguists, at least as of yet.

Ethnologue is a country-by-country listing of languages. There are well over 7,000 entries in the 2000 edition, representing 6,809 unique languages, both living and extinct. Each entry gives a main name for the language; a unique three-letter code to distinguish languages with identical or similar names; alternative names by which the language was or may still be known, whether in the local vernacular or the professional linguistic literature (with derogatory names flagged); the number of mother-tongue speakers, if known; the names of other countries in which the language is spoken, if any; known dialects; linguistic information, such as language family affiliation, linguistic typology, the availability of dictionaries and grammars; ecological information on the environment and subsistence type of the speakers (for some languages); and, because the publishers are a Christian missionary organization, information of the availability of the Bible in the language. Ethnologue includes information on deaf sign languages, non-deaf sign languages, pidgins and creoles, a few widely used artificial languages (e.g., Esperanto), ritual and auxiliary languages that do not have mother-tongue speakers, and many recently extinct languages. Ethnologue also includes qualitative information about language endangerment. For example, an entry may contain statements to the effect that children are no longer learning the language, or that all fluent speakers are over 50 years old, or that the language has so few mother-tongue speakers left that it is “nearly extinct.”

The languages listed under each country heading in Ethnologue are either endemic languages confined to that country alone, or other, non-endemic languages that the editors consider of sufficient linguistic, political, or social interest to warrant an individual entry under that country. For example, the main entry for English is under the United Kingdom. Other, abbreviated entries for English, usually containing information pertinent to the country (such as dialects spoken there) are found for many other countries in which English is spoken, with cross-references to the main U.K. entry. In addition to the individual language entries under each country heading, there is an introductory paragraph for the country which includes (where applicable) a list of immigrant languages spoken there. Only those immigrant languages that are still spoken in the country of origin and that do not display significant dialect differences with the original form are listed in this introductory paragraph. The editors caution that, for many countries, the list is incomplete and in certain cases may be incorrect. Nonetheless, we tallied the number of languages listed in the introductory paragraph and added it to the number of languages listed individually under the country’s entry to arrive at a total number of languages spoken in each country. Because the introductory list of immigrant languages is often incomplete, the total number of languages reported here for each country may be an undercount of the true number. This undercount is probably quite significant for countries with large immigrant populations, such as the USA, where the Ethnologue introductory list is certainly incomplete. On the other hand, for smaller countries with less diverse immigrant populations, the number of languages reported here may be accurate or only a slight undercount.

Number of religions. Data on religions are from the World Christian Encyclopedia, second edition, a widely cited source for the numbers of religions, denominations, and adherents worldwide. The authors define “religion” as “a grouping of persons with beliefs about God or gods, and defined by its adherents’ loyalty to it, by their acceptance of it as unique and superior to all other religions, and by its relative autonomy”. The compilers of World Christian Encyclopedia have tracked information on 19 major religions and related religious categories (such as “nonreligious” and “atheist”) for more than 20 years and have time-series data going back to 1900 (on a global level).

Number of ethnic groups. Data on ethnic groups are also taken from the World Christian Encyclopedia, second edition , primarily because it gives detailed breakdowns for over 200 countries. The authors continue a classification system published in the first edition of the Encyclopedia. In this system, the world’s peoples are divided into 432 primary ethnolinguistic groups, each with an identifying code. These codes are then applied to different groups at the national level to produce 12,583 distinct ethnic groups; population numbers for each of these groups are given and form the backbone of the Encyclopedia’s ethnolinguistic analysis.

top

Biological diversity indicators

Number of bird/mammal species; number of plant species. Data on bird/mammal species richness, on and plant species richness—the two indicators of biological diversity used for the IBCD—are taken from Global Biodiversity: Earth’s Living Resources in the 21st Century . It lists the total numbers of bird, mammals and plant species recorded in each country, as well as the number of endemic and threatened species in each of these three groups. Marine species are excluded. The total for birds includes only those species which are known to breed in a particular country, and not the total number recorded, which would include a number of nonbreeding migrant and vagrant species and so inflate the species richness. Birds and mammals, but not reptiles, amphibians, fish or invertebrates, are used to represent the entire animal kingdom because only these two taxonomic groups have been extensively surveyed and recorded in most countries. The other groups are less well studied, and the numbers recorded in many countries are liable to change. However, any species richness data should be regarded as being provisional, as the totals tend to change over time with new surveys or changes to taxonomic classification. The number of plant species recorded in a country is likely to change more than the number of bird and mammal species, particularly in tropical countries, but plants are included here to give a more balanced picture of biodiversity than would be given by looking at birds and mammals alone. Currently, 9,946 bird species, 4,763 mammal species and 250,876 plant species have been recorded worldwide.

Calculating the IBCD components

The IBCD gives equal weight to cultural and biological diversity. A country’s IBCD value therefore is calculated as the average of its cultural diversity (CD) and its biodiversity (BD), or:

IBCD = (CD + BD)/2

In measuring a country’s cultural diversity (CD), equal weight is given to linguistic, religious, and ethnic diversity. Therefore CD is calculated as the average of a country’s language diversity (LD), religion diversity (RD), and ethnic group diversity (ED):

CD = (LD + RD + ED)/3

In measuring biodiversity (BD), equal weight is given to animal species diversity (using birds and mammals as a proxy for all animal species) and plant species diversity. Therefore BD is calculated as the average of a country’s bird and mammal species diversity (MD), and plant species diversity (PD):

BD = (MD + PD)/2

To calculate CD and BD, we first took the logarithms of the richness values for eachindicator in each country. For example, the number of languages (L) spoken in Afghanistan is 49; log L = 1.69.

We used logarithms because BCD—whether measured using species, languages, ethnic groups, or religions—is not distributed evenly around the world. If one ranks countries according to the richness of any of the BCD indicators considered here, one gets a few very large values recorded in the world’s megadiverse countries, rapidly diminishing to lower values found in more typical countries. For example, 833 of the world’s 6,800 languages (12%) are spoken in just one country, Papua New Guinea. The average number of languages spoken in the 229 countries and territories for which we have data is 45, Mali being an average country in this respect. But only in a minority of the world’s countries (50) are there 45 or more languages spoken, whereas in the majority of countries (179) there are fewer than 45 languages spoken. In almost half of those countries (85) there are fewer than 10 languages spoken. In other words, a few countries hold a disproportionately large share of the world’s linguistic diversity . This is a wellknown statistical pattern, known as a logarithmic distribution, and applies equally well to the distribution of species, ethnic groups, and religions among countries. To adjust for this, we used a logarithmic scale, the common log scale, to rank countries in the index. This results in a linear distribution of the index values.

Applying a common log scale essentially compresses a large range of values down to a manageable range. For example, as we noted above, the maximum number of languages spoken in one country is 833, the average number of languages spoken is 45, while several countries share the minimum value of 1 language only. Taking the common logs of these three numbers (log 833 = 2.92; log 45 = 1.65; log 1 = 0) gives us a scale of 2.92 to 0 instead of 833 to 1 (see examples in the table below). Because the common log scale smoothes out the skewed distribution into a linear distribution, the values, when compared with one another, fall into a much more even (linear) pattern.

Representative Log L values
Country/Territory
No.
languages
spoken (L)
Log L
Language Diversity index
LD-RICH (log Li/log Lworld)
World 6,8000 3.83
1.000
Papua New Guinea (highest) 833 2.92
0.762
Mali (average) 45 1.65
0.431
Bermuda (lowest) 1 0.00
0.000


Calculating IBCD-RICH. To generate the raw richness component of the index (IBCDRICH), we compared each country’s value with the global richness value. For example, staying with language diversity, the index is calculated as the log of the number of languages spoken in a country divided by the log of the number of languages spoken worldwide. The total number of languages currently spoken is 6,800 (log 6,800 = 3.83). Hence the formula we used is:

XX-RICH = log Ni/log Nworld.

where XX = LD, RD, ED, MD, or PD;
           Ni = number of languages, religions, ethnic groups, or species in country i;
           Nworld = the actual observed number of languages, religions, ethnic groups, or species in the world;

Calculating IBCD-AREA. To compensate for the fact that large countries tend to have a greater cultural and biological diversity than small ones simply because of their greater area, a second component of the IBCD adjusts the BCD value for each country by accounting for its land area. This was done by calculating how much more or less diverse a country is in comparison with an expected value based on its area alone. For example, if they were typical for their respective land areas, one would expect about 36 or 37 languages to be spoken in Papua New Guinea and about 50 languages to be spoken in Mali. Therefore, based on their areas, Papua New Guinea is vastly more diverse than one would expect, whereas Mali is slightly less diverse than one would expect. The method is a modified version of that used by Groombridge and Jenkins (2002).

The expected diversity of a country is derived from the species-area relationship, which comes from ecological theory:

log S = c + z log A

where S = number of species;
          A = area; and
          c and z are constants.

The formula simply states that the log of the number of species present in a country or territory increases in proportion with the log of the area of the country or territory. The constants c and z can be derived by observation. We have applied the same formula to indicators of cultural diversity, hence:

expected log Ni = c + z log Ai

where Ni = number of languages, religions, ethnic groups, or species in country i;
          Ai = area of country i: and
          c and z are constants.

Strictly speaking, the species-area formula applies only to biodiversity. However, we used it to make area adjustments for the cultural indicators because, as noted above, the global distribution patterns of richness in cultural diversity and biodiversity are similar. To find the values of c and z for each of the indicators used in the IBCD-AREA analysis, we scatter-plotted log Ni against log Ai for all countries, and drew the best-fit straight line through the scatter; z is the slope of the line and c is the point where it intersects the yaxis.To calculate the deviation of each country from its expected value we simply subtracted the expected log Ni value from the observed log Ni value.

Deviation from expected value = log Ni – expected log Ni
                                            or log Ni – (c + z log Ai)

This gives a series of values for each country where a score of 0 means that the country is exactly as diverse as one would expect based on its area, a score of 1 means it is ten times more diverse, a score of 2 means it is a hundred times more diverse, a score of -1 means it is ten times less diverse, a score of -2 a hundred times less, and so on.

The index is calculated such that the global value is equal to 1.0 and the minimum value is zero. The global value for each of the five measures is also the maximum value, or, put another way, the world as a whole is more diverse than any country, even after adjusting for land area. The minimum value was selected by choosing a value below that of the any country. Hence the formula used to calculate a country’s area-adjusted diversity value for each of the five indicators was:

XX-AREA =
Di - Dmin
Dmax - Dmin

where
          Di = observed log Ni – expected log Ni;
          Dmin = a value below that of the least diverse country; and
          Dmax = Dworld, the actual observed value for the entire world.

Examples of IBCD-AREA methodology using language diversity data
Country
Area(A)
[thousand
km2]
Log
A
No.
languages
spoken
(L)
Log L
Expected log L
value
Deviation from
expected
value
(D)
Language
diversity
index
LDAREA
World 136,605 8.14
6,800
3.83
2.33
1.50 1.00
PNG 463 5.67
833
2.92
1.56
1.36 0.954
Turkmenistan 488 5.69
37
1.57
1.57
0.00 0.525
Greenland 2,176 6.34
2
0.30
1.77
-1.47 0.062
Minimum Value 1,000 6.00
1
0.00
1.67
-1.67 0.000


Calculating IBCD-POP. Finally, a third component of the index, IBCD-POP, compensates for the fact that more populous countries tend to have greater cultural diversity than small ones because of greater population size. This was done in the same
way as compensating for area, by calculating deviation from an expected value based on
population size alone, using the formula:

expected log Ni = c + z log Pi

where Ni = number of languages, religions, ethnic groups or species in country i;
           Pi = population of country i: and
           c and z are constants.

To calculate c and z, we scatter-plotted log Ni against log Pi for all countries, and added the best-fit straight line; z is the slope of the line and c is the point where it intersects the y-axis. To calculate the deviation from the expected value we simply subtracted the expected log Ni value from the observed log Ni value.

The formula used to calculate a country’s population-adjusted value for each of the five indicators was the same as that used to calculate the area-adjusted value:

XX-POP =
Di - Dmin
Dmax - Dmin

where
         Di = observed log Ni – expected log Ni;
         Dmin = a value below that of the least diverse country; and
         Dmax = Dworld, the actual observed value for the entire world.

However, unlike IBCD-AREA, for some component indicators the value of Dmax in IBCD-POP was not equal to Dworld, and so an arbitrary maximum value was chosen at a level greater than that of the most diverse country.

 

Examples of IBCD-POP methodology using language diversity data
Country
Population(P)
[thousand]
Log
P
No.
languages
spoken
(L)
Log L
Expected log L
value
Deviation from
expected
value
(D)
Language
diversity
index
LD-POP
Max value 6,056,710 6.78
12,000
4.08
2.48
1.60 1.000
PNG 4,809 3.68
833
2.92
1.34
1.58 0.994
Pakistan 141,256 5.15
76
1.88
1.88
0.00 0.477
Korea, DPR 22,268 4.35
2
0.30
1.58
-1.28 0.057
Minimum Value 10,000 4.00
1
0.00
1.46
-1.46 0.000

 

Missing data. If data are missing for a particular indicator (LD, RD, ED, MD, or PD) within a component’s cultural diversity (CD) or biodiversity (BD) parts, the remaining indicators are used to calculate that country’s IBCD. For example, if a country is missing data for religion and ethnic groups, we used only languages to calculate the CD part of its IBCD; if data are missing for plants, we used only birds/mammals to calculate its BD part; and so on. However, a country must have at least one cultural diversity component and one biodiversity component to calculate its IBCD. Notes to the tables show which data are included for each country.

Additional remarks on the methodology. As noted above, for some indicators the maximum and minimum values used to fix the top and bottom of the scale are theoretical rather than observed. The drawback of using theoretical maxima and minima is precisely that they are theoretical; hence it can always be argued that they are arbitrary to some degree. Also, it is possible that the posited values could be superseded by new information or different interpretations of existing data. For example, the number of languages reported in Ethnologue has increased from edition to edition, not because previously unknown languages are found (although a few have been), but because the editors have reclassified dialects as separate languages based on new information and interpretations from field linguists. It has also been argued that there are a very large number—possibly several thousand—unreported deaf languages that need to be added to the world’s total . Therefore, it is possible the theoretical maximum number of languages used here will become obsolete when the next edition of Ethnologue appears. If so, then one would have to go back and recalculate the language data using the new, higher theoretical maximum.

Of course, the IBCD would also have to be recalculated if different datasets were chosen for the indicators. For example, the ethnic group classification used here, from the 2001 edition of the World Christian Encyclopedia, results in a much higher number of ethnic groups than is reported in some other sources in the anthropological literature. If, say, data were drawn from the Human Relations Area Files instead, the results of the IBCD might be completely different.

top

site map      home      about us      support us      projects       resources       forum       contact
Text © 1997-2009 Terralingua. All rights reserved.
Terralingua is a 501(c)(3) non-profit organization registered under U.S.A. tax laws (38-3291259).
Terralingua logo © 1998-2009 Fausto Bonasera and Anna Maffi.
Photographs © 2009 Cristina Mittermeier
, Anna Maffi, David Rapport
Website design by o r t i x i a.