This is the first of two background notes on what I mean by a merger of
quantitative and qualitative data analysis tools to improve decision-making in
the international development field. It explains my transition from being a
specialist in quantitative techniques at the World Bank and IMF. My apologies
for exceeding by much the ground rule of keeping messages to three screens in
length.
It is now generally accepted that development involves more than helping poor
localities get more produced assets (PA), from dams and roads to tractors and
computers, than their current income allows. The new paradigm views PA in a
portfolio of assets broadly defined to include natural capital (NC, for land,
subsoil assets, etc.), human resources (HR, for what is invested in
individuals like education and health), and social infrastructure (SI, for
what is invested in helping people work together like culture, family,
institutions, etc.)--as well as PA. Whether (and if so which) PA are the
priority now depends on how they complement or substitute for other forms of
wealth. Optimum development means creating and maintaining all four forms of
wealth to maximize net worth.
I demonstrated that it is possible to quantify these portfolios, at least
roughly at the national level, in the World Bank's Monitoring Environmental
Progress (MEP). The unwritten standard was the level of statistical neatness
associated with Stone's early work on national accounts although my own
personal criterion was the level of precision accepted today for Angola's GNP
estimates. Such 'optimally inaccurate indicators' are crude but nonetheless
useful in focusing decision-makers. In this case, my intention was to
demonstrate that most national wealth lies in (HR+SI), not (PA+NC), even if
all agreed to valuation (weighting of disparate forms of wealth) by prevailing
market prices.
I think MEP did this. You can judge since it is online with underlying
datasets and notes on sources and methods at
http://www-esd.worldbank.org/html/esd/env/publicat/mep/mep.htm
A more readable and focused exposition, which unfortunately is not online, is
the monograph Sustainability and the Wealth of Nations by Ismail Serageldin,
World Bank Vice President for Environmentally Sustainable Development.
Articles on MEP can be found in most major newspapers (New York Times, Wall
Street Journal, Financial Times, etc.) on September 18-19 1995 or soon
thereafter.
While I could not write this in MEP, I concluded that more refined measurement
of (PA+NC) would do little; the priority should be objectifying analysis of HR
& SI. There is an inference that funding should also be shifted but that
depends on how various forms of wealth complement or substitute for each
other, as discussed in MEP. My priority for objective indicators of HR & SI
was (and is) to inform this resource allocation issue: Where will a marginal
allocation of resources do the most to create and maintain wealth, broadly
defined as PA+NC+HR+SI? More objective information on HR & SI is the
prerequisite for making such a choice.
Number-oriented databases can help objectify the role of HR & SI in creation
and maintenance of wealth. For example, I know of useful work at the World
Bank on monitoring and evaluating social capital (which I call SI to emphasize
its role as basic infrastructure). However, that seems the hard way to do so
given the vast amount of information already available in word-oriented
databases from observers who at least purport to be objective. Moreover, if
the words are from a stakeholder they are relevant to the success of a
proposed solution even if biased. Finally, I have spent too much time in the
bowels of national statistical offices to believe that tabulated datasets,
particularly at high levels of aggregation (eg, GNP), are devoid of judgement
calls. The problem is in managing qualitative information whether presented as
word-oriented databases or appendages to number-oriented ones.
Indeed, the main challenge I faced while chief of the World Bank's comparative
analysis and data division was in managing the meta-data enveloping tabulated
datasets. I instituted procedures to streamline description of methodology
(which I call statements of expectations) while systematizing series- and
cell-specific footnotes (which I call statements of exceptions to
expectations). In principle, any stated exception to expectations ought to
affect the weight given the qualified number, in decision-making. I began
looking for tools to do so, which led me to a pioneer in mutlidimensional
analysis, Erik Thomsen (check out http://www.dimsys.com/ for his work). By his
standards, even if all our 'grey material' were distilled into statements of
exceptions, it would be too sparse to fuel inference engines yet so over-
defined that system closure would be impossible. Erik's book, OLAP Solutions,
includes one case study I sponsored (on Cost Benefit Analysis for
infrastructure investment) where we sought to resolve such problems by almost
fractal expansion of number-oriented databases. I concluded that such efforts
were a necessary but not sufficient condition for improved decision-making on
development projects.
Since I also promoted open disclosure, what meta-data we could organize began
appearing in our data dissemination vehicles (Social Indicators of
Development, WDI, Atlas, etc.). Electronic dissemination via STARS and related
means came at my behest. Mining our internal files for relevant meta-data and
packaging it for electronic dissemination loomed ever larger as our number-
oriented databases became more accessible.
For me, the information technology scale tipped from quantitative to
qualitative data analysis tasks when we captured and disseminated what had
been confidential Country Briefs, repackaged as Trends in Developing Economies
(TIDE). This is a word-oriented database with an associated set of tabulated
data (appendices to each brief). Given my role in the Bank's financial
preference mechanism, I wanted to ensure conformity between what was written
in text and shown by tabulated data. This led me, still ignorant of CAQDAS, to
construct a hierarchical indexing system that reached out from the framework
suggested by the Bank's number-oriented databases to search word-oriented
TIDE. The result is available as a World Bank CD-ROM, World Data 1995. Users
who find a time series of interest in the number-oriented database
(potentially over 700 series per country, covering 35 years) can then
hyperlink to related country-specific texts in TIDE. I had hoped to release
our flawed but still useful approach to inference engines and system closure
as templates for a PC package the Bank was using (Javelin). However, the Bank
was not ready to admit its imperfections and I only managed to get Javelin
onto World Data 1995, without the templates.
Using a more powerful mainframe system I had developed (COMETS), I looked for
correspondence between numbers in appendices and related textual references
which were usually some simple algorithm like relating current to past values
or those for one country compared to those in another. I looked for 'the dog
that didn't bark' meaning the absence of textual references where a standard
algorithm seemed to suggest something worth mentioning. I became aware of the
bark without a dog, meaning assertions in the text for which there was no
substantiation from the number-oriented database. This divided between
assertions the number-oriented database was designed to substantiate but
didn't in a specific case (missing data) and those beyond the design paradigm.
I used COMETS to summarize such findings into profiles describing the 'weight
of discussion.' Via internal management processes, I worked to iterate between
refocusing the number-oriented database to substantiate what was discussed in
word-oriented databases (broadened beyond TIDE) and expecting country teams to
write about things that appeared important in the number-oriented database
(even if only to say why they disbelieve the data and to explain steps being
taken to improve the empirical base).
When I became Senior Advisor in the Environment Department to create MEP, the
IT (information technology) emphasis reverted to mainstream (prettier GUIs).
However, the forgoing was very much on my mind as I considered the problem of
objectifying analysis of HR & SI once MEP demonstrated the dominant role of
these forms in the creation and maintenance of wealth. In particular, I was
impressed by the interesting and seemingly objective but word-oriented work
being pulled together by the Bank's fledgling Social Policy team. It was at
that stage I became aware of CAQDAS and began playing with NUD-IST. Pulling
the pieces together implied a level of involvement in IT that I didn't think
could fairly be asked of a financial intermediary (which is all the World Bank
really is), so I left to try it on my own. FIND is the result, as I'll explain
in my second background note.
John C. O'Connor
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|