> What is a classification good for?
There are two questions here:
- what is a given terminology, nomenclature or vocabulary good for.
- what is the classification of the terms in that vocabulary good for.
> One of its usual aims is to *group* information with a purpose.
Certainly, most of the existing vocabularies (e.g. ICD, SNOMED, ICPC,
IND etc) have been designed to meet the specific needs of
epidemiological study in a specific domain of medicine. Typically, the
terms in their vocabularies are relatively coarse-grained compared to
'real' clinical information, and the particular way in which the terms
are classified lends itself best to statistical analysis. Because each
vocabulary is tailored to a specific field of medicine (ICD to
morbidity, SNOMED to pathology) it has proved necessary to develop many
(more than 300) different vocabularies and associated classifications.
> We've been pretty happy with ICPC for epidemiological and consultation management purposes.
Not surprising, because that is what it was designed for.
> I still know little about Read but, by the discussions about it in GP-UK, it
> seems to me that codes are being created for classifying things to a great
> detail.
It's always a little difficult to talk about READ these days because you have to be
clear which version you are talking about. At the moment in the UK, there are only a
handful of sites using the 'new, improved' READ 3. The majority of GPs are still using
READ version 2. The overall intention of READ, in any version, is to produce a vocabulary
which is more fine-grained than schemes such as ICD. Hence, the section D41 from ICD-9-CM
reads:
D410 ACUTE MYOCARDIAL INFARCTION...
D410.0 ACUTE MYOCARDIAL INFARCTION OF ANTEROLATERAL WALL
D410.1 ACUTE MYOCARDIAL INFARCTION OF OTHER ANTERIOR WALL
D410.2 ACUTE MYOCARDIAL INFARCTION OF INFEROLATERAL WALL
D410.3 ACUTE MYOCARDIAL INFARCTION OF INFEROPOSTERIOR WALL
D410.4 ACUTE MYOCARDIAL INFARCTION OF OTHER INFERIOR WALL
D410.5 ACUTE MYOCARDIAL INFARCTION OF OTHER LATERAL WALL
D410.6 TRUE POSTERIOR WALL INFARCTION
D410.7 SUBENDOCARDIAL INFARCTION
D410.8 ACUTE MYOCARDIAL INFARCTION OF OTHER SPECIFIED SITES
D410.9 ACUTE MYOCARDIAL INFARCTION OF UNSPECIFIED SITE
D411 OTHER ACUTE AND SUBACUTE FORMS OF ISCHEMIC HEART DISEASE...
D411.0 POSTMYOCARDIAL INFARCTION SYNDROME
D411.1 INTERMEDIATE CORONARY SYNDROME
D411.8 OTHER ACUTE AND SUBACUTE FORMS OF ISCHEMIC HEART DISEASE
D412 OLD MYOCARDIAL INFARCTION
D413 ANGINA PECTORIS...
D413.0 ANGINA DECUBITUS
D413.1 PRINZMETAL ANGINA
D413.9 OTHER AND UNSPECIFIED ANGINA PECTORIS
D414 OTHER FORMS OF CHRONIC ISCHEMIC HEART DISEASE...
D414.0 CORONARY ATHEROSCLEROSIS
D414.1 ANEURYSM OF HEART...
D414.10 ANEURYSM OF HEART (WALL)
D414.11 ANEURYSM OF CORONARY VESSELS
D414.19 OTHER ANEURYSM OF HEART
D414.8 OTHER SPECIFIED FORMS OF CHRONIC ISCHEMIC HEART DISEASE
D414.9 CHRONIC ISCHEMIC HEART DISEASE, UNSPECIFIED
...whilst the corresponding section of READ version 2.0 (Jan 1995) reads:
G30.. Acute myocardial infarction+...
G300. Acute anterolateral infarction
G301. Other specified anterior myocardial infarction+...
G3010 Acute anteroapical infarction
G3011 Acute anteroseptal infarction
G301z Anterior myocardial infarction NOS+
G302. Acute inferolateral infarction
G303. Acute inferoposterior infarction+
G304. Posterior myocardial infarction NOS+
G305. Lateral myocardial infarction NOS+
G306. True posterior myocardial infarction+
G307. Acute subendocardial infarction+
G308. Inferior myocardial infarction NOS+
G30y. Other acute myocardial infarction+...
G30y0 Acute atrial infarction
G30y1 Acute papillary muscle infarction+
G30y2 Acute septal infarction
G30yz Other acute myocardial infarction NOS+
G30z. Acute myocardial infarction NOS+
G31.. Other acute and subacute ischaemic heart disease+...
G310. Postmyocardial infarction syndrome+
G311. Preinfarction syndrome+...
G3110 Myocardial infarction aborted+
G311z Preinfarction syndrome NOS
G31y. Other acute and subacute ischaemic heart disease+...
G31y0 Acute coronary insufficiency
G31y1 Microinfarction of heart
G31y2 Subendocardial ischaemia
G31yz Other acute and subacute ischaemic heart disease NOS+
G32.. Old myocardial infarction+
G33.. Angina pectoris...
G330. Angina decubitus...
G3300 Nocturnal angina
G330z Angina decubitus NOS
G331. Prinzmetal's angina+
G33z. Angina pectoris NOS...
G33z0 Status anginosus
G33z1 Stenocardia
G33z2 Syncope anginosa
G33zz Angina pectoris NOS
G34.. Other chronic ischaemic heart disease+...
G340. Coronary atherosclerosis+
G341. Aneurysm of heart+...
G3410 Ventricular cardiac aneurysm
G3411 Other cardiac wall aneurysm+
G3412 Aneurysm of coronary vessels
G3413 Acquired atrioventricular fistula of heart+
G341z Aneurysm of heart NOS
G34y. Other specified chronic ischaemic heart disease+...
G34y0 Chronic coronary insufficiency
G34y1 Chronic myocardial ischaemia
G34yz Other specified chronic ischaemic heart disease NOS+
G34z. Other chronic ischaemic heart disease NOS+
G3y.. Other specified ischaemic heart disease+
G3z.. Ischaemic heart disease NOS
As you can see from this example, READ version 2.0 has (at least in this area of
medicine) some striking similarity with ICD: it is organised in very similar ways,
although on occasions it adds an additional level of detail. READ 2 has a wider scope
than ICD, including things such as Occupations. This scope has been further enhanced
by the Clinical Terms Project, an exercise in gathering new terms for the vocabulary
across a wide range of practical, clinical specialties.
> After a point, I ask, what is the difference between the code and
> the object it is supposed to describe? Is Read trying to convert *every*
> tiny expression into an individual coding string?
This would clearly be impossible. Enumerative schemes such as ICD or
READ 2.0 are able (just!) to be built to a size of around 150,000
separate terms. Maintaining consistency of scope, methodology and
classification for that range of terms becomes increasingly difficult,
however. This is especially true if the inbuilt classification
intended for statistical analysis is also used as a means of navigation
to find the correct code in the first place. Users of ICD and of READ
2 often complain that the schemes are simultaneously too big (you can
not find the right code, or even be sure whether or not it exists at
all) and too small (the level of detail available as discrete codes is
insufficient as a basis of a clinical, medico-legal record).
It is a common phenomenon of all enumerative schemes that many of their
users can not find terms they expect to be present, whilst dedicated fans
of the scheme are always quick to point out that a code exists (cf Jon
Rogers). The fact remains that the organisation of the terms is
clearly such that most mortal users do not find the coding scheme
usable. Any requirement for super-human memory or persistence in the
use of a coding scheme will mean that, ultimately, it may fail
completely or, at best, be used inconsistently. If the use of the
scheme in the first place was intended to support data aggregation,
then inconsistent usage amounts also to failure.
Nomenclatures sufficiently detailed to record clinically useful
information - satisfactory for medico-legal purposes and decision
support technologies - can not be built using enumerative techniques.
For this reason, SNOMED, READ 3 and GALEN to name a few have been, or
are being, developed using compositional techniques. They make bigger,
more detailed terms by 'sticking' together smaller ones. A few
thousand basic terms can then, potentially, be combined to make several
billion more complex terms. Such massive generational capabilities,
however, present new challenges regarding how to control the generation
to rule out nonsense combinations (e.g. fracture + eyebrow), and more
significantly how to classify the resulting combinations - both for
aggregation and for navigation. The principle differences between these
three schemes on these engineering points are:
SNOMED has no constraining mechanism or classification of compositions
READ 3 has some constraining mechanism and limited classification of
compositions
GALEN has an advanced constraining mechanism linked to a formal and
exhaustive classification.
GALEN has, until recently, been a purely research and engineering endeavour.
As a result, it covers a much smaller range of medicine than either of the
other two. A commercial GP data entry application based on GALEN technologies
is now nearing alpha-test, and was demonstrated at the VAMP stand at HC95.
GALEN is now working with the national coding and classification centres
in Holland, Sweden, France and Italy to develop new nomenclatures and
classifications of surgical procedures, although still in a research setting.
For more details see:
GALEN WWW site:
http://www.cs.man.ac.uk/mig/galen/
Or:
Putting the Clinical into Clinical Information Systems
JE Rogers DW Solomon
PHCSG Conference 1995
http://www.cs.man.ac.uk/mig/migGeneral/phcsg95.html
GALEN is a research project funded by the EEC Framework III and Framework IV.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr Jeremy Rogers MRCGP DRCOG DFFP MBChB
Clinical Research Fellow
Medical Informatics Group
Department of Computer Science
Manchester University, Oxford Road
Manchester, United Kingdom
M13 9PL
(+44) 161 275 6145 voice
(+44) 161 275 6932 fax
[log in to unmask]
URL http://www.cs.man.ac.uk/mig/people/jeremy.html
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|