JISCMail - GP-UK Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
GP-UK Archives

GP-UK@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		GP-UK Home
		GP-UK 1996
Options

Subscribe or Unsubscribe
Get Password
Subject:
Use of SGML for medical records. (was EDIFACT versus SGML)
From:
Peter Johnson <[log in to unmask]>
Reply-To:
[log in to unmask]
Date:
Sun, 29 Dec 1996 22:23:28 GMT
Content-Type:
text/plain
Parts/Attachments:
text/plain (423 lines)
[sorry about the length of this everyone]

Tom

A few statements up front.

I do not promote EMR's using solely codes (eg Read coded, or ICD, Snomed
etc) as the way forward. I don't know what the way forward is, but I do
recognise some of the requirements, which I have not yet seen met by any system.

My worries about the SGML debate [again I point out I have only been
involved in some of the UK debate about this, so it may not be true in the
US] are that the way it seems to be being taken forwards glosses over some
of the same areas that have been glossed over in existing (UK) EMR's, and
the fact that we avoided those areas has come back to haunt us. Some of the
advantages - the 'loose structure' which sounds very appealing up front,
turns out to be a major limitation when one tries to use the medical record
for other tasks, or in other geographical locations.

btw I should point out that the primary care EMR's in the UK that use the
read codes usually use them in conjunction with free text, so they can be
seen as tags to sections of free text. The codes can act as semantic tags
and as terminology tags. It is this type of record I have tried to use for
decision support systems.

At 19:30 28/12/96 PST, you wrote: [PJ is >>, TL is >]

>>...How
>>can we create EMR's which are usable for purposes other than that
>>for which they were recorded?
>
>This is exactly where we see the contribution of SGML to be, in which the
>many loosely structured documents that contribute to the clinical record,
>by virtue of sufficient and appropriate markup, become the stakeholder
>neutral and task-neutral carriers of information, interpretable and
                                                   ^^^^^^^^^^^^^
>extractable by intelligent agents.
                ^^^^^^^^^^^^^^^^^^

And this is exactly where I see the problem. For markup to be sufficient,
the DTD would need to be very carefully defined to be as context free as
possible - task, geographically etc. Without this one cannot be confident
that one is interpreting the record in the same way as the author.

I see the central problems to be these shareable tags, which everyone
understands, agrees on the definition in all contexts of use. This isn't a
problem that is solved by SGML - this is a terminology, ontology problem. It
is a much bigger, more fundamental problem - and this is what seems to be
getting glossed over. This foundation needs to be in place before issues of
technical solutions become useful. Otherwise we're building castles on sand.

>>d) The problems facing EMR use in the UK  can be summarized as - we
>>lack standardized, closely controlled definitions of concepts used
>>in health care, which are independent of the context in which they
>>were recorded. This wasn't realized until many years use of loosely
>>structured EMR's, when people tried to use the data in these EMR's
>>for other purposes.

(I probably need to qualify this a bit - after re-reading it! I mean for
those concepts which we wish to share for other tasks. eg Diagnosis - used
for epidemiology, care planning etc, any observations, tests, any actions
taken - all of these concepts are used for many purposes in a medical
record, and the promise of shareable EMR's is that they can be used for many
more.)

>NOW HERE IS WHERE WE DIFFER! Which came first, categories or the
>natural variety of things? In organizing information, does one start
>systematically, or does one start otherwise? I stand with Stephen
>Gould who finds variation to be the sine qua non of life.[For
>example, Gould, Stephen, "Full House;" New York, Random House 1996.]
>He states: "Categories are human impositions upon nature (though nature's
>factuality offers hints and suggestions in return)." Everything I know from
>my background in pathology is that the disease categories to which we
>assign specimens and patients are convenient but flawed constructs.  No
>matter how well we refine them, they fool us.

I agree completely with this. *But* the reason why humans have invented
categories is so they can communicate with each other without
misunderstanding. The medical record is all about communicating. This is
especially what the SGML community are promising. Undoubtedly the best way
to communicate an experience is to expose the person with whom you want to
communicate to exactly the same experience - that way nothing of the natural
variation is lost (although as they have a different background their
experience may be slightly different). However, this is impractical, so we
choose to communicate abstractions which hopefully capture the majority (but
never all) of that experience. In human language the abstractions rely on
the recipient having the same background of understanding. The surrounding
context of use of the abstraction is essential for selecting the right
meaning of the abstraction. If we wish to use the medical record with
'intelligent agents' of lesser ability than a medically trained human,
and/or for different tasks, relying on the intelligent agent to be able to
recreate the experience will fail, when it is given some abstractions which
have been arrived at in a different context. What is worse is that no
'intelligent agent' so far created of which I am aware, can realise when
available data no longer fits its background model of the world, something
which humans are capable of.

So for non human interpreters of data, these abstractions need to be well
defined. Ideally they need to be as context independent as possible (or to
be qualified as valid only in a particular context), if we wish to be able
to use such concepts for other tasks.

>I borrow the following from
>Alfred North Whitehead: "In all systematic thought, there is a tinge of
>pedantry.  There is a putting aside of notions, of experiences, and of
>suggestions, with the prim excuse that of course we are not thinking of
>such things.  System is important.  It is necessary for the handling, for
>the utilization, and for the criticism of the thoughts which throng our
>experience.  But before the work of systemization can commence, there is a
>previous task -- a very necessary task if we are to avoid the narrowness
>in all finite systems... [Thus the framework] should never start from
>systemization.  Its primary stage can be termed assemblage." [Whitehead,
>Alfred N., "Modes of Thought," New York, Capricorn Books, 1939.]

I don't differ with you here. I have never said that systemisation should
come first. But note also in the above passage "System is important.  It is
necessary for the handling, for the utilization..."

You talk about 'intelligent agents' being able to use an SGML EMR.
Unfortunately, all existing intelligent agents I have seen are very
systematic, in the rationalistic tradition. As such they need a very
systematic definition of their domain as a precursor to their use. The trick
is to get the correct systemisation. Systematically defined sufficiently to
achieve context and task independence, but not too defining to be overly
restricting. But I don't know if this is possible. (And have considerable
doubts)


>SGML is an enabling technology that helps us avoid the narrowness of
>all finite systems.

This is OK provided nothing is claimed of the SGML version of the EMR (or
EHR) that is not claimed for the written record. i.e. no use for tasks other
than those in the mind of the person recording the record. No intelligent
agents that don't share exactly the same understanding of the domain and the
way the record was recorded. i.e. nothing but humans with the right
education, life experience and training. But then there is no advantage to
using it.

>>e) Those promoting SGML seem to believe it will solve such failures
>>of existing UK EMR's, but I see it making exactly the same mistakes,
>>unless there is some universally defined DTD to which every user
>>commits. (and thus an 'intelligent agent' can also commit)
>
>See above: "context free categories" are only context free because
>the context has been stripped off to make them fit some preconceived
>notion, and thus are subject to misinterpretation by omission -- a
>sin more difficult to correct than sins of commission.

see above. If an intelligent agent is to make decisions which may have
considerable impact, I would rather it made decisions solely on concepts
which it clearly understood. I don't want a non human 'intelligent agent'
making decisions based on concepts it has freely interpreted, thinking it
knows how to resolve the ambiguity of the record. This is very dangerous. It
is sins of interpretation which worry me.

{jump to discussion of UK loosely structured UK EMR's - free text and read
codes.}

>>The particular hope was that the EMR could be used for
>>epidemiological purposes (not usually in the mind of the HCP at
>>point of entry).
>
>There are well known confusions that make this difficult. The HCP is
>generally in pursuit of a diagnosis or problem statement (which will
>suggest the predictable course of the patient and the most prudent
>therapy). The codes needed must reflect this (but may not). The
>epidemiologist wants to know how it turned out. To borrow from fox
>hunting, the epidemiologist is not interested in the chase, but in
>the kill. There are points in the clinical record where this
>information is approximated, and can be located if it is properly
>tagged.

I agree with the confusions. So how does tagging (which would also have to
be in the mind of the recorder) help this?

[snip]
>
>We would suggest that you have defined an EMR as the data document
>in hand, which has all of the deficiencies that you note.

On the contrary. I have tried specifically designed EMR's. I have tried
tagging as a solution. It solved none of the problems I mentioned - a) it
doesn't record missing distinctions which become necessary for use in a
different task, b) it doesn't make explicit what shift in meaning is
necessary when the context of use is different.

>Is exactly
>that sort of context (and more) that content and process oriented
>(not format oriented) SGML tagging is designed to accomplish --
>without predefining what the categories must be

As I mentioned in my earlier email, it isn't inferring what the original
context of entry was that is the problem, (This is usually easy) but it is
having insight into what interpretations one must bring to bear on the
record when the context is known that is a problem. For example,
'hypotension' in England - and the US, is not (unless extreme) considered a
disease state that needs treating. But in Germany I believe, it is a
condition which is believed to cause significant morbidity and is treated.

The term 'Tired all the time' is an abstraction which means a great deal to
any UK clinician. US clinicians I have spoken to do not understand this
term. Tagging cannot help this, except to say 'this was recorded in the UK'
- it cannot make you understand the term any better.

>-- after all, less
>than four years ago in the US we would not have dreamed of the
>organizations that have evolved, and the skill mixes that they use.
>The categories that you suggest above have already become instant
>legacies in many parts of our country, because they do not meet the
>richness of the alternatives: was the encounter by a physician's
>assistant? was a therapy overruled by a secondary screener, and with
>what background? was a medication substituted with or without the
>agreement or knowledge of the prescribing physician? We would
>suggest that you do not have an EMR/EPR/EHR without such data.

An EMR is a contemporaneous record. The fact that a term in the record may
be legacy after a year (or a week) is irrelevant - it was considered to be
an accurate record at the time of recording. Indeed it is right that a term
should become a legacy term if its meaning should drift. It is a well known
problem of medicine (and KB design in general) that the meaning of a concept
may change over time. To take the obvious (dramatic shift) example - to say
someone was 'gay' 20 years ago in the medical record shouldn't be
interpreted in the same way as if it was recorded 10 years ago. But if an
agreed concept of homosexuality was available this problem would never arise.

[regarding UK EMR's]
>As I understand it, you have a labor intensive recording scheme aimed at
>one player in an increasingly integrated practice endeavor.  That is not
>an EHR/EPR or even an EMR, at least by our standards.  In a more
>interactive on-line scheme, the necessary data are accumulated in the
>course of doing business.  However, clearly, this is not yet the case
>except in a few settings, which nevertheless are proof of concept. (Not to
>mislead, none to my knowledge use SGML.)

In fact nothing could be further from the truth. Most primary care medical
records in the UK are recorded by many people involved in the care of
individuals, during the course of normal business, usually with the patient
present at the time of recording.

>>This is a well known problem - drug trial data collection for
>>example is specifically tied to that task, usually by using
>>questionnaires with tightly defined terms -
>
>Unless you have quite different physicians in practice than we have, labor
>intensive questionnaires that do not have a meaningful reward for the
>physician or user in question will not be filled out, except under duress.
>And under duress they will not be filled out except in a wasteful
>pro-forma way.

No, we have the same humans here. I did not mean to propose long data entry
questionnaires.

>A good hospital example, is the epidemiology of nosocomial
>infections, which presently uses especially tasked nurses as clerks to
>collect the data.  Much of it, however, would be derivable from a more
>integrated system.

This is an assumption you make which I am saying is wrong. *Some* of the
information may be available from an integrated system. Age, sex, for
example are pretty universal, task independent concepts. In my experience,
other items of information that are needed may appear to be present in the
record, but turn out to have been recorded for another use which affects
their validity for this new task. The big problem is knowing which items of
data in the record it is safe to use, and which isn't.

> Then an evaluation could be made by a knowledgeable
>individual examining each case from a terminal using a DSS... with a call
>to the floor to resolve ambiguities.

I have been involved in devising systems to do this over the past 6 years or
so. It is not so easy. Most items of information need confirmation that they
are valid in this context of use, and then the user needs to see what the
context of recording the item was, so all the interpretation is being done
manually by a human agent who we presume has sufficient background in the
context of use and the context of recording. Where is the gain?

[snip]
>>What I am saying here is that in current medical records, the
>>context and definitions are made explicit prospectively - before the
>>record is made. The task - the way the record will be used, is
>>integrally bound with the record. The record cannot be considered
>>valid outside the task/context in which it was recorded.
>
>That is properly observed, and what a SGML markup, properly
>introduced, is intended to correct.

But how can it possibly correct this?

>>These two problems are fundamental to the hopes for EMR's.  SGML
>>offers no answer to them at all.
>
>Now just stop for a minute, and tell me why this is so... as it runs
>directly counter to our observations, but perhaps not on the same data..

The problems are as I put below:

>>Basically they come down to the questions:
>>
>>1) Is it possible to define the semantics of an EMR (say tags) and
>>the terms in a way which makes them task independent? i.e. they mean
>>the same if I am to use them for medico legal purposes, or for
>>decision support purposes, epidemiology, in primary care, in
>>secondary care? (etc.)
>>
>>2) If I can achieve 1), is it usable in practice, by the normal HCP?
>>
>>Unless both 1) and 2) can be answered, we have failed in being able
>>to use the EMR across care and geographical boundaries, for purposes
>>other than that in the mind of the HCP recording the data.
>
>Properly observed... This is the direction we are taking.

OK, so in that case, you are going to have to spend a considerable portion
of your time answering 1) above. In doing so, you will produce a task and
context independent terminology and semantics of medicine. An ontology of
the domain. This is such a big and problematic undertaking that I cannot see
why this isn't the focus of your efforts, as the effort required to achieve
this dwarfs any representation issues such as the choice of SGML.

[snip]
>>In fact by making the suggestion that it [sgml] answers such problems, and
>>makes a medical record widely available outside the context (task,
>>geography, ethnic) of it's recording, is downright dangerous, as
>>inferences may be falsely drawn in such circumstances.
>
>These are blatant statements for which you surely have some grounds,
>but I certainly see the matter differently. The ability to capture
>the relevant context by no means guarantees that this context will
>be captured,

I've answered this above - it isn't capturing the context that is important,
but knowing how to interpret the record given that context. the use of SGML
offers nothing here. This is the area I feel is being glossed over, but of
fundamental importance to the reuse of medical records.

>but this is exactly the enabling capability that one is
>looking for with SGML -- ... Given that the information has been brought
>together, nothing will ever prevent false inferences from being
>drawn from it.

But this is the very challenge of sharing the medical record. While it is a
difficult challenge, I do believe that it is possible to go some way down
the road to avoiding false inferences.

>To pick the example before us, starting from the same
>ISO standard, the inferences and conclusions that you draw about
>SGML are almost exactly the reverse of what we put forth as
>considered hypotheses (by no means yet properly tested). Without
>these tests, we will never know. (This is not to say that
>exploratory tests have not been done , which are encouraging.)

I'd be fascinated to see any tests of using the captured data for other
tasks, because what you are proposing is opposite to the direction being
taken by the AI KBS community on reusable and sharable knowledge bases.

>>Given this [opinion -- see above], it is my belief that using SGML
>>in itself will in no way forward the development of EMR's.
>
>We need testable hypotheses.  Without insight, dedication, and skilled
>use, no tool, neither a stethoscope nor a surgical knife, nor set of
>computing conventions can further the task set before it.  The real
>question is: does the tool in skilled hands make it easier?  Is the tool
>enabling?  Once this is accomplished, the experience can be consolidated
>for routine practice. All of medical development is like that.

I agree 100%

>>But deciding what tasks one wants to use an EMR for, and then
>>defining a DTD for all concepts one wishes to record, in a way which
>>means that all users accept the definitions, will use the tags, and
>>can do so in the time available, would be a major leap forward.
>
>But this precommitted focus contradicts your points above -- unless
>it means adding an onerous set of questions to given report form --
>or unless it suggests a means of (tentatively and automatically)
>collecting a broader range of information by a wider reach of
>transaction processing.

It doesn't contradict at all - because I suspect it would be very difficult
to use, as you suggest. But I would say it is a very good hypothesis to test.

>>But of course there are many other ways of doing this which are
>>arguably better, and SGML seems a strange technology to focus on -
>>it is just one possible solution of many.
>
>Please suggest some.

My comment here was meant in the context of Ontology construction -
essentially what you need to do to create the DTD mentioned above, and what
I see as the fundamental problem. Ontology construction has been much
studied in the AI and Philosophy community, and tools and indeed medical
ontologies developed (or at least started upon)

>>SGML may well have a place in the communication and storage of
>>EMR's.
>
>Those are its strengths

but I want more than that. I want a record which we can use once it has been
communicated, without being fearful that I, or my intelligent agent, is
drawing false conclusions from the record. A genuinely shareable medical record.

I don't know the answers, but in my opinion (and this is with experience)
loosely structured data will not be sufficient. I agree that filtering the
EMR solely into codes is grossly distorting the record, so this will not do
either. I suspect we need both ... sounds suspiciously like SGML, I suspect
you will say. BUT only if the shareable Ontology of the domain is sorted
first, and then incorporated in the DTD. And the (ongoing) definition of
this, to me, is the major task, what needs focussing on, and it is being
glossed over.

Pete

---
Peter Johnson
[log in to unmask]
(+44) 1525 261432



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options