Dear all,
I'd like to describe you a project we actually work on. Perhaps you may
find our participation in your project helpfull.
In 2000 we started improving our metainformation system consisting of mere
databases and contacts. I did not helped much the organization of the
environmental information.
So we have been working on environmental catalog consisting of:
- databases and structure of information that can be found in the database
- expert and organisation directory
- authorised documents
- services, etc.
All these types have the same cataloguing scheme: Thematic (GEMET), Spatial
(our own) and Temporal/Event thesaurus (yet a simple one).
Thesauri have synonyms and GEMET has been extended by synonyms and specific
new descriptors (cca 10 000 - ) to increase search accuracy.
We have been running first version of structured linguistic analyzer - a
"simple" gadget that using word morphology parses the user query thru all
the thesauri and returns back to user a particularization of the query.
The query after decompostion to theme/time/space is used as a keyword
search in the catalog. In case the search returns also any information
source, we go and ask it for "signal information", i.e. concise information
not revealing real data but just the information whether the information
for specified theme/time/space is present.
I know this concept has been broadly discussed already and here I'd like to
thank Thomas Bandholtz who unknowingly brought this idea to me.
We have extended somewhat this concept and now we expect it shall bring
first results.
We put emphasis on consistency of the whole
query/search/retrieval/presentation chain and this is the most important
achievement.
Currently from the project work more or less its parts (ie. portal,
analyzer, catalog, information broker and database connectors, thesauri)
and we plan to seam it together before September 02.
We are highly interested in joining the club since this project can bring
us new ideas and verify our goals.
To Stefan:
We have unofficial Czech GEMET translation. We are working for more than
one year on refining of the thesaurus - many terms are duplicated because
of nature of the Czech expert language.
We plan to send it to you in end of July?
All the best,
Jiri
"Bandholtz, Thomas" <[log in to unmask]>@JISCMAIL.AC.UK>
na 18.05.2002 13:30:26
Odpovězte prosím - "Dublin Core Metadata Initiative's Environment Special
Interest Group" <[log in to unmask]>
Odesláno kým: "Dublin Core Metadata Initiative's Environment Special
Interest Group" <[log in to unmask]>
Komu: [log in to unmask]
Kopie:
Předmět: 6th Framework
Hello all,
I would like to draw your attention to taxonomies. Applied to DC, his means
some controlled vocabulary to be used as values to be assigned to the
metadata attributes.
You all know we have GEMET
http://www.mu.niedersachsen.de/cds/etc-cds_neu/library/select.html,
a 19-lingual thesaurus.
Those who have been in Thun (CH) last year, or in the Expo00 the year
before, may remember I had presentations about a thesaurus-based
auto-classification we use in the German Environmental Information Network
http://www.gein.de/index_en.html, and I was asking for people who could
provide (or develop) the same (or a similar) auto-classification in their
own language.
Now I think the time has come. The EU is preparing the IST Programme in FP6
http://www.cordis.lu/ist/fp6/workshops.htm. This week I visited the
Knowledge Management Workshop in Luxemburg, and I think this is exactly of
the kind they are looking for.
I already contributed to the 6th Framework Consultation Meeting:
'Technologies for Major Societal Challenges' last year in Brussels, with a
proposal named "European Environmental Topic Map", and this has been
accepted as part of the subjects to be sponsored - not yet as a project.
The final call for proposals will be published in Q4 this year. I propose
to do the following:
1. Select a set of sample documents (test cases)
2. Have them translated in all languages (currently 19)
3. Re-organize & enhance GEMET as a topic map
4. Discuss some common classification methods
5. Find the language-specific challenges
6. Develop the (currently 19) language-specific text analysis modules
7. Apply them to the test cases: Each of the languages should result in
exactly the same GEMET descriptors.
What do you think about this?
I can imagine my US mother company Schlumberger when I will ask for some
funding for this project: They'll die laughing about us Europeans going
bananas with our 19 languages.
Well, let's give them an example of what will be "KM made in Europe"!
Cheers
Thomas Bandholtz
CM / KM Division Manager; XML Network Moderator
Competence Center Content Management
SchlumbergerSema
http://www.schlumbergersema.com
Kaltenbornweg 3
D50679 Köln / Cologne
Germany
+49 221 8299 264
|