Eric,
Thanks for the informative message. As your original message about the
qualifier ballot hinted at, the typology in the ballot builds on your
modelling of the qualifiers in RDF. All of us familiar with RDF could, I'm
reasonably certain, see the nature of the graphs that lie behind your
thinking.
That said, let me refine my sense of discomfort and clarify why I'm being
such a simpleton about all this.
As I believe I've said to you in the past, we've played this strange bait
and switch game with RDF and DCMES. We claim at one moment that the
underlying modeling of DCMES is based on RDF, but then at the next moment
that "real" RDF is too complex and so we try to avoid its tools and modeling
formalisms.
I think you realize that I fully support the notion that complex layered
descriptions require well-thought out formalisms. By layered, I mean
exactly what we are trying to do with the 'agent type' or 'object type'
proposal. What we are trying to represent is exactly what RDF does quite
well: the notion that the 'thing' sitting at the end of a propery arc can be
a simple string or another resource (which itself has arcs originating form
it. And yes, that is distinctly different from the problem of what we call
'semantic refinement' which has a very different graph representation using
class hierarchy constructs.
This is all very clear and well founded when we use the actual RDF formalism
and express qualificatioin in those terms. However, the field gets
incredibly muddied when we try then to present this layman's view of RDF and
start making up terms such as 'object type' and pretend that this is only
something that is applicable to the creator, contributor, publisher trio and
is not, in fact, applicable to all properties in the DCMES.
I'm happy to express principles for qualification that extend uniformly
across all the DCMES elements. If we want to write up a document that says
"qualification of DCMES elements is based on the notions expressed in the
RDF data model". And then go on to say that element semantics can be
sub-classed and that values can be either simple strings or other resources
(entities) themselves that have properties of there own. That's all fine
and let's use the real terminology and lets not try to duck the problem and
try to come up with 'pretend RDF' as if obfuscating the complexity helps the
non-technical person understand it better. What happens is that we confuse
the experienced (us) and offer nothing for the unexperienced.
As you know, I feel pretty strongly about how we got our selves in this hole
in the DM working group when we decided not to use very adequate RDF
formalisms and decided to drift into some funny graph structures that had
non-deterministic translations.
That all said, we still have to deal with two other oustanding problems if
we relent and use 'real RDF'.
- DCMES has all along tried to pose itself as a somewhat simple and usable
alternative to complex things (e.g. MARC, AARC2). If we give in to the
notion that complex resource description is hard and needs real formalisms,
we might throw out this 'simple' baby (in the interest of saving the
'complex' bathwater). That is what lies behind my insistence that perhaps
we should restrict qualification in DCMES to 'simple' concepts that can be
expressed in relatively simple terms like my (and now Tom's) principles
document.
- The 'will RDF ever arrive' problem. Here we are in the 21st century and
there still ain't no practical available tools for the standard practitioner
to use RDF. If we are going to push complex qualification and if we are
going to say that doing requires a good formal modelling tool (e.g., RDF)
then we owe it to our constituency to give them tools so they don't have to
hand construct nodes and arcs.
Concluding this long stream of conciousness, we have this schizophrenia
between complexity and simplicity. As you state quite well in your message,
the RDF folks banged up against this and came up with a quite elegant
expression of modelling this complexity. The DCMI seems to want to go even
further by saying we can 'dumb-down' complexity to the point where
complexity is simple (note the horror at my use of the terms 'data types' in
my original principles document). If we want JOe or Jane sixpack librarian
to be able to create qualifiers then, yes, we must have principles that he
or she understands. But, if we want to enable Joe or Jane to create those
qualifiers for more complex layered descriptions (the arcs pointing to
resources problem) then we need to give them the proper modeling tools to do
it. As you know, my bias leans heavily towards providing Joe and Jane the
simple principles and perhaps accepting the fact that complex descriptions
might be out of the domain of DCMES.
Carl
> -----Original Message-----
> From: Miller,Eric [mailto:[log in to unmask]]
> Sent: Monday, January 24, 2000 4:49 PM
> To: [log in to unmask]
> Subject: Simple grammars, RDF, object-types, and the meaning
> of life...
> ( long)
>
>
> Stu's grammar comment on last-weeks teleconference and Tom's
> recent thread
> on the subject reminds me of a long (and somewhat painful :)
> series of RDF
> teleconferences on a similar subject. The gist of these
> teleconferences may
> be useful in helping show how elements, qualifiers and
> resources may fit
> together to make grammars that we (the collective 'we') may
> understand.
>
> The Dublin Core data model working group has agreed that RDF
> is the base
> model. RDF is about machine-understandable data. Basically,
> if the web had
> a simple, unambiguous grammar for expressing information,
> machines could
> "understand" this and (hopefully) make our human lives
> easier. RDF data
> model is the basis for this grammar. This simple grammar provides the
> formal means for communities to write sentences some which
> are understood
> across communities, some which may be only understood in a
> local context.
> Additionally, however, generic tools (creation, database, etc.) that
> understood this grammar could be built and used by all
> communities that
> spoke in this particular manner.
>
> In the context of grammars... the RDF grammar is simply built on the
> following basic constructs: 'subject', 'predicate' and
> 'object'. The idea
> is that these constructs can form simple sentences which when
> combined with
> other sentences make more complex statements.
>
> The following is a simple example of this:
>
> A 'Paper' has a 'title' of 'string("The answer is 42")'.
>
> In this case,
>
> - 'Paper' is the 'subject'
> - 'title' is the 'predicate'
> - and the string "The answer is 42" is the 'object'.
>
> Combined with other scentences, we can make more descriptive
> statements:
>
> A 'Paper' has a 'creator' of 'Person'.
> A 'Person' has a 'name' of 'string("Eric")'.
> A 'Person' has an 'affiliation' of 'string("OCLC, Inc.")'.
>
> While this may sound a bit silly to humans (we generally
> don't talk this
> way) the unambiguous structure of these sentences is a huge
> step for the web
> and for interoperability across communities. RDF does not define any
> semantics, however, for these sentences components
> ('subjects', 'predicates'
> and 'objects'.) but rather provides the necessary capabilities for
> descriptive communities (such as Dublin Core) to fill these in.
>
> To this end, we (the Dublin Core) have started to do just
> this... In this
> grammar context, the DCES is a set of well defined, commonly
> understood
> 'predicates'. One way of viewing (at least one way I view)
> what the element
> working groups have delivered is the 'names' and semantics
> for the rest of
> the components of the grammar. The goal of which is the
> ability to make
> simple sentences that are commonly understood among communities.
>
> Normalizing these deliverables took the better part of the
> holiday. Luck
> for the editors, however, was Andy's template which (most,
> but not all) of
> the element working groups filled out. Andy's template
> provided the basis
> for the element working groups to normalize their work in a
> way that helped
> articulate which part of the grammar was being recommended.
>
> For example: One thing this template provided was the clear
> articulation of
> what "semantically refined" what. So for qualifiers that
> refine a particular
> element, we can say...
>
> A 'date' has a 'semantic refinement' of 'modificationDate'.
>
> And thus we would say:
>
> A 'Paper' has a 'modificationDate' of 'string("1999-02-03")'.
>
> and know 'modificationDate' is a specialization of 'date'.
>
> Another thing this template provided was the permissible
> "encoding scheme"
> for a particular element. For example:
>
> A 'date' has an 'encodingScheme' of 'W3CDTF'.
>
> And thus we would use this in conjuction with RDF to say
> something about the
> way the text was encoded:
>
> A 'Paper' has a 'date' of 'W3CDTF'.
> A 'W3CDTF' is a 'type' of 'encodingScheme'.
> A 'W3CDTF' has a 'value' of 'string("1999-02-03")'.
>
> Again, somewhat awkward for humans to say, but clear for machines to
> understand.
>
> In any case, it is critical for making unambiguous statements that
> everything has context. To this end, RDF suggests each (non
> string) part of
> the grammar have a unique identifier. This means when we see
> 'date' for
> example, we (and machines) understand this is the 'date' as
> defined by the
> DCMI as apposed to some other kind of 'date' ("the small
> brown fruit" in the
> context of the produce industry comes to mind...) When you
> hear talk about
> namespaces, this is what these try and solve.
>
> In the above example, a namespace would be used to define
> where different
> parts of the grammar come from... in this case RDF (type,
> value) and Dublin
> Core (date, W3CDTF, encodingScheme).
>
> 'Paper' and 'Person', however, are names for 'subjects' I
> made up... I made
> these up because I needed them to communicate with members on
> this list. I
> (as well as a growing set of implementors) also, however,
> need these to
> communicate across applications. If we can agree on unique
> 'names' for
> basic concepts across applications we've taken a hugh step up the
> interoperability curve.
>
> Without 'names' for these concepts, however, we would have
> the following:
>
> A 'Resource' has a 'title' of 'string("The answer is 42")'.
> A 'Resource' has a 'creator' of 'Resource2'
> A 'Resource2' has a 'name' of 'string("Eric")'.
> A 'Resource2' has an 'affiliation' of 'string("OCLC, Inc.")'.
>
> Now, if the previous examples sounded silly, the above one sounds even
> sillier. It's still useful, please don't get me wrong, just
> not as useful
> as we may like. The Dublin Core giving names (semantics) to
> these 'subjects'
> and 'objects' helps in the creation of sentences that make
> sense across
> communities. And I think we *all* what this.
>
> The Agent group has proposed a set of these named things that
> can be used as
> either 'subjects' or 'objects'. The Type group has proposed a set of
> strings (but if we give these things unique identity, they
> can be viewed as
> another set). Regardless of what the DCMI calls the generic
> class of these
> things (right now we're using 'object-type' [1]), they
> components are needed
> sooner-than-later for effectively communicate among communities.
>
> --
> Eric Miller http://purl.oclc.org/net/eric
> Senior Research Scientist mailto:[log in to unmask]
> Office of Research phone:614.764.6109
> OCLC Online Computer Library Center fax:614.764.2344
>
> [1] Please note... 'object-type' is the *generic* name for
> this class of
> qualifier. This is not a final term nor definition, just
> something we (the
> editors) needed to differentiate this from the other kinds of
> qualifiers.
> 'Resource-type', or 'Class' is also possible if it makes more
> sense to the
> group. This term is *not* meant, however, for just the
> things being defined
> by the Agent working group, but rather (like 'semantic refinement' and
> 'encoding scheme') a general term.
>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|