JISCMail - DC-ARCHITECTURE Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
DC-ARCHITECTURE Archives

DC-ARCHITECTURE@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		DC-ARCHITECTURE Home
		DC-ARCHITECTURE January 2012
Options

Subscribe or Unsubscribe
Get Password
Subject:
Schema.org Alignment Task Group *informal* telecon - TODAY
From:
Thomas Baker <[log in to unmask]>
Reply-To:
DCMI Architecture Forum <[log in to unmask]>
Date:
Mon, 9 Jan 2012 10:06:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (364 lines)
Dear all,

Here is the agenda for a our *informal* call THIS MORNING -- not
for discussing the substance of Schema.org alignments but to discuss
the practicalities of publication.

Tom


Schema.org Alignment Task Group *informal* telecon

This agenda: http://wiki.dublincore.org/index.php/Schema.org_Alignment/Telecon_20120109_Agenda
Chair: Tom
Date: Monday, 2012-01-09
Tom: 11:00 AM Eastern Std Time
Dial-in: +1-218-936-4141, participant Access Code 334034
IRC: irc://irc.freenode.net/#dcmi
Mailing list: http://www.jiscmail.ac.uk/lists/dc-architecture

Note: This will be an informal call to discuss practical solutions that need
to be put into place before we can take a decision on Schema.org alignments.
On Monday's call, we will _not_ discuss issues of substance - only of process
and practice.

1.  Publication of mappings

    Problem: we need a better set-up for collecting comments than wiki pages:
       http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings
       http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details

    Note:
    -- Posting comments on specifics to dc-architecture will not scale
    -- Comments should be part of what is published, and comments should continue
       to be collected after publication

    Jon has proposed to a way to put mappings, already in RDF, under version
    control with Git: https://github.com/jonphipps/Example-Map
    We could put these under: http://github.com/dublincore

    How could we provide a human-readable view of these mappings (e.g., using Lode
    or Parrot)?

2.  Process of deciding alignments

    The process of getting good-enough consensus on mappings does not need to
    be overly formal, e.g., with precise voting rules.  However: we do need to be
    clear about the informal process.

    Proposal:
    -- Basis for decision must be published with:
       Proposed alignment statements.
       Details of semantics of the properties and classes being aligned, as in:
       http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details

    -- A structured way to collect feedback and comments on mappings has been put
       into place -- not not only in order to prepare the "vote" but also as a way
       of collecting feedback after publication, i.e., as a way of identifying
       alignments that may need to be revisited in the future.

    -- When the information and comment environment are in place, we hold a telecon
       in which we walk through the list, discuss any issues arising, and get approval
       for the alignments among the attendees of the call.

   --  We publish the telecon-approved alignments as a draft to the world, publicize
       the draft for a comment period of, say, two weeks, before declaring them
       officially "published".

3. Alignment "vote"

    Decide on a timetable for the above and set a rough date for a telecon in which
    we take a decision (which will be subject to a public comment period before publication
    as described above.

4. Issues tracker (time-permitting)

   See: wiki page about issue tracking at W3C [1].
   -- Tracker is great, but it is only available for use by W3C working groups [2]
   -- Bugzilla: powerful, but said to require investment in time to learn [3]
   -- RoundUp [4], a ten-year-old Python project, used to track Python and IETF projects
   -- Assembla [5] -- a cloud-based service to which DCMI would need to subscribe
   -- Jira [6], subscription required - said to be good for "document-oriented" issues

   [1] http://www.w3.org/wiki/TrackingIssues
   [2] http://www.w3.org/2005/06/tracker/
   [3] http://www.bugzilla.org/
   [4] http://roundup.sourceforge.net/
   [5] http://www.assembla.com/
   [6] http://www.atlassian.com/software/jira/




> Schema.org Alignment Task Group 2011-12-12 Telecon Report
>
> Chair:    Tom Baker
> Attended: Tom Baker, Dan Brickley, Stuart Sutton, Bernard Vatant, Ahsan Morshed, Jon Phipps,
>           Antoine Isaac, Kirsten Jeude, Corey Harper, Jane Greenberg, John Kunze, Ed Summers,
>           Diane Hillmann
> Date:     2011-12-12, Monday
> Agenda:   http://wiki.dublincore.org/index.php/Schema.org_Alignment/Telecon_20111212
> Note:     This report integrates some follow-up discussion after the meeting.
>
> ----------------------------------------------------------------------
> Links
> --  Wiki page for this Task Group
>     http://wiki.dublincore.org/index.php/Schema.org_Alignment
> --  Bernard Vatant's proposal
>     http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings
> -   Bernard's proposal with details added
>     http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details
> --  DC-ARCHITECTURE mailing list
>     http://www.jiscmail.ac.uk/lists/dc-architecture.html
>
> ----------------------------------------------------------------------
> Background on Schema.org (Dan)
>
> Dan: http://schema.org/ is hosted at Google. Other search engines collaborate.
>
> One recent extension is "jobs" vocabulary, and vocabularies are brewing for
> medicine and television.  Doing as much of this work in public as possible.  We
> have created a Web Schemas interest group at W3C [1], with tools like an issues
> tracker, public mailing list, wiki.  Trying to figure out the social process
> for extensions.
>
> [1] http://www.w3.org/2001/sw/interest/webschema.html
>
> The vocabulary is maintained in a Google-specific format from which the OWL is
> generated -- and now also RDFa.  A machine-readable, versioned view may
> eventually be made available, e.g., as a big RDFa Lite file, and probably in
> Mercurial repository at W3C, even if the actual site continues to be driven by
> the intermediary format.  There are scraped-from-html views of the schema
> extracted by the DERI+friends team over at schema.rdfs.org (a separate
> project), and an OWL/RDFS description of the vocabulary which was
> script-generated from the internal source files by Peter Mika. The basic
> approach is essentially RDFish, but not very picky about the kind of details
> that webmasters don't care about.
>
> The strongest driver has been simplicity, and a focus on trying to make less
> things webmasters might get wrong. So for example we pushed for the 'RDFa lite'
> profile of RDFa, which removed complex RDF detail. In RDFa Lite publishers
> aren't forced to think about the difference between rel="..." (for things)
> and property="..." (for strings) since this is a common cause of confusion.
>
> We also have a kind of semi-official mistakes tolerance strategy.  For example
> see http://schema.org/docs/datamodel.html:
>
>     "While we would like all the markup we get to follow the schema, in
>     practice, we expect a lot of data that does not. We expect schema.org
>     properties to be used with new types. We also expect that often, where we
>     expect a property value of type Person, Place, Organization or some other
>     subClassOf Thing, we will get a text string. In the spirit of "some data is
>     better than none", we will accept this markup and do the best we can."
>
> Schema.org does not try to document this flexibility formally in RDFS/OWL, but
> it does reflect the practicalities of this kind of very broad-participation use
> of structured data: lots of mistakes. This topic has somewhat haunted the
> history of Dublin Core over the years: we've tended to agonize about the gap
> between string-centric and thing-centric descriptions, and about how to move in
> a fluid way between the two idioms.
>
> Schema.org is using OWL instead of RDFS because of some properties require the
> stronger semantics.
>
> There are alot of things in the Schema.org vocabularies -- "Volcano",
> "Hairdresser"...  Integrating rNews.  Philosophy is not to push multiple
> namespaces onto authors, so the core is flat.  Single flat NS overlaps with
> other initiatives. But the intention is to avoid duplication. Want to say:
> "This part is based on collaboration with X".
>
> A possible model for collaboration with DC: "80% is already expressible." Couch
> in terms of markup for particular types of information, such as "cultural
> heritage".  Perhaps point to particular Web sites whose markup could be improved
> with these extensions/terms.
>
> Mappings can serve different purposes:
>
> 1. a social signal to those who don't 'live and breathe' standards that
>    the right people are talking to each other. So not to worry about
>    tabloid style "we shouldn't use DC because the search engines only
>    consume schema.org" too much. This is an issue, but we can do several
>    things to reduce the problem it causes.
>
> 2. as a 'documention centre' resource for people working with data,
>    including machine tooling (e.g. we could write sparql CONSTRUCT
>    queries that map one idiom into another).
>
> 3. as a "here, this might be useful" offering to search engine
>    engineers in case they are interested (no promises...) in going beyond
>    schema.org-only markup and also parsing equivalent triple patterns
>    e.g. from RDFa / Microdata, even when different namespaces are used.
>
> 4. to help vocabulary development by identifying things expressible in
>    idioms from one community (eg. we could take Scholarly Works
>    scenarios, or cultural heritage examples...) and see how they look in
>    the other schema.
>
> Since currently, the Schema.org sponsor search engines have committed only to
> consume Schema.org markup, and not DC, SKOS etc., this could be considered an
> unfortunate pressure on sites who are currently publishing Dublin Core. Getting
> these mappings in place is one step we can take to making that a less painful
> situation. It might be, for example, they choose to publish schema.org markup
> in RDFa, and more detailed RDF/XML using DC+SKOS+FOAF as Linked Data in other
> formats. Or maybe this time next year the search engines might be more
> pluralistic and consume other idioms. It's not clear what will happen. What is
> clear is that having search engines actually use structured data is making a
> lot of sites pay attention who otherwise wouldn't.
>
> If we channel use cases from DC -- working groups, workshops, conferences,
> personal connections... -- into Schema.org via use cases and specific scenarios
> that aren't currently addressed, could perhaps be picked up by search engines.
> Rather than focusing on whether Schema.org's partner search engines consume
> DC's namespace alongside schema.org.
>
> ----------------------------------------------------------------------
> Sources of the mappings
>
> For Schema.org terms, there is an official RDFS/OWL export linked from
> http://schema.org/docs/datamodel, i.e.: http://schema.org/docs/schemaorg.owl.
>
> Another version is maintained at schema.rdfs.org, i.e.:
> http://schema.rdfs.org/all.nt.
>
> Schema.org launched with expression in microdata. At some point, started to
> publish OWL, which is kept up to date. Schema.rdfs.org scraped from HTML.  The
> rdfs.org version may go away as better machine-readable versions are made
> available from Schema.org.
>
> ----------------------------------------------------------------------
> Publication of mappings.
>
> Corey: Human-readable version important because people have deployed DC and
> using related formats. Help people understand how that relates to Schema.org.
> Antoine: +1
>
> Dan: Related example: http://blog.schema.org/2011/11/using-rdfa-11-lite-with-schemaorg.html ...
>
> Jane: Educational aspect.
> Stuart: +1
>
> Antoine: Use out-of-box tool for visualizing vocabularies. Use simple HTML generator.
>
> Bernard: Parrot?  http://ontorule-project.eu/parrot/parrot
>
> Dan: Publish in RDF/XML, NTriples, or RDFa.
>
> Antoine: other visualizers:
> -- http://pellet.owldl.com/ontology-browser/
> -- http://lode.sourceforge.net/
>
> Dan: See blog post in support of RDFa Lite (above). For mappings, not just
> term-by-term, but use cases, e.g., Linked Library - here in DC, here in
> Schema.org. People think in concrete terms.
>
> Jane: Important message.
>
> Dan: What's the easiest way to find, say, 15 mainstream but varied DC-based
> examples?  The only markup that search engines currently collectively agree to
> parse is Schema.org namespace. "Here is a structure in Schema.org - here is how
> to say it in DC". Here are the equivalent patterns - consume them if you'd
> like.  Could be useful to document how you 'say' schema.org things using other
> namespaces like DC.  Helpful to document the equivalences as we see them.
>
> General consensus: Creative Commons license CC0 is a good way to go.
>
> Tom:  RDF page, embed the mapping, w/explanatory notes about not having to
> choose one-or-the-other?
>
> Tom: This is a test balloon. If we were to do alignments on any sort of scale.
> We can do the mappings, can't keep it all updated, can't make ambitious
> promises regarding maintenance. Alignments are dynamic things. We can version.
> We can surface the versioning so folks can find previous version of mapping.
> We should not be too fussy about agreement.
>
> ======================================================================
> Mapping detals - http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details
>
> Tom: Wanted to see the two side-by-side. Wanted to see classes, sub-classes,
> properties. Asking why the two are being maintained separately.
>
> Dan: Schema.org tends to accept strings where things are called for.
>
> Corey: Grounding in DCTERMS will set explicit ranges.
>
> Dan: "Expect this to be messy".
>
> Ed: Does that get reflected in OWL?
>
> Dan: No, the formal descriptions are reasonably tidy.  Suggest we not spend too
> much time trying to anticipate things that could go wrong.  Publishing
> machine-readable data is more important than worrying about which we should
> use.
>
> Antoine: +1
>
> General consensus: Consider these as mappings between "tidy representations"
> ("tidy" from a formal-semantic point of view) but recognize and anticipate that
> formal ranges may not be followed in practice.
>
> Dan: Noting slight uncertainty re schema:Language rdfs:subClassOf
> dct:LinguisticSystem but let's move along.
>
> Corey: Open question about whether preference should be for equivalentClass /
> Property vs. subClass / Property
>
> Dan: I tried SELECT * WHERE {?x a <http://purl.org/dc/terms/LinguisticSystem>}
> in http://lod.openlinksw.com/sparql.  I tried same query in
> http://sparql.sindice.com/ ... found some more results.  Would be good to have
> such empirical data when deciding about mappings.
>
> Corey: It depends on whether the subClass/Property represents a more narrowly
> defined set in some way. Equivalence implies that the sets are the same.  My
> preference is to prefer Equivalent; it is more useful.
>
> Diane disagrees; subProperty relations may be more accurate.
>
> We agree to continue discussion on Equivalent vs subPropertyOf on the list.
>
> Ed: Wonders if an authority record describing a person is a bibliographic
> resource and if it's a creative work.  Probably not worth worrying about right now.
> Would be a fun conversation to have though; preferably over pints...
>
> Tom: Propose that dct:title be subPropertyOf schema:name.
>
> Dan: Aside: foaf:name has
>     <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#label"/>
> (which OWL DL people don't like btw).
>
> Antoine: @danbri: btw what is the mapping of foaf:name in DC?
>
> Dan: Don't think we documented one yet.
>
> Corey: Issues coming up: schema:desc and dct:desc equivalence - restritive vs
> open ranges.
>
> Antoine: @danbri: that looks like an argument for dc:title equivalent to
> schema:name ;_
>
> Dan: Yeah, they're all basically short and often lossy labeling properties
>
> Corey: What triggers assignmnet of subproperty versus equivalent?
>
> ----------------------------------------------------------------------
> Next steps
>
> Will schedule another call -- spend whole call on the specific alignments.
> Prepare for call w/ description of problems on the discussion list.  Week of
> January 9.
>
> Request from Bernard that we look through the two schemas more closely to see
> if the current mappings miss anything.  Things in DC that are not in
> Schema.org.
>
> Dan: DC can be thought of as a vocabulary, but also as a community
> well-grounded in practice. Most terms might be covered by Schema.org, but we
> could point out use cases that are not addressed by Schema.org - reflect into
> documentation work from the wider community.  Thinking in particular of the
> application-profile strand of DC thought.
>
> Dan: eg. where "mapping from DC" might be more than DC terms:
> http://www.ariadne.ac.uk/issue50/allinson-et-al/ (or any later successor...)

--
Tom Baker <[log in to unmask]>
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options