JISCMail - DC-ARCHITECTURE Archives

Email discussion lists for the UK Education and Research communities

Subscriber's Corner

Email Lists

DC-ARCHITECTURE Archives

DC-ARCHITECTURE@JISCMAIL.AC.UK

View:

Message:

[

First

Last

]

By Topic:

[

First

Last

]

By Author:

[

First

Last

]

Font:

Proportional Font

		LISTSERV Archives
		DC-ARCHITECTURE Home
		DC-ARCHITECTURE December 2011

Options

Subscribe or Unsubscribe

Get Password

Subject:

Schema.org Alignment Task Group 2011-12-12 Telecon - Report

From:

Thomas Baker <[log in to unmask]>

Reply-To:

DCMI Architecture Forum <[log in to unmask]>

Date:

Tue, 20 Dec 2011 11:25:31 -0500

Content-Type:

text/plain

Parts/Attachments:

text/plain (273 lines)

Schema.org Alignment Task Group 2011-12-12 Telecon Report

Chair:    Tom Baker
Attended: Tom Baker, Dan Brickley, Stuart Sutton, Bernard Vatant, Ahsan Morshed, Jon Phipps, 
          Antoine Isaac, Kirsten Jeude, Corey Harper, Jane Greenberg, John Kunze, Ed Summers, 
          Diane Hillmann
Date:     2011-12-12, Monday
Agenda:   http://wiki.dublincore.org/index.php/Schema.org_Alignment/Telecon_20111212
Note:     This report integrates some follow-up discussion after the meeting.

----------------------------------------------------------------------
Links
--  Wiki page for this Task Group
    http://wiki.dublincore.org/index.php/Schema.org_Alignment
--  Bernard Vatant's proposal
    http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings
-   Bernard's proposal with details added
    http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details
--  DC-ARCHITECTURE mailing list
    http://www.jiscmail.ac.uk/lists/dc-architecture.html

----------------------------------------------------------------------
Background on Schema.org (Dan)

Dan: http://schema.org/ is hosted at Google. Other search engines collaborate.

One recent extension is "jobs" vocabulary, and vocabularies are brewing for
medicine and television.  Doing as much of this work in public as possible.  We
have created a Web Schemas interest group at W3C [1], with tools like an issues
tracker, public mailing list, wiki.  Trying to figure out the social process
for extensions.

[1] http://www.w3.org/2001/sw/interest/webschema.html

The vocabulary is maintained in a Google-specific format from which the OWL is
generated -- and now also RDFa.  A machine-readable, versioned view may
eventually be made available, e.g., as a big RDFa Lite file, and probably in
Mercurial repository at W3C, even if the actual site continues to be driven by
the intermediary format.  There are scraped-from-html views of the schema
extracted by the DERI+friends team over at schema.rdfs.org (a separate
project), and an OWL/RDFS description of the vocabulary which was
script-generated from the internal source files by Peter Mika. The basic
approach is essentially RDFish, but not very picky about the kind of details
that webmasters don't care about.

The strongest driver has been simplicity, and a focus on trying to make less
things webmasters might get wrong. So for example we pushed for the 'RDFa lite'
profile of RDFa, which removed complex RDF detail. In RDFa Lite publishers
aren't forced to think about the difference between rel="..." (for things)
and property="..." (for strings) since this is a common cause of confusion.

We also have a kind of semi-official mistakes tolerance strategy.  For example
see http://schema.org/docs/datamodel.html:

    "While we would like all the markup we get to follow the schema, in
    practice, we expect a lot of data that does not. We expect schema.org
    properties to be used with new types. We also expect that often, where we
    expect a property value of type Person, Place, Organization or some other
    subClassOf Thing, we will get a text string. In the spirit of "some data is
    better than none", we will accept this markup and do the best we can."

Schema.org does not try to document this flexibility formally in RDFS/OWL, but
it does reflect the practicalities of this kind of very broad-participation use
of structured data: lots of mistakes. This topic has somewhat haunted the
history of Dublin Core over the years: we've tended to agonize about the gap
between string-centric and thing-centric descriptions, and about how to move in
a fluid way between the two idioms.

Schema.org is using OWL instead of RDFS because of some properties require the
stronger semantics.

There are alot of things in the Schema.org vocabularies -- "Volcano",
"Hairdresser"...  Integrating rNews.  Philosophy is not to push multiple
namespaces onto authors, so the core is flat.  Single flat NS overlaps with
other initiatives. But the intention is to avoid duplication. Want to say:
"This part is based on collaboration with X".

A possible model for collaboration with DC: "80% is already expressible." Couch
in terms of markup for particular types of information, such as "cultural
heritage".  Perhaps point to particular Web sites whose markup could be improved
with these extensions/terms.

Mappings can serve different purposes:

1. a social signal to those who don't 'live and breathe' standards that
   the right people are talking to each other. So not to worry about
   tabloid style "we shouldn't use DC because the search engines only
   consume schema.org" too much. This is an issue, but we can do several
   things to reduce the problem it causes.

2. as a 'documention centre' resource for people working with data,
   including machine tooling (e.g. we could write sparql CONSTRUCT
   queries that map one idiom into another).

3. as a "here, this might be useful" offering to search engine
   engineers in case they are interested (no promises...) in going beyond
   schema.org-only markup and also parsing equivalent triple patterns
   e.g. from RDFa / Microdata, even when different namespaces are used.

4. to help vocabulary development by identifying things expressible in
   idioms from one community (eg. we could take Scholarly Works
   scenarios, or cultural heritage examples...) and see how they look in
   the other schema.

Since currently, the Schema.org sponsor search engines have committed only to
consume Schema.org markup, and not DC, SKOS etc., this could be considered an
unfortunate pressure on sites who are currently publishing Dublin Core. Getting
these mappings in place is one step we can take to making that a less painful
situation. It might be, for example, they choose to publish schema.org markup
in RDFa, and more detailed RDF/XML using DC+SKOS+FOAF as Linked Data in other
formats. Or maybe this time next year the search engines might be more
pluralistic and consume other idioms. It's not clear what will happen. What is
clear is that having search engines actually use structured data is making a
lot of sites pay attention who otherwise wouldn't.

If we channel use cases from DC -- working groups, workshops, conferences,
personal connections... -- into Schema.org via use cases and specific scenarios
that aren't currently addressed, could perhaps be picked up by search engines.
Rather than focusing on whether Schema.org's partner search engines consume
DC's namespace alongside schema.org.

----------------------------------------------------------------------
Sources of the mappings

For Schema.org terms, there is an official RDFS/OWL export linked from
http://schema.org/docs/datamodel, i.e.: http://schema.org/docs/schemaorg.owl.

Another version is maintained at schema.rdfs.org, i.e.:
http://schema.rdfs.org/all.nt.

Schema.org launched with expression in microdata. At some point, started to
publish OWL, which is kept up to date. Schema.rdfs.org scraped from HTML.  The
rdfs.org version may go away as better machine-readable versions are made
available from Schema.org.

----------------------------------------------------------------------
Publication of mappings.

Corey: Human-readable version important because people have deployed DC and
using related formats. Help people understand how that relates to Schema.org.
Antoine: +1

Dan: Related example: http://blog.schema.org/2011/11/using-rdfa-11-lite-with-schemaorg.html ...

Jane: Educational aspect.
Stuart: +1

Antoine: Use out-of-box tool for visualizing vocabularies. Use simple HTML generator.

Bernard: Parrot?  http://ontorule-project.eu/parrot/parrot

Dan: Publish in RDF/XML, NTriples, or RDFa.

Antoine: other visualizers:
-- http://pellet.owldl.com/ontology-browser/
-- http://lode.sourceforge.net/

Dan: See blog post in support of RDFa Lite (above). For mappings, not just
term-by-term, but use cases, e.g., Linked Library - here in DC, here in
Schema.org. People think in concrete terms.

Jane: Important message.

Dan: What's the easiest way to find, say, 15 mainstream but varied DC-based
examples?  The only markup that search engines currently collectively agree to
parse is Schema.org namespace. "Here is a structure in Schema.org - here is how
to say it in DC". Here are the equivalent patterns - consume them if you'd
like.  Could be useful to document how you 'say' schema.org things using other
namespaces like DC.  Helpful to document the equivalences as we see them.

General consensus: Creative Commons license CC0 is a good way to go.

Tom:  RDF page, embed the mapping, w/explanatory notes about not having to 
choose one-or-the-other?

Tom: This is a test balloon. If we were to do alignments on any sort of scale.
We can do the mappings, can't keep it all updated, can't make ambitious
promises regarding maintenance. Alignments are dynamic things. We can version.
We can surface the versioning so folks can find previous version of mapping.
We should not be too fussy about agreement.

======================================================================
Mapping detals - http://wiki.dublincore.org/index.php/Schema.org_Alignment/Mappings_Details

Tom: Wanted to see the two side-by-side. Wanted to see classes, sub-classes,
properties. Asking why the two are being maintained separately.

Dan: Schema.org tends to accept strings where things are called for.

Corey: Grounding in DCTERMS will set explicit ranges.

Dan: "Expect this to be messy".

Ed: Does that get reflected in OWL?

Dan: No, the formal descriptions are reasonably tidy.  Suggest we not spend too
much time trying to anticipate things that could go wrong.  Publishing
machine-readable data is more important than worrying about which we should
use.

Antoine: +1

General consensus: Consider these as mappings between "tidy representations"
("tidy" from a formal-semantic point of view) but recognize and anticipate that
formal ranges may not be followed in practice.

Dan: Noting slight uncertainty re schema:Language rdfs:subClassOf
dct:LinguisticSystem but let's move along.

Corey: Open question about whether preference should be for equivalentClass /
Property vs. subClass / Property

Dan: I tried SELECT * WHERE {?x a <http://purl.org/dc/terms/LinguisticSystem>}
in http://lod.openlinksw.com/sparql.  I tried same query in
http://sparql.sindice.com/ ... found some more results.  Would be good to have
such empirical data when deciding about mappings.

Corey: It depends on whether the subClass/Property represents a more narrowly
defined set in some way. Equivalence implies that the sets are the same.  My
preference is to prefer Equivalent; it is more useful. 

Diane disagrees; subProperty relations may be more accurate. 

We agree to continue discussion on Equivalent vs subPropertyOf on the list.

Ed: Wonders if an authority record describing a person is a bibliographic
resource and if it's a creative work.  Probably not worth worrying about right now.
Would be a fun conversation to have though; preferably over pints...

Tom: Propose that dct:title be subPropertyOf schema:name.

Dan: Aside: foaf:name has 
    <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#label"/> 
(which OWL DL people don't like btw).

Antoine: @danbri: btw what is the mapping of foaf:name in DC?

Dan: Don't think we documented one yet.

Corey: Issues coming up: schema:desc and dct:desc equivalence - restritive vs
open ranges.

Antoine: @danbri: that looks like an argument for dc:title equivalent to
schema:name ;_

Dan: Yeah, they're all basically short and often lossy labeling properties

Corey: What triggers assignmnet of subproperty versus equivalent?

----------------------------------------------------------------------
Next steps

Will schedule another call -- spend whole call on the specific alignments.
Prepare for call w/ description of problems on the discussion list.  Week of
January 9.

Request from Bernard that we look through the two schemas more closely to see
if the current mappings miss anything.  Things in DC that are not in
Schema.org.

Dan: DC can be thought of as a vocabulary, but also as a community
well-grounded in practice. Most terms might be covered by Schema.org, but we
could point out use cases that are not addressed by Schema.org - reflect into
documentation work from the wider community.  Thinking in particular of the
application-profile strand of DC thought.

Dan: eg. where "mapping from DC" might be more than DC terms:
http://www.ariadne.ac.uk/issue50/allinson-et-al/ (or any later successor...)


-- 
Tom Baker <[log in to unmask]>

Top of Message | Previous Page | Permalink

JiscMail Tools

Files Area | help

RSS Feeds and Sharing

Search Archives

Advanced Options

Archives

February 2024
January 2024
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
September 2022
August 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
November 2021
October 2021
September 2021
August 2021
July 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
September 2020
August 2020
July 2020
June 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
September 2005
August 2005
July 2005
June 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
March 2004
February 2004
January 2004
November 2003
October 2003
September 2003
August 2003
June 2003
May 2003
April 2003
March 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
March 2001
February 2001
December 2000
November 2000
October 2000

JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk