On 10/8/15 2:15 AM, Thomas Baker wrote:
> Hi Karen,
>
> On Wed, Oct 07, 2015 at 01:25:33PM -0700, Karen Coyle wrote:
>> 1) There are two modes of operation for SHACL -- open shape and
>> closed shape (shape being a definition of a graph to be validated).
>> The default in SHACL is "open" which means that "extra" triples will
>> not be noted or treated as errors. "Closed" can be interpreted to
>> mean: "these triples and only these triples." I wanted to create
>> tests that demonstrate how these intereact with other validation
>> rules -- mainly because I think this will be useful for training.
>
> So that I understand: Is the distinction between "open shape" and
> "closed shape" meant to apply at the level of individual SHACL
> expressions, or to entire sets of SHACL expressions? Is the choice made
> entirely by the data consumer, or is "open/closed" encoded in the
> individual SHACL expression? Apologies if this has already been covered
> here.
>
"Open" is the default, but closed can be coded. As I understand it, it
applies to the "focus node" -- that is the set of triples that is being
validated.
So for this shape (open by default):
ex:myShape
a sh:Shape ;
sh:property [
sh:predicate dct:creator ;
] ;
.
Each of these instances is "valid":
# Instances
-------------------------------------------------------------------
ex:instance1
sh:nodeShape ex:myShape ;
dct:creator "A";
.
ex:instance2
sh:nodeShape ex:myShape ;
dct:creator "A";
dct:title "Moby";
.
ex:instance3
sh:nodeShape ex:myShape ;
dct:title "Moby";
But when you close the shape,
ex:myShape
a sh:Shape ;
sh:property [
sh:predicate dct:creator ;
] ;
sh:constraint [
a sh:ClosedShapeConstraint ;
sh:ignoredProperties ( sh:nodeShape rdf:type ) ;
] ;
.
instances 2 and 3 are "invalid" because they have properties that are
not named explicitly in the shape definition. And note that it appears
to be possible to indicate properties that are not to be validated
(sh:ignoredProperties) in the shape. I copied this constraint, and do
not know if/when such a statement is necessary, but rdf:type could
appear in just about any set of triples so is probably needed most of
the time.
>> 3) I've only thought about very very simple SHACL cases; even though
>> I don't know how to express more complex cases in SHACL
>
> What is the current level of interest in the Data Shapes WG in providing
> a language that can easily be used for the simplest and most common use
> cases? Should we be worried that the WG is moving towards a
> specification that is unhelpfully over-engineered from the perspective
> of, say, cultural heritage requirements? I'm thinking that complexity
> is all and good -- unless it actually gets in the way.
I, too, am worried about the complexity. I am also worried that all of
the examples that I have seen are atomistic. In my experience, there are
often dependencies in metadata -- "if this is a book, extent is in
pages; if this is music, extent is in minutes" -- and that kind of
thing. I suspect that once we get into real data we will have to write
many separate SHACL tests to cover the complexity of the data, and that
the SHACL statements will be huge.
There is a spoken commitment to a simpler language but it hasn't been
added to SHACL. It is complicated by the fact that the simpler language
is in fact the ShEx implementation. I find this confusing. If ShEx is
the simpler language, why aren't we using ShEx? Plus ShEx and SHACL are
not mutually consistent, and I don't know how that will be resolved.
There are many mysteries.
kc
>
> Tom
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
|