Hi Corey,
I think we should keep the requirements distinct for the moment. R-171bis is still about validating stuff at the very surface of the content, while your ideas already start processing the content to see what's inside.
Cheers,
Antoine
On 6/4/15 2:01 PM, Corey A Harper wrote:
> Also, looking again at R-171bis, it is currently very specific about validating resource content type. Do we want to generalize this to include validation of other resource characteristics? This could be written to include content length, cryptographic hash function, and even some of the XML or MARC validation functions, though I worry about watering down the requirement if we lose the current specificity.
>
> http://lelystad.informatik.uni-mannheim.de/rdf-validation/?q=node/455
>
> On Wed, Jun 3, 2015 at 3:33 AM, Antoine Isaac <[log in to unmask] <mailto:[log in to unmask]>> wrote:
>
> Hi Corey,
>
>
> Right. These are less about dereferencing and validating remote RDF and more about extension mechanisms. My understanding of the former is that it was mostly resolved. The W3C heard our use case, and was moving in the direction of allowing SHACL to define the extent of the RDF graph that it would validate, including some dereferencing.
>
>
>
> That's good to hear, but shouldn't we caputre this requirement in our own set, if you think it's important?
> (and the fact that W3C is moving towards it doesn't guarantee they'll deliver the implementation you'd need)
>
>
> As to q=node/455, I can't get there right now, as the requirements DB seems to be down, but I know we've discussed the confirm https status of remote non-RDF resource before. I'm not sure we've discussed it as far as "confirming http response headers or validating _content_".
>
>
>
> It's online now. I think it does mentions what you want.
>
> Cheers,
>
> Antoine
>
>
>
> On Tue, Jun 2, 2015 at 3:23 PM, Antoine Isaac <[log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>>> wrote:
>
> Hi Corey,
>
> Thanks a lot!
>
> I won't dive in the discussion regarding SHACL, rather just ask if one of these requirements (in the second group) wouldn't be in fact on that we've identified at [1].
>
> I'm also unclear about whether this relates to our previous discussion. I thought the original case at [2] was one of activating (or not) validation of RDF shapes for URIs used as objects of statements in the description of the first (set of) resource being validated. For example validating (remotely served) SKOS concepts appearing in the metadata for a book. The cases you describe below are not really about RDF validation of such de-referenced RDF descriptions, are they?
>
> Cheers,
>
> Antoine
>
> [1] http://lelystad.informatik.uni-mannheim.de/rdf-validation/?q=node/455
> [2] https://etherpad.wikimedia.org/p/dcmi-ap-09-04-2015
>
> On 6/2/15 7:57 PM, Corey A Harper wrote:
>
> Thanks Karen, Antoine,
>
> I drafted the following as a starting point for a conversation. It needs wordsmithing, and it's not precisely in our Use Case -> Requirement format, but I think it captures my needs adequately to serve as a starting point.
>
> *** Draft W3C Discussion points ***
>
> I have a strong interest in being able to shell out from a shacl constraint to any arbitrary code that handles specific validation requirements. Many of the validation requirements I want to handle relate to the nature of content in an LDP Non-RDF Source that's in the object position of a triple. A few of the example Use Cases that I have in mind for this kind of external validation:
>
> * I have a FITS [1] XML document containing technical metadata about my resource, and my constraint is it has to validate against an XSD.
> * I have a link to JPEG thumbnail in one property and an MD5 checksum in another. I need to validate that:
> - The JPEG URL returns http status 200.
> - The response header says content-type: image/jpeg, content-length: 231.
> - The content checksum matches a stored value for a named cryptographic hash function.
> * I have a Marc record (Shudder), that needs to pass some validation routines managed as a python program.
> * I have specific ruby code for validating a rails object that is generated from this RDF resource.
>
> I need to be able to do this with my arbitrary code, written in Python, Ruby, etc. My need is not for SHACL to do the validation. I have the code that does that. It isn't sparql, and it isn't Javascript. It _is_ shared and reproducible. My need is to communicate to others what code I'm using, and what validation needs are being met. I also want to communicate those validation needs in a human readable way, point to my reference implementation, and suggest to a data consumer or data provider how they might implement this in another language.
>
> Maybe I'm asking too much of shacl here, but as it stands, the limited approach to extensions will render this unusable for the vast majority of my use cases.
>
> [1]http://projects.iq.harvard.edu/fits
>
> On Tue, Jun 2, 2015 at 1:45 PM, Antoine Isaac <[log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>> <mailto:[log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>>>> wrote:
>
> Hi Karen,
>
> Thanks a lot! This sounds like a good plan. Looking forward to see Corey's use cases!
>
> Cheers,
>
> Antoine
>
>
> On 6/2/15 7:09 PM, Karen Coyle wrote:
>
> Antoine, and all -
>
> Corey and I talked about this. Here's a quick summary of the status of things:
>
> - the W3C SHACL (that's the name of the standard) document currently allows extensions in SPARQL and the proposal is to also allow them in Javascript. Any other extension mechanism would need to be written as an addendum to the standard (as I understand it). The reason for this is complex, but has to do with a view of SHACL that has a contained "engine" concept that vendors can adhere to.
>
> - the original ShEx standard had a mechanism that allowed the requirements language to shell out to any arbitrary routine.
>
> The latter is closer to what is needed for many existing applications.
>
> Corey will write up some use cases that testify to the need, including de-referencing, format checking (e.g. XML documents, MARC documents). Among the motivations is to make explicit to third-parties the actual applications required to process the data. (He hopes to do this by the end of his day today.)
>
> We should discuss this, finalize a statement, and I will present that statement to the W3C group (ASAP, before they go down the SPARQL/JS road too far) as a DCMI use case.
>
> kc
>
> On 6/2/15 7:04 AM, Antoine Isaac wrote:
>
> Hi Corey, Karen,
>
> This is to ask you about the status of
>
> [
> ACTION: Corey and Karen to write up cases of validation with
> de-referencing or local caches, to be sent to W3C
>
> Corey: I thought we had done.
> ... I had drafted something during a meeting
> ... I thought it had been sent and they were not very interested
> ... I will ask Karen for confirmation.
> ]
>
> We still have R-171 [1] and R-171bis [2], but I believe none of these
> were what Corey was after the last time [3]. It would be a pity to lose
> an important requirement.
>
> Apparently Corey had written a short description at [4]
> [
> My question for the W3C group is whether their definition of "instance
> data" includes local caches of remote resources. Example of LCSH on
> id.loc.gov <http://id.loc.gov> <http://id.loc.gov> <http://id.loc.gov>. Over 480,000 skos concepts represented, of which I may need
> 10,000 in a local system, so I will use a separate triplestore or
> something like Linked Data Fragments to cache. I have validation needs
> around dereferencing these and confirming their shape. I also
> potentially have a validation need on when my cache is invalid .
> -- Partial answer: there is great discussion about how the Shapes
> standard will define the extent of the graph over which validation will
> take place. There is also discussion about extension mechanisms, e.g.
> the ability to call arbitrary routines.
> ]
>
> Cheers,
>
> Antoine
>
> [1] http://lelystad.informatik.uni-mannheim.de/rdf-validation/?q=node/286
> [1] http://lelystad.informatik.uni-mannheim.de/rdf-validation/?q=node/455
> [3] https://etherpad.wikimedia.org/p/dcmi-ap-23-04-2015
> [4] https://etherpad.wikimedia.org/p/dcmi-ap-09-04-2015
>
>
>
>
>
> --
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479 <tel:212.998.2479> <tel:212.998.2479 <tel:212.998.2479>>
> [log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>> <mailto:[log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>>>
>
>
>
>
> --
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479 <tel:212.998.2479>
> [log in to unmask] <mailto:[log in to unmask]> <mailto:[log in to unmask] <mailto:[log in to unmask]>>
>
>
>
>
> --
> Corey A Harper
> Metadata Services Librarian
> New York University Libraries
> 20 Cooper Square, 3rd Floor
> New York, NY 10003-7112
> 212.998.2479
> [log in to unmask] <mailto:[log in to unmask]>
|