Sorry, I now realize that I was interpreting "serialization" too narrowly - and Tom's statement about "serializing DSP" makes sense. I think the rest of this message is still coherent, however. Once again, apologies for the length, not only of this message but of the entire conversation on my part. I feel like we've got apples and oranges here, and I, for one, am trying to overcome my own monoculture. kc On 8/16/12 9:54 AM, Karen Coyle wrote: > I apologize for the length of this post. I just can't seem to make it > more concise. > > On 8/16/12 6:41 AM, Thomas Baker wrote: > >> If a DSP is a set of templates with specified constraints -- a >> Description Set >> template, which encloses one or more Description Templates, each of which >> encloses one or more Statement Templates, each of which is described with >> various Resource, Property, and Value constraints, it is not >> immediately clear >> to my why one _couldn't_ simply say that the order of templates >> described in >> that Description Set Profile document is meaningful. When serializing >> that DSP >> to RDF triples, the order would be lost. But when serializing to another >> document format, such as XML, or to an ISBD Publication String, I see >> no reason >> the order could not be retained. > > Tom, I may be mistaken, but I think this still conflates the DSP and > instance data. The DSP, in my mind, plays the role of an XML schema, but > for DCAM-compliant data. (Which also means RDF-compliant, right? or > wrong?) There's a difference between the templates as defined in the > DSP, and the instance data, which is where repetition that is allowed in > the DSP actually takes place. > > A DSP can specify that a statement is (or is not) repeatable, mandatory, > etc. But the DSP itself is not serialized, it is the instance data that > is serialized. So if a DSP provides for a statement that is "paragraph" > and is repeatable, the repetition takes place in the instance data. A > paragraph template of: > > paragraphTemplate > min=0, max=unbounded > - paragraphText (literal) > min=1, max=1 > > just gives you an undistinguished group of paragraphs in instance data. > The only way to maintain order is to wrap them in something like XML. > But my interest is in triples. > > Where order matters, to maintain order in the instance data, the DSP > would need to define a statement template something like: > > paragraphTemplate > min=0, max=unbounded > - paragraphOrder > min=1, max=1 > - paragraphText > min=1, max=1 > > The instance data would then be: > > paragraph > - "1" > - "First paragraph" > > paragraph > - "2" > - "Second paragraph" > > This is obviously do-able, and I believe would work in an RDF > environment as these would be graphs. Another example would be tables of > contents, each statement of which consists of: > > author, title, startPage > > This could be seen as: > > ToCDSPTemplate > min=0, max=1 > ToCStatement > min=1, max=unbounded > - author > min=1, max=3 > - title > min=1, max=1 > - startPage > min=1, max=1 > > This would give you a repeatable template for toc's, with three > statements in the DSP. But generally you want to display toc's in order, > so an ordering data element would be needed here, as would another > ordering to keep the up-to-three authors in order. (This latter order is > sometimes important.) So you would need a solution like the one for > paragraphs. > > Much of what is in library data is repeatable elements, and in some > cases order matters. In other cases, order does not matter and a display > program can construct meaningful displays. > > Any time you have repeatable patterns where order matters, you will need > an ordering mechanism. You could also define a serialization (like XML) > that treats your "record" as a single string, thus maintaining the order > of all that is within the string. I believe that the SES that Jon is > proposing is conceptually like an XML document, in that it is a single > string with meaningful parts and order of parts within it. The SES, as I > read it, is a clever attempt to make strings into things. > > That said, I will go on record as saying that in terms of "converting" > library cataloging documents (e.g. ISBD or MARC records) into linked > data, I prefer the choice made by OCLC, which has added RDFa to its > catalog data displays, and does not attempt to represent the entire > catalog document as linked data. I think this is in keeping with the > intention of linked data, which has been described as a way to define > the data encapsulated in documents. The WorldCat RDFa is derived > programmatically, and does not attempt to replicate the entire content > of the catalog record. Moving from library cataloging (as it is done > today) to linked data will be lossy, just as adding microformat data to > HTML is. The catalog data that is created today is an artifact that > dates back at least to 1830, and it really is time for libraries to > re-conceptualize how they catalog in terms of "data" not "documents." > > Are we in a quagmire if we try to replicate all of library catalog data > in RDF? There may be a solution, but I have serious doubts about the > value and return on the effort. If you must drag library catalog data > into the linked data space, the "pass them as strings" solution is not > the worst. However, I would treat them as literals, not structured data, > and let applications deal with any internal structure "up the stack." > > kc > > > >> >> Tom >> > -- Karen Coyle [log in to unmask] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet