Hi Giuseppe,
How you encode that abbreviation depends entire, as you say, on what you
want to be able to do with the encoding, and to a lesser extent on what
you imagine users of your published XML might want to do with it. A few
examples that I have seen used (sans context, I'm afraid... relying on a
combination of memory and imagination).
I would suggest that, at the very least, even if you're not interested
in the abbreviation at all, you encode it as:
(1) <expan>ἥκοντές</expan>
so that people can tell that it was abbreviated, even if they can't see
how or what the manuscript looks like. It's not much work for you, and
may be of great interest to someone else some time. If you wanted to be
a little more helpful, you could indicate which letters are on the
manuscript, and which you have resolved, for example like:
(2) <expan><abbr>ἥκοντ</abbr><ex>ές</ex></expan>
If you want to go all the way and encode the ligatured tau as a separate
character, but without losing legibility and searchability of your text,
you might want to do something like:
(3) <expan><abbr>ἥκον<g ref="#taulig">τ</g></abbr><ex>ές</ex></expan>
But I think I would prefer (and closer to EpiDoc practice) would be to
encode the tau as part of the abbreviation, as it is, and the
diacritics/ligatured part of the symbol as a separate abbreviation mark,
which will not be part of the expanded word, like so:
(4) <expan><abbr>ἥκοντ<am> ̉ ΅</am></abbr><ex>ές</ex></expan>
A bit simpler and potentially less misleading than the last, would be a
variant of 2, with the tau additionally tagged as a `hi` or `am` and a
@rend attribute indicating the feature of this character that tells us
it's an abbreviation:
(5) <expan><abbr>ἥκον<am rend="hooked">τ</am></abbr><ex>ές</ex></expan>
Apologies to the list for this excessively technical post! ;-) If
there's any further discussion of this, perhaps we should take it to the
EpiDoc Markup list (lsv.uky.edu/archives/markup.html) where people get
off on this stuff...
Best,
Gabby
On 19/05/2016 17:27, Giuseppe G. A. Celano wrote:
> Hi Gabby and Usama,
>
> Thanks for the links. I am encoding a few Greek manuscripts (Plato), and
> I was trying to find a way to also encode abbreviations and ligatures as
> characters. Most of us are indeed interested in expanded forms and may
> therefore consider good to either not signal the presence of an
> abbreviation/ligature at all or signal its presence, but maybe without
> specifying what the abbreviation looks like in the manuscript. For
> example, ἥκοντές can be in my manuscript an expanded form for ἥκοντ plus
> - on the final τ - a combination of (a sort of) combining comma above +
> diaeresis + acute accent. Since I want to also encode the abbreviation
> marks, I often have to spend some time to figure out which Unicode
> characters I could use for that. Sometimes this is easy/doable (as in my
> example), but other times I can only "describe" the
> abbreviation/ligature using the strategy of declaring characters in
> <charDecl/> and then refer to it within the text using <g/>.
>
> I was therefore curious to see how other people have dealt with these
> problems in their XML transcriptions. Thanks for the help!
>
> Best,
> Giuseppe
>
>
>
>
>
> Quoting Gabriel BODARD <[log in to unmask]>:
>
>> While Papyri.info have implemented some of the TEI and EpiDoc
>> mechanisms for recording abbreviations and other non-unicode symbols,
>> they use a fairly simplified method (symbols are not given when
>> expanded to a full word, e.g.). The fuller EpiDoc Guidelines on the
>> subject at
>> http://www.stoa.org/epidoc/gl/latest/trans-abbrevsymbol.html and
>> http://www.stoa.org/epidoc/gl/latest/trans-abbrevmark.html (while they
>> could do with some fuller examples and documentation) give a bit more
>> detail on the recommendations there.
>>
>> EpiDoc handling of ligatures
>> (http://www.stoa.org/epidoc/gl/latest/trans-ligature.html) and symbols
>> (http://www.stoa.org/epidoc/gl/latest/trans-symbol.html) are similarly
>> minimalist, but more complex examples would be very welcome.
>>
>> Would you be willing to share some of the examples behind (I suspect)
>> your question, Giuseppe?
>>
>> Best,
>>
>> Gabby
>>
>>
>> On 18/05/2016 11:53, Usama A. Gad wrote:
>>> Hi Giuseppe,
>>>
>>> You most probably aware of papyri.info <http://papyri.info> , where
>>> you can find something like what you are looking for. Here is the link
>>> to the documentation
>>> http://www.papyri.info/editor/documentation?docotype=text .
>>>
>>> All the best,
>>> Usama
>>>
>>> On May 18, 2016 12:44 PM, "Giuseppe G. A. Celano"
>>> <[log in to unmask]
>>> <mailto:[log in to unmask]>> wrote:
>>>
>>> Dear all,
>>>
>>> I am looking for transcriptions of manuscripts (in TEI XML) where
>>> abbreviations and/or ligatures have been encoded. I am particularly
>>> interested to see which strategy has been adopted when no
>>> corresponding Unicode characters exist. Thank you for any links you
>>> can share!
>>>
>>> Best,
>>> Giuseppe
>>>
>>> --
>>> Universität Leipzig
>>> Institute of Computer Science, Digital Humanities
>>> Augustusplatz 10
>>> 04109 Leipzig
>>> Deutschland
>>> E-mail: [log in to unmask]
>>> <mailto:[log in to unmask]>
>>> E-mail: [log in to unmask] <mailto:[log in to unmask]>
>>> Web site 1: http://www.dh.uni-leipzig.de/wo/team/
>>> Web site 2: https://sites.google.com/site/giuseppegacelano/
>>>
>>
>> --
>> Dr Gabriel BODARD
>> Reader in Digital Classics
>>
>> Institute of Classical Studies
>> University of London
>> Senate House
>> Malet Street
>> London WC1E 7HU
>>
>> E: [log in to unmask]
>> T: +44 (0)20 78628752
>>
>> http://digitalclassicist.org/
>
--
Dr Gabriel BODARD
Reader in Digital Classics
Institute of Classical Studies
University of London
Senate House
Malet Street
London WC1E 7HU
E: [log in to unmask]
T: +44 (0)20 78628752
http://digitalclassicist.org/
|