Right, exactly. There are also relevant TEI-based projects (listed at
http://www.tei-c.org/Activities/Projects/) and datasets familiar to
linguists but perhaps not to classicists.
The idea have a convenient list that pulls together various projects, with
brief commentary on any theoretical assumptions adopted (e.g., whose
morphological taxonomy? How do you define lexeme?), or known
issues/problems, and with links that point to master raw data sets or
mirrors. (One can be forgiven for not being able to find something on
Perseus!)
Again, what Išve supplied is just a short catalyst I hope others will
contribute to what I know could eventually be a rather lengthy list, and
probably justify its own page.
jk
On 4/19/16, 7:17 AM, "The Digital Classicist List on behalf of Gabriel
Bodard" <[log in to unmask] on behalf of
[log in to unmask]> wrote:
>Presumably the morpho-syntactically tagged texts in (e.g.) the Perseus
>Treebank and INESS contain, inter alia, the lexico-morphological data
>you want? Lots and lots of Greek and Latin in both of those...
>
>On 18 April 2016 at 17:43, Kalvesmaki, Joel <[log in to unmask]> wrote:
>> Išve recently needed to get well-curated lexico-morphological data for
>>ancient texts, in a variety of languages. Such data is not as easy to
>>find as I had hoped. So Išve begun a new stub on the Digital Classicist
>>wiki:
>>
>>https://wiki.digitalclassicist.org/Morphological_parsing_or_lemmatising_G
>>reek_and_Latin#Curated_Lexico-morphological_Data
>>
>> If you know of any LM data for corpora or individual texts in Latin,
>>Greek, etc. please contribute.
>>
>> Best wishes,
>>
>> jk
>> --
>> Joel Kalvesmaki
>> Editor in Byzantine Studies
>> Dumbarton Oaks
>> 202 339 6435
>
>
>
>--
>Dr Gabriel BODARD
>Reader in Digital Classics
>
>Institute of Classical Studies
>University of London
>Senate House
>Malet Street
>London WC1E 7HU
>
>E: [log in to unmask]
>T: +44 (0)20 78628752
>
>http://digitalclassicist.org/
|