Print

Print


The metrical data in WordHoard could in principle be searchable by pattern, so that you could look for all verses that include the pattern 110-121 or 320-410-421-422-510.  Wordhoard has the category in its search terms, but for whatever reason we didn’t add the data. I once had the data in Microsoft Access, and it was quite useful.

 

Error correction via collation of different sets is a good idea. I won’t be able to contribute much (except the source data) since I’m very busy imitating the Red Crosse Knight trying to slay myriads of tiny dragons in the EEBO-TCP Cave of Error. Fortunately they don’t spawn new dragons, unlike the ‘Fake News’ that seems to be modern and more frightening re-incarnation of Spenser’s Cave of Error.

 

From: The Digital Classicist List <[log in to unmask]> on behalf of Helma Dik <[log in to unmask]>
Reply-To: The Digital Classicist List <[log in to unmask]>
Date: Tuesday, July 24, 2018 at 12:01 PM
To: The Digital Classicist List <[log in to unmask]>
Subject: Re: [DIGITALCLASSICIST] Seminar: Backoff Lemmatization for Ancient Greek with the Classical Language Toolkit

 

Martin's parses are incorporated at perseus.uchicago, which James Tauber and Giuseppe Celano are in the position to integrate with the Scaife viewer. I've made a minimal number of edits over the years, separating sheep from apples, urine from fair winds, and the like, to align things further with Liddell&Scott-based Perseus lemmatization. 

 

I would be very interested in metrical data compared with morphology data for application to non-metrical texts (so as to disambiguate quantities of certain endings; e.g., acc pl -ας depending on pos and lemma). 

 

All best,

Helma


Helma Dik

Department of Classics

University of Chicago

 

 

On Tue, Jul 24, 2018 at 11:55 AM, David Chamberlain <[log in to unmask]> wrote:

I have a broader metrical tagging project at https://hypotactic.com. Most of Early and Classical Latin verse is done (including drama), Greek is underway (Homer and most other hexameter-based verse is done). Tagging is by syllable rather than word, in what I hope is fairly transparent html. Conversion of verse/line-based tagging to TEI (especially in drama where speakers change in the line) is not as easy as we might hope, but it is possible. Martin’s approach is easier than mine from this point of view, since TEI texts generally don’t identify syllables. I’m not tagging by POS, but it is not a huge challenge to combine such metrical data with the Perseids data (https://perseusdl.github.io/treebank_data/). If I may make so bold, I suggest that what would be really useful at this point would be to start comparing and error checking independent projects like this against each other (e.g. compare and correct the Perseids Homer POS data against Martin’s - unless Perseids perhaps used Wordhoard already?). I’d be happy to collaborate.

 

David Chamberlain

Deptartment of Classics

University of Oregon

 

On July 24, 2018 at 8:19:01 AM, Martin Mueller ([log in to unmask]) wrote:

I just learned about the CLTK toolkit, and it made me wonder whether there are classicists out there who would have a use for the linguistically and metrically data of Early Greek epic in WordHoard.  Below is the example of the opening line of the Iliad. The tagging is not TEI, but could be converted into it easily enough.  The morphosyntactic tagging is derived from Perseus, but has gone through a lot of manual correction, and it is, as these things go, pretty good. The metrical tagging uses a machine-friendly ad hoc notation, where the first number identifies the foot, the second the position in the foot, and the third tells you whether the second part consists of one or two syllables.  The hyphen and space tell you whether metrical transitions occur within or between words.

 

Everything in Wordhoard is in the public domain, and I’ll be happy to put the data on github if there is a demand for them.

 

 

<wordHoardTaggedLine id="IL.1.1" n="1">
                                                                               
<w id="ege-101000101"
                                                                                  lemma="μῆνις (n)"
                                                                                  pos="4201"
                                                                                  metricalShape="110-121"              >μῆνιν</w>
                                                                               
<punc                                   > </punc>
                                                                               
<w id="ege-101000102"
                                                                                  lemma="ἀείδω (v)"
                                                                                  pos="1410021"
                                                                                  metricalShape="122-210-221"          >ἄειδε</w>
                                                                               
<punc                                   > </punc>
                                                                               
<w id="ege-101000103"
                                                                                  lemma="θεά (n)"
                                                                                  pos="5201"
                                                                                  metricalShape="222-310"              >θεὰ</w>
                                                                               
<punc                                   > </punc>
                                                                               
<w id="ege-101000104"
                                                                                  lemma="Πηληϊάδης (np)"
                                                                                  pos="2101"
                                                                                  metricalShape="320-410-421-422-510"  >Πηληϊάδεω</w>
                                                                               
<punc                                   > </punc>
                                                                               
<w id="ege-101000105"
                                                                                  lemma="Ἀχιλλεύς (np)"
                                                                                  pos="2101"
                                                                                  metricalShape="521-522-610-620"      >Ἀχιλῆος</w>

 


To unsubscribe from the DIGITALCLASSICIST list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DIGITALCLASSICIST&A=1

 


To unsubscribe from the DIGITALCLASSICIST list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DIGITALCLASSICIST&A=1

 

 


To unsubscribe from the DIGITALCLASSICIST list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DIGITALCLASSICIST&A=1



To unsubscribe from the DIGITALCLASSICIST list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DIGITALCLASSICIST&A=1