Print

Print


Hello,


I would say
that in certain cases, some stop words lists of latin and greek words can be
useful. I work on the Res Gestae Divi Augusti and I miss them too. I make some digital editions of the text, from Mommsen to Scheid, focusing on the primary sources and searching to enhance the historical features.

For an historical approach of this text, and of many others, it can be useful
to easily get rid off all the "little words" such as "et",
"ut" and many others, before making an automatic analysis of the
content or other more complex representation.

Marion Lamé
PhD student,
University of Aix-en-Provence, France
University of Bologna, Italy



----- Message d'origine ----
De : Hugh Cayless <[log in to unmask]>
À : [log in to unmask]
Envoyé le : Lundi, 25 Août 2008, 4h11mn 13s
Objet : Re: [DIGITALCLASSICIST] Stopwords for Latin?

I don't know of one, and I wonder whether anyone's ever seen a need  
for one.  Stopwords can help as a sort of performance optimization in  
search engines with a restricted set of use cases, but once you get  
beyond a certain domain limit, they just aren't useful (you can search  
for 'a' on Google for an example of what I mean).  Philologists are  
often very interested in words that might get dropped by a stopword  
list.  I might want to find particular uses of 'et' for example, and  
be very irritated if the results told me I couldn't.

I've implemented search engines a few times now and honestly never had  
a use for stopwords in the end for any of them.  I sort of don't  
believe in them anymore...so my question would be: what's the use  
case, and do you really need one?

Hope this helps,
Hugh

On Aug 24, 2008, at 6:28 PM, Neven Jovanović wrote:

> Hello,
>
> does anybody know where could one look for a list of stop words for  
> Latin?
> I have seen an English stop words list on Perseus
> (http://www.perseus.tufts.edu/Texts/engstop.html), but have not been  
> able
> to find anything similar for Latin.  Yes, the Dartmouth Dante
> (http://dante.dartmouth.edu/help.php) mentions "stopword list" for  
> Latin,
> but does not make it available.
>
> It seems that such a list is something that always gets compiled from
> scratch.  Perhaps a version of it, made freely available, could be a
> welcome contribution to the Digital Classicist wiki.
>
> (Not to mention a Greek stop word list...)
>
> Yours,
>
> Neven Jovanovic
>
> Zagreb, Hrvatska / Croatia



      _____________________________________________________________________________ 
Envoyez avec Yahoo! Mail. Une boite mail plus intelligente http://mail.yahoo.fr