----- Message d'origine ----
De : Hugh Cayless <
[log in to unmask]>
À :
[log in to unmask]Envoyé le : Lundi, 25 Août 2008, 4h11mn 13s
Objet : Re: [DIGITALCLASSICIST] Stopwords
for Latin?
I don't know of one, and I wonder whether anyone's ever seen a need
for one. Stopwords can help as a sort of performance optimization in
search engines with a restricted set of use cases, but once you get
beyond a certain domain limit, they just aren't useful (you can search
for 'a' on Google for an example of what I mean). Philologists are
often very interested in words that might get dropped by a stopword
list. I might want to find particular uses of 'et' for example, and
be very irritated if the results told me I couldn't.
I've implemented search engines a few times now and honestly never had
a use for stopwords in the end for any of them. I sort of don't
believe in them anymore...so my question would be: what's the use
case, and do you really need one?
Hope this helps,
Hugh
On Aug 24,
2008, at 6:28 PM, Neven Jovanović wrote:
> Hello,
>
> does anybody know where could one look for a list of stop words for
> Latin?
> I have seen an English stop words list on Perseus
> (
http://www.perseus.tufts.edu/Texts/engstop.html), but have not been
> able
> to find anything similar for Latin. Yes, the Dartmouth Dante
> (
http://dante.dartmouth.edu/help.php) mentions "stopword list" for
> Latin,
> but does not make it available.
>
> It seems that such a list is something that always gets compiled from
> scratch. Perhaps a version of it, made freely available, could be a
> welcome contribution to the Digital Classicist wiki.
>
> (Not to mention a Greek stop word
list...)
>
> Yours,
>
> Neven Jovanovic
>
> Zagreb, Hrvatska / Croatia