Print

Print


Il 26/01/18 17:57, Aurélien Berra ha scritto:
[log in to unmask]">
I'm not sure I see your point here, Maurizio. We probably agree that there is no ideal stoplist. The lists should be corpus-based, implementing a statistical threshold (with or without a shared static core), and iterative, in relation to successive interests. Obviously, in an environment where the user cannot choose or update the stoplist, the default list can be designed in various ways. And techniques like phrase search introduce other approaches.
ehm... probably no point but genuine curiosity for the completely new perspectives you open.
what i have in mind is that if i want e.g. to study the expression/representation of time, adverbs of time which can be in other case treated as stopwords become relevant words.
this implies - as i saw you did - that the (stop)words be categorized in order to facilitate the user to further customize them according to own scopes.
best
maurizio

[log in to unmask]">
On 26 Jan 2018, at 14:21, maurizio lana <[log in to unmask]> wrote:
so my next question arises: can one practically define/individuate the set of stopwords for own text(s)?


-- 

natale 2017 - gesù, un figlio di profughi in fuga