A useful tool is the Vocabulary profiler – it’s not a list but will identify which words in a text are in the first 1000 most frequent words (1K)  (and the second thousand, and the Academic Word list too).  It’s great tool -  very quick and colour coded.

Vocabulary Profiler: http://www.er.uqam.ca/nobel/r21270/cgi-bin/webfreqs/web_vp.html Paste in your text to check its vocabulary profile.

You can read about how the 1K list was produced here – it’s based on West’s 1953  General Service List but updated in 1995 in various ways: http://jbauman.com/aboutgsl.html  and there is also a link to the actual list in frequency order. So there may be newer ones but the profiler tool is good. 

 

There are also lists on www.talkenglish.com that are based on corpus data.

 

Regards

 

Mary Osmaston

University of Central Lancashire

[log in to unmask] 

 

 

 

From: ESOL-Research discussion forum and message board [mailto:[log in to unmask]] On Behalf Of James Simpson
Sent: 31 March 2014 22:50
To: [log in to unmask]
Subject: Re: Corpus question

 

Hello all

A paper about a new General Service List of English by Brezina and Gablasova (2013) – both at Lancaster - published in Applied Linguistics, and available in open access format at

http://applij.oxfordjournals.org/content/early/2013/08/25/applin.amt018.full

This appears to be a different NGSL to the one Dominic refers to – it’s based on different (and bigger) corpora and seems to involve a different set of researchers.  I haven’t checked how similar the two are.

The word list itself is available (also free) through the applied linguistics site – click on ‘supplementary data’. It’s attached for info – the first 500 words are in red.

Cheers

James

 

The abstract:

The current study presents a New General Service List (new-GSL), which is a result of robust comparison of four language corpora (LOB, BNC, BE06, and EnTenTen12) of the total size of over 12 billion running words. The four corpora were selected to represent a variety of corpus sizes and approaches to representativeness and sampling. In particular, the study investigates the lexical overlap among the corpora in the top 3,000 words based on the average reduced frequency (ARF), which is a measure that takes into consideration both frequency and dispersion of lexical items. The results show that there exists a stable vocabulary core of 2,122 items (70.7%) among the four corpora. Moreover, these vocabulary items occur with comparable ranks in the individual wordlists. In producing the new-GSL, the core vocabulary items were combined with new items frequently occurring in the corpora representing current language use (BE06 and EnTenTen12). The final product of the study, the new-GSL, consists of 2,494 lemmas and covers between 80.1 and 81.7 per cent of the text in the source corpora.

 

 

From: ESOL-Research discussion forum and message board [mailto:[log in to unmask]] On Behalf Of dominic mccabe
Sent: 31 March 2014 10:30
To: [log in to unmask]
Subject: Re: Corpus question

 

Hello All

 

In 1953 Michael West famously produced the General Service List of English aimed to produce a list of words that would enable a basic level of functionality. Of course this was very ESL/EFL and is quite high level. There has been a new general service list of English produced in 2013 and they still seem to be offering different forms of the almost 3000 words here (with and without definitions for example and it can be found here:

 

http://www.newgeneralservicelist.org/

 

Still too high for our pre-entry, entry 1 and entry 2 classes but okay for entry 3 I think and maybe British Council Nexus could do something clever with it using the CEFR but maybe that wheel has already been invented.

 

Find the list without definitions attached if JSIC lets us do that.

 

Cheers Dominic

 

On Sunday, 30 March 2014, 13:05, Diana Tremayne <[log in to unmask]> wrote:

I guess it depends what language you think the learners need Dominic - i.e. for everyday life or to pass qualifications etc (obviously there is an overlap here) - but you could build a small corpus quite quickly by inputting text into a program like Antconc http://www.antlab.sci.waseda.ac.jp/antconc_index.html (it's free). So, for example if you wanted to look at key language in reading assessments you could use a range of texts and see what was most frequently used (could be very useful for instructional language). Or you could input texts from materials (e.g. Skills for Life) and try that out, or find some relevant authentic materials. It could be done fairly quickly.

As part of my MA I'm thinking of building a corpus to help learners who want to progress to a vocational area so that we can spend more time on the language they need before they begin the course. I was thinking that you would really need to include a mixture of written texts and also spoken language that they would encounter in theory/practical sessions as the spoken language might sometimes be more of a challenge. I'll see how it goes and if anyone is interested I'll let you know!

best wishes

Diana Tremayne
Advanced Learning Practitioner / ESOL E2 Course Leader
Tel: 01422 357357 ext 9403

Calderdale College  Francis Street   Halifax   HX1 3UZ
01422 357357 email:[log in to unmask] www.calderdale.ac.uk



-----Original Message-----
From: ESOL-Research discussion forum and message board on behalf of Lesley Robinson
Sent: Fri 28/03/2014 18:15
To: [log in to unmask]
Subject: Re: Corpus question

Dear Philida

This is a fantastic resource which I didn't know about! - thank you.

Do you or does anyone else happen to know if anything similar has been produced for other languages, eg Italian?

Many thanks,
Lesley

Lesley Robinson

Sent from my iPad

On 28 Mar 2014, at 13:52, "Philida" <[log in to unmask]> wrote:

> Hi Dominic

> I think your best bet is English Profile http://www.englishprofile.org/ which has a searchable listing of vocabulary by CEFR level.

> Regards - Philida

> From: Hann, Naeema
> Sent: Friday, March 28, 2014 11:46 AM
> To: [log in to unmask]
> Subject: FW: Corpus question

> Hi Dominic and Everyone,

> I am not  a corpus person myself but am lucky enough to share an office with someone who is a corpus linguist - Dr. Ivor Timmis - and here is what Ivor suggests.

> Best wishes,

> Naeema

> From: Timmis, Ivor S
> Sent: 28 March 2014 11:29
> To: Hann, Naeema
> Subject: RE: Corpus question

> Not as far as I am aware, Naeema.  It is unusual to pre-determine the language level of the corpus in this way.  There are corpora of learner language, but these are often written corpora and often at high levels.  The best you could hope for would be corpus-informed materials e,g. Touchstone or face2face at elementary level.  CEF A1 indicators might be another useful point of reference, and may be the best bet,

> Ivor

> Dr Ivor Timmis,
> Reader in English Language Teaching,
> School of Languages,
> Headingley Campus,
> Leeds Metropolitan University,
> Leeds LS63QN
> 01138124707

> From: ESOL-Research discussion forum and message board [mailto:[log in to unmask]] On Behalf Of Dominic Clarke
> Sent: 28 March 2014 11:13
> To: [log in to unmask]
> Subject: Corpus question

> Hello all

> I am aware that various kinds of corpus have been created
> over the years ( eg journalism, fiction etc ) however I wonder if
> any corpus has ever been created which focusses on the very
> basic language that many of the students I teach need - many
> of my students are at the lower end of the level system and
> for various reasons seem unlikely to progress rapidly or beyond
> a certain level.

> Given the fact that teaching time in the classroom is limited
> it would seem useful to focus on the language that many of
> these students desperately need.  Has any work been done
> on such a corpus ?

> Regards

> Dominic
> *********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]

>

>
> To view the terms under which this email is distributed, please go to http://disclaimer.leedsmet.ac.uk/email.htm
>
> *********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]
> *********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]

***********************************
ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds.
To join or leave ESOL-Research, visit
http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html
To contact the list owner, send an email to
[log in to unmask]

*********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]

 

*********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]

*********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]

*********************************** ESOL-Research is a forum for researchers and practitioners with an interest in research into teaching and learning ESOL. ESOL-Research is managed by James Simpson at the Centre for Language Education Research, School of Education, University of Leeds. To join or leave ESOL-Research, visit http://www.jiscmail.ac.uk/lists/ESOL-RESEARCH.html To contact the list owner, send an email to [log in to unmask]