DATA MINING IN PRACTICE
Reading RSS Local Group Summer Meeting
Wednesday, 15 September, 1999
Department of Applied Statistics,
University of Reading
(abstracts provided below and in attachments):
10.30am Coffee
11.00 Keynote speech: David Hand (Imperial College)
Data mining for fun and profit
12.00 Simon Young (Equifax)
Using data to make informed decisions in the credit and marketing
industries
12.30 Lunch (please see below)
1.30 Franky de Cooman (SPS Belgium)
Why use data mining in the pharmeceutical sector ?
2.00 Shanti Majithia (National Grid)
Short term electricity demand forecasting within the National Grid
Company
2.30 Tea
2.45 Alan Menius and Clive Bowman (Glaxo)
Data mining at Glaxo Wellcome
3.15 Ian Schagen (NFER)
Exploring OFSTED's School Inspection Database
3.45 Report and Election of Committee
4.00 Close
The talks will take place in Room GU01 in the Department of Meteorology
(usual room) at the Earley Gate entrance to the campus. Lunch, tea and
coffee will be in the Department of Applied Statistics. The meeting is
FREE OF CHARGE except for the cost of buffet lunch at 4.50 per head (money
will be collected on the day). If you would like to come along to the
meeting and lunch and you have not previously replied, please email Neil
Butler at [log in to unmask] with numbers on or before TUESDAY 7
SEPTEMBER (Note change from previous address).
We hope to see you there !
Abstracts (others provided in the attachments):
--------------------------------------------------------------------------
David Hand: Data mining for fun and profit
Data mining is defined as the process of seeking interesting or valuable
information within large data sets. This presents novel challenges and
problems, distinct from those typically arising in the allied areas of
statistics, machine learning, pattern recognition, or database science. A
distinction is drawn between the two data mining activities of model
building and pattern detection. Even though statisticians are familiar with
the former, the large data sets involved in data mining mean that novel
problems do arise. The second of the activities, pattern detection,
presents entirely new classes of challenges, some arising, again, as a
consequence of the large sizes of the data sets. Data quality is a
particularly troublesome issue in data mining applications, and this is
examined. The discussion is illustrated with a variety of real examples.
--------------------------------------------------------------------------
Ian Schagen: Exploring OFSTED's School Inspection Database
The OFSTED numerical database contains detailed judgements derived from the
inspection of schools in England on a four-year cycle. The NFER was
commissioned to explore ways of deriving useful insights from this database,
concentrating on secondary schools. At the school level, there are over 100
judgements per school, on a 7-point scale, with over 50 judgements for each
subject department per school. Lesson observations, on a 5-point scale, form
another stratum of data. Factor analysis techniques were used to derive
relevant factors at school and departmental level, which were then used in
the analysis of both lesson observations and public examination results as
outcomes.
--------------------------------------------------------------------------
|