JISCMail - IR Archives

ACL 2009 NLPIR4DL: Workshop on text and citation analysis for scholarly 
digital libraries

Call for Papers

In recent years, interest in scholarly publications in electronic
forms has boomed, and several large-scale electronic digital libraries
and citation indices are now used everyday by researchers. Current
digital libraries collect and allow access to digital papers and their
metadata (including citations), but largely do not attempt to analyze
the items they collect.

The goal of this workshop is to investigate how developments in
natural language processing and information retrieval techniques can
advance the state-of-the-art in scholarly document understanding,
analysis and retrival. Full document text analysis can help design
automatic summarization and sentiment detection methods, automated
recommendation and reviewing systems, and may provide data for
visualizing scientific trends and bibliometrics. Citation analysis
takes this a step further, adding scientific social network analysis
as another strand of evidence to enhance solutions to the above
challenges. Web based digital libraries add download counts and Web
2.0 information such as tagging.

Aside from researchers, this workshop hopes to interest other
stakeholders, namely implementers, publishers and policymakers. Even
within computer science, many different scholarly sites exist -- ACM
Portal, IEEE Xplore, Google Scholar, PSU's CiteSeerX, MSRA's Libra,
Tsinghua's ArnetMiner, Trier's DBLP, UMass' Rexa, Hiroshima's PRESRI
-- and with this workshop we hope to bring a number of these
contributers together. Today's publishers continue to seek new ways to
be relevant to their consumers, in disseminating the right published
works to their audience. The fact that formal citation metrics have
become an increasingly large factor in decision-making by universities
and funding bodies worldwide makes the need for research in such
topics and for better methods for measuring the impact of work more
pressing.

We invite stimulating and unpublished submissions on topics including
but not limited to) full-text analysis, multimedia and multilingual
analysis and alignment as well as citation-based NLP or IR. Specific
examples of fields of interests include:

* new information access methods for scientific papers
* automatic creation of reviews
* automatic qualitative assessment of submissions
* summarisation of scientific articles
* navigation, searching and browsing in scholarly DLs
* techniques for suggesting and recommending scholarly papers,
   reviewers, citations and publication venues
* information retrieval for scholarly text, e.g. citation-based IR
* topical modeling analysis
* network analysis and citation analysis in scholarly DLs
* citation function/motivation analysis
* novel bibliographic metrics
* niche search in scholarly DLs, e.g., survey paper finding and
   provenance tracing of algorithms)
* knowledge discovery and analysis of the ancestry of ideas
* analyses of writing style in scholarly publications
* multilingual and multimedia analysis and alignment of scholarly works
* managing digital archives of linguistic corpora; federated access
* metadata and controlled vocabularies for resource description and 
discovery
* automatic metadata discovery, e.g., language identification
* data cleaning and data quality
* disambiguation issues in scholarly DLs using NLP or IR techniques.

Submission details:

Style files for submissions should following standard ACL-IJCNLP paper
submission style:
http://www.acl-ijcnlp-2009.org/main/authors/stylefiles/

Important Dates:

May 1, 2009 Deadline for paper submissions
Jun 1, 2009 Notification of acceptances
Jun 7, 2009 Camera-ready copies due
Aug 7, 2009 ACL-IJCNLP 2009 Workshop

Program Committee:

* Colin Batchelor (Royal Society of Chemistry)
* Steven Bird(Univ. of Melbourne & Linguistic Data Consortium)
* Shannon Bradshaw (Drew University)
* Jason S Chang (National Tsing-hua Univ.)
* Robert Dale (Macquarie Univ.)
* Bonnie Dorr (Univ. of Maryland)
* Curtis Dyreson (Utah State Univ.)
* C Lee Giles (Pennsylvania State Univ.)
* Dan Jurafsky (Stanford Univ.)
* Noriko Kando (National Institute of Informatics, Japan)
* Dongwon Lee (Pennsylvania State Univ.)
* Elizabeth Liddy (Syracuse Univ.)
* Andrew McCallum (Univ. of Massachusetts)
* Qiaozhu Mei (UIUC)
* Hidetsugu Nanba (Hiroshima Univ.)
* Manabu Okumura (Tokyo Institute of Technology)
* Dragomir Radev (Univ. of Michigan)
* Anna Ritchie (Cambridge University)
* Mark Sanderson (Sheffield Univ.)
* John Swales (Univ. of Michigan)
* Jie Tang (Tsinghua Univ.)
* Michael Thelwall (Univ. of Wolverhampton)
* Howard White (Drexel Univ.)
* Bonnie Webber (Edinburgh Univ.)

Organizers:

Simone Teufel
University of Cambridge Computer Laboratory
William Gates Building, JJ Thompson Ave,
Cambridge CB3 0FD, United Kingdom.

Simone Teufel is a senior lecturer in the Computer laboratory at
Cambridge University, where she has worked since 2001. Her main
research interests are in corpus-linguistic approaches to discourse
theory, and in the application of such information to summarisation,
information retrieval and citation analysis. She has a background in
computer science (1994 Diploma from University Stuttgart) and in
cognitive science (2000 PhD from Edinburgh University), and has also
experience in medical information processing and search, from a
postdoctoral stay at Columbia University, and in collocation
extraction, from a research post at Xerox Europe. Her lastest research
interests include lexical acquisition, and the visualisation and
language generation of the analysis results of scientific articles.

Min-Yen Kan
AS6 05-12
Computing 1, Law Link
National University of Singapore

Min-Yen Kan is an assistant professor at the National University of
Singapore. His research interests include digital libraries and
applied natural language processing. Specific projects include work in
the areas of citation analysis, document structure acquisition, verb
analysis, and applied text summarization. Prior to joining NUS, he was
a graduate research assistant at Columbia University, and has interned
at various industry laboratories, including AT&T, IBM and Eurospider
Technologies in Switzerland.