编辑: 戴静菡 2019-07-16
Adam Kilgarriff'

s Legacy to Computational Linguistics and Beyond Roger Evans , Alexander Gelbukh? , Gregory Grefenstette? , Patrick Hanks? , Milo? Jakubíˇ cek? , Diana McCarthy? , Martha Palmer? , Ted Pedersen , Michael Rundell , Pavel Rychl?∞ , Serge Sharoff? , David Tugwell University of Brighton, R.

P.Evans@brighton.ac.uk, ? CIC, Instituto Politécnico Nacional, Mexico, gelbukh@gelbukh.com, ? Inria Saclay, ggrefen@gmail.com, ? University of Wolverhampton, patrick.w.hanks@gmail.com, ? Lexical Computing and Masaryk University, milos.jakubicek@sketchengine.co.uk, ? DTAL University of Cambridge, diana@dianamccarthy.co.uk, ? University of Colorado, martha.palmer@colorado.edu, University of Minnesota, tpederse@d.umn.edu Lexicography MasterClass, michael.rundell@lexmasterclass.com ∞ Lexical Computing and Masaryk University, pary@?.muni.cz, ? University of Leeds, s.sharoff@leeds.ac.uk, Independent Researcher, dtugwell@googlemail.com Abstract. This year, the CICLing conference is dedicated to the memory of Adam Kilgarriff who died last year. Adam leaves behind a tremendous scienti?c legacy and those working in computational linguistics, other ?elds of linguistics and lexicography are indebted to him. This paper is a summary review of some of Adam'

s main scienti?c contributions. It is not and cannot be exhaustive. It is writ- ten by only a small selection of his large network of collaborators. Nevertheless we hope this will provide a useful summary for readers wanting to know more about the origins of work, events and software that are so widely relied upon by scientists today, and undoubtedly will continue to be so in the foreseeable future.

1 Introduction Last year was marred by the loss of Adam Kilgarriff who during the last

27 years contributed greatly to the ?eld of computational linguistics1 , as well as to other ?elds of linguistics and to lexicography. This paper provides a review of some of the key scienti?c contributions he made. His legacy is impressive, not simply in terms of the numerous academic papers, which are widely cited in many ?elds, but also the many scienti?c events and communities he founded and fostered and the commercial Sketch Engine software. The Sketch Engine has provided computational linguistics tools and corpora to scientists in other ?elds, notably lexicography for example [61,50,17], as

1 In this paper, natural language processing (NLP) is used synonymously with computational linguistics. well as facilitating research in other areas of linguistics [56,12,11,54] and our own sub?eld of computational linguistics [60,74]. Adam was hugely interested in lexicography from the very inception of his post- graduate career. His DPhil2 on polysemy and subsequent interest in word sense disam- biguation (WSD) and its evaluation was ?rmly rooted in examining corpus data and dictionary senses with a keen eye on the lexicographic process [20]. After his DPhil, Adam spent several years as a computational linguist advising Longman Dictionaries on use of language engineering for the development of lexical databases, and he contin- ued this line of knowledge transfer in consultancies with other publishers until realizing the potential of computational linguistics with the development of his commercial soft- ware, the Sketch Engine. The origins of this software lay in his earlier ideas of using computational linguistics tools for providing word pro?les from corpus data. For Adam, data was key. He fully appreciated the need for empirical approaches to both computational linguistics and lexicography. In computational linguistics from the 90s onwards there was a huge swing from symbolic to statistical approaches, however the choice of input data, in composition and size, was often overlooked in favor of a focus on algorithms. Furthermore, early on in this statistical tsunami, issues of repli- cability were not always appreciated. A large portion of his work was devoted to these issues;

下载(注:源文件不在本站服务器,都将跳转到源网站下载)
备用下载
发帖评论
相关话题
发布一个新话题