To register, click here

Regular rates
(register after April 15):
Rutgers affiliate, not volunteering
: $75
Academic: $200
Non-academic/industry: $400

Rutgers students volunteering: free!


Corpus Statistics with Open Source Tools

This course will focus how to use descriptive and inferential statistics over linguistic data – corpora – both to validate linguistic theories or algorithms and discover patterns of interest in corpora. It will recall some basic notions of descriptive and inferential statistics, provide an overview of the main corpus analysis techniques together with a hands-on tutorial on how to implement and visualize them using, e.g., the Natural Language ToolKit (NLTK) Python open source library.

Course Website

Camilo Thorne

Camilo Thorne
University of Mannheim



Copyright ©2015, Rutgers, The State University of New Jersey