Achtung: The news items in the side bar are sorted alphabetical and not chronological!
Natural language processing is a key technology in web search, information retrieval, social network analysis, machine translation, speech recognition, and many other applications. The course introduces students to methods for natural language processing, natural language understanding, and information retrieval.
- text processing and encoding
- string algorithms, edit distance
- statistical language models
- spell correction
- n-gram models
- word sense disambiguation
- Markov models, parts-of-speech tagging
- probabilistic grammars and parsing
- text alignment, clustering, text categorization
- statistical machine translation
- applications in speech recognition, handwriting recognition, and OCR
- language acquisition
- machine learning for NLP
- cognitive and psychological aspects of NLP
The course will combine a statistical, mathematical, and practical approach. Exercises with be in Python and some Python toolkits.
Lectures: Wednesdays, 13:45 - 15:15 @ Room 48-462
Tutorials: every other Thursday, 17:00 - 18:30 @ Room 32-411 (see announcements on tutorial page)
Tutors: Mayce Al Azawi and Ludwig Schmidt-Hackenberg
Much of the material we cover is contained in this free e-book (also available from O'Reilly and Amazon in printed form):
You should also do background reading on your own, using sources like Wikipedia and Google Scholar when appropriate. In particular, after each lecture, look up any important terms and ideas introduced in class online.
- Dates: roughly between April 9th and 12th - There will be no exams in the end of the winter semester!!!
- Admission (Zulassung): You have to complete 50% of the assigned homework averaged over all handed out tasks.
- For the exam you have to finish all tasks and bring them with you!
- FAQ Oral Exams
Lecture 1 (17.10.12):
Lecture 2 (24.10.12):
- Worksheet*: Regular Expressions iPyNB, PDF
- Worksheet*: Regular Expressions and FSA iPyNB, PDF
PyDot is not available on the SCI terminals at the moment. We are talking to the SCI to get installed. Working as of 30.11.12
Lecture 7 (
Lecture 8 (5.12.12):
- NLPA - Markov Models iPyNB, PDF
- NLPA - HMM - OCR (see lecture 9)
- NLPA - Classification - Intro
iPyNB, PDF (see below)
- NLPA - Classification - Intro iPyNB, PDF (updated)
- NLPA - Dialog Act Type classification iPyNB, PDF
- NLPA - Sentence Segmentation classification iPyNB, PDF
- NLPA - Classification tagging iPyNB, PDF
- NLPA - Classifier errors iPyNB, PDF
* these are iPython Notebook Worksheets. To use these you need to run ipython notebook --pylab=inline
from the folder where you downloaded the files to. You will need version 0.13 or bigger. Ipython notebook is installed on the SCI machines.
** you need to download tagutils.py and put it in the same folder as the worksheets in order to run these *** you need to download fstutils.py and put it in the same folder as the worksheets in order to run these NEW