Courses

Natural Language Computing

CSC401H (undergraduate) and CSC2511H (graduate)

Instructors: Gerald Penn and Sean Robertson and Raeid Saqur

This course presents an introduction natural language computing in applications such as information retrieval and extraction, intelligent web searching, speech recognition, and multi-lingual systems including machine translation. These applications will involve techniques such as n-grams, part-of-speech tagging, semantic distance metrics, indexing, entropy, hidden Markov models, and corpus analysis. Assignments will be completed in Python and MATLAB (with optional C/C++ modules at the student's discretion).

Prerequisites: CSC 207 or 209 or 228, and STA 247 or 255 or 257 and a CGPA of 3.0 or higher or a CSC subject POSt. MAT 223 or 240 is strongly recommended.

Note: CSC485/2501 and CSC401/2511 may be taken in either order.

Computational Linguistics

CSC485H (undergraduate) and CSC2501H (graduate)

Instructor: Gerald Penn

Computational linguistics and the understanding and generation of natural language by computer. Syntactic processing. Semantics and semantic interpretation. Pragmatics, pronouns, definite descriptions, discourse context. Machine translation.

Prerequisite: STA247H1/STA255H1/STA257H1 or familiarity with basic probability theory; CSC209H1 or proficiency in C++, Java, or Python.

Note: CSC485/2501 and CSC401/2511 may be taken in either order.

Recommended preparation: CSC324H1/CSC330H1/CSC384H1

Computational Models of Semantic Change

CSC2611H (graduate)

Instructor: Yang Xu

Words are fundamental components of human language, but their meanings tend to change over time, e.g., face ('body part' -> 'facial expression'), gay ('happy'-> 'homosexual'), mouse ('rodent' -> 'device'). Changes like these present challenges for computers to learn accurate representations of word meanings - a task that is crucial to natural language systems. This course explores data-driven computational approaches to word meaning representation and semantic change. Topics include latent models of word meaning (e.g., LSA, word2vec), corpus-based detection of semantic change, probabilistic diachronic models of word meaning, and cognitive mechanisms of word sense extension (e.g., chaining, metaphor). The course involves a strong hands-on component that focuses on large-scale text analyses and seminar-style presentations.

Advanced Computational Linguistics

CSC2528H (graduate)

Instructor: Graeme Hirst

A seminar-style course that continues CSC 485/2501, and assumes the material presented therein. The course takes several topics of current research interest in computational linguistics, and studies them in depth. It emphasizes the interdisciplinary nature of computational linguistics. The interests of the class will determine exactly which topics are chosen. Auditors are welcome.

Prerequisite: CSC 401/2511 or 485/2501 or permission of instructor.

Discrete Mathematical Models of Sentence Structure

CSC2517H (graduate)

Instructor: Gerald Penn

Typed feature logic; mildly context-sensitive languages; parallel context-free grammars; tree-adjoining grammars; combinatory categorial grammar; pre-group grammars; tree transducers and tree-walking transducers.

Spoken Language Processing

CSC2518H (graduate)

Instructor: Gerald Penn

An introduction to working with speech in natural language processing systems. Topics include: articulatory and acoustic phonetics, prosody and information structure, introduction to digital signal processing of speech, automated speech recognition, text-to-speech synthesis, language models, dialogue modeling and dialogue systems. CSC2511H/401H1 is recommended (but not required) as a prerequisite.

The Computational Lexicon

CSC2520H (graduate)

Instructor: Suzanne Stevenson

A computational lexicon is a highly structured repository of the rich syntactic and semantic knowledge about individual words in a natural language processing system. Two key issues will be the focus of this seminar course: the representation of lexical information, and its automatic acquisition. Topics will include: the organization of meaning and syntax in the lexicon; the interface between lexical semantics and its syntactic realization; the predicate argument structure of verbs; corpus-based approaches to automatic lexical acquisition and semantic/syntactic annotation of words; linking of statistical models to linguistic models of lexical properties; unsupervised learning of lexical relations; resolution of lexical ambiguities in natural language processing. Research papers will primarily focus on relevant research in computational linguistics, but we will also discuss work in linguistic and cognitive models of the human lexicon, and ways in which the engineering and cognitive approaches can inform each other.

Cognitive Linguistics

CSC2540H (graduate)

Instructor: Suzanne Stevenson