« Older Entries

MCQLL Meeting, 11/11 — Bing’er Jiang

At this week’s MCQLL meeting (1:30-2:30pm Wednesday, November 11), Bing’er Jiang, a sixth year PhD student at the McGill Linguistics Department, will present her work on the perceptual tonal space in Mandarin Chinese continuous speech. Talk abstract is below.

If you would like to join the meeting and have not already registered for the MCQLL mailing list, please do so ASAP using this form.

Abstract: This study examines the perceptual tonal space in Mandarin Chinese continuous speech and how various acoustic properties signalling the tonal contrast are represented in this space. Previous studies on Mandarin tones mainly focus on words produced in isolation, but there is little understanding on the perception of tones in continuous speech, which are realized with more variability. We first evaluate the importance of three acoustic correlates (pitch, intensity, and duration) for the tonal contrast by using a set of tone classification models trained on broadcast news. Instead of model ablation, we use a novel method of data ablation inspired from conventional perceptual experiments to restrict the acoustic information the model can access. We further force the model to learn a low-dimensional representation, which can be seen as the model’s perceptual representation for tones. We find that the information for tonal distinction can be compressed in a two-dimensional space, and the structure of the space corresponds to the findings on human’s perception of isolated tones in the literature.

MCQLL Meeting, 11/4 — Emi Baylor

At this week’s MCQLL meeting (Wednesday, November 4th, 1:30-2:30pm), Emi Baylor, masters student at McGill School of Computer Science and Mila, will be presenting on her work with morphological productivity. Bio and talk abstract are below.

If you would like to attend the talk but are not already on the MCQLL listserv, please sign up at this link as soon as possible, as there is still a registration step that needs to be completed after that.

Bio: Emi Baylor is a masters student at McGill Computer Science and Mila. She is interested in computational morphology, multilingual NLP, and low resource languages, as well as the combination of all three.

Abstract: This work investigates and empirically tests theories of linguistic productivity. Language users are able to make infinite use of finite means, meaning that a finite number of words and morphemes can be used to create an infinite number of utterances. This is largely due to linguistic productivity, which allows language users to create and understand novel expressions through stored, reusable units. One example of a productive process across language is plural morphology, which generalizes the use of plural morphemes in a language to novel words. This work investigates and empirically tests theories of how this generalization of forms is learned and carried out, through data from the complex German plural noun system.

MCQLL Meeting, 10/28 — Michaela Socolof

At this week’s MCQLL meeting (Wednesday, October 8th, 1:30-2:30pm), Michaela Socolof, PhD student in the McGill Linguistics department, will be presenting on her work with idioms and compositionality. Bio and talk abstract are below.

If you would like to attend the talk but are not already on the MCQLL listserv, please sign up at this link as soon as possible, as there is still a registration step that needs to be completed after that.

Bio: Michaela Socolof is a PhD student at McGill Linguistics. She is interested in syntax and semantics, with a focus on using computational tools to explore questions in these domains.

Talk: This work addresses the question of how idioms should be characterized. Unlike most phrases in language, whose meanings are largely predictable based on the meanings of their individual words, idioms have idiosyncratic meanings that do not come from straightforwardly combining their parts. This observation has led to the commonly repeated notion that idioms are an exception to compositionality that require special machinery in the linguistic system. We show that it is possible to characterize idioms based on the interaction of two simple properties of language: the extent to which the word meanings are dependent on context and the extent to which the phrase is stored as a unit. We present computational approximations of these two properties, and we show that our measures successfully distinguish between idiomatic and non-idiomatic phrases.

MCQLL Meeting, 10/21 — Jacob Hoover

At this week’s MCQLL meeting (Wednesday, October 21st, 1:30-2:30pm), Jacob Louis Hoover, a PhD student at McGill and Mila, will present on the connection between grammatical structure and the statistics of word occurrences in language use. Abstract and bio are below.

If you would like to attend and have not already signed up for the MCQLL mailing list, please fill out this google form ASAP to do so.

Bio: Jacob is a PhD student at McGill Linguistics / Mila. He is broadly interested in logic, mathematical linguistics, and the generative / expressive capacity of formal systems, as well as information theory, and examining what both human and machine learning might be able to tell us about the underlying structure of language.

Talk: There is an intuitive connection between grammatical structure and the statistics of word occurrences observed in language use. This intuitive connection is reflected in cognitive models and also in NLP, in the assumption that the patterns of predictability correlate with linguistic structure. We call this the “dependency-dependence” hypothesis. This hypothesis is implicit in the use of language modelling objectives for training modern neural models, and has been made explicitly in some approaches to unsupervised dependency parsing. The strongest version of this hypothesis is to say that compositional structure is in fact entirely reducible to cooccurrence statistics (a hypothesis made explicit in Futrell et al. 2019). Investigating the mutual information of pairs of words using pretrained contextualized embedding models, we show that the optimal structure for prediction is in fact not very closely correlated to the compositional structure. We propose that contextualized mutual information scores of this kind may be useful as a way to understand the structure of predictability, as a system distinct from compositional structure, but also integral to language use.

MCQLL, 10/7 — Mika Braginsky

At this week’s MCQLL meeting (Wednesday, October 7th, 1:30-2:30pm), Mika Braginsky, a graduate student in Brain and Cognitive Sciences at MIT, will discuss their work investigating linguistic productivity and child language acquisition. Talk abstract is below.
If you would like to attend and have not already signed up for the MCQLL mailing list, please fill out this google form to do so.
Talk: In learning morphology, do children generalize from their vocabularies on an item-by-item basis, or do they form global rules on a developmental timetable? We use large-scale parent-report data to address this question by investigating relations among morphological development, vocabulary growth, and age. For three languages, we examine irregular verbs (e.g. go) and predict children’s correct inflection (went) and overregularization (goed/wented). Morphology knowledge relates strongly to vocabulary, more so than to age. Further, this relation is modulated by age: for two children with the same vocabulary size, the older is more likely to correctly inflect and overregularize, and the effect of vocabulary on morphology decreases with age. Lastly, correct inflection and overregularization rates rise in tandem over age, and vocabulary effects on them are correlated across items. Our findings support that morphology learning is strongly coupled to lexical learning and that correct inflection and overregularization are related, verb-specific, processes.

MCQLL, 9/30 — Maya Watt

At this week’s MCQLL meeting (Wednesday, September 30th, 1:30-2:30) Maya Watt will be presenting her research on the rates of over-irregularization of English past-tense verbs. See below for the talk abstract and Maya’s bio.

If you would like to attend and have not already signed up for the MCQLL mailing list, please fill out this google form.

TALK: In her talk, Maya will discuss the rates of over-irregularization of English past-tense verbs (i.e. believing the past tense of snow is snew instead of snowed). Such mistakes rarely happen in natural speech, so very little is know about the nuances of over-irregularization — do people tend to over-irregularize verbs of a particular inflectional class, or do the rates stay fairly similar? Because capturing an instance of over-irregularization in natural speech is difficult, we decided to collect our data via implementing a lexical decision task (LDT) and launching it on Mechanical Turk. The assumption is that highly natural over-irregularized non-words (e.g. brang) will take longer to be judged as non-words than other, less-natural non-words (e.g. screamt). The goal of this project is to provide some data and insight into language learning and productive morphology.

BIO: Maya is an undergraduate student in Linguistics and Computer Science. She’s interested in syntax, logic, and formal linguistics. Her research interests lie in the intersection of natural language and mathematics.

MCQLL, 9/23 — Emily Goodwin

This week at MCQLL (Wednesday 1:20-2:30), Emily Goodwin will present her ongoing work on systematic syntactic parsing. Abstract and bio are below. If you would like to join the mailing list and/or attend the meeting, please fill out this google form (as soon as possible).

ABSTRACT:
Recent work in semantic parsing, including novel datasets like SCAN (Lake and Baroni, 2018) and CFQ (Keysers et al., 2020) demonstrate that semantic parsers generalize well when tested on items highly similar to those in the training set, but struggle with syntactic structures that combine components of training items in novel ways. This indicates a lack of systematicity , the principle that individual words will make similar contributions to the expressions they appear in, independently of surrounding context. Applying this principle to syntactic parsing, we show similar problems plague state of the art syntactic parsers, despite achieving human or near-human performance on randomly sampled test data. Moreover, generalization is especially poor on syntactic relations which are crucial for the compositional semantics.

BIO:
Emily is an M.A. Student in the McGill linguistics department, supervised by Profs. Timothy J. O’Donnell and Siva Reddy, and by Dzmitry Bahdanau of ElementAI. She is interested in compositionality and systematic generalization in meaning representation.

MCQLL, 9/16 — Lightning Talks

As last week’s meeting was cancelled to make it easier for people to participate in the scholar strike, this week’s MCQLL meeting (1:30pm on Wednesday, September 16th) will be lightning talks by returning MCQLL lab members (that would have normally taken place last week). This will serve as an introduction to the type of work done at MCQLL, as well as provide a space to ask questions about our research and the lab in general.

Please make sure to register here beforehand so that you can get the meeting link. If you already registered last week, then there is no need to register again, just join with the link you got in your registration confirmation email.

MCQLL, 9/9 — Lightning Talks

At this week’s MCQLL meeting (1:30pm on Wednesday, September 9th), there will be a series of lightning talks by returning MCQLL lab members. This will serve as an introduction to the type of work done at MCQLL, as well as provide a space to ask questions about our research and the lab in general.

Please make sure to register here beforehand so that you can get the meeting link.

The (tentative) meeting agenda is as follows:

  1. Announcements
  2. Lightning Talks
    • Clarifying Questions here are fine, but please hold all discussion questions until the Q & A session.
  3. Q & A Session, including:
    • Discussion questions relating to the talks
    • Questions relating to the lab in general

Please don’t hesitate to reach out if you have any questions, comments, or concerns.

MCQLL meeting, 6/3 – Timothy J O’Donnell 

This week at the Montreal Computational and Qualitative Linguistics Lab meeting, Timothy O’Donnell will be presenting his Meditations on Compositional Structure, to makeup for last week’s postponement. This presentation attempts to synthesize several threads of work in a broader framework. We meet at 2:30 via zoom (if you are not on the MCQLL emailing list, please contact Emily Goodwin emily.goodwin@mail.mcgill.ca for the meeting link).

 

MCQLL meeting, 5/13 — Bing’er Jiang 

The next meeting of the Montreal Computational and Quantitative Linguistics Lab will take place on Wednesday May 13th, at 2:30, via Zoom. Bing’er will present on Modelling Perceptual Effects of Phonology with Automatic Speech Recognition Systems. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 5/6 — Jacob Hoover

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday May 6th at 2:30, via Zoom. Jacob Hoover will present an ongoing project on compositionality and predictability.

For abstract and more information see the MCQLL lab page. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 4/29 — Koustuv Sinha

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday April 29th, at 2:30, via Zoom. Koustuv Sinha will present “Learning an Unreferenced Metric for Online Dialogue Evaluation” (ACL, 2020). For abstract and more information see the MCQLL lab page. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 4/22 – Spandana Gella

Spandana Gella, research scientist at Amazon AI, will present on “Robust Natural Language Processing with Multi-task Learning” at this week’s Montreal Computational and Quantitative Linguistics Lab meeting. We are meeting Wednesday, April 22nd, at 2:00 via Zoom (to be added to the MCQLL listserve, please contact Jacob Hoover at jacob.hoover@mail.mcgill.ca).
Abstract:

In recent years, we have seen major improvements to various Natural Language Processing tasks. Despite their human-level performance on benchmarking datasets, recent studies have shown that these models are vulnerable to adversarial examples. It is shown that these models are relying on spurious correlations that hold for the majority of examples and suffer from distribution shifts and fail on atypical or challenging test sets. Recent work has shown that large pre-trained models improve model robustness to spurious associations in the training data.  We observe that superior performance of large pre-trained language models comes from their better generalization from a minority of training examples that resemble the challenging sets. Our study shows that multi-task learning with the right auxiliary tasks improves accuracy on adversarial examples without hurting in distribution performance. We show that this holds true for multi-modal task of Referring Expression Recognition and text-only tasks of Natural language inference and Paraphrase identification.

MCQLL meeting, 4/1 — Guillaume Rabusseau

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday April 1st, at 1:00, via Zoom (meeting ID: 912 324 021). This week, Guillaume Rabusseau will present on “Spectral Learning of Weighted Automata and Connections with Recurrent Neural Networks and Tensor Networks”.
Abstract:
Structured objects such as strings, trees, and graphs are ubiquitous in data science but learning functions defined over such objects can be a tedious task. Weighted finite automata~(WFAs) and recurrent neural networks~(RNNs) are two powerful and flexible classes of models which can efficiently represent such functions.In this talk, Guillaume will introduce WFAs and the spectral learning algorithm before presenting surprising connections between WFAs, tensor networks and recurrent neural networks. Guillaume Rabusseau is an assistant professor at Univeristé de Montréal and at the Mila research institute since Fall 2018, and a Canada CIFAR AI (CCAI) chair holder since March 2019.  His research interests lie at the intersection of theoretical computer science and machine learning, and his work revolves around exploring inter-connections between tensors and machine learning and developing efficient learning methods for structured data relying on linear and multilinear algebra.
Meeting ID: 912 324 021

MCQLL Meeting, 2/19 — Vanna Willerton

This week at MCQLL, Vanna Willerton will be discussing the (over)application of irregular inflection, and exploring how it can influence our understanding of morphological productivity. She will review existing studies, as well as a recent large-scale corpus study of child speech with Graham Adachi-Kriege, Shijie Wu, Ryan Cotterell, and Tim O’Donnell, and current work analyzing recent experimental results.As usual, we meet at 1:00 in 117 of the McGill Linguistics building, and all are welcome!

MCQLL Meeting, 2/5 — Kushal Arora

The next MCQLL meeting will take place on Wednesday, February 12th, 1:00- 2:30, in room 117. This week, Kushal Arora will be presenting the EMNLP Tutorial by Joel Grus et al. from AllenAI discussing “effective methods and best practices for writing code for NLP research”. He might also add some of his experiences with writing code for research and using AllenNLP framework.

MCQLL Meeting, 2/5 — Dzmitry Bahdanau

The next MCQLL meeting will take place on Wednesday, February 5th, 1:00- 2:30, in room 117. This week, Dzmitry Bahdanau will present a recent paper from Google brain, Keysers et al., 2020 “Measuring Compositional Generalization: A Comprehensive Method on Realistic Data”.

MCQLL Meeting, 1/15 — Alessandro Sordoni

MCQLL meets on Wednesdays at 1:00 in room 117. This week, Alessandro Sordoni will be discussing Natural Language Inference datasets such as MNLI, or paraphrase corpora such as QQP. NLU models trained on those datasets become usually brittle when they are tested on examples that are still grammatically correct but slightly out-of-distribution (see HANS and PAWS datasets). This talk presents preliminary results on how one can train state-of-the-art natural language understanding models on MNLI and QQP such that the resulting model is more robust when tested on OOD data. A useful starting point is this paper: https://arxiv.org/abs/1902.01007

MCQLL Meeting, 11/27 — Gemma Huang

This week at MCQLL, Gemma Huang will present her ongoing project on comparing phonotactic probability models.

Overview: Phonotactics is a branch of phonology that studies the restrictions on permissible combinations of phonemes. The adoption of words is governed by systematic intuitions on the likelihood of different combinations of sounds in a language. For example, between two hypothetical English words “blick” and “bnick”, “blick” has a higher likelihood to be accepted as an English word by a native speaker (Chomsky and Halle, 1965). Understanding such constraints on allowable sound sequences is crucial to understanding language productivity and second language acquisition. The focal point of the project is to explore and test phonotactics models from three classes: maximum entropy models, Bayesian models, and neural-network-based models. To compare these models, we will collect English speakers’ judgements on novel word forms.

As usual, the meeting will be Wednesday from 14:30-16:00 in Room 117. A late lunch will be provided.

« Older Entries
Blog authors are solely responsible for the content of the blogs listed in the directory. Neither the content of these blogs, nor the links to other web sites, are screened, approved, reviewed or endorsed by McGill University. The text and other material on these blogs are the opinion of the specific author and are not statements of advice, opinion, or information of McGill.