« Older Entries

MCQLL, 9/23 — Emily Goodwin

This week at MCQLL (Wednesday 1:20-2:30), Emily Goodwin will present her ongoing work on systematic syntactic parsing. Abstract and bio are below. If you would like to join the mailing list and/or attend the meeting, please fill out this google form (as soon as possible).

ABSTRACT:
Recent work in semantic parsing, including novel datasets like SCAN (Lake and Baroni, 2018) and CFQ (Keysers et al., 2020) demonstrate that semantic parsers generalize well when tested on items highly similar to those in the training set, but struggle with syntactic structures that combine components of training items in novel ways. This indicates a lack of systematicity , the principle that individual words will make similar contributions to the expressions they appear in, independently of surrounding context. Applying this principle to syntactic parsing, we show similar problems plague state of the art syntactic parsers, despite achieving human or near-human performance on randomly sampled test data. Moreover, generalization is especially poor on syntactic relations which are crucial for the compositional semantics.

BIO:
Emily is an M.A. Student in the McGill linguistics department, supervised by Profs. Timothy J. O’Donnell and Siva Reddy, and by Dzmitry Bahdanau of ElementAI. She is interested in compositionality and systematic generalization in meaning representation.

MCQLL, 9/16 — Lightning Talks

As last week’s meeting was cancelled to make it easier for people to participate in the scholar strike, this week’s MCQLL meeting (1:30pm on Wednesday, September 16th) will be lightning talks by returning MCQLL lab members (that would have normally taken place last week). This will serve as an introduction to the type of work done at MCQLL, as well as provide a space to ask questions about our research and the lab in general.

Please make sure to register here beforehand so that you can get the meeting link. If you already registered last week, then there is no need to register again, just join with the link you got in your registration confirmation email.

MCQLL, 9/9 — Lightning Talks

At this week’s MCQLL meeting (1:30pm on Wednesday, September 9th), there will be a series of lightning talks by returning MCQLL lab members. This will serve as an introduction to the type of work done at MCQLL, as well as provide a space to ask questions about our research and the lab in general.

Please make sure to register here beforehand so that you can get the meeting link.

The (tentative) meeting agenda is as follows:

  1. Announcements
  2. Lightning Talks
    • Clarifying Questions here are fine, but please hold all discussion questions until the Q & A session.
  3. Q & A Session, including:
    • Discussion questions relating to the talks
    • Questions relating to the lab in general

Please don’t hesitate to reach out if you have any questions, comments, or concerns.

MCQLL meeting, 6/3 – Timothy J O’Donnell 

This week at the Montreal Computational and Qualitative Linguistics Lab meeting, Timothy O’Donnell will be presenting his Meditations on Compositional Structure, to makeup for last week’s postponement. This presentation attempts to synthesize several threads of work in a broader framework. We meet at 2:30 via zoom (if you are not on the MCQLL emailing list, please contact Emily Goodwin emily.goodwin@mail.mcgill.ca for the meeting link).

 

MCQLL meeting, 5/13 — Bing’er Jiang 

The next meeting of the Montreal Computational and Quantitative Linguistics Lab will take place on Wednesday May 13th, at 2:30, via Zoom. Bing’er will present on Modelling Perceptual Effects of Phonology with Automatic Speech Recognition Systems. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 5/6 — Jacob Hoover

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday May 6th at 2:30, via Zoom. Jacob Hoover will present an ongoing project on compositionality and predictability.

For abstract and more information see the MCQLL lab page. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 4/29 — Koustuv Sinha

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday April 29th, at 2:30, via Zoom. Koustuv Sinha will present “Learning an Unreferenced Metric for Online Dialogue Evaluation” (ACL, 2020). For abstract and more information see the MCQLL lab page. If you would like to participate but are not on the MCQLL or computational linguistics emailing list, contact emily.goodwin@mail.mcgill.ca for the Zoom link.

MCQLL meeting, 4/22 – Spandana Gella

Spandana Gella, research scientist at Amazon AI, will present on “Robust Natural Language Processing with Multi-task Learning” at this week’s Montreal Computational and Quantitative Linguistics Lab meeting. We are meeting Wednesday, April 22nd, at 2:00 via Zoom (to be added to the MCQLL listserve, please contact Jacob Hoover at jacob.hoover@mail.mcgill.ca).
Abstract:

In recent years, we have seen major improvements to various Natural Language Processing tasks. Despite their human-level performance on benchmarking datasets, recent studies have shown that these models are vulnerable to adversarial examples. It is shown that these models are relying on spurious correlations that hold for the majority of examples and suffer from distribution shifts and fail on atypical or challenging test sets. Recent work has shown that large pre-trained models improve model robustness to spurious associations in the training data.  We observe that superior performance of large pre-trained language models comes from their better generalization from a minority of training examples that resemble the challenging sets. Our study shows that multi-task learning with the right auxiliary tasks improves accuracy on adversarial examples without hurting in distribution performance. We show that this holds true for multi-modal task of Referring Expression Recognition and text-only tasks of Natural language inference and Paraphrase identification.

MCQLL meeting, 4/1 — Guillaume Rabusseau

The next meeting of the Montreal Computational and Qualitative Linguistics Lab will take place on Wednesday April 1st, at 1:00, via Zoom (meeting ID: 912 324 021). This week, Guillaume Rabusseau will present on “Spectral Learning of Weighted Automata and Connections with Recurrent Neural Networks and Tensor Networks”.
Abstract:
Structured objects such as strings, trees, and graphs are ubiquitous in data science but learning functions defined over such objects can be a tedious task. Weighted finite automata~(WFAs) and recurrent neural networks~(RNNs) are two powerful and flexible classes of models which can efficiently represent such functions.In this talk, Guillaume will introduce WFAs and the spectral learning algorithm before presenting surprising connections between WFAs, tensor networks and recurrent neural networks. Guillaume Rabusseau is an assistant professor at Univeristé de Montréal and at the Mila research institute since Fall 2018, and a Canada CIFAR AI (CCAI) chair holder since March 2019.  His research interests lie at the intersection of theoretical computer science and machine learning, and his work revolves around exploring inter-connections between tensors and machine learning and developing efficient learning methods for structured data relying on linear and multilinear algebra.
Meeting ID: 912 324 021

MCQLL Meeting, 2/19 — Vanna Willerton

This week at MCQLL, Vanna Willerton will be discussing the (over)application of irregular inflection, and exploring how it can influence our understanding of morphological productivity. She will review existing studies, as well as a recent large-scale corpus study of child speech with Graham Adachi-Kriege, Shijie Wu, Ryan Cotterell, and Tim O’Donnell, and current work analyzing recent experimental results.As usual, we meet at 1:00 in 117 of the McGill Linguistics building, and all are welcome!

MCQLL Meeting, 2/5 — Kushal Arora

The next MCQLL meeting will take place on Wednesday, February 12th, 1:00- 2:30, in room 117. This week, Kushal Arora will be presenting the EMNLP Tutorial by Joel Grus et al. from AllenAI discussing “effective methods and best practices for writing code for NLP research”. He might also add some of his experiences with writing code for research and using AllenNLP framework.

MCQLL Meeting, 2/5 — Dzmitry Bahdanau

The next MCQLL meeting will take place on Wednesday, February 5th, 1:00- 2:30, in room 117. This week, Dzmitry Bahdanau will present a recent paper from Google brain, Keysers et al., 2020 “Measuring Compositional Generalization: A Comprehensive Method on Realistic Data”.

MCQLL Meeting, 1/15 — Alessandro Sordoni

MCQLL meets on Wednesdays at 1:00 in room 117. This week, Alessandro Sordoni will be discussing Natural Language Inference datasets such as MNLI, or paraphrase corpora such as QQP. NLU models trained on those datasets become usually brittle when they are tested on examples that are still grammatically correct but slightly out-of-distribution (see HANS and PAWS datasets). This talk presents preliminary results on how one can train state-of-the-art natural language understanding models on MNLI and QQP such that the resulting model is more robust when tested on OOD data. A useful starting point is this paper: https://arxiv.org/abs/1902.01007

MCQLL Meeting, 11/27 — Gemma Huang

This week at MCQLL, Gemma Huang will present her ongoing project on comparing phonotactic probability models.

Overview: Phonotactics is a branch of phonology that studies the restrictions on permissible combinations of phonemes. The adoption of words is governed by systematic intuitions on the likelihood of different combinations of sounds in a language. For example, between two hypothetical English words “blick” and “bnick”, “blick” has a higher likelihood to be accepted as an English word by a native speaker (Chomsky and Halle, 1965). Understanding such constraints on allowable sound sequences is crucial to understanding language productivity and second language acquisition. The focal point of the project is to explore and test phonotactics models from three classes: maximum entropy models, Bayesian models, and neural-network-based models. To compare these models, we will collect English speakers’ judgements on novel word forms.

As usual, the meeting will be Wednesday from 14:30-16:00 in Room 117. A late lunch will be provided.

MCQLL Meeting, 11/6 — Yikang Shen and Zhouhan Lin

This week at MCQLL, Yikang Shen and Zhouhan Lin will present on previous and ongoing work.

Title: Ordered Neurons and syntactically supervised structural neural language models.

Abstract: In this talk, we will first present the ONLSTM model, which is a language model that learns syntactic distances through an extra master input gate and a master forget gate. In the second part of the talk, we will present a way of incorporating supervised syntactic trees to the neural language model, also through the syntactic distances. Experimental results reveal that for neural models this way of injecting supervised tree structure helps the language model to yield better results.

The meeting is Wednesday from 14:30-16:00 in Room 117. A late lunch will be provided.

MCQLL Meeting, 10/30 — Dima Bahdanau

This week at MCQLL, Dima Bahdanau presents recent work.

Title: CLOSURE: Assessing Systematic Generalization of CLEVR Models

Abstract: The CLEVR dataset of natural-looking questions about 3D-rendered scenes has recently received much attention from the research community. A number of models have been proposed for this task, many of which achieved very high accuracies of around 97-99%. In this work, we study how systematic the generalization of such models is, that is to which extent they are capable of handling novel combinations of known linguistic constructs. To this end, we define 7 additional question families which test models’ understanding of similarity-based references (such as e.g. “the object that has the same size as …”) in novel contexts. Our experiments on the thereby constructed CLOSURE benchmark show that state-of-the-art models often do not exhibit systematicity after being trained on CLEVR. Surprisingly, we find that the explicitly compositional Neural Module Network model also generalizes badly on CLOSURE, even when it has access to the ground-truth programs at test time. We improve the NMN’s systematic generalization by developing a novel Vector-NMN module architecture with vector-shaped inputs and outputs. Lastly, we investigate the extent to which few-shot transfer learning can help models that are pretrained on CLEVR to adapt to CLOSURE. Our few-shot learning experiment contrast the adaptation behavior of the models with intermediate discrete programs with that of the end-to-end continuous models.

The meeting is Wednesday from 14:30-16:00 in Room 117.

MCQLL Meeting, 10/23 – Kushal Arora

At the meeting of MCQLL this week, Kushal Arora will present his recent work with Aishik Chakraborty.

Title: Learning Lexical Subspaces in a Distributional Vector Space

Abstract: In this paper, we propose LexSub, a novel approach towards unifying lexical and distributional semantics. We inject knowledge about lexical-semantic relations into distributional word embeddings by defining subspaces of the distributional vector space in which a lexical relation should hold. Our framework can handle symmetric attract and repel relations (e.g., synonymy and antonymy, respectively), as well as asymmetric relations (e.g., hypernymy and meronomy). In a suite of intrinsic benchmarks, we show that our model outperforms previous post-hoc approaches on relatedness tasks, and on hypernymy classification and detection while being competitive on word similarity tasks. It also outperforms previous systems on extrinsic classification tasks that benefit from exploiting lexical relational cues. We perform a series of analyses to understand the behaviors of our model.

This meeting will be held in room 117 at 14:30 on Wednesday.

MCQLL Meeting, 10/16 — Michaela Socolof

This week at MCQLL, Michaela Socolof will lead a discussion on the paper “A noisy-channel model of rational human sentence comprehension under uncertain input” by Roger Levy (abstract below). The setup will be somewhat different from the typical MCQLL presentations. Michaela will have some discussion topics prepared for the group, please give the paper a read before the meeting to make the discussion more dynamic!

As usual the meeting is from 14:30-16:00 on Wednesday. A late lunch will be provided.

Abstract: Language comprehension, as with all other cases of the extraction of meaningful structure from perceptual input, takes places under noisy conditions. If human language comprehension is a rational process in the sense of making use of all available information sources, then we might expect uncertainty at the level of word-level input to affect sentence-level comprehension. However, nearly all contemporary models of sentence comprehension assume clean input—that is, that the input to the sentence-level comprehension mechanism is a perfectly-formed, completely certain sequence of input tokens (words). This article presents a simple model of rational human sentence comprehension under noisy input, and uses the model to investigate some outstanding problems in the psycholinguistic literature for theories of rational human sentence comprehension. We argue that by explicitly accounting for input level noise in sentence processing, our model provides solutions for these outstanding problems and broadens the scope of theories of human sentence comprehension as rational probabilistic inference.

Link to the paper: https://www.mit.edu/~rplevy/papers/levy-2008-emnlp.pdf

MCQLL, 10/9 — Siva Reddy

At the meeting of MCQLL this week, Siva Reddy will discuss ongoing work on Measuring Stereotypical Bias in Pretrained Neural Network Models of Language.

A key ingredient behind the success of neural network models for language is pretrained representations: word embeddings, contextual embeddings and pretrained architecutures. Since pretrained representations are obtained from learning on massive text corpora, there is a danger that unwanted societal biases are reflected in these models. I will discuss ideas on how to assess these biases in popular pretrained language models.

This meeting will be in room 117 at 14:30 on Wednesday, October 9th.

MCQLL Meeting, 10/02 — Emi Baylor

At Next week’s MCQLL meeting, Emi will be presenting about a research project investigating morphological productivity.

Emi will discuss morphological productivity by presenting on German plural nouns, and what makes them uniquely suited to use in the testing of theories of productivity.

The meeting will be in room 117 at 14:30 on Wednesday.

« Older Entries
Blog authors are solely responsible for the content of the blogs listed in the directory. Neither the content of these blogs, nor the links to other web sites, are screened, approved, reviewed or endorsed by McGill University. The text and other material on these blogs are the opinion of the specific author and are not statements of advice, opinion, or information of McGill.