From clark.3664 at buckeyemail.osu.edu Thu Feb 3 17:49:37 2022 From: clark.3664 at buckeyemail.osu.edu (Clark, Christian) Date: Thu, 3 Feb 2022 22:49:37 +0000 Subject: [CaCL] Reading for 2/10 Message-ID: Hi CaCLers, For our next meeting, 2/10, we will discuss "One model for the learning of language" by Yang and Piantadosi (2022). Link: https://www.pnas.org/content/119/5/e2021865119 Zoom link: https://osu.zoom.us/j/95536921111?pwd=TG9YdVZ0Wk45R2hCdHhTYk5ubkhIQT09 Abstract: A major goal of linguistics and cognitive science is to understand what class of learning systems can acquire natural language. Until recently, the computational requirements of language have been used to argue that learning is impossible without a highly constrained hypothesis space. Here, we describe a learning system that is maximally unconstrained, operating over the space of all computations, and is able to acquire many of the key structures present in natural language from positive evidence alone. We demonstrate this by providing the same learning model with data from 74 distinct formal languages which have been argued to capture key features of language, have been studied in experimental work, or come from an interesting complexity class. The model is able to successfully induce the latent system generating the observed strings from small amounts of evidence in almost all cases, including for regular (e.g., an, (ab)n, and {a, b}+), context-free (e.g., anbn, anbn+m, and xxR), and context-sensitive (e.g., anbncn, anbmcndm, and xx) languages, as well as for many languages studied in learning experiments. These results show that relatively small amounts of positive evidence can support learning of rich classes of generative computations over structures. The model provides an idealized learning setup upon which additional cognitive constraints and biases can be formalized. ---- Christian Clark Ph.D. Student Department of Linguistics The Ohio State University https://christian-clark.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cheung.179 at buckeyemail.osu.edu Thu Feb 10 22:55:49 2022 From: cheung.179 at buckeyemail.osu.edu (Cheung, Willy) Date: Fri, 11 Feb 2022 03:55:49 +0000 Subject: [CaCL] Cacl 2/17 Message-ID: Hi CaCLers, For our next meeting 2/17, we will discuss Elazar et al 2021, Measuring and Improving Consistency in Pretrained Language Models. Link: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00410/107384/Measuring-and-Improving-Consistency-in-Pretrained Measuring and Improving Consistency in Pretrained Language Models | Transactions of the Association for Computational Linguistics | MIT Press Abstract. Consistency of a model?that is, the invariance of its behavior under meaning-preserving alternations in its input?is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel?, a high-quality resource of cloze-style query ... direct.mit.edu Zoom link: https://osu.zoom.us/j/95536921111?pwd=TG9YdVZ0Wk45R2hCdHhTYk5ubkhIQT09 Abstract: Consistency of a model?that is, the invariance of its behavior under meaning-preserving alternations in its input?is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel?, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel?, we show that the consistency of all PLMs we experiment with is poor? though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oh.531 at buckeyemail.osu.edu Sat Feb 19 17:29:28 2022 From: oh.531 at buckeyemail.osu.edu (Oh, Byung-Doh) Date: Sat, 19 Feb 2022 22:29:28 +0000 Subject: [CaCL] 2/24: Grammar-Based Grounded Lexicon Learning Message-ID: Hello everyone, Next week, we'll be discussing the following paper (apologies for the late email): Grammar-Based Grounded Lexicon Learning https://proceedings.neurips.cc/paper/2021/file/4158f6d19559955bae372bb00f6204e4-Paper.pdf We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist approach toward learning a compositional and grounded meaning representation of language from grounded data, such as paired images and texts. At the core of G2L2 is a collection of lexicon entries, which map each word to a tuple of a syntactic type and a neuro-symbolic semantic program. For example, the word shiny has a syntactic type of adjective; its neuro-symbolic semantic program has the symbolic form ?x.filter(x, SHINY), where the concept SHINY is associated with a neural network embedding, which will be used to classify shiny objects. Given an input sentence, G2L2 first looks up the lexicon entries associated with each token. It then derives the meaning of the sentence as an executable neuro-symbolic program by composing lexical meanings based on syntax. The recovered meaning programs can be executed on grounded inputs. To facilitate learning in an exponentially-growing compositional space, we introduce a joint parsing and expected execution algorithm, which does local marginalization over derivations to reduce the training time. We evaluate G2L2 on two domains: visual reasoning and language-driven navigation. Results show that G2L2 can generalize from small amounts of data to novel compositions of words. Best, Byung-Doh ================= Byung-Doh Oh (he/him/his) Ph.D. Student Department of Linguistics The Ohio State University -------------- next part -------------- An HTML attachment was scrubbed... URL: From clark.3664 at buckeyemail.osu.edu Thu Feb 24 17:11:49 2022 From: clark.3664 at buckeyemail.osu.edu (Clark, Christian) Date: Thu, 24 Feb 2022 22:11:49 +0000 Subject: [CaCL] Reading for 3/3 Message-ID: Hi CaCLers, Next Thursday we will discuss Joint Universal Syntactic and Semantic Parsing by Stengel-Eskin et al. (2021). Paper link: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00396/106796/Joint-Universal-Syntactic-and-Semantic-Parsing Zoom link: https://osu.zoom.us/j/95536921111?pwd=TG9YdVZ0Wk45R2hCdHhTYk5ubkhIQT09 Abstract: While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other. This trade-off contradicts the large body of research focusing on the rich interactions at the syntax?semantics interface. We explore multiple model architectures that allow us to exploit the rich syntactic and semantic annotations contained in the Universal Decompositional Semantics (UDS) dataset, jointly parsing Universal Dependencies and UDS to obtain state-of-the-art results in both formalisms. We analyze the behavior of a joint model of syntax and semantics, finding patterns supported by linguistic theory at the syntax?semantics interface. We then investigate to what degree joint modeling generalizes to a multilingual setting, where we find similar trends across 8 languages. ---- Christian Clark Ph.D. Student Department of Linguistics The Ohio State University https://christian-clark.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: