Bookmarks for 2024-05-26

5 bookmarks were saved this day.

16.

Index of Lambek's papers

www.math.mcgill.ca/barr/lambek/pdffiles

This site contains Lambek's recent papers, nearly all written in this millennium. They are mostly undated, so I have sorted them by subject matter, which mostly breaks up into linguistics, physics, and category theory. Most of the category theory was for the linguistics. Some of the papers are not complete, since I got them from his typist. Some were published (and I have provided the citations) and some are not. Among the unpublished, most are undated and a couple seem to be duplicates, doubtless revisions. If anyone figures out which the latest version is, I will try to remove the duplicates. If I was able to download the actual publication, it appears as a normal citation. If the citation is preceded by "Appeared in:", I was not able to download the actual publication, but I compiled and included the source file for what may not be the final version. Dates of writing or of appearance are provided when discernible. All in all it is a remarkable amount of writing for someone aged between 75 and 90. According to MathSciNet, he had 19 publications between 2001 and 2013.

15.

Pregroups and natural language processing

www.math.mcgill.ca/barr/lambek/pdffiles/Natlangproc.pdf

A pregroup is a partially ordered monoid endowed with two unary operations called left and right adjunction. Pregroups were recently introduced to help with natural language processing, as we illustrate here by looking at small fragments of three modern European languages. As it turns out, the apparently new algebraic concept of a pregroup had been around for some time in recreational mathematics, although not under this name.

14.

Logic and Grammar

doi.org/10.1007/s11225-012-9426-7

Grammar can be formulated as a kind of substructural propositional logic. In support of this claim, we survey bare Gentzen style deductive systems and two kinds
of non-commutative linear logic: intuitionistic and compact bilinear logic. We also glance at their categorical refinements

13.

Mathematical Foundations for a Compositional Distributional Model of Meaning

arxiv.org/pdf/1003.4394v1

We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.

12.

A* CCG Parsing with a Supertag and Dependency Factored Model

arxiv.org/abs/1704.06936v1

Supertagging in lexicalized grammar parsing is known as almost parsing (Bangalore and Joshi, 1999), in that each supertag is syntactically informative and most ambiguities are resolved once a correct supertag is assigned to every word. Recently this property is effectively exploited in A* Combinatory Categorial Grammar (CCG; Steedman (2000)) parsing (Lewis and Steedman, 2014; Lewis et al., 2016), in which the probability of a CCG tree y on a sentence x of length N is the product of the probabilities of supertags (categories) ci (locally factored model):
P (y|x) = ∏ i∈[1,N] P_tag (ci|x). (1)
By not modeling every combinatory rule in a derivation, this formulation enables us to employ efficient A* search (see Section 2), which finds the most probable supertag sequence that can build a well-formed CCG tree.
Although much ambiguity is resolved with this supertagging, some ambiguity still remains. Figure 1 shows an example, where the two CCG parses are derived from the same supertags. Lewis et al.’s approach to this problem is resorting to some deterministic rule. For example, Lewis et al. (2016) employ the attach low heuristics, which is motivated by the right-branching tendency of English, and always prioritizes (b) for this type
of ambiguity. Though for English it empirically works well, an obvious limitation is that it does not always derive the correct parse; consider a phrase “a house in Paris with a garden”, for which the correct parse has the structure corresponding to (a) instead.
In this paper, we provide a way to resolve these remaining ambiguities under the locally factored model, by explicitly modeling bilexical dependencies as shown in Figure 1. Our joint model is still locally factored so that an efficient A* search can be applied. The key idea is to predict the head of every word independently as in Eq. 1 with a strong unigram model, for which we utilize the scoring model in the recent successful graph-based dependency parsing on LSTMs (Kiperwasser and Goldberg, 2016; Dozat and Manning, 2016). Specifically, we extend the bi-directional LSTM (bi-LSTM) architecture of Lewis et al. (2016) predicting the supertag of a word to predict the head of it at the same time with a bilinear transformation.
The importance of modeling structures beyond supertags is demonstrated in the performance gain in Lee et al. (2016), which adds a recursive component to the model of Eq. 1. Unfortunately, this formulation loses the efficiency of the original one since it needs to compute a recursive neural network every time it searches for a new node. Our model does not resort to the recursive networks while modeling tree structures via dependencies.
We also extend the tri-training method of Lewis et al. (2016) to learn our model with dependencies from unlabeled data. On English CCGbank test data, our model with this technique achieves 88.8% and 94.0% in terms of labeled and unlabeled F1, which mark the best scores so far.
Besides English, we provide experiments on Japanese CCG parsing. Japanese employs freer
word order dominated by the case markers and a deterministic rule such as the attach low method may not work well. We show that this is actually the case; our method outperforms the simple application of Lewis et al. (2016) in a large margin, 10.0 points in terms of clause dependency accuracy.