Andras Kornai | Budapest University of Technology and Economics (original) (raw)
Papers by Andras Kornai
Advanced Information and Knowledge Processing, 2008
Advanced Information and Knowledge Processing, 2008
Cognitive technologies, Dec 7, 2022
Advanced information and knowledge processing, 2008
Springer eBooks, Nov 10, 2019
Advanced information and knowledge processing, 2008
Advanced information and knowledge processing, 2008
Springer eBooks, Nov 10, 2019
Cognitive technologies, Dec 7, 2022
Until this point, we concentrated on the lexicon, conceived of as the repository of shared lingui... more Until this point, we concentrated on the lexicon, conceived of as the repository of shared linguistic information. In 8.1 we take on the problem of integrating real-world knowledge, nowadays typically stored in knowledge graphs as billions of RDF triples, and linguistic knowledge, stored in a much smaller dictionary, typically compressible to a few megabytes. We present proper names as point vectors (rather than the polytopes we use for common nouns and most other lexical entries), and introduce the notion of content continuations, algorithms that extend the lexical entries to more detailed hypergraphs that can refer to technical nodes, such as Date, FloatingPointNumber, or Obligation (see 9.1) that are missing from the core lexicon. In classical Information Extraction, our goal is to abstract the triples from running text, and a great deal of effort is directed toward database population, finding new edges in a knowledge graph. After a brief discussion of this task, in 8.2 we deal with the inverse problem: given that we already have real-world knowledge, in fact orders of magnitude more than we have lexical knowledge, how can we bring it to bear on the acquisition problem? As we shell see, there are some striking successes involving not just one-shot but also zero-shot learning that rely on dynamic embeddings instrumentally. In 8.3 we turn to dynamic embeddings. We briefly outline the four major ideas that, taken together, define the modern, dynamic embeddings: the use of vectors, the use of subword units, the use of neural networks, and the use of attention, linking the latter to the idea of the representation space we introduced in 2.3. We propose a semi-dynamic embedding, DilBERT, which occupies a middle ground between fully static and fully dynamic embeddings, and enables careful study of the representations learned while sacrificing very little of the proven usefulness of dynamic embeddings.
CSLI Publications eBooks, 2018
After a brief introduction to algebras, in 2.1 we begin with linear spaces (LSs) and Boolean alge... more After a brief introduction to algebras, in 2.1 we begin with linear spaces (LSs) and Boolean algebras (BAs). In 2.2 we cover the basic notions of universal algebra, building up to, but not including, Birkhoff’s Theorem. Ultrafilters are introduced in 2.3. In 2.4 we turn to the propositional calculus, and the (lower) predicate calculus is discussed in 2.5. The elements of proof theory are sketched in 2.6, and some preliminaries from multivariate statistics (which are, for the most part, just linear algebra in new terminological garb) are discussed in 2.7. The material is selected for its utility in later chapters, rather than for internal cohesion, and is largely orthogonal to the standard logic prerequisites to compositional semantics covered in for example the two volumes of Gamut (1991). A first course in linear algebra (but not in multivariate statistics) is assumed.
The analysis of geographic references in natural language text involves, at least conceptually, f... more The analysis of geographic references in natural language text involves, at least conceptually, four distinct stages. Of course, implementations may vary greatly in how these stages are interleaved. The first conceptual stage is geographic entity reference detection: strings such as New York, the Amazon delta, LaGuardia, the San Diego-Tijuana border, [the] Brooklyn Bridge, a mile from downtown Manhattan, etc. are identified in the text (Rauch et al). Second, contextual information gathering may help identify the type and approximate location of geographic entities: LaGuardia Airport vs. LaGuardia Community College, the town of Manhattan (population 44,831), etc. (Manov et al, Bilhaut et al). Third is the actual disambiguation of the entity with respect to both type (New York City vs. New York State) and location (Orange County, California vs. Orange County, Florida) (Leidner et al, Waldinger et al, Li et al).
Advanced information and knowledge processing, 2008
Final version MATHEMATICAL LINGUISTICS is the study of mathematical structures and methods that a... more Final version MATHEMATICAL LINGUISTICS is the study of mathematical structures and methods that are of importance to linguistics. As in other branches of applied mathematics, the influence of the empirical subject matter is somewhat indirect: theorems are often proved more for their inherent mathematical value than for their applicability. Nevertheless, the internal organization of linguistics remains the best guide for understanding the internal subdivisions of mathematical linguistics, and we will survey the field following the traditional division of linguistics into → Phonetics, → Phonology, → Morphology, → Syntax, and → Semantics, looking at other branches of linguistics such as → Sociolinguistics or → Language Acquisition only to the extent that these have developed their own mathematical methods.
The notion of modality is almost inextricably intertwined with metaphysics, some kind of theory o... more The notion of modality is almost inextricably intertwined with metaphysics, some kind of theory of what is real, what exists, and why (a theory of 'first causes'). At the center of the commonsensical theory is the real world, but the idea is that there exist, or at least there can exist, other worlds. This idea is most clearly supported by the commonsense notion that the world existed yesterday and will exist tomorrow, even if it will be slightly different from what it is like today. In 3.2 we already discussed that the current world V n requires two modal versions: V b for the past and V a for the future, and in 6.1 we will considerably refine this idea by describing a a nonstandard theory of the standard temporal modalities. A central question of metaphysics is the existence of the things that are co-present in all worlds, the things that do not change. Are these collected in another world? Are there other worlds to begin with, and if there are, in what sense are they real? In 6.2 we use the same technique to introduce an ideal world V d where rules are kept, and investigate the real world in relation to this. In 6.3 we use an even simpler technique to bring epistemic modality in scope, and in 6.4 we reanalyze defaults. 6.1 Tense and aspect In 3.2 we introduced the naive theory of time, and described how it requires at least one, and possibly two somewhat imperfect copies of V n to explicate word meaning. When we say that statements about these subworlds, and especially statements that involve more than one of these, have modal import, we rely on the broad family of theories known collectively as modal logic. (For a terse discussion, see S19:3.7, for a book-length one see Blackburn, Rijke, and Venema, 2001.
We owe the recognition of a deep connection between time, space, and gravity to the 20th century,... more We owe the recognition of a deep connection between time, space, and gravity to the 20th century, but people have used language to speak about spatial and temporal matters long before the development of Euclidean geometry, let alone general relativity. Throughout this book, we approach problems through language use, in search of a naive theory that can be reasonably assumed to underlie human linguistic competence. Since such a theory predates all scientific advances, there is a great deal of temptation to endow it with some kind of deep mystical significance: if this is what humans are endowed with, this must be the 'true' theory of the domain. Here we not only resist this temptation (in fact we consider the whole idea of linguistics and cognitive science making a contribution e.g. to quantum gravity faintly ridiculous), but we will also steer clear of any attempt to bridge the gap between the naive and the scientific theory. The considerable difference between the two will no doubt have explanatory power when it comes to understanding, and dealing with, the difficulties that students routinely encounter when they try to learn the more sophisticated theories, but we leave this rich, if somewhat anecdotal, field for future study. In 3.1 we begin with the naive theory of space, a crude version of 3D Euclidean geometry, and in 3.2 we deal with time. The two theories are connected by the use of similar proximities (near/far), similar ego-centered encoding (here/there, before/now/later), and similar use of anaphora (Partee, 1984), but there are no field equations connecting the two, not even in vacuum. The shared underpinnings, in particular the use of indexicals, are discussed in 3.3. Finally, the naive theory of numbers and measurement is discussed in 3.4.
Advanced information and knowledge processing, 2008
Advanced Information and Knowledge Processing, 2008
Advanced Information and Knowledge Processing, 2008
Cognitive technologies, Dec 7, 2022
Advanced information and knowledge processing, 2008
Springer eBooks, Nov 10, 2019
Advanced information and knowledge processing, 2008
Advanced information and knowledge processing, 2008
Springer eBooks, Nov 10, 2019
Cognitive technologies, Dec 7, 2022
Until this point, we concentrated on the lexicon, conceived of as the repository of shared lingui... more Until this point, we concentrated on the lexicon, conceived of as the repository of shared linguistic information. In 8.1 we take on the problem of integrating real-world knowledge, nowadays typically stored in knowledge graphs as billions of RDF triples, and linguistic knowledge, stored in a much smaller dictionary, typically compressible to a few megabytes. We present proper names as point vectors (rather than the polytopes we use for common nouns and most other lexical entries), and introduce the notion of content continuations, algorithms that extend the lexical entries to more detailed hypergraphs that can refer to technical nodes, such as Date, FloatingPointNumber, or Obligation (see 9.1) that are missing from the core lexicon. In classical Information Extraction, our goal is to abstract the triples from running text, and a great deal of effort is directed toward database population, finding new edges in a knowledge graph. After a brief discussion of this task, in 8.2 we deal with the inverse problem: given that we already have real-world knowledge, in fact orders of magnitude more than we have lexical knowledge, how can we bring it to bear on the acquisition problem? As we shell see, there are some striking successes involving not just one-shot but also zero-shot learning that rely on dynamic embeddings instrumentally. In 8.3 we turn to dynamic embeddings. We briefly outline the four major ideas that, taken together, define the modern, dynamic embeddings: the use of vectors, the use of subword units, the use of neural networks, and the use of attention, linking the latter to the idea of the representation space we introduced in 2.3. We propose a semi-dynamic embedding, DilBERT, which occupies a middle ground between fully static and fully dynamic embeddings, and enables careful study of the representations learned while sacrificing very little of the proven usefulness of dynamic embeddings.
CSLI Publications eBooks, 2018
After a brief introduction to algebras, in 2.1 we begin with linear spaces (LSs) and Boolean alge... more After a brief introduction to algebras, in 2.1 we begin with linear spaces (LSs) and Boolean algebras (BAs). In 2.2 we cover the basic notions of universal algebra, building up to, but not including, Birkhoff’s Theorem. Ultrafilters are introduced in 2.3. In 2.4 we turn to the propositional calculus, and the (lower) predicate calculus is discussed in 2.5. The elements of proof theory are sketched in 2.6, and some preliminaries from multivariate statistics (which are, for the most part, just linear algebra in new terminological garb) are discussed in 2.7. The material is selected for its utility in later chapters, rather than for internal cohesion, and is largely orthogonal to the standard logic prerequisites to compositional semantics covered in for example the two volumes of Gamut (1991). A first course in linear algebra (but not in multivariate statistics) is assumed.
The analysis of geographic references in natural language text involves, at least conceptually, f... more The analysis of geographic references in natural language text involves, at least conceptually, four distinct stages. Of course, implementations may vary greatly in how these stages are interleaved. The first conceptual stage is geographic entity reference detection: strings such as New York, the Amazon delta, LaGuardia, the San Diego-Tijuana border, [the] Brooklyn Bridge, a mile from downtown Manhattan, etc. are identified in the text (Rauch et al). Second, contextual information gathering may help identify the type and approximate location of geographic entities: LaGuardia Airport vs. LaGuardia Community College, the town of Manhattan (population 44,831), etc. (Manov et al, Bilhaut et al). Third is the actual disambiguation of the entity with respect to both type (New York City vs. New York State) and location (Orange County, California vs. Orange County, Florida) (Leidner et al, Waldinger et al, Li et al).
Advanced information and knowledge processing, 2008
Final version MATHEMATICAL LINGUISTICS is the study of mathematical structures and methods that a... more Final version MATHEMATICAL LINGUISTICS is the study of mathematical structures and methods that are of importance to linguistics. As in other branches of applied mathematics, the influence of the empirical subject matter is somewhat indirect: theorems are often proved more for their inherent mathematical value than for their applicability. Nevertheless, the internal organization of linguistics remains the best guide for understanding the internal subdivisions of mathematical linguistics, and we will survey the field following the traditional division of linguistics into → Phonetics, → Phonology, → Morphology, → Syntax, and → Semantics, looking at other branches of linguistics such as → Sociolinguistics or → Language Acquisition only to the extent that these have developed their own mathematical methods.
The notion of modality is almost inextricably intertwined with metaphysics, some kind of theory o... more The notion of modality is almost inextricably intertwined with metaphysics, some kind of theory of what is real, what exists, and why (a theory of 'first causes'). At the center of the commonsensical theory is the real world, but the idea is that there exist, or at least there can exist, other worlds. This idea is most clearly supported by the commonsense notion that the world existed yesterday and will exist tomorrow, even if it will be slightly different from what it is like today. In 3.2 we already discussed that the current world V n requires two modal versions: V b for the past and V a for the future, and in 6.1 we will considerably refine this idea by describing a a nonstandard theory of the standard temporal modalities. A central question of metaphysics is the existence of the things that are co-present in all worlds, the things that do not change. Are these collected in another world? Are there other worlds to begin with, and if there are, in what sense are they real? In 6.2 we use the same technique to introduce an ideal world V d where rules are kept, and investigate the real world in relation to this. In 6.3 we use an even simpler technique to bring epistemic modality in scope, and in 6.4 we reanalyze defaults. 6.1 Tense and aspect In 3.2 we introduced the naive theory of time, and described how it requires at least one, and possibly two somewhat imperfect copies of V n to explicate word meaning. When we say that statements about these subworlds, and especially statements that involve more than one of these, have modal import, we rely on the broad family of theories known collectively as modal logic. (For a terse discussion, see S19:3.7, for a book-length one see Blackburn, Rijke, and Venema, 2001.
We owe the recognition of a deep connection between time, space, and gravity to the 20th century,... more We owe the recognition of a deep connection between time, space, and gravity to the 20th century, but people have used language to speak about spatial and temporal matters long before the development of Euclidean geometry, let alone general relativity. Throughout this book, we approach problems through language use, in search of a naive theory that can be reasonably assumed to underlie human linguistic competence. Since such a theory predates all scientific advances, there is a great deal of temptation to endow it with some kind of deep mystical significance: if this is what humans are endowed with, this must be the 'true' theory of the domain. Here we not only resist this temptation (in fact we consider the whole idea of linguistics and cognitive science making a contribution e.g. to quantum gravity faintly ridiculous), but we will also steer clear of any attempt to bridge the gap between the naive and the scientific theory. The considerable difference between the two will no doubt have explanatory power when it comes to understanding, and dealing with, the difficulties that students routinely encounter when they try to learn the more sophisticated theories, but we leave this rich, if somewhat anecdotal, field for future study. In 3.1 we begin with the naive theory of space, a crude version of 3D Euclidean geometry, and in 3.2 we deal with time. The two theories are connected by the use of similar proximities (near/far), similar ego-centered encoding (here/there, before/now/later), and similar use of anaphora (Partee, 1984), but there are no field equations connecting the two, not even in vacuum. The shared underpinnings, in particular the use of indexicals, are discussed in 3.3. Finally, the naive theory of numbers and measurement is discussed in 3.4.
Advanced information and knowledge processing, 2008