Ross W Gayler | Independent Researcher (original) (raw)

Papers by Ross W Gayler

Nature machine intelligence, Feb 13, 2024

The “My Seizure Gauge” competition explored the challenge of forecasting epileptic seizures using... more The “My Seizure Gauge” competition explored the challenge of forecasting epileptic seizures using non-invasive wearable devices without an electroencephalogram. The organizers and the winning team reflect on their experiences.

Nature Machine Intelligence , Feb 13, 2024

This is the R notebook and R project that generates the presentation, as given at Credit Scoring ... more This is the R notebook and R project that generates the presentation, as given at Credit Scoring & Credit Control XVI, Edinburgh, UK, on 2019-08-30. A docker image has been generated from this release of the project. It can be executed in Rstudio running on a free cloud instance provided by mybinder.org. All the user interaction is via a web browser, so that the user can experiment with the code without needing to install any software locally. The project notebook can be executed by following all the instructions below or opening the "launch binder" link in a new browser tab then following steps 2 and 3 of the instructions. [LAUNCH BINDER] INSTRUCTIONS Open https://mybinder.org/v2/zenodo/10.5281/zenodo.3402938?urlpath=rstudio/ in a web browser (or click the "launch binder" link instead). Once the Rstudio instance is running open the file scorecal_CSCC_2019.Rmd by clicking on the filename in the Files tab of the bottom-right pane. Click the Knit button at the top of the scorecal_CSCC_2019.Rmd tab of code editor (top-left) pane. This will execute the notebook and generate the presentation slides.

Connection science, Jun 1, 2011

IEEE Transactions on Knowledge and Data Engineering, Mar 1, 2012

Identity crime is well known, prevalent, and costly; and credit application fraud is a specific c... more Identity crime is well known, prevalent, and costly; and credit application fraud is a specific case of identity crime. The existing non-data mining detection systems of business rules and scorecards, and known fraud matching have limitations. To address these limitations and combat identity crime in real-time, this paper proposes a new multi-layered detection system complemented with two additional layers: Communal Detection (CD) and Spike Detection (SD). CD finds real social relationships to reduce the suspicion score, and is tamper-resistant to synthetic social relationships. It is the whitelist-oriented approach on a fixed set of attributes. SD finds spikes in duplicates to increase the suspicion score, and is probe-resistant for attributes. It is the attribute-oriented approach on a variable-size set of attributes. Together, CD and SD can detect more types of attacks, better account for changing legal behaviour, and remove the redundant attributes. Experiments were carried out on CD and SD with several million real credit applications. Results on the data support the hypothesis that successful credit application fraud patterns are sudden and exhibit sharp spikes in duplicates. Although this research is specific to credit application fraud detection, the concept of resilience, together with adaptivity and quality data discussed in the paper, are general to the design, implementation, and evaluation of all detection systems.

Journal of Data and Information Quality, Oct 23, 2015

Real-time Entity Resolution (ER) is the process of matching query records in subsecond time with ... more Real-time Entity Resolution (ER) is the process of matching query records in subsecond time with records in a database that represent the same real-world entity. Indexing techniques are generally used to efficiently extract a set of candidate records from the database that are similar to a query record, and that are to be compared with the query record in more detail. The sorted neighborhood indexing method, which sorts a database and compares records within a sliding window, has been successfully used for ER of large static databases. However, because it is based on static sorted arrays and is designed for batch ER that resolves all records in a database rather than resolving those relating to a single query record, this technique is not suitable for realtime ER on dynamic databases that are constantly updated. We propose a tree-based technique that facilitates dynamic indexing based on the sorted neighborhood method, which can be used for real-time ER, and investigate both static and adaptive window approaches. We propose an approach to reduce query matching times by precalculating the similarities between attribute values stored in neighboring tree nodes. We also propose a multitree solution where different sorting keys are used to reduce the effects of errors and variations in attribute values on matching quality by building several distinct index trees. We experimentally evaluate our proposed techniques on large real datasets, as well as on synthetic data with different data quality characteristics. Our results show that as the index grows, no appreciable increase occurs in both record insertion and query times, and that using multiple trees gives noticeable improvements on matching quality with only a small increase in query time. Compared to earlier indexing techniques for real-time ER, our approach achieves significantly reduced indexing and query matching times while maintaining high matching accuracy.

Video recording of the lecture "Analogical Reasoning" given on 2021⁠-⁠10⁠-⁠06 as Module 6 of Neur... more Video recording of the lecture "Analogical Reasoning" given on 2021⁠-⁠10⁠-⁠06 as Module 6 of Neuroscience 299: Computing with High-Dimensional Vectors at the Redwood Center for Theoretical Neuroscience, University of California, Berkeley.

Video DOI:10.5281/zenodo.5560797
Slides DOI:10.5281/zenodo.5552219
Source DOI:10.5281/zenodo.5561063

Slides for the lecture "Analogical Reasoning" given on 2021⁠-⁠10⁠-⁠06 as Module 6 of Ne... more Slides for the lecture "Analogical Reasoning" given on 2021⁠-⁠10⁠-⁠06 as Module 6 of Neuroscience 299: Computing with High-Dimensional Vectors at the Redwood Center for Theoretical Neuroscience, University of California, Berkeley.

Frontiers in Artificial Intelligence and Applications - Proceedings of the 2008 conference on Artificial General Intelligence, 2008

We provide an overview of Vector Symbolic Architectures (VSA), a class of structured associative ... more We provide an overview of Vector Symbolic Architectures (VSA), a class of structured associative memory models that offers a number of desirable features for artificial general intelligence. By directly encoding structure using familiar, computationally efficient algorithms, VSA bypasses many of the problems that have consumed unnecessary effort and attention in previous connectionist work. Example applications from opposite ends of the AI spectrum-visual map-seeking circuits and structured analogy processing-attest to the generality and power of the VSA approach in building new solutions for AI.

Jackendoff (2002) posed four challenges that linguistic combinatoriality and rules of language pr... more Jackendoff (2002) posed four challenges that linguistic combinatoriality and rules of language present to theories of brain function. The essence of these problems is the question of how to neurally instantiate the rapid construction and transformation of the compositional structures that are typically taken to be the domain of symbolic processing. He contended that typical connectionist approaches fail to meet these challenges and that the dialogue between linguistic theory and cognitive neuroscience will be relatively unproductive until the importance of these problems is widely recognised and the challenges answered by some technical innovation in connectionist modelling. This paper claims that a little-known family of connectionist models (Vector Symbolic Architectures) are able to meet Jackendoff's challenges.

Presentation given at the Credit Scoring & Credit Control Conference XVI in Edinburgh, UK. <st... more Presentation given at the Credit Scoring & Credit Control Conference XVI in Edinburgh, UK. Abstract Score calibration is the process of empirically determining the relationship between a score and an outcome on some population of interest, and scaling is the process of expressing that relationship in agreed units. Calibration is often treated as a simple matter and attacked with simple tools – typically, either assuming the relationship between score and log-odds is linear and fitting a logistic regression with the score as the only covariate, or dividing the score range into bands and plotting the empirical log-odds as a function of score band. Both approaches ignore some information in the data. The assumption of a linear score to log-odds relationship is too restrictive and score banding ignores the continuity of the scores. While a linear score to log-odds relationship is often an adequate approximation, the reality can be much more interesting, with...

Presentation to be given at the Workshop on Developments in Hyperdimensional Computing and Vector... more Presentation to be given at the Workshop on Developments in Hyperdimensional Computing and Vector Symbolic Architectures, 16 March 2020, Kirchoff-Institute for Physics at Heidelberg University, Heidelberg, Germany. Extended Abstract It has been argued that analogy is at the core of cognition [7, 1]. My work in VSA is driven by the goal of building a practical, effective analogical memory/reasoning system. Analogy is commonly construed as structure mapping between a source and target [5], which in turn can be construed as representing the source and target as graphs and finding maximal graph isomorphisms between them. This can also be viewed as a kind of dynamic similarity in that the initially dissimilar source and target are effectively very similar after mapping. Similarity (the angle between vectors) is central to the mechanics of VSA/HDC. Introductory papers (e.g. [8]) necessarily devote space to vector similarityand the effect of the primi...

Entity resolution, also known as data mat hing or re ord linkage, is the task of identifying and ... more Entity resolution, also known as data mat hing or re ord linkage, is the task of identifying and mat hing re ords from several databases that refer to the same entities. Tradition-ally, entity resolution has been applied in bat h-mode and on stati databases. However, many organisations are in-reasingly fa ed with the hallenge of having large databases ontaining entities that need to be mat hed in real-time with a stream of query re ords also ontaining entities, su h that the best mat hing re ords are retrieved. Example appli-ations in lude online law enfor ement and national se u-rity databases, publi health surveillan e and emergen y re-sponse systems, nan ial veri ation systems, online retail stores, eGovernment servi es, and digital libraries. A novel inverted index based approa h for real-time entity resolution is presented in this paper. At build time, simi-larities between attribute values are omputed and stored to support the fast mat hing of re ords at query time. The presen...

Connection Science, 2011

In October 2004, approximately 30 connectionist and nonconnectionist researchers gathered at a AA... more In October 2004, approximately 30 connectionist and nonconnectionist researchers gathered at a AAAI symposium to discuss and debate a topic of central concern in artificial intelligence and cognitive science: the nature of compositionality. The symposium offered participants an opportunity to confront the persistent belief among traditional cognitive scientists that connectionist models are, in principle, incapable of systematically composing and manipulating the elements of mental structure (words, concept, semantic roles, etc.). Participants met this challenge, with several connectionist models serving as proofs of concept that connectionism is indeed capable of building and manipulating compositional cognitive representations (Levy and Gayler 2004).

The August 2010 workshop focused on what may now be the major issue in connectionism and computational cognitive neuroscience: the debate between proponents of localist representations (e.g. Page 2000), in which a single unit or population of units encodes one (and only one) item, and proponents of distributed representations, in which all units participate in the encoding of all items (see Plate 2002 for an overview). The aim of this workshop was to bring together researchers working with a wide range of compositional connectionist models, independent of application domain (e.g. language, logic, analogy, web search), with a focus on what commitments (if any) each model makes to localist or distributed representation. We solicited submissions from both localist and distributed modellers, as well as those whose work bypasses this distinction or challenges its importance. We expected vigorous and exciting debate on this topic, and we were not disappointed.

Nature machine intelligence, Feb 13, 2024

Nature Machine Intelligence , Feb 13, 2024

Connection science, Jun 1, 2011

IEEE Transactions on Knowledge and Data Engineering, Mar 1, 2012

Journal of Data and Information Quality, Oct 23, 2015

Video DOI:10.5281/zenodo.5560797
Slides DOI:10.5281/zenodo.5552219
Source DOI:10.5281/zenodo.5561063

Frontiers in Artificial Intelligence and Applications - Proceedings of the 2008 conference on Artificial General Intelligence, 2008

Connection Science, 2011

Page 1. Credit Scoring and Data Mining Ross Gayler Page 2. What is credit scoring? Page 3. 2/12/2... more Page 1. Credit Scoring and Data Mining Ross Gayler Page 2. What is credit scoring? Page 3. 2/12/2009 AusDM'09, Melbourne 3 What is credit scoring? • Predictive modelling of operational outcomes in mass-market credit • Used to automate operational decision making – Make decisions on the basis of predicted outcomes Page 4.

The problem of extracting an invariant representation from perceptual inputs has long been recogn... more The problem of extracting an invariant representation from perceptual inputs has long been recognised (e.g. Lashley, 1942). More recently, various proposals have been made and implemented for recurrent connectionist systems that simultaneously settle on a mapping and retrieve an item from memory (Arathorn, 2002; Hinton, 1981; Olshausen, Anderson, & Van Essen, 1993). The mapping (which captures the variant aspect of the input) transforms the input into the cue that retrieves the item (which is the invariant representation) from memory.

Coming from a completely different direction, I have been developing a connectionist memory architecture to support high level cognition (Gayler, 2000). Surprisingly, this architecture is also based on simultaneous transformation and recognition and is abstractly isomorphic to the perceptual invariance architectures. The similarity between the perceptual and cognitive architectures suggests that there may be a fundamental unity between them. The difference lies in the details; the perceptual architectures use localist representations and a fixed palette of geometric transformations, while the cognitive architecture uses distributed connectionist representations capable of representing recursive structures, and transformations that are arbitrary structural substitutions. I propose that the architectures could be unified and devote the remainder of this presentation to exploring how this may enable the recognition of composite objects.

The perceptual architectures mentioned earlier recognise a single item at a time. They can be persuaded to attend to multiple items serially, but they do not allow for representation of the relations between items. These architectures do represent the relations between the elements (pixels or feature vectors) within an item, but these relations are fixed. Each item is recognised holistically and treated as atomic (having no internal compositional structure). Thus, multi-level composite items can not be represented.

The representational advantage offered by the distributed approach is that transformations are “first-class” entities, having the same status as the content mapped by the transformations. This means that representations of transformations can be included in the representations of objects. In particular, two serially fixated items and the attentional transformation between the fixations could be represented on the same set of connectionist units used to represent just one item. Thus, it should be possible to represent complex entities as a network of components with transformations between them. This leads naturally to graph structures as representations of objects – a common choice in computer vision systems.

The process advantage of such an approach is that it should be possible to build a connectionist memory that simultaneously recalls multiple items while settling on mappings between them. These mappings would serve to unify the retrieved items into a representation of a novel composite object. Memory systems of this sort should be able to recognise novel compositions of familiar components as readily as they recognise the components themselves. The distributed connectionist implementation of this recognition process can be construed as an indirect implementation of Pelillo's (1999) approximate graph matching via replicator equations, by embedding his algorithm in a fixed high-dimensional vector space.

Arathorn, D. W. (2002). Map-seeking circuits in visual cognition: A computational mechanism for biological and machine vision. Stanford, CA, USA: Stanford University Press.

Gayler, R. W. (2000). Multiplicative Binding, Representation Operators & Analogical Inference. Presented at Cognitive Science Conference. Melbourne, Australia.

Hinton, G. E. (1981). A parallel computation that assigns canonical object-based frames of reference. Proceedings of the Seventh International Joint Conference on Artificial Intelligence Vol. 2. Vancouver BC, Canada.

Lashley, K. S. (1942). The problem of cerebral organization in vision. Biological Symposia, 7, 301-322.

Olshausen, B. A., Anderson, C. H., & Van Essen, D. C. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. The Journal of Neuroscience, 13, 4700-4719.

Pelillo, M. (1999). Replicator equations, maximal cliques, and graph isomorphism. Neural Computation, 11, 1933-1955.

The ROC curve is useful for assessing the predictive power of risk models and is relatively well ... more The ROC curve is useful for assessing the predictive power of risk models and is relatively well known for this purpose in the credit scoring community. The ROC curve is a component of the Theory of Signal Detection (TSD), a theory which has pervasive links to many issues in model building. However, these conceptual links and their associated insights and techniques are less well known than they deserve to be among credit scoring practitioners.

The purpose of this paper is to alert credit risk modelers to the relationships between TSD and common scorecard development concepts and to provide a toolbox of simple techniques and interpretations.

Workshop on Developments in Hyperdimensional Computing and Vector Symbolic Architectures, 2020

It has been argued that analogy is at the core of cognition [7, 1]. My work in VSA is driven by t... more It has been argued that analogy is at the core of cognition [7, 1]. My work in VSA is driven by the goal of building a practical, effective analogical memory/reasoning system. Analogy is commonly construed as structure mapping between a source and target [5], which in turn can be construed as representing the source and target as graphs and finding maximal graph isomorphisms between them. This can also be viewed as a kind of dynamic similarity in that the
initially dissimilar source and target are effectively very similar after mapping.

Similarity (the angle between vectors) is central to the mechanics of VSA/HDC. Introductory papers (e.g. [8]) necessarily devote space to vector similarityand the effect of the primitive operators (sum, product, permutation) on similarity. Most VSA examples rely on static similarity, where the vector representations are fixed over the time scale of the core computation (which is usually a single-pass, feed-forward computation). This emphasises encoding methods (e.g.
[12, 13]) that create vector representations with the similarity structure required by the core computation. Random Indexing [13] is an instance of the vector embedding approach to representation [11] that is widely used in NLP and ML. The important point is that the vector embeddings are developed in advance and then used as static representations (with fixed similarity structure) in the
subsequent computation of interest.

Human similarity judgments are known to be context-dependent (see [3] for a brief review). It has also been argued that similarity and analogy are based on the same processes [6] and that cognition is so thoroughly context-dependent that representations are created on-the-fly in response to task demands [2]. This seems extreme, but doesn’t necessarily imply that the base representations are context-dependent as long as the cognitive process that compares them is
context-dependent, which can be achieved by having dynamic representations that are derived from the static base representations by context-dependent transforms (or any functionally equivalent process).

An obvious candidate for a dynamic transformation function in VSA is substitution by binding, because the substitution can be specified as a vector and dynamically generated (see Representing substitution with a computed mapping in [8]). This implies an internal degree of freedom (a register to hold the substitution vector while it evolves) and a recurrent VSA circuit to provide the dynamics to evolve the substitution vector.

These essential aspects are present in [4], which finds the maximal subgraph isomorphism between two graphs represented as vectors. This is implemented as a recurrent VSA circuit with a register containing a substitution vector that evolves and settles over the course of the computation. The final state of the substitution vector represents the set of substitutions that transforms the static base representation of each graph into the best subgraph isomorphism to the static base representation of the other graph. This is a useful step along the path to an analogical memory system.

Interestingly, the subgraph isomorphism circuit can be interpreted as related to the recently developed Resonator Circuits for factorisation of VSA representations [9], which have internal degrees of freedom for each of the factors to be calculated and a recurrent VSA dynamics that settles on the factorisation. The graph isomorphism circuit can be interpreted as finding a factor (the substitution vector) such that the product of that factor with each of the graphs is the
best possible approximation to the other graph. This links the whole enterprise back to statistical modelling, where there is a long history of approximating matrices/tensors as the product of simpler factors [10].

References
1. Blokpoel, M., Wareham, T., Haselager, P., van Rooij, I.: Deep Analogical Inference as the Origin of Hypotheses. The Journal of Problem Solving 11(1), 1–24 (2018)
2. Chalmers, D.J., French, R.M., Hofstadter, D.R.: High-level perception, representation, and analogy: A critique of artificial intelligence methodology. Journal of Experimental & Theoretical Artificial Intelligence 4(3), 185–211 (1992)
3. Cheng, Y.: Context-dependent similarity. In: Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence (UAI’90), pp. 27–30. Cambridge, MA, USA (1990)
4. Gayler, R.W., Levy, S.D.: A distributed basis for analogical mapping. In: Proceedings of the Second International Conference on Analogy (ANALOGY-2009), pp. 165–174. New Bulgarian University, Sofia, Bulgaria (2009)
5. Gentner, D.: Structure-mapping: A theoretical framework for analogy. Cognitive Science 7(2), 155–170 (1983)
6. Gentner, D., Markman, A.B.: Structure mapping in analogy and similarity. American Psychologist 52(1), 45–56 (1997)
7. Gust, H., Krumnack, U., Kühnberger, K.-U., Schwering, A.: Analogical Reasoning: A core of cognition. KI - Künstliche Intelligenz 1(8), 8–12 (2008)
8. Kanerva, P.: Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cognitive Computation 1, 139–159 (2009)
9. Kent, S.J., Frady, E.P., Sommer, F.T., Olshausen, B.A.: Resonator Circuits for factoring high-dimensional vectors. http://arxiv.org/abs/1906.11684 (2019)
10. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review 51(3), 455–500 (2009)
11. Pennington, J., Socher, R., Manning, C.D.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014)
12. Purdy, S.: Encoding data for HTM systems. http://arxiv.org/abs/1602.05925 (2016)
13. Sahlgren, M.: An introduction to random indexing. In: Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering (TKE 2005), Copenhagen, Denmark (2005)