Jason Eisner - Home Page (JHU) (original) (raw)

Professor
ACL Fellow

My Work

Publications and related materials
Research students
Full CV, short bio, and research summary
Courses, tutorial materials, teaching statement, 2012 interview and 2019 interview about teaching
Code, including the Dyna project
Advice for research students and prospective students
Workshop on Probabilistic Representations of Linguistic Meaning (2014)

What I Work On

Novel methods in NLP and ML. Focusing on probabilistic modeling and inference in complex, structured, or ill-defined settings.

This often involves new machine learning; creative uses and modifications of large language models; probabilistic models of linguistic structure, human behavior, and machine behavior; combinatorial algorithms and approximate inference.

I'm also into designing declarative specification languages backed by general efficient algorithms (and adaptive execution). This produces a coherent view of all of the modeling and algorithmic options, and accelerates the research of others.

Why? [_click here_]

The questions: Large language models attempt to imitate typical human behavior. How can we combine this with disciplines for ensuring rational behavior, such as statistics, case analysis and planning, reinforcement learning, the scientific method, and probabilistic modeling of the world? How can we use this to support humans, including by integrating human preferences and expertise?
The engineering motivation: Computers must learn to understand human language. A huge portion of human communication, thought, and culture now passes through computers. Ultimately, we want our devices to help us by understanding text and speech as a human would—both at the small scale of intelligent user interfaces and at the large scale of the entire multilingual Internet.
The scientific motivation: Human language is fascinatingly complex and ambiguous. Yet babies are born with the incredible ability to discover the structure of the language around them. Soon they are able to rapidly comprehend and produce that language and relate it to events and concepts in the world. Figuring out how this is possible is a grand challenge for both cognitive science and machine learning.
The disciplines: My research program combines computer science with statistics and linguistics. The challenge is to fashion statistical models that are nuanced enough to capture good intuitions about linguistic structure, and especially, to develop efficient algorithms to apply these models to data (including training them with as little supervision as possible, or making use of large pre-trained models).

What? [_click here_]

Models: I've developed significant modeling approaches for a wide variety of domains in natural language processing—syntax, phonology, morphology, and machine translation, as well as semantic preferences, name variation, and even database-backed websites. The goal is to capture not just the structure of sentences, but also deep regularities within the grammar and lexicon of a language (and across languages). My students and I are always thinking about new problems and better models. For example, latent variables and nonparametric Bayesian methods let us construct a linguistically plausible account of how the data arose. Our latest models continue to include linguistic ideas, but they also include deep neural networks in order to fit unanticipated regularities and large pre-trained language models to exploit the knowledge implicit in large corpora.
Algorithms: A good mathematical model will _define_the best analysis of the data, but can we compute that analysis? My students and I are constantly developing new algorithms, to cope with the tricky structured prediction and learning problems posed by increasingly sophisticated models. Unlike many areas of machine learning, we have to deal with probability distributions over unboundedly large structured variables such as strings, trees, alignments, and grammars. My favorite tools include dynamic programming, Markov chain Monte Carlo (MCMC), belief propagation and other variational approximations, automatic differentiation, deterministic annealing, stochastic local search, coarse-to-fine search, integer linear programming, and relaxation methods. I especially enjoy connecting disparate techniques in fruitful new ways.
General paradigms: My students and I also work to pioneer general statistical and algorithmic paradigms that cut across problems (not limited to NLP). We are developing a high-level declarative programming language, Dyna, which allows startlingly short programs, backed up by many interesting general efficiency tricks so that these don't have to be reinvented and reimplemented in new settings all the time. We are also showing how to learn execution strategies that do fast and accurate approximate statistical inference, and how to properly train these essentially discriminative strategies in a Bayesian way. We have also developed other machine learning techniques and modeling frameworks of general interest, primarily for structured prediction and temporal sequence modeling.
Measuring success: We implement our new methods and evaluate them carefully on collections of naturally occurring language. We have repeatedly improved the state of the art. While our work can certainly be used within today's end-user applications, such as machine translation and information extraction, we ourselves are generally focused on building up the long-term fundamentals of the field.
In general, I have broad interests and have worked on a wide range of fundamental topics in NLP, drawing on varied areas of computer science. See my , CV, and research summary for more information; see also notes on my advising style.

Courses

601.465/665 (fall): Natural Language Processing
601.765 (spring 2018, 2019): Machine Learning: Linguistic and Sequence Modeling
601.325/425/625 (spring 2017): Declarative Methods
601.865 (every semester): Selected Topics in Natural Language Processing
LI 569 (Summer 2013): Intro to Computational Linguistics (at the LSA Linguistic Institute)
600.226 (Spring 2003, 2004): Data Structures
600.665 (Spring 2002): Statistical Language Learning
600.406 (Spring 2000): Finite-State Methods in Natural Language Processing, Part II
600.405 (Fall 2000): Finite-State Methods in Natural Language Processing, Part I
CSC400 (Fall 2000): Problem Seminar (at U. of Rochester)
CSC577 (Spring 2000): Statistical Learning of Natural Language (at U. of Rochester)

Personal

Undergraduates are often curious about their teachers' secret lives. In the name of encouraging curiosity-driven research, here are a few photos:

John saw Mary killed Jason
I, Jay, and Kay
got chocolate?
As the Astonishing Fuzzo
As Prof. Henry Higgins in My Fair Lady (song video, with captions)
As the Scarecrow in the Wizard of Oz (song video, with captions)
As van Gogh
Bike diving on a hot day
ALS ice bucket challenge on a cold day
With my grad students
My kids Talia and Lev
At the furniture store
Groovin'
Professing (thanks to a student who prompted DALL-E to generate this)
Mug shot
My great-great-grandfather's lack of street smarts
My brain as of 1999 (sagittal,axial) - study carefully and you can skip my course (well, the first half).
My face 10 years later, as drawn by Hongyuan Mei

And some non-photos:

I own The Type/Token Distinction (that's right, I paid actual money for a philosophical idea, in cryptocurrency)
First Workshop on Unnnatural Language Processing (see also related work)
The Grammar and the Sentence (a song about parsing) (see also All Malonely People)
My Favorite Things (happy thoughts during Covid-19) ; Don'cha Go 'Way Mad; Where or When
Read about the threat of NLP and global warming
Why hamantaschen are better than latkes (animated PowerPoint with speaker notes) (see also related work)

If I had a geek code, it would be GCS/O/M/MU d-(+) s:- a+ C++$ ULS+(++) L++ P++ E++>+++ W++ N++ o+ K++ w@ !O V- PS++ PE- Y+ PGP b++>+++ !tv G e++++ h- r+++ y+++, but I disapprove of the feeping creaturismof these things.