Philip Resnik's Home Page (original) (raw)

Computational social science.(Why? See my discussion of research and social impact.) The key question I'm exploring: what can the signal available in language use tell us about underlying mental aspects of the speaker/author, such as their ideology, emotional state, or the presence of mental disorders? My earlier work in this area has included topics such as sentiment analysis, persuasion, framing, and "spin", with interests in connections among lexical semantics, surface linguistic expression, and underlying internal state, as well as applications of unsupervised and semisupervised methods. These days from a linguistics perspective I'm less focused on lexical semantics, and more focused on computational pragmatics. From a methods perspective, I retain a longstanding interest in topic models, because of those models' interpretability and their ability to incorporate pre-existing knowledge as informative priors, plus like anyone else I'm interested in large language models -- though only as a useful tool, not as some grand answer to the question of how to achieve AI. (Although it may rapidly go out of date, to get some idea of my views see this 2023 talk to a lay audience on what they should know about ChatGPT.)

A primary application focus for me these days involves applying computational models to the identification of linguistic signal related to mental health. See my page about research and social impact for discussion and for a good overview of my angle on this see my invited talk, Analyzing social media for suicide risk using natural language processing (~30min), at the AWS Machine Learning Summit; my article with suicide prevention experts also helps to situate what I do in its broader context. In addition to suicidology, a primary area of research in collaboration with computational and medical school colleagues involves computational methods for identifying signal related to depression and schizophrenia; you, dear reader, are invited to contribute data to this effort here.

In addition to research on computational methods, I've been trying hard to make progress on the ability of the wider computational community to work with sensitive mental health data, including creation of the The University of Maryland Reddit Suicidality Dataset, development of The UMD/NORC Mental Health Data Enclave (a joint project with NORC at the University of Chicago sponsored in part by an Amazon AWS Machine Learning Research Award), and serving as co-founder and multiple-time organizer for the Workshops on Computational Linguistics and Clinical Psychology (CLPsych). Many of those pieces were brought together in the 2021 CLPsych Shared Task, where teams worked on prediction of suicide attempts using sensitive social media data within the UMD/NORC enclave. I also serve on the Scientific Advisory Committee for the Coleridge Initiative, a non-profit focused on data-driven policy decision-making, often involving sensitive data.

The other main area in which I'm applying these ideas is computational political science -- again see discussion on my page about research and social impact, and also take a look at some of my previous research. Recently, I was engaged in work with students Alexander Hoyle, Pranav Goel, and Rupak Sarkar, and collaborator Kris Miler and SoRelle Gaynor, on co-decisions, with the goal of using computational methods to better understand when and for what reasons individuals make the same versus different decisions.

With the same students I also recently engaged in an NSF RAPID project focused on improving topic modeling methods for analysis of open-ended survey responses, with the more general and ambitious goal of revolutionizing survey methodology by making open-ends a first-class citizen in survey research. This work was tightly connected to the COVID-19 pandemic: it involved collaborating on COVID-related survey research using computational techniques with folks at CDC National Center for National Center for Healthcare Statistics, the Pandemic Crisis Response Coalition, NYU School of Nursing, and others. These days it has evolved into collaboration Frauke Kreuter and folks connected with UMD's Joint Program in Survey Methodology.

**Computational psycholinguistics and neurolinguistics.**Over the past five years or so, I have been re-engaging more and more fully with my longstanding interests in computational approaches to cognitive questions, including psycholinguistics and more recently the computational neuroscience of language. During my 2018-2019 sabbatical, I began getting up to speed on interests in computational cognitive neuroscience and in Fall 2019 I began working with postdoc Shohini Bhattasali on applying computational models to neuroimaging data in order to better understand the physical basis of language comprehension and contextual influences on language (mis)understanding, in the context of a MURI project involving document understanding(Shohini is now on the Linguistics faculty at University of Toronto). I also collaborated with Christian Brodbeck (along with Ellen Lau and Jonathan Simon) on neural representations of continuous speech and linguistic context, using computational models as a way of expressing cognitive hypotheses. I'm excited that these lines of work have begun to produce some interesting results, e.g. see here for recent work where Shohini and I introduced a new predictive measure, topical surprisal, and used it in an fMRI to map the neural bases of broad and local contextual prediction during natural language comprehension. Right now I'm excited about new work with undergrad Chiebuka Ohams taking a serious look at predictive coding models and sentence understanding.

On the psycholinguistics side, I have long been interested in sentence processing and particularly the interaction of top-down prediction and bottom-up evidence; my paper on left-corner parsing and psychological plausibility is an early example. As I've ramped back up in psycholinguistics, my work has involved looking at the interactions between syntactically mediated compositional processes and broader context, for which vector space representations (yes, including "deep learning", see below) offer some interesting modeling tools. Some initial papers related to this line of work include Ettinger, Phillips, and Resnik, Modeling N400 amplitude using vector space models of word representation (CogSci 2016) and Ettinger, Resnik, and Carpuat, Retrofitting sense-specific word vectors using parallel text (NAACL 2016). I also remain quite interested in the ways that ideas from (statistical) information theory may have a useful role to play in explaining why language works the way it does. (This is an idea I first began exploring in my dissertation [ps,pdf], back in 1993, and in following years a variety of people like John Hale, Roger Levy, and Florian Jaeger, among others, have done very interesting work in the same spirit.) My psycholinguistics interests have led to interesting, more recent collaboration with my colleague Colin Phillips and his student Hanna Muller, and Colin and I are now co-advising Linguistics PhD student Sathvik Nair, who's been doing some really interesting work on surprisal and reading-time data, plus some recent, very important work looking at whether the non-cognitive tokenization schemes widely used in large language models might be problematic when these models are used in cognitively oriented research; EMNLP Findings paper here. (Also: folks who are interested in computational models and psycholinguistics should also be talking with Naomi Feldman.)

Finally, I believe there are interesting ways to connect these cognitive modeling interests together with more application-oriented interests and particularly the computational social science interests discussed above. For a very big-picture look at how I'm thinking about this, take a look at my invited talk,Beyond Facts: The Problem of Framing in Assessing What is True, at the 2020 Fact Extraction and Verification workshop (FEVER 3).