Data Science at SPAN At SPAN, I helped make smart electrical panels for the residential market. For my first year there I was a data scientist working on ML-driven products, anomaly detection, battery modeling and control, infrastructure, and general data science/analytics support. After a year I transitioned to managing part of the Data Science team, covering between 4 to 7 people.
Technologist in residence at USDR U.S. Digital Response (USDR) is a nonprofit organization that helps governmental organizations on all levels respond quickly to technical issues, by providing a network of pro-bono technical expertise and modern technology. As the data lead I am responsible for project sourcing and intakes, client relationships, volunteer interviewing and matching, project management, and implementation for data-related projects. I worked on a wide range of themes including nursing homes, redistricting, COVID dashboards, vaccination appointments, and equity issues in vaccine access.
Social network formation during and after college We study the impact of going to college on people's social networks, using a large data set of networks from over 1k colleges in the U.S. We compare the structure of the resulting networks, the network formation processes, the role of homophily, heterogeneity across schools, connections between schools, and the long-range impact of networks formed during college years. This project was done in collaboration with Bogdan State, John Levi Martin, and Lada Adamic. The structure of U.S. college networks on Facebook. ICWSM, 2020 · paper The dynamics of U.S. college network formation on Facebook. Unpublished, 2020 · paper Persistence and change in structural signatures of tie formation over time. Social Networks, 2022 · paper
Choosing to grow a graph We apply discrete choice models to study social network formation. A large number of existing network formation models (including preferential attachment, triadic closure, node fitness, and homophily) can be unified within a discrete choice framework. The ability to sample data, as well as the existence of theory and software routines, make it easy to fit these models to large graphs. Work on this project was done in collaboration with Johan Ugander, Austin Benson, and George Pakapol Supaniratisai. Choosing to grow a graph: modeling network formation as discrete choice. WWW, 2019 · paper · code Scaling choice models of relational social data. KDD, 2020 · paper · code
Cultural integration following corporate mergers We trace the cultural integration of three firms based on analysis of email content before and after their subsequent mergers and explore how patterns of cultural assimilation that individuals follow after the merger relate to their subsequent career outcomes. This project is part of the Computational Culture Lab in collaboration with Anjali Bhatt, Amir Goldberg, and Sameer B. Srivastava. Unpublished · related paper
A large-scale analysis of racial disparities in police stops across the United States
Product data science at Airbnb I was at Airbnb from 2012 to 2016, as the first data scientist working on product. I worked with teams across the company, but my main focus was search. I also worked on data tools, including the experimentation platform and a system to share knowledge and findings within the company. I wrote a few external blog posts during my time there on: search, experimentation, and the knowledge repository.
Trust in the CouchSurfing community I worked with CouchSurfing to study how trust gets signaled and propagated in the community. We found linguistic markers that are highly predictive of distrust, and that trust relations are transitive enough to improve predictions of the valence of new ties. This work was done with Bogdan State, Ellery Wulczyn, and Chris Potts. Msc. Thesis, 2012 · Poster at ICWSM 2012 · paper