GitHub - abdulhaim/llm_writing_distortion (original) (raw)

LLM Writing Distortion

This repository accompanies the paper:
How LLMs Distort Our Written Language
by Marwa Abdulhai, Isadora White, Yanming Wan, Ibrahim Qureshi, Joel Z. Leibo, Max Kleiman-Weiner, Natasha Jaques.

Overview

We study these effects across three evaluation settings:

Human Evaluation — A randomized controlled trial where participants wrote an argumentative essay with or without access to an LLM.
ArgRewrite Analysis — With a dataset of 86 human-written argumentative essays with expert feedback collected in 2021 (before the release of ChatGPT), we performed a counterfactual analysis comparing human revisions to revisions produced by three frontier LLMs (gpt-5-mini, gemini-2.5-flash, claude-3.5-haiku) across five revision types (general, minimal, grammar, completion, expansion).
ICLR Analysis — An analysis of the strengths and weaknesses of 18k peer reviews from ICLR 2026 (where 21% were found to be LLM-generated).

Repository Structure

llm_writing_distortion/
├── iclr_analysis/          # Diversity and homogenization analysis on ICLR co-writing data
├── argrewrite_analysis/    # Analysis of LLM-assisted revisions on the ArgRewrite corpus
├── ArgRewrite/             # ArgRewrite dataset files
├── NRC-Emotion-Lexicon/    # NRC Emotion Lexicon for sentiment and emotion scoring
├── human_evaluation/       # Human evaluation data and analysis notebooks
├── requirements.txt
└── README.md

Quick Start

Installation

We recommend setting up a clean conda environment:

git clone https://github.com/abdulhaim/llm_writing_distortion cd llm_writing_distortion conda create --name writing_distortion python=3.10 conda activate writing_distortion pip install -r requirements.txt

Running the Analysis

Each subdirectory contains Jupyter notebooks that can be run independently:

human_evaluation/ — Open the notebook to inspect human evaluation results comparing human-written and LLM-assisted writing samples.
argrewrite_analysis/ — Open the notebook to analyze content and emotional shifts across revision stages in the ArgRewrite corpus.
iclr_analysis/ — Open the notebook to reproduce diversity metrics (lexical and semantic homogenization scores) across writing conditions from the ICLR 2024 study.

Datasets

ArgRewrite-v2

The ArgRewrite-v2 corpus (Chen et al., 2022) contains 86 argumentative essays written by university students in 2021, each paired with expert feedback and a human-revised second draft. Because this dataset predates the release of ChatGPT, it enables a clean counterfactual comparison: what would a human have written versus what an LLM produces given the same essay and the same expert feedback.

NRC Emotion Lexicon

The NRC Word-Emotion Association Lexicon (Mohammad and Turney, 2013) maps English words to eight basic emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, trust) and two sentiments (positive, negative). Used to quantify shifts in affective tone induced by LLM revisions relative to human edits.

LIWC-22

The Linguistic Inquiry and Word Count (LIWC-22) tool (Boyd et al., 2022; Tausczik and Pennebaker, 2010) categorizes words across 90+ dimensions including summary variables (analytic thinking, clout, authenticity), grammatical categories (pronouns, prepositions), and psychological processes (cognitive mechanisms, social processes). Used to measure shifts in analytical thinking style and authenticity between human-written and LLM-edited text.

ICLR 2026 Peer Reviews

Peer reviews from ICLR 2026, with LLM-generation labels from the Pangram AI classifier. We analyze 18k reviews drawn from papers that received exactly one fully human-written and one fully LLM-generated review, ensuring unbiased sampling across conditions.

Dependencies

pandas
numpy
matplotlib
scikit-learn
sentence_transformers

Citation

If you use this code or build on this analysis, please cite the associated work.