Studying how LLMs reason, decide, and take risks
I run behavioral experiments on language models using game theory and causal inference to figure out how prompts shape the way AI makes decisions. My work connects economics, alignment, and empirical AI research.
LLM behavioral alignment through the lens of economics
Empirical study across 283 prompt permutations identifying which phrases cause LLMs to embody scenarios versus observe them, and how this shifts strategic behavior.
15,000+ observations showing LLMs are 73.2% more likely than humans to take risks when decisions affect others.
When LLMs play behavioral games, their output changes based on the persona they take on. Mapping which demographic dimensions shift AI strategic choices.
Quasi-experimental analysis of zero bail policy reform using RDD and DiD to measure real-world outcomes of criminal justice interventions.
Tools and systems I've built
Conversational agentic RAG using PERMA+4 framework to optimize long-term STEM engagement. LLM orchestration + Discord.
Annotation software for YOLO training using SAM. Bootstraps object detection with CLIP transformer embeddings.
Agent-based simulation of free market dynamics. Models competitive equilibrium, price discovery, and emergent behavior.
150M police stops at 280 req/s. Python + Rust. Saved $200k/yr in API costs for criminal justice research.
Web interface for managing behavioral experiments. Annotation, survey, and data collection pipeline for MTurk and students.
Conversational game built on LLMs. Interactive dialogue exploring narrative generation and player-AI interaction.
Autonomous equipment rental on distributed ledgers. Raspberry Pi + IOTA. Won IOT2Tangle Hackathon.
View →Multi-agent bot swarm with C&C server. Recorded human mouse movements for realism. Docker isolation, custom scripting DSL.
Multiplayer educational capture-the-flag teaching web crawling. Matter.js WebGL, Flask backend.
View →