deep ganguli

research scientist at anthropic

about me

I am a research scientist at Anthropic, where I started and lead the societal impacts team. Prior to that, I was the founding research director at the Stanford Institute for Human Centered AI (HAI). I did my PhD in Computational Neuroscience at NYU and obtained my BS in Electrical Engineering and Computer Science (EECS) from Berkeley. For fun, I surf, play bass with my band, and read widely across the humanities and social sciences. Here's some of my and some

research

Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions
COLMlast author
Apr 2025
Many-shot Jailbreaking
NeurIPS
Dec 2024
The Capacity for Moral Self-Correction in Large Language Models
arXivfirst author
Feb 2023
Red Teaming Language Models to Reduce Harms
arXivfirst author
Aug 2022
Constitutional AI: Harmlessness from AI Feedback
arXivmiddle author - evals
Dec 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
arXivmiddle author - evals
Apr 2022

Measuring AI Agent Autonomy in Practice
Anthropic Research Bloglast author
Feb 2026
Evaluating and Mitigating Discrimination in Language Model Decisions
NeurIPS Algorithmic Fairness Workshoplast author
Dec 2024
Evaluating Feature Steering: A Case Study in Mitigating Social Biases
Anthropic Research Bloglast author
Oct 2024
Towards Measuring the Representation of Subjective Global Opinions in Language Models
COLMlast author
Oct 2024
Measuring the Persuasiveness of Language Models
Anthropic Research Bloglast author
Apr 2024
Challenges in Evaluating AI Systems
Anthropic Research Blogfirst author
Oct 2023
Discovering Language Model Behaviors with Model-Written Evaluations
ACLmiddle author - evals
July 2023
Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models
TMLRmiddle author - evals
May 2022

Introducing Anthropic Interviewer: What 1,250 professionals told us about working with AI
Anthropic Research Bloglast author
Dec 2025
How AI Is Transforming Work at Anthropic
Anthropic Research Bloglast author
Dec 2025
Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations
arXivlast author
Feb 2025
Anthropic Economic Index: Understanding AI's Effects on the Economy Over Time
Anthropic Economic Indexlast author
Feb 2025

The Impact of Advanced AI Systems on Democracy
Nature Human Behaviourmiddle author
Oct 2025
Collective Constitutional AI: Aligning a Language Model with Public Input
FAccTlast author
Jun 2024
Testing and Mitigating Elections-related Risks from AI
Anthropic Research Blog
Jun 2024
Opportunities and Risks of LLMs for Scalable Deliberation with Polis
arXivlast author
Jun 2023

What 81,000 People Want from AI
Anthropic Featureslast author
Mar 2026
How People Use Claude for Support, Advice, and Companionship
Anthropic Research Bloglast author
Jun 2025
Predictability and Surprise in Large Generative Models
FAccTfirst author
Jun 2022
The AI Index 2021 Annual Report
arXiv
Mar 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
arXivlast author
Feb 2021

Neural and Perceptual Signatures of Efficient Sensory Coding
arXivfirst author
Feb 2016
Efficient Sensory Encoding and Bayesian Inference with Heterogeneous Neural Populations
Neural Computationfirst author
Oct 2014
Implicit Encoding of Prior Probabilities in Optimal Neural Populations
NeurIPSfirst author
Dec 2010

Clio: Privacy-Preserving Insights into Real-World AI Use
Arxivlast author
Dec 2024
Starfish: Open Source Image Based Transcriptomics and Proteomics Tools
Journal of Open Source Softwarefirst author
Jun 2020
Druid: A Real-time Analytical Data Store
ACM SIGMOD
Jun 2014

media

It's their job to keep AI from destroying everything
The Verge
Dec 2025
Societal Impacts Team
What if We Could All Control AI?
The New York Times
Oct 2023
Collective Constitutional AI: Aligning a Language Model with Public Input
The 3 Most Important AI Innovations of 2023
Time Magazine
Dec 2023
Collective Constitutional AI: Aligning a Language Model with Public Input
What if Dario Amodei is Right About AI?
The Ezra Klein Show
Apr 2024
Measuring the Persuasiveness of Language Models
The Unpredictable Abilities Emerging From Large AI Models
Quanta Magazine
Mar 2023
Predictability and Surprise in Large Generative Models

AI hallucinations haunt users more than job losses
Financial Times
Mar 2026
What 81,000 People Want from AI
Does A.I. Need a Constitution?
The New Yorker
Mar 2026
Collective Constitutional AI: Aligning a Language Model with Public Input
I Used Anthropic's Interviewer Tool to Share My AI Complaints
ZDNet
Dec 2025
Introducing Anthropic Interviewer: What 1,250 professionals told us about working with AI
Anthropic Will Start Using AI to Interview Its Users About Their Experience with AI
The Verge
Dec 2025
Introducing Anthropic Interviewer: What 1,250 professionals told us about working with AI
Anthropic Turns Inward to Show How AI Affects Its Own Workforce
Inc
Dec 2025
How AI Is Transforming Work at Anthropic
AI is making the workplace lonelier
Axios
Dec 2025
How AI Is Transforming Work at Anthropic
Anthropic just analyzed 700,000 Claude conversations and found its AI has a moral code of its own
VentureBeat
Apr 2025
Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions
Exclusive: Anthropic's Index Tracks AI Economy
Axios
Feb 2025
Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations
How Claude Uses AI to Identify New Threats
Platformer
Dec 2024
Clio: Privacy-Preserving Insights into Real-World AI Use
Anthropic Leads Charge Against AI Bias and Discrimination With New Research
VentureBeat
Dec 2023
Evaluating and Mitigating Discrimination in Language Model Decisions
When Hackers Descended to Test AI, They Found Flaws Aplenty
The New York Times
Aug 2023
Red Teaming Language Models to Reduce Harms
Language Models Might be Able to Self-correct Biases—If You Ask Them
MIT Technology Review
Mar 2023
The Capacity for Moral Self-Correction in Large Language Models
OpenAI & Stanford Researchers Call for Urgent Action to Address Harms of LLMs Like GPT-3
VentureBeat
Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Starfish Enterprise: Finding RNA Patterns in Single Cells
Nature
Aug 2019
Starfish: Open Source Image Based Transcriptomics and Proteomics Tools

about me

research

Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions

Many-shot Jailbreaking

The Capacity for Moral Self-Correction in Large Language Models

Red Teaming Language Models to Reduce Harms

Constitutional AI: Harmlessness from AI Feedback

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Measuring AI Agent Autonomy in Practice

Evaluating and Mitigating Discrimination in Language Model Decisions

Evaluating Feature Steering: A Case Study in Mitigating Social Biases

Towards Measuring the Representation of Subjective Global Opinions in Language Models

Measuring the Persuasiveness of Language Models

Challenges in Evaluating AI Systems

Discovering Language Model Behaviors with Model-Written Evaluations

Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models

Introducing Anthropic Interviewer: What 1,250 professionals told us about working with AI

How AI Is Transforming Work at Anthropic

Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations

Anthropic Economic Index: Understanding AI's Effects on the Economy Over Time

The Impact of Advanced AI Systems on Democracy

Collective Constitutional AI: Aligning a Language Model with Public Input

Testing and Mitigating Elections-related Risks from AI

Opportunities and Risks of LLMs for Scalable Deliberation with Polis

What 81,000 People Want from AI

How People Use Claude for Support, Advice, and Companionship

Predictability and Surprise in Large Generative Models

The AI Index 2021 Annual Report

Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

Neural and Perceptual Signatures of Efficient Sensory Coding

Efficient Sensory Encoding and Bayesian Inference with Heterogeneous Neural Populations

Implicit Encoding of Prior Probabilities in Optimal Neural Populations

Clio: Privacy-Preserving Insights into Real-World AI Use

Starfish: Open Source Image Based Transcriptomics and Proteomics Tools

Druid: A Real-time Analytical Data Store

media

It's their job to keep AI from destroying everything

What if We Could All Control AI?

The 3 Most Important AI Innovations of 2023

What if Dario Amodei is Right About AI?

The Unpredictable Abilities Emerging From Large AI Models

AI hallucinations haunt users more than job losses

Does A.I. Need a Constitution?

I Used Anthropic's Interviewer Tool to Share My AI Complaints

Anthropic Will Start Using AI to Interview Its Users About Their Experience with AI

Anthropic Turns Inward to Show How AI Affects Its Own Workforce

AI is making the workplace lonelier

Anthropic just analyzed 700,000 Claude conversations and found its AI has a moral code of its own

Exclusive: Anthropic's Index Tracks AI Economy

How Claude Uses AI to Identify New Threats

Anthropic Leads Charge Against AI Bias and Discrimination With New Research

When Hackers Descended to Test AI, They Found Flaws Aplenty

Language Models Might be Able to Self-correct Biases—If You Ask Them

OpenAI & Stanford Researchers Call for Urgent Action to Address Harms of LLMs Like GPT-3

Starfish Enterprise: Finding RNA Patterns in Single Cells