Shlok Channawar

Poking around inside language models.

vol. i

Current

—junior @ Penn State, applied data science
—researching mech interp + AI safetycurrently enjoying
—AI safety fellow @ BlueDot
—building interp tooling for finance

Mostly I'm trying to figure out what models actually represent inside, and whether we can steer it. When I'm not doing that: poker, chess, and pointing a camera at the night sky.

The Orion Nebula — one of my astrophotography shots

Research

Algoverse AI Research

AI Researcher

2025 – 2026 · Remote

—co-first authored "Look Before You Steer" on SAE feature steerability
—accepted to the ICML 2026 Mechanistic Interpretability Workshop

BlueDot Impact

AI Safety Fellow

2025 · Remote

—thinking carefully about alignment and what it takes to make models safe

Projects

myfavorite

—extends Anthropic's Natural Language Autoencoders to investigate whether Qwen2.5-7B internally represents privacy violations before its outputs reveal them
—found that deflection is pre-committed before generation (AUC 0.89), leak signal emerges mid-output around token 42
—probing LLM internals for contextual integrity

Predictions

—we'll understand a frontier model's internals before we can fully control them
—interpretability becomes a standard part of every serious safety case by 2030

Notes

—the inside of a model is more interesting than its outputs
—most of research is just asking a better question
—you learn the most by trying to break your own results
—i hate dave's hot chicken

Education

Penn State · College of IST

Applied Data Science

2023 – Present · State College, PA

—originally in mechanical engineering, switched to applied data science in fall 2025 after getting into ai research

Contact

State College, PA · originally Nagpur, India

emailshlokchannawar05@gmail.com

githubgithub.com/shlok1808

xx.com/shlok_ch

linkedinlinkedin.com/in/shlok-channawar

chessandrej_karpathys_hair

résuméview·download