Thoughts

2025

  1. RL environment specs

  2. The inverse 80-20 rule

  3. Method-driven vs problem-driven research

  4. AI research is a max-performance domain

  5. ATLA is the ultimate benchmark

  6. Measurement is all you need

  7. AlphaEvolve is thought-provoking

  8. Binary-choice questions for AI research taste

  9. The craziest chain-of-thought

  10. The best hard-to-solve easy-to-verify benchmark

  11. Flavors of AI for scientific innovation

  12. Debugging-prioritized AI research

  13. When scientific understanding catches up with models

  14. Butterfly effect of AI researchers’ backgrounds

  15. Benchmarks quickly get saturated

  16. Deep browsing models

  17. Unstoppable RL optimization vs unhackable RL environment

  18. Dopamine cycle in AI research

  19. Find the right dataset

2024

  1. Biggest lessons in AI in past five years

  2. Solving hallucinations via self-calibration

  3. Cooking with AI mindset

  4. OpenAI o3

  5. Value of safety research

  6. RL all the time

  7. People who influenced me

  8. Transition to AI for science

  9. Information density & flow of papers

  10. CoT before and after o1

  11. SimpleQA

  12. The o1 paradigm

  13. Inspiring words from a young OpenAI engineer

  14. Levels and expectations

  15. Bet on AI research experiments

  16. History of Flan-2

  17. When I don’t sleep enough

  18. Thinking about history makes me appreciate AI

  19. Advice from Bryan Johnson

  20. Sora is like GPT-2 for video generation

  21. A typical day at OpenAI

  22. Yolo runs

  23. Uniform information density for CoT

  24. Inertia bias in AI research

  25. Compute-bound, not headcount-bound

  26. Magic of language models

  27. Why you should write tests

  28. Co-founders who still write code

2023

  1. Hyung Won

  2. Read informal write-ups

  3. Relationship board of directors

  4. Reinventing myself

  5. Good prompting techniques

  6. 10k citations

  7. Manually inspect data

  8. Language model evals

  9. Amusing nuggets from being an AI resident

  10. When to use task-specific models

  11. Benefits of pair programming

  12. Many great managers do IC work

  13. Why I’m 100% transparent with my manager

  14. My girlfriend is a reward model

  15. Better citation metrics than h-index

  16. My strengths are communication and prioritization

  17. Emergence (dunk on Yann LeCun)

  18. UX for researchers

  19. My refusal

  20. The evolution of prompt engineering

  21. Prompt engineering battle

  22. Incumbents don’t have a big advantage in AI research

  23. Potential research directions for PhD students

  24. Best AI skillset

2022

  1. Add an FAQ section to your research papers

  2. Prompt engineering is black magic

  3. What work withstands the bitter lesson

  4. A skill to unlearn

  5. Advice on choosing a topic