Research Methodology: Tips for Empirical Alignment Research
Based on Ethan Perez's post. Applies to highly experimental LLM alignment research (scalable oversight, adversarial robustness, chain-of-thought faithfulness, process-based oversight, model organisms of misalignment).
Core Principles
1. Hard Work Pays Off
- •Run as many experiments as you can, tinker a lot, try lots of stuff
- •Often many reasonable-sounding ideas need testing—sometimes the 5th or 20th thing is what works
- •"Rich get richer" effect: the more you experiment, the better you get at picking the right experiments
- •More things work when you try more stuff → easier to stay motivated
2. Rapid Iteration (Velocity)
- •Reduce uncertainty at the fastest possible rate (Jacob Steinhardt)
- •Get quick feedback and iterate on ideas rapidly
- •With LLMs, you can reduce uncertainty really quickly—even with a single message to GPT-4/Claude
- •Can gain 1+ OOMs more information per unit time by derisking ideas in the quickest way possible
3. Avoid the "Swamp"
- •The swamp = getting stuck when you really want technique X to work but nothing succeeds
- •Solution: high velocity—test as many ideas as possible per unit time until you escape
- •For LLM research: you should rarely be stuck in a swamp
- •If stuck → you've likely exhausted low-hanging fruit on that problem/approach
- •There's other low-hanging fruit elsewhere—go pick that instead
4. Low-Hanging Fruit Abundance
- •The field is moving so quickly that there's much more low-hanging fruit than almost any other field
- •Each new model capability = time to be opportunistic, explore what's now possible
- •High velocity is useful for both:
- •Picking low-hanging fruit quickly
- •Getting through swamps when you must solve a particular problem
Collaboration & Communication
Strong Collaborator Qualities
Ethan puts 70% weight on "getting ideas to work quickly" as criteria. Other qualities:
- •Receptive to feedback
- •Adds emotional energy rather than draining it
- •Transparent/communicative about issues faced
- •High-trust relationship where various topics can be discussed easily
- •Notices and calls out room for improvement in collaboration
Communication Best Practices
- •Overcommunicate: bring up issues during meetings or privately—nip problems in the bud
- •Close mentorship is maybe the fastest path to become an expert in a domain
- •Take agency: organize coworking, discussion groups, standups—whatever helps
- •Get feedback from peers, iterate on project plans
Research Workflow
- •Have a clear project plan with motivation and research goals
- •List all experiments you can think of running
- •Think about milestones and deliverables to stay accountable
- •Know what tools are available—sharing tooling increases experimental velocity
Paper Writing
- •See ethanperez.net/easy-paper-writing-tips/ for ML paper writing tips
- •Write short paragraphs composed of short sentences
- •Write comprehensive abstracts
- •Seek feedback from a naive audience
- •Distinguish confirmation from exploration in your writeup
Related Resources
- •Tips and Code for Empirical Research Workflows
- •Jacob Steinhardt's blog on reducing uncertainty
- •Michael Bernstein's slides on velocity