- Two-column run selectors with experiment→run cascading dropdowns and URL state sync
- Config diff with color-coded same/changed/added/removed entries using key-level comparison
- Line-level LCS response diff with added/removed/same highlighting
- Score comparison with overlaid indigo/emerald bars per scorer
- Pick Winner buttons submit human_preference score via API
- Full RunCard detail view for each run side by side
- 15 tests added (5 diff helper unit tests + 10 component integration tests)
- App.test.tsx updated to mock experiments.list for ComparePage route