Cards first
How I balanced Blood Rogue with a unit of value and ten thousand simulated fights, and why 'measure it' is the easy half of the lesson.
In my last post I wrote about boss design in Blood Rogue. This one is about the layer underneath bosses, which is where the game was actually broken: the cards. It’s also the post where the game stuff and the day-job stuff stop being separable, so I’ll just let them merge.
The symptom
For a long stretch, Blood Rogue was too easy in a way I couldn’t fix by turning knobs. I’d raise enemy HP and the fights got longer but not harder. I’d buff a boss and one class would shrug it off while another hit a wall. The numbers I changed never seemed to move the thing I wanted to move.
When I finally instrumented the combat engine and ran fights in bulk, the picture was unambiguous and a little embarrassing:
- The hero was stranding an average of 12.8 energy per fight, playing two or three cards a turn and then sitting on the rest, because a single card already won the turn.
- The Druid summoner build killed bosses while the hero literally stood still: one cheap summon out-damaged three additional plays.
- A particular two-class combo produced an 854-guard, 621-damage single turn, an entire boss deleted before it acted once.
- One outlier, Sorceress Lightning, sat at a 91% win rate and felt right. Its damage cards happened to be priced close to correct, so it was the only archetype playing the game as intended.
None of these are difficulty problems. They’re pricing problems. The game wasn’t too easy so much as mispriced, and difficulty was just where the symptom happened to surface.
A unit of value
You cannot tune a system you can’t measure, and you can’t measure one without a unit. So before touching anything I defined one: EVP, Effective Value Points, where 1 EVP is roughly 1 point of damage to a single enemy at the baseline class. Guard is worth 0.8 EVP each (defense is tactically narrower: it expires, damage doesn’t). Area damage is worth 1.4x but capped at three targets so it can’t bill you for enemies that aren’t there. A debuff is worth the damage it converts. Every effect in the game reduces to EVP.
With a unit, a card has a price you can check. I stole the baseline wholesale from Slay the Spire, whose upgraded common attack, 9 damage for 1 energy, is the de-facto unit of value the whole game is calibrated around. From there the cost bands fall out: a 1-cost card should pay about 1x baseline, a 2-cost about 1.5-2x, a 3-cost about 2.5-3.5x, and a 0-cost is deliberately worse than a 1-cost, buying you tempo rather than raw value. A card that pays 30 EVP at 2 energy isn’t a treat, it’s a bug that auto-includes itself and collapses every deck into the same deck.
Run the actual cards through that table and the diagnosis writes itself: the over-paying 1-costs were the reason the hero stranded energy, the reason summons trivialized bosses, the reason one combo could end a fight in a turn.
Cards first, downstream second
Here’s the part that took me longest to accept, and the part that generalizes hardest. Enemy HP, gear bonuses, charm effects, potions: all of it is priced relative to what a card does. A +5 attack ring is a massive upgrade if a baseline card hits for 6 and an irrelevant rounding error if it hits for 20. Boss HP is “right” only relative to the hero’s real damage per turn, which is just card output averaged over a fight.
Which means there is a strict order of operations. Fix the cards first. If you tune enemy HP while the card baseline is wrong, you’re calibrating the whole game against a broken ruler, and every later pass is patching symptoms of the thing you refused to fix. Get the cards right and most of the downstream numbers can stay where they are. Get them wrong and you’ll re-tune the game forever, one whack-a-mole at a time.
To check the work, I built a simulation harness, thousands of headless fights per change, reporting win rate, fight length, and ending hero HP per class and per difficulty. The bar isn’t “did win rate move.” It’s “does this card pull the fight into the target window”: five to eight turns, hero ending somewhere around 40-60% HP. A card that ends fights in three turns at 80% HP is overpriced into the player’s favor, no matter how fun it looks in isolation.
The half of the lesson that isn’t “measure it”
If the post stopped here it’d be the usual metrics sermon: define a unit, instrument the system, let the data tell you the truth. That’s the easy half, and on its own it’s dangerous.
The hard half is that the data lies, routinely, and you have to know its failure modes before you act on it. Slay the Spire’s designers tell a story about a card that looked broken in the numbers purely because of where players got it, not because it was strong. The win rate was an artifact of acquisition, not power. My simulator has its own version of this: a low win rate can mean a card is weak, or it can mean my sim’s autoplayer doesn’t know how to use it. Before I “fix” any outlier I now have to establish why it’s an outlier: is this a real signal, or an artifact of how I’m measuring? A number you can’t explain is not yet evidence.
There’s a second trap the sim taught me: optimizing purely for the numbers produces boring. A deck that wins by playing one guard card a turn for fourteen turns can post a great win rate and be miserable to play. The metric said “winning.” The game said “solved and joyless.” Both burst and grind are degenerate, they’re just degenerate at opposite ends, and no single number catches both. The measurement tells you where to look, never what good feels like.
That distinction is the whole reason I’m careful, in my actual job, about how I talk about developer-productivity metrics. I lean on them hard (release frequency, weighted diff signals, shipped outcomes) because flying blind is worse. But the number is an instrument, not a verdict. The moment a team starts optimizing the proxy instead of the thing the proxy was standing in for, you’ve recreated the one-guard-card-for-fourteen-turns deck, except now it’s a quarter of someone’s roadmap.
I went into Blood Rogue to balance a game. What it actually sharpened was the discipline of holding two things at once: measure everything you can, and never fully trust the measurement. Cards first, but taste, still, last.