I think this snippet from one of David Sirlin's articles on balance might be useful for our discussion.
Why don't we come up with tier lists based on this model for whatever we're talking about at the moment whether it be leaders, traits, civilizations, units, etc. That way we can find some commonality of opinion and can focus on targeting what needs to be targeted quickly without the conversation going everywhere.
Sareln, since you're the one programming this, would you like to suggest what we focus on? That way we can provide discussion onto whatever is most relevant for you at the moment.
Quote: David Sirlin from his article - Balancing Multiplayer Games, Part 3: Fairness
The Tier List
During the balancing of Street Fighter, Kongai, and my card game called Yomi, I used a similar approach with playtesters. I think this approach doesnât really depend on the genre, and the key idea is managing the tier list.
The term âtier listâ is, I think, a term from the fighting game genre. It means a ranking of how powerful each character is from highest to lowest, but it also accepts that such a list cannot be exact. Instead of ranking 20 characters from 1 to 20, the idea is to group them together into âtiersâ of power. Remember that if a divine being handed you a 100% perfectly balanced game, that players would still make tier lists. You should accept the existence of these lists from players as a given, and its your job to manage this list.
In Kongai and Yomi, I even gave the players a template for the tier list that is most useful for me as a designer. First, I tell them to think of three tiers: top, middle, and bottom. Then I tell them about the two âsecret tiersâ that I hope are empty.
0) God tier (no character should be in this tier, if they are, you are forced to play them to be competitive)
1) Top tier (don't be afraid to put your favorite characters here. Being top tier does not necessarily mean any nerfs are needed)
2) Middle tier (pretty good, not quite as good as top)
3) Bottom tier (I can still win with them, but it's hard)
4) Garbage tier (no one should be in this. Not reasonable to play this character at all.)
My first goal of balancing is to get the god tier empty. Of course some character will end up strongest, or tied for strongest, and that is ok. But a âgod tierâ character is so strong as to make the rest of the game obsolete. We have to fix that immediately because it ruins the whole playtest (and the game). Also, the power level of anything in the god tier is so high, that we canât even hope to balance the rest of the game around it.
My next goal is get rid of the garbage tier characters. They are so bad that no one touches them, and itâs usually pretty easy to increase their power enough to get them somewhere between top, middle, and bottom. If they are somewhere in those three tiers (which gives you a lot of latitude actually), at least they are playable.
Public Tier Lists
I really like it when playtesters all see each otherâs tier lists. The debate this spawns is very useful for me to read (or overhear in person) and for the playtesters to sort out their ideas. Sometimes when someone put a character unusually high or low on the list, I dug deeper to find out that player really did know something most of the rest of us didnât. Other times, that player is just crazy and the rest of the testers are happy to point that out. Itâs also good to see what kind of consensus the testers come up with, like if they all rank a certain character as the worst, for example.
The biggest landmark moments in each of the games I balanced was when the tester communities consistently gave tier lists with no characters in the god tier or garbage tier. Once youâve achieved that, the next goal is to compress the tiers. That means that you want the difference between the best and worst characters to be as small as possible. Notice that that means even if you have the same characters in the bottom tier that you did a month ago, you might have dramatically improved the game if all those âbadâ characters are really only a hair worse than the tier above, rather than way worse.
Adjusting the Tiers
In all the games I balanced, I used the same approach of letting the top tier set the benchmark power-level. In Street Fighter, I already had an established top tier as a starting point from the previous game, but in Kongai and Yomi, it was somewhat accidental who ended up in the top tier. But early on, after the god tier was removed and it was pretty clear which characters / decks were top, I allowed that to be the target power level. In other words, the characters in that tier are âhow the game is supposed to be.â Again, I didnât plan exactly who would be here, but I accepted how it ended up and worked with it. So if the top tier is the target, itâs the bottom tier you should adjust the most. If the top tier is the intended power level, you donât really want to mess up the good things you have going there. Instead, boost the bottom characters up and compress the tiers as much as you can, so you get the worst characters just barely below or equal to the best characters.
There are some psychological factors that I saw over and over again while making these adjustments. The first is that whenever I make a move or character worse (aka ânerfingâ), players overreact. Sometimes that top tier creeps a little too high in power, or an otherwise average character ends up having something unexpected thatâs crazily good, or a character has a move that really reduces the strategy in the game and needs to lose that in exchange for gaining something else. Thereâs lots of reasons for nerfs.
Iâll use some made-up numbers to convey the general idea here. Imagine a move is at power level 9 out of 10, and thatâs just too good for that character. Time and time again, I saw that if I made the power level an 8 out of 10, playtesters would complain that the move was worthless and put the character down at least one tier. This happened consistently, and even in the cases where 8 out of 10 was still too powerful and it really needed to be a 7. For some reason, players in every game seem unable to grasp the concept that a top tier character who is made slightly worse can still be a top tier character.
This is one of the cases where I think you just canât listen to the playtesters. Ignore their first reactions to nerfs, let them play it more and get used to it, let them see if they can still be successful with the new version of the move, then take their feedback on that move or character more seriously.
The other psychological effect to know about is what happens when you increase a moveâs power. I learned about this Rob Pardoâs lecture on balancing multiplayer games at the Game Developerâs Conference, and I tried it on all the games I balanced, and I think Rob is right. He said that if you have a move that youâre not really sure how to balance, make it too powerful. If you make it too weak, then you run the risk of no one using it at all. Then, when you slightly increase its power, none of the testers will notice or care. They already decided that move is weak. Then if you make it slightly more powerful still, they still wonât care. Even when you inch it up past the reasonable level of power, itâs hard to get it on peopleâs radar and that makes it really hard to know how to tune the move.
Instead, Pardo said to start with the move too powerful. Then everyone will know about it and care about it. I did exactly this with T.Hawk, Fei Long, and Akuma in Street Fighter HD Remix, because I had trouble figuring out their power levels. Each one of those characters was the best character in the game at some point in development, and that meant I got lots of feedback from testers about these characters. It also gave me a sense of where the top of the scale even was. Sometimes my âtoo powerfulâ versions of a character would end up waaaaay too good, or sometimes just barely too good. By knowing where the upper limit was, it helped me pick appropriate power levels more quickly. That said, I did have to deal with the inevitable cries that follow all nerfs, but that just goes with territory here.
Illusions in Tiers
Another point from Rob Pardoâs speech on multiplayer games was not to balance the fun out of things. Iâm very conscious of this as well. Donât just think about the game as some abstract set of numbers that has to line up. You also have to think about how people will perceive it and whether itâs actually fun. Pardo said that he likes the player to feel like the tools they have are extremely powerful, even though they are actually fair.
Tafari is unfair!An example of this in one of my games is Tafari, the Trapper in Kongai. Tafariâs main ability is that the enemy cannot switch characters while fighting him. Switching characters is one of the gameâs main mechanics, so fighting him is like playing rock, paper, scissors with no rock. It seems, at first glance, ludicrously powerful. But from the start, I gave Tafari several weaknesses and he loses many fights if he ends up having to fight on even footing. Heâs best when you bring him in against an already-weak character to finish them off.
I knew Tafari was not too powerful. I tested him with many experts and they tended to rank him as middle tier once they got the hang of him. As we added new testers over time, probably nearly 100% of them claimed that Tafari was too strong. I refused to change him though and after a year of testing, the best players still ranked him as middle tier, while inexperienced players still ranked him as top. Tafari is an illusion.
Iâm telling you this because you have to be very careful with feedback in cases where you intentionally made something feel more powerful than it actually is. Itâs a success if you can pull that off though, because Tafari makes the game more interesting, creates lots of debates, and at the end of the day, he is balanced.
Why don't we come up with tier lists based on this model for whatever we're talking about at the moment whether it be leaders, traits, civilizations, units, etc. That way we can find some commonality of opinion and can focus on targeting what needs to be targeted quickly without the conversation going everywhere.
Sareln, since you're the one programming this, would you like to suggest what we focus on? That way we can provide discussion onto whatever is most relevant for you at the moment.


). Within the point value categories I've given each a value of between 1-5, as requested. I'm also going to ignore civ-specific traits (including Tolerant).