taw's blog

The best kittens, technology, and video games blog in the world.

Friday, October 14, 2016

Design patterns for parsing binary files in Ruby

Sand Cat by Tanya Durrant from flickr (CC-ND)

A while ago I wrote about how to analyze binary file formats.

Video games have a ton of custom binary file formats, and to mod them I kept writing parsers, which were in a way very similar.

All parsers were very similar in form, but they varied in too many details to make it possible to extract any viable gem from that.

So let's write a parser.

Just read the whole file

Even if file is fairly big, it's probably still far easier to just read it all in one go, and save yourself from the pointless file I/O.

require "pathname"
class FooParser
  def initialize(path)
    @path = Pathname(path)
    @data = @path.open("rb", &:read)
    @ofs = 0

We're saving the @path mostly to make exception messages we'll throw more meaningful.

@data is all file's contents - and we read it in binary mode ("rb") to deal with binary vs Unicode and Windows silliness.

@ofs is current offset, as we'll be processing data mostly in linear way.

Some common helper functions

These aren't necessary for all parsers, but it's fairly common to see code like this:

class FooParser
  def bytes_left
    @data.size - @ofs

  def eof?
    @data.size == @ofs

  def fail!(message)
    raise "#{message} at #{@path}:#{@ofs}"

Such functions aren't really doing parsing work, they're mostly there for debugging and for sanity checks.

Get some data

Now let's get some data:

class FooParser
  def get(n)
    fail! "Trying to read past end of file" if bytes_left < n
    result = @data[@ofs, n]
    @ofs += n

Most files are byte-oriented, and going through a helper function to get the next few bytes and manage offset and end of file will save us complexity elsewhere.

This kind of function is also a great place for cases where file format is unusual somehow. Vast majority of formats are byte-oriented and fit in memory just fine, but if your format in too big, or uses individual bits, or does anything else weird, you can generally encapsulate such logic in get method - as long as @ofs uniquely describes current location in the file.

Get integers

Most formats are full of numbers and strings, and it's useful to write get methods for all of them. Let's start with Integers as they're easiest.

I generally use concise byte count convention (so get_i4 means get signed integer with 4 bytes), but more verbose bit-based naming (like get_i32 or even get_int32) might be more to your liking.

For most formats there's standard #unpack for them, so their unpackers are very simple - usually the only difference is right unpack string based on big vs small endianness of your format:

class FooParser
  def get_u1

  def get_u2

  def get_u4

  def get_i4

Very rarely file format uses more unusual number like 3-byte or variable length integer. Their get_* methods will look different from format to format, but you should generally be able to encapsulate them.

Here's an examples - for 24-bit numbers we can just pad them with extra byte (0 for unsigned, either 0 or 255 for signed), and use proper unpack string for 32-bit numbers:

class FooParser
  def get_u3

  def get_i3
    s = get(3)
    prefix = (s[0].ord >= 128 ? "\xFF".b : "\x00".b)

Usually you can either add some padding, or just read them byte at a time and do some math to get the right number.

Get floats

This is slightly more complicated for two reasons - first the most common format in files is single precision, but Ruby uses double precision. Unpacking floats generates questionable extra precision:

[0.1].pack("f").unpack("f")[0] #=> 0.10000000149011612

Especially if you plan to present the result to users, we often don't want that - we want the "nicest looking" double precision number which still decodes to the same single precision number.

Now I'm sure there's some more "precise" algorithm, but this prettification method worked well:
class Float
  def pretty_single
    result = round(6)
    return result if [self].pack("f") == [result].pack("f")

What we do here is round it to 6 decimal places, check if it equals self when converted to single precision, and if so, return rounded result, otherwise return self. It works with infinities, NaN, -0 and all other weirdness.

With that it's easy to write single precision float getter:
class FooParser
  def get_flt

Second problem is that Ruby unpack still lacks half-precision floats, which are extremely popular in GPU world, so any data destined to go to GPU often uses it - or just formats trying to save on space.

In such case, just get_u2 and do bit manipulation yourself.

Get strings

Strings are just as ubiquitous as numbers, but there's a few more formats, including:
  • character count (ofter u2, 1 byte characters), followed by ASCII characters - get(get_u2)
  • character count (ofter u2, 2 byte characters), followed by UTF-16 characters - get(get_u2*2).unpack("v*").pack("U*")
  • NUL-terminated ASCII / UTF-8 - result = ""; result << c while (c = get(1)) != "\x00"; result
  • Constant size ASCII field (here 256 1-byte characters), padded by NULs - get(256).sub(/\x00*\z/, "")
  • Constant size UTF-16 field (here 256 2-byte characters), padded by NULs - get(512).pack("v*").unpack("U*").sub(/\x00*\z/, "")
Just create appropriate get_str method based on whichever encoding your format uses.

It's not uncommon for a single file format to have more than one string encoding - in such case create multiple such methods, or make it accept some argument to switch between them.

Get individual data structures

The generic part of the parser is done, so now for any data structure in your format you need to write appropriate parsing method, returning some Hash, Array, or proper object depending on the structure:

class FooParser
  def get_vector3d
    x = get_flt
    y = get_flt
    z = get_flt
    {x: x, y: y, z: z}

For more concise code you can often shortcut this thank to ruby evaluating arguments left to right (like most languages nowadays, but there were dark times when undefined or backwards evaluation conventions existed too):

class FooParser
  def get_vector3d
    {x: get_flt, y: get_flt, z: get_flt}

Complete the parser

And the last part is completing the parser. If the whole file consists of same data structures every time, just write some kind of get_everything following the same pattern and you're done.

Even then I'd recommend adding something like fail! "Parsing done, but #{bytes_left} bytes left" unless eof? sanity check just to be sure format is what we expected it to be - extra garbage bytes generally indicate that file is not quite in formats expected.

Of course things are rarely quite as simple. It's very common for file formats to have structure like:
  • header starting with some magic
  • header then includes some global data and counts for data structures to follow
  • after that individual data structures follow, based on data in the header

Example parser

To avoid complexity of real formats, let's try a fictional format with kittens and puppies:
  • magic marker "CUTE"
  • u4 number of kittens
  • u4 number of puppies
  • each kitten has star rating (float), number of lives left (u4) and photo url (str)
  • each puppy has star rating (float) and photo url (str)
class FooParser
  def get_kitten
    { rating: get_flt, lives_left: get_u4, url: get_str}

  def get_puppy
    { rating: get_flt, url: get_str}
  def parse!     fail! "Wrong magic number" unless get(4) == "CUTE"     kitten_count = get_u4     puppy_count = get_u4     result = {       kittens: kitten_count.times.map{ get_kitten },       puppies: puppy_count.times.map{ get_puppy },     }     fail! "Parsing done, but #{bytes_left} bytes left" unless eof?     result   end end

Offset-based formats

OK, that's one easy file format, but what if instead of data structures following each other, header contains a list of offsets instead, and data structures are at positions indexed.

That's going to take some jumping around, so let's write a helper for that:

class FooParser
  def at_ofs(ofs)
    saved_ofs, @ofs = @ofs, ofs
    @ofs = saved_ofs

Then it's fairly straightforward. In formats like that you usually won't end at end of file when you finish parsing, so that step is not necessary. Let's say our format is:
  • magic marker "CUTE"
  • u4 number of kittens
  • that many times u4 offset to kitten data structure (relative to beginning of file)
  • (rest of the file, presumably containing kitten data)

class FooParser
  def parse!
    fail! "Wrong magic number" unless get(4) == "CUTE"
    kitten_count = get_u4
    kittens = kitten_count.times.map do
      at_offset(get_u4) { get_kitten }
    {kittens: kittens}

There are weirder data structures, like offsets relative to current location etc. You can generally keep all the offset mangling in helper functions and keep your parsing code high level.


This is more a debugging thing than something you'll need to do often in actual parsing, but sometimes in your pry session you'd like to see if what's following is a kitten without affecting current parser state. It's very easy:

class FooParser
  def lookahead
    saved_ofs = @ofs
    @ofs = saved_ofs

Then in your pry session just try lookahead{ get_kitten } and it will give you a kitten (or throw an exception).


Writing Ruby-based parsers following patterns described in the post have been far more convenient and flexible than anything else I tried, but there are limitations:
  • you need to have some clue about the format, or at least data structures at its beginning - there's some debugging helpers, but no autodetection, and the only "partial parsing" you can conveniently do is for beginning of the file - if you know some data structures in the middle of the file, it will still be hard to find them and partially decode them without knowing the rest of the format
  • it's inherently one-way - you'll need to write a separate encoder, even though some clever DSL might have given you 90% of it for free. I found that overhead of highly abstracting DSLs is generally much higher than just writing it twice, and it's relatively easy to test that they match by round trip testing - just grab some samples, and check if decoding followed by encoding gets you the sample back.
  • its performance is OK, but not amazing - we're working with very high level language, and doing byte manipulation with a lot of layers of abstraction above it - it's actually quite surprising how fast it works, but it's still nowhere near as fast as a parser written in very low level language
  • it's designed for full parsing in one go, not random access - this is mostly performance issue, but if you want to just read (let alone change) a single kitten in file with thousands, this pattern won't support you
  • if the format is crazy complex, you'll probably need a lot more sophisticated parser - this pattern is great for simple to moderately complex formats

Wednesday, October 12, 2016

How I made mtg.wtf fast

red kitten 01 by mathias-erhart from flickr (CC-SA)
Website speed was one of the most common complaints about mtg.wtf, so I decided to do something about it.

It won't be necessary, but if you want more background, you can read about architecture of mtg.wtf in my previous post.

Verify the problem

The site seemed "fast enough" to me, but multiple users complained about its performance, so presumably there was something real about it.

I still wanted to make sure. It's important to verify that the problem you're trying to fix is real and on your side, as to not waste time fixing problems which might be caused by someone's crappy mobile connection or ISP throttling.

I checked Rails logs, and indeed - timing I've seen there were pretty poor. There wasn't even much need for statistical analysis - very simple queries shouldn't routinely took ~1000ms.

Have a test suite

Before I started any performance work, I needed to have a test suite to validate that I'm not breaking anything. Fortunately mtg.wtf was as TDD as it gets, and had very comprehensive tests - even if I'm retrospectively somewhat regretting using minitest for that.

If you don't have good test suite, don't despair - for optimizations you just need to find a reasonably representative sample of inputs (which you can use logs for), and record current version's outputs for them. If something broke in process, you'll get different results, so you'll instantly know.

It's not as good as dedicated test suite, as your random sample probably lacks important edge cases a proper test suite would generally have, but you get 80% of value for 20% of effort, and you can add a few test for edge cases as you go.

Generate a benchmark

This was easy enough - I downloaded logs, extracted list of queries from them, took a random sample, and tested time per request.

The best thing about this set was that I could very easily generate new benchmarks for specific types of queries with just grep - so when optimizing let's say card type searches, I can look for t:, which will give a much more accurate result than just throwing everything at it. (assuming my optimizations don't affect other query types of course)

As secondary benchmark I also recorded time to run the test suite. I didn't expect it to be super sensitive measurement, but it's nice to have extra validation for zero effort, and I was going to run the tests a lot anyway.

A small complication for any kind of benchmarking is that you want your computer to not be overly busy, or it will contaminate the results. It doesn't need to be completely idle, as computers have multiple cores and you'll generally be using only one core, but I closed things like Chrome and used my other computer for sucht things for a while.


This is optional, and profiling tools in ruby are fairly weak and awkward, but I ran ruby-prof anyway to get some idea about code hot spots.

It would identify which parts of the code were taking the most time, but it's mostly there to generate some suggestions.

Precompute data

Shockingly the biggest hot spot was very simple query ConditionExact, which corresponds to querying for exact card - with queries like !goblin guide or !Far // Away.

The query was actually very slow, as it was going through entire database, and doing something like this:
  • go through the whole database, every card printing
  • take query, downcase, fix unicode, fix spacing
  • take card.name, downcase, fix unicode, fix spacing
  • if they're the same, it's a match
That was really slow, and also quite silly. Card names don't have spacing problems (like double spaces between words), and we're already fixing unicode silliness like the "Æ" ligature when we load the database. As for the query, we can precompute this data when we parse it and create Condition object.

It still needed the card name downcase step as card name was something like Goblin Guide, but it needed to match goblin guide, GOBLIN GUIDE and whichever other capitalization people use.

So the algorithm became:
  • go through the whole database, every card printing
  • take preprocessed query
  • take card.name, downcase
  • if they're the same, it's a match
That's still fairly inefficient, but it skips huge amount of pointless string manipulation.

Similar preprocessing was later applied to a lot of other conditions.

Separate common and uncommon cases

Going back to ConditionExact, it was actually two different queries:
  • exact card name - like !Goblin Guide
  • multipart card name - like !Far // Away
The multipart case was much more complex, as search engine represents every part of a multipart card as a separate object, so it needed somewhat complex logic to handle this.

So I moved all rare and complex code to ConditionExactMultipart and I just had the commond and simpler case left to optimize.

In a few other cases I set a flag if certain complex processing was needed. For example we support queries like o:"when ~ enters the battlefield", where "~" matches name of the card - so for example it matches Zealous Conscripts as it contains text "When Zealous Conscripts enters the battlefield, ...".

Such queries require more complex logic than most o: (card text) queries, which just look for same text every time, and can do more preprocessing.

In this case we set a flag in ConditionOracle constructor and then follow different logic based on that flag.

This optimization only makes sense when simpler condition is also the common one - if most cases are complex, you're not really gaining much by splitting them.

Use indexes

Most queries use ConditionSimple, which just tests card printings one at a time with match? method.

The alternative is for condition to process whole database and generate Set of results in one go.

These two modes are combined by he most common ConditionAnd to first get all Sets, take their intersection, and then filter them with each ConditionSimple one at a time.

Direct result Set generation was already used for a few queries already - like searching for cards printed in a specific edition, or specific block. I added that to format search as well, as f:standard is reasonably common phrase in queries which benefits a lot from this kind of early culling. Bigger formats like f:modern or f:commander didn't significantly benefit from this optimization, but they didn't seem to suffer either, so I applied it everywhere.

Of course the biggest beneficiary of direct access was ConditionExact which got a downcased card name index, and can now result results almost instantly.

I didn't add any more indexes as ruby is very memory hungry and keeping memory use low is very important. Other queries were either much less common or wouldn't benefit anywhere near as much.

Optimize memory use

I wanted to take a quick go at reducing memory use, unfortunately ruby memory profiling tools are still very poor, so I just did a few intuitive things like removing unnecessarily duplicated data.

I pondered going further, but speed improvement I got for random sample of queries were already huge. I didn't need anything complex like adding caching, getting bigger box, trying JRuby, or moving search engine to external service.

Verify the solution

Even though changes looked really good in benchmark mode, it's important to verify them in production. So I checked the results, and indeed vast majority of requests are fast now.

There's a few which are still somewhat slow - like for example someone trying to get 52nd page of result of all cards in EDH, but it's a reasonably safe bet that most of such weird requests are made by bots.

Overall, it's been a fairly straightforward and successful process.

Monday, October 10, 2016

Architecture of mtg.wtf search engine

There's nothing revolutionary here, it's a reasonably straightforward design. All the code discussed can be seen in its repository.

What are we searching?

We're searching "cards", but that's not specific enough, as there are two big complications:
  • Some cards have multiple printings
  • Some cards have multiple parts
So we need to decide if our searchable object is:
  • one part, one printing
  • one part, all printings
  • all parts, one printing
  • all parts, all printings
Far // Away is an example of multipart card

Vast majority of cards have only one part, and a lot of the time card's properties are the same between printings, so we could get away with other choices, or even inconsistent semantics, but I've chosen to operate on most granular "a single part of a single printing of a card".


Raw data comes mostly from mtgjson, but first step is converting into a more usable form, which is actually just more json. I keep both raw and processed json files in repository, mildly annoying github, but in this case it's best practice as tests can't run without that, so code is incomplete without those jsons.

Indexer does some transformations and validations, and groups data by properties which should be same between cards (like name, text, cost, color) and those which can vary (like artist, rarity, flavor text).

When I started the project mtgjson data was of high quality, but it got a lot worse since it got new maintainers, so now I need to do a lot more fixing (and reporting those bugs upstream did nothing).

Every now and then I feel like I should just fork mtgjson and fix all those issues, but it's a pile of javascript and duct tape, so I mostly hope they get their act together again.

Because indexer started as very simple script and accumulated hacks over time (mostly due to deteriorating mtgjson quality), it's the ugliest part of the codebase. It has no dedicated tests, but as the whole test suite is operating on real data, any bug in indexer or mtgjson is likely to be picked by tests down the line.

Loading data

The library loads data from processed json files and transforms them into in-memory representation of Ruby objects. There's no dedicated database or anything like that - we're doing everything directly.

There's currently 16686 cards and 31648 printings, so it's not tiny dataset, but it's not unreasonably huge for this fairly naive approach to work.

Query Parser

We parse any query with StringScanner - which is one of the least known parts of ruby standard library, generating Query object.

Query object contains a tree of Condition objects corresponding to the query. In addition to searching Query object is also responsible for sorting results (by name, most recent etc., with sort: operator) and for time travel.

Yes, time travel - I find it occasionally useful to sometimes search for results as if I was searching at particular date in the past - so cards not printed at that time are not in search results, and format legalities are changed accordingly. This isn't literally going into the past, as we're not trying to do anything like updating Oracle text to old wording (which wouldn't be all that helpful), but that's how you can search what to play when you somehow get to play flashback Innistrad Standard.

time: operator is on query level, so you can't do cute things like asking for cards legal in both Ravnica Standard and Return to Ravnica Standard.


Condition objects must implement #search(db) method which takes database and returns Set of matching card printings.

This probably seems like a weird interface, and most conditions (those returning true when asked #simple?) also implement secondary interface of #match?(card_printing) returning true or false and avoiding allocations of intermediate Sets.

ConditionAnd, ConditionOr, and ConditionNot use one or the other API depending on type of their subconditions.

The reason for this design is that some subqueries ask about other part of the card, or other printing. For example you can check for cards printed in both M10 (Magic 2010) and M11 (Magic 2011) by e:m10 alt:e:m11. Doing recursive search and checking all subconditions would be exponentially slow in the worst case, and this design is always linear is query size.

Searching for edition (e:) and block (b:) always return small number of results, and we already have to maintain set data, so we reuse it as index and return small result Sets from them. Other queries don't use any indexing and just check cards one by one.

For most conditions either all printings match a query or none do, so special case for conditions which don't deal with anything printing-specific could potentially speed up a lot of queries. On the other hand while some cards have crazy number of printings, average is just 1.9 printings/card, so it's hard to tell if such savings would be worth added complexity.

Of course more indexes and more sophisticated query optimizer would be possible - or even using some dedicated search engine like solr. This didn't seem necessary due to relatively small amount of data. Fortunately the search engine has very extensive tests, so if I ever feel like developing one, it would be reasonably easy to do so safely.

Spelling correction

Many Magic cards have names which are very easy to misspell.
How do I spell this exactly?
QueryParser deal with it - if search returns zero results, it sets fuzzy flag on Conditions and retries the whole search. A few conditions will autocorrect your query, and any such autocorrection are logged and displayed with search results.

This two step process (as well as diacritics stripping and very limited amount of stemming) deals with most misspelled searches with good usability while keeping false positives to very low level.

Results coalescence

At this point we get a sorted list of printings, which is not quite what we want to display - all printings of the same card need to be merged (which is just .group_by(&:name)) - and since we'll be displaying card picture from specific printing we also want to choose the most appropriate one.

The algorithm for that:
  • choose only from matching printings
  • prefer those with card picture (small number of rare promos don't have pictures) - that's a frontend consideration database has no idea about
  • otherwise use order returned by search engine, which is:
    • whichever order was explicitly requested if any
    • name in alphabetical order
    • Standard-legal sets first
    • most recent first
After that we paginate the results.

Future direction for mtg.wtf

The most common requests are:
  • improving website performance
  • adding card pricing information via some third party API
  • advanced search form
Advanced search form is a UI problem, for which I don't have a solution yet, as all advanced search forms are atrocious, leave most possible options out, or both.

Third party API for pricing ended up more complex than I thought, but I'll probably get there at some point.

Performance complaints probably mean that current architecture could use some improvements after all. It seems fast enough to me, but possibly that's not good enough.

Sunday, October 09, 2016

Solving sudoku with ruby and Z3

Solving Sudoku is a bit like FizzBuzz for constraint solvers, but since Z3 is very poorly known, and Z3 for ruby has few users other than me, I want to write a series of posts explaining various Z3 techniques, starting from very simple problems.

Parsing Sudoku

Before we get to Z3, let's read test data. A nice format for sudoku would be a text file like this:

_ 6 _ 5 _ 9 _ 4 _
9 2 _ _ _ _ _ 7 6
_ _ 1 _ _ _ 9 _ _
7 _ _ 6 _ 3 _ _ 9
_ _ _ _ _ _ _ _ _
3 _ _ 4 _ 1 _ _ 7
_ _ 6 _ _ _ 7 _ _
2 4 _ _ _ _ _ 6 5
_ 9 _ 1 _ 8 _ 3 _

Where _s represent empty cells, and numbers represent pre-filled cells.

Fairly straightforward code like this can parse it to 9x9 Array of Arrays:

File.read(path).strip.split("\n").map do |line|
  line.split.map{|c| c == "_" ? nil : c.to_i}

Getting us data structure like this:

[[nil, 6, nil, 5, nil, 9, nil, 4, nil],
 [9, 2, nil, nil, nil, nil, nil, 7, 6],
 [nil, nil, 1, nil, nil, nil, 9, nil, nil],
 [7, nil, nil, 6, nil, 3, nil, nil, 9],
 [nil, nil, nil, nil, nil, nil, nil, nil, nil],
 [3, nil, nil, 4, nil, 1, nil, nil, 7],
 [nil, nil, 6, nil, nil, nil, 7, nil, nil],
 [2, 4, nil, nil, nil, nil, nil, 6, 5],
 [nil, 9, nil, 1, nil, 8, nil, 3, nil]]

Z3 workflow

First, we need to get the solver:

solver = Z3::Solver.new

After that we feed it with a bunch of formulas with solver.assert formula.
How would we describe sudoku problem in plain English?
  • There are 9x9 integer variables
  • Each of them is between 1 and 9
  • They correspond to prefilled data, unless that data is nil
  • Each row contains distinct values
  • Each column contains distinct values
  • Each 3x3 square contains distinct values
Once we tell Z3 about that, we check if our formulas are solvable with solver.check == :sat, and if it so, get model with solver.model - I feel those two steps should probably be refactored into one, but let's leave it for now.

Model can be then accessed with model[z3_variable] syntax to get our answers. So let's get to it!

Creating variables

There's nothing special about Z3 variables, they're sort of like ruby Symbols with associated types. So we could build our formulas with names like Z3.Int("cell[4,8]") - such name is just for our personal use, and doesn't mean cells form an array or anything.

However, since we'll be referring to the same 9x9 variables all the time it's probably useful to save them to an Array of Arrays.

cells = (0..8).map do |j|
  (0..8).map do |i|

Setting possible values

All variables are between 1 and 9, which is very easy to tell the solver about:

cells.flatten.each do |v|
  solver.assert v >= 1
  solver.assert v <= 9

Setting prefilled variables

Telling solver that Z3 Int variable is equal to some ruby Integer is completely straightforward:

cells.each_with_index do |row, i|
  row.each_with_index do |var, j|
    solver.assert var == data[i][j] if data[i][j]

All rows contain distinct values

If we relied on basic operations, we might have to give Z3 a lot of inequalities per row, fortunately there's Z3.Distinct(a,b,c,...) formula, which handles this very common case.

It makes it really easy:

cells.each do |row|
  solver.assert Z3.Distinct(*row)

All columns contain distinct values

We can use Array#transpose to flip our multidimensional array, and do the same thing we did for rows:

cells.transpose.each do |column|
  solver.assert Z3.Distinct(*column)

All square contain distinct values

This is a bit more complex rearrangement, but it's all pure ruby, with Z3 getting very similar looking formula in the end:

cells.each_slice(3) do |rows|
  rows.transpose.each_slice(3) do |square|
    solver.assert Z3.Distinct(*square.flatten)
By the way, you should really take a look at Enumerable API, it contains a lot of useful methods which will save you a ton of time.

Get the model and print it

And we're basically done:

raise "Failed to solve" unless solver.check == :sat
model = solver.model
cells.each do |row|
  puts row.map{|v| model[v]}.join(" ")

Getting the answer we want:

8 6 3 5 7 9 2 4 1
9 2 5 3 1 4 8 7 6
4 7 1 8 2 6 9 5 3
7 1 4 6 8 3 5 2 9
6 8 9 7 5 2 3 1 4
3 5 2 4 9 1 6 8 7
1 3 6 2 4 5 7 9 8
2 4 8 9 3 7 1 6 5
5 9 7 1 6 8 4 3 2

Avoid temptation to optimize

Everything we did here was so ridiculously straightforward - we skipped even the most obvious shortcuts.

For example why the hell did we assert that cells are between 1 and 9 even for cells whose value we know perfectly well? And why did we even create variables for them if we know their value already? If we coded any kind of manual sudoku solver, we'd probably start with those.

It turns out Z3 really doesn't care about such optimizations, and will solve your problem just as efficiently either way - but by "optimizing" your code will become more complicated, and there's a good chance you'll make an error trying to "optimize".

That's not to say Z3 is made of magic, and that there are no cases where you can help it - but that's much more likely to be by restating a problem in a different way than by some microoptimizations.

Some pitfalls

One small pitfall to remember is that Z3 values - including those returned by solved models - override all operators including == so you can't check if for example model[Z3.Int("x")] == 5 - that will just create a Z3 Bool expression.

This is necessary behaviour for some more complex use cases, but for the simplest case you'll generally want to #to_s whatever you get out of the model and go on from there.

API is subject to change

Some details will probably change in future versions to simplify things - especially everything from solver.check onwards, which will probably receive a simpler API for the most common case, but existing one will still be supported as it's necessary for some more complex situations.

Integers variables constrained to specific range are another commonly used pattern which could use some shortcuts.

Z3 is crazy powerful

This was a very simple example. You could probably write a sudoku solver yourself - even if it'd take a lot more time, and would probably perform a lot worse than Z3.

This manual approach doesn't scale - doing much bigger and much more complex problems with a constraint solver is as literally fast as just writing the constraints, while doing it manually constraint list is barely a starting point, and naive approaches get exponentially slow almost right away.

Due to unfortunate accidents of programming history constraint solvers got relegated to an obscure niche, but they really ought to be a tool in every good programmer's toolkit, and once you learn them you'll find applications for them all the time.

Saturday, October 08, 2016

Z3 Constraint Solver library for Ruby

Weazel by Lcrward from flickr (CC-ND)

Wouldn't it be awesome if you could just describe a problem to the computer, like let's say a sudoku, and it would find the answer for you?

That was the premise of "logic programming" in 1970s and 1980s, which was also one of many cases where Japan tried to do something else than everybody else, and failed, but it's not a post about history.

Logic programming suffered a lot worse than other paradigms. Lisp, Smalltalk, Perl, and friends left rich legacy of features for future programming languages to mine, but Prolog variants were all one big dead end.

Which is a bit of a shame, as some problems are really best coded by giving computer a list of constraints, and telling it to have a go at it, and it's very difficult to write a decent custom algorithm for them.

Technically you could use constraint solver libraries even without logic programming, but they were mostly only available in obscure Lisp / Prolog dialects, hard to compile on modern systems, barely documented, and/or extremely limited.

This somewhat changed with Z3 by Microsoft Research, which even got rudimentary Python interface.

But while Python is tolerable, I got tired of all the self. nonsense, and I wanted a nice Ruby DSL, so I wrote Ruby bindings. You can install them as z3 gem but first you need z3 library itself (brew install z3 or whatever works on your system).

If you want a quick look, check examples folder. If you want longer explanation, let's keep going.

Basic model

The basic Z3 workflow is:
  • create solver with Z3::Solver.new
  • create a bunch of formulas, generally starting with variables like Z3.Int("a"), Z3.Bool("b") etc. - formulas are independent of any solver, they're basically symbol trees.
  • assert some facts about those variables, like solver.assert Z3.Int("a") == Z3.Int("b") + Z3.Int("c")
  • ask solver to check if your set of formulas is satisfiable with solver.check. If it returns :sat, you're good to go.
  • get Z3::Model with solver.model
  • ask model for value of various variables with queries like model[Z3.Int("a")] etc.


Let's go deeper, one step at a time. Values in Z3 all belong to "Sorts" which are sort of like types.

To create a Sort object, just instantiate its class, like Z3::BoolSort.new, then you can create relevant object like Z3::BoolSort.new.var("my_var") or Z3::IntSort.new.from_const(5). This seems like unnecessary indirection, and it sort of is for most common sorts, but there are fancier ones where it's necessary.

Some of the sorts are:
  • Z3::BoolSort.new - boolean variables
  • Z3::IntSort.new - integers, unlimited precision
  • Z3::RealSort.new - real numbers, unlimited precision 
  • Z3::BitvecSort.new(size) - bit vector sorts, of size bits
  • Z3::FloatSort.new(esize, ssize) - floating point sort of esize exponent bits and ssize significand bits
  • Z3::RoundingModeSort.new - floating point rounding mode sort
  • Z3::SetSort.new(elemement_sort) - set sort of elements of elemement_sort sort
  • Z3::ArraySort.new(key_sort, value_sort) - associative array sort with keys of key_sort, and values of value_sort
  • there are some more sorts in Z3 which are not yet implemented
For vast majority of uses you want Bool and Int sorts, for physics style calculations you want Real sort (generally don't use Floats for them). They have decently completely and nice APIs.

Bitvec and Float sorts should work reasonably well, but their APIs are somewhat awkward - for example a lot of Bitvec operations have separate signed and unsigned operations, so you need to do awkward things like Z3.Bitvec("a", 32).signed_gt(100) instead of more obvious but ambiguous Z3.Bitvec("a", 32) > 100.

Float APIs are even worse as a lot of operations need rounding mode passed - and most operations take rounding mode argument, and printing out results tries to use precise but unintuitive notation like 1.25B+5.

Sets, Arrays, and fancier stuff, are basically unfinished.

The obvious missing sort is any kind of finite domain integer, like Int[1..9] - in Z3 you need to create Int variable, and then tell the solver that it's within certain range, like solver.assert (Z3.Int("a") >= 1) & (Z3.Int("a") <= 9)


You want to give Z3 a bunch of formulas, and these generally start with variables like sort.var("name"). As Z3::BoolSort.new.var("name") is fairly long, shortcuts like Z3.Bool("name") are provided.

From that point, you can construct your formulas with fairly straightforward ruby - a + b == c means exactly what you'd expect if a is Z3.Int("a") and so on.

There are of course some complications:
  • ruby won't let us override !, &&, and || - so we use ~& and | for booleans - and they have different parsing priorities so you might need to add some parentheses
  • you can't directly check if a value is equal to something else, as foo == 0 will create a z3 Bool expression not give you true/false. It's a great case where some extra operators would be useful, but for now just .to_s whatever you want to extract or check.
  • some operations don't have easy operator equivalents, so other syntax is provided. For some common examples - to assert that a bunch of values are all different, you can use Z3.Distinct(a, b, c, ...), for z3 ternary use (bool_expression).ite(if_true, if_false).
  • we'll automatically convert ruby values like true or 42 to z3 expressions, but if you try to mix sorts without converting them appropriately you'll generally get an exception
  • ASTs are interned, so every Z3.Int("a") + 2 == Z3.Int("b") is going to be the same underlying object.

Library layers

Z3 is a huge library, and it really doesn't map to anything resembling a sensible ruby API, so there are many layers here:
  • We use ffi to create Z3::VeryLowLevel interface of raw C calls. You should never use it directly.
  • After that there's Z3::LowLevel interface which basically deals with context management, array arguments, and mapping Ruby object arguments (but not return values) to FFI pointers. You should also never use it directly.
  • Then there are legitimate Ruby objects. A lot of them are wrappers around C pointers, so using SomeClass.new(c_pointer) to create them directly is not really supported. These pointers can be accessed by attributes starting with underscore (like _ast, _sort etc.), but it's all for internal use only.
  • The main reason for this anal system is that if you mess anything up, you'll get a crash. For some things Z3 library raises error which we then turn into Z3::Exception, but other things just crash your program with segmentation fault. I added a lot of checks for operations which might end with segmentation fault so you get Z3::Exception instead, but I'm sure I missed some things.
  • Reference counting memory management z3 uses is not exactly compatible with ruby, so if you use it in a very long running process like Rails server, it might cause problems. It's usually going to be no worse than Symbol interning.
  • Default interface doesn't give any guarantees for running time. The underlying z3 library has ways to restrict solver check running time, but they're not exposed in any convenient way yet.

No semantic versioning

Z3 is huge library (I only described the basic), and the gem is not only incomplete, but many APIs are fairly poor.

Until I release 1.0, any version can freely break any API, so if you want to rely on a stable version, just depend on exact one - or help me complete it faster.

The regular flow with Bool, Int, Real and Solver shouldn't break too often, but Bitvec, Float etc. feel awkward and could use a nicer API.

Pull requests wellcome.

Lessons from making mtg.wtf mobile-friendly

When you are weary,  feeling small by Michael Taggart Photography from flickr (CC-NC)
mtg.wtf is currently the best search engine for Magic: the Gathering cards. It's reasonably successful and popular, so let's go through its history quickly.

It started as a wishlist of improvements I'd like to see for magiccards.info

Then once magiccards.info stopped being updated, I decided to write a replacement. I started by wring a bunch of tests for how I'd like it to work, mostly using data from mtgjson

After I had tests, I wrote a search engine as Ruby library with command line interface. I thought about making it a standalone gem at this point, but I'm not sure how useful people would find it, considering json data is out there already.

After that I added rudimentary Rails frontend with minimal styling and put it online.

At this point it was text only, so I had to find card scans. WotC only publishes somewhat low quality scans, which are usable enough, but I wished to get something better. For about 2/3 of cards I found those higher quality scans, but it was a fairly slow and manual process. That still leaves some rare promos there's just no source of any kind - it wasn't a huge deal, as few people search specifically for them, the search engine just needed to display card with good picture instead of most recent version (which might be weirdo promo).

Then I added some CSS on-hover animations. Most cards have regular layout, but there are some cards which are double-faced, have horizontal layout ("split" cards), or have content on top and tobbom ("flip" cards). Later they even added pairs of cards which merge ("meld") into one big card with halves printed on their back face. All those cases for decent usability required either completely different display layout, or some animation support, and CSS animations turned out to be pretty easy to add.


At this point I was getting feedback that it doesn't work too well on mobiles.

Well it's not too hard, we just need to add header telling mobile browsers to not be stupid:

  <meta content='width=device-width, initial-scale=1' name='viewport'>

Then I installed bootstrap, and replaced some existing layout code with bootstrap 12-column grid. It was surprisingly easy retrofit - other that bootstrap's use of .card colliding with my use of .card I don't remember any surprises. Markup looked a bit worse due to extra <div class="row">, <div class="col-md-8"> etc., but it's not too bad thanks to HAML.

Making it responsive

I have a bunch of mobile devices around, and testing on one or two devices is not too much trouble, but I wanted to make sure site will work well on everything, so I used Device Mode in Chrome Developer Tools to make sure everything works in wide range of sizes, portrait and landscape, added a bunch of responsive bootstrap tags, and I pushed the changes to the server.

  • For big screens site used desktop layout - 4 column for card picture, 6 column for card information, 2 columns for extra intformation
  • For medium screens layout was 6 column for pic, 6 column for information, and extras were hidden
  • For small screens layout was full row for card pic, and full row for card information underneath, with extras hidden
That was the most usable configuration in Chrome Device Mode.

Animation problems

There were two immediate problems - first Mobile Safari users reported that CSS animations are broken - it turned out that it required some prefixed properties for backface-visibility which nobody told Rails CSS autoprefixer about.

Second problem was that on very small screen sizes (where card was already full screen width) there was simply no space to do some of the animations - I ended up doing messy screen size queries in my CSS code matching Bootstrap size thresholds to turn them off.

And it was all wrong because "pixel" is a lie

I was quite happy with it, so I expected my users to be as well, but feedback I got was overwhelmingly negative. Users found the site unusable on mobiles, even though I crammed as much information as could reasonably fit in Chrome Device Mode.

The thing that got the most backlash was placing card text under card pic. But if I put tiny card picture next to text nobody could see anything, right? Oh wait, those mobile "pixel" numbers are total bullshit. That 100 "pixel" width is actually a lot more device pixels, so even though card looked like completely unreadable thumbnail in Chrome Device Mode, it looked just fine when I tried to use it on an actual phone.

And even if card picture was too small, so what? It's just as easy to pinch to zoom as it is to scroll down a bit - something that doesn't apply to desktops (where Cmd+ / Cmd- can zoom, but they don't focus on specific area, so they're much less useful).

This also means that 6 pic / 6 info layout on bigger devices was excessive - so I switched all non-desktop devices to 5 pic / 7 info as that's what looked best on actual mobiles and tablets I tried it on, even it looks horrible in Chrome Device Mode.

So now the site looked good on desktop (with big browser window), and on mobile devices with 2x or higher pixel density (of any size).

That doesn't cover using mobile devices with 1x pixel density (which fortunately basically nobody uses any more), or desktop browser with small window size (which happens, but worst case users can resize the window).

That leads to two big lessons:
  • Chrome Device Mode is horribly unrepresentative as it uses 1x not 2x pixel density
  • Bootstrap device width targeting is fairly unrepresentative as it ignores pixel density
Getting rid of whole-width card pic layout let me get rid of most CSS animation workarounds. There are still some for gallery view, like Nils Hamm's cards from Dragon's Maze where animations need to be aware of which column the card's in.

Soft Keyboard

Now it looked good, but I ran into one more problem. On desktop we want to start search box focused with HTML5 autofocus attribute, so user can start typing the query without clicking anything.

Now most good browsers understand opensearch annotations, so if you ever visited mtg.wtf, they'll remember it, so you can just type mtg<tab>your query from URL bar, and they'll get you straight to search results - but not everybody uses it this way.

But we don't want to autofocus on mobiles - as then huge soft keyboard appears covering half of results and making it all miserable. At least on some mobile browsers, others sensibly require click to show soft keyboard even with autofocus attribute.

Unfortunately I never found any way to query if device has real keyboard or not. So backup logic of autofocusing on home page (which has no results, so soft keyboard covering empty space is no big deal), but not on other pages (which have some result) ended up being a workable compromise.

It would be nice if someone figured out which browser/device combinations have this issue, and created some device specific autofocus library, as it can't be the only site affected by the problem.

Overall, it wasn't too bad

Even with all the surprises, it all went better than expected.

But imagine the world could have been different - once upon a time mobiles tried to create their own separate web with completely separate protocols and formats. How insane would it be if they succeeded?

Thursday, September 29, 2016

EU4 Colonial Kebab AAR

Post 1 - Originally published on Google+ on 2016-09-06 21:17:25 UTC

Colonial Kebab: Part 01: 1444-1461: Regaining the cores

Time to play some EU4, and I wanted to try the new easy westernization system Eastern/Anatolian tech countries get before it goes away; and enjoy corruption, disloyal estates, states, espionage, ZoC bugs, and the rest of the new stuff.

I also checked every modification in Fun and Balance, and divided them into yes, no, and maybe groups. For changes I wasn't sure of (most notably scaled truce timer and no-mana vassal integration) I'm playing with vanilla values but I might change them as campaign progresses.

There are two variants of this easy westernization - regular variant which requires getting Vienna or Prague; and an even easier variant which requires Danzig or Krakow which Muscovy and friends get, probably as compensation for getting nerfed so hard in HoI4.

There's a lot of countries to choose from in Eastern/Anatolian groups, but as I also want to do some colonial game on improved map of Africa, I picked Ottomans.

Campaign goals are:
• westernize by taking Vienna or Prague
• unify islam (which requires blobbing quite hard)
• get nice colonial and trade empire

First step is of course getting capital. Weirdly it seems that vanilla fixed rival selection, so I have a wide choice of countries to choose as my PP cows, including Byzantium. What they did not do is fix maximum range, so F&B range fix is still necessary outside Europe, but we're doing well.

After that, I had a long list of cores to reconquer, which annoyingly generates a shitton of AE.

Then I got to some mission-based conquests, and I accidentally 104% OE, and got -30 legitimacy, -1 stability etc. So annoying.

I also get myself a very big coalition - but the faster I expand the better access I get to faraway places nobody cares for. I need to expand towards Prague or Vienna, and also towards Indian Ocean ports, and that led to lovely coalition of:

• Knights, Ragusa, Hungary, Austria (HRE), Venice
• Golden Horde, Aq Qoyunlu, Qara Qoyunlu, Timurids
• (briefly) Najd and Shammar

And then there's a bunch of countries which would love to join but are stopped by truce timers.

Fortunately I have two vassals - Bosnia and Serbia (both on scuttage, hopefully that means enemies can't pass through their territory, but who knows how ZoC rules work), and some allies - Tunis, Hedjaz, Georgia, Crimea, Wallachia, and Bohemia.

Now I'm presumably supposed to wait for coalition to either attack or get bored. I don't think I'm going to expand in Europe all that much, and I have tech advantage over everyone - mostly thanks to my starting 5/5/6 rules who's not going to last, but also thanks to estates giving me 100 of each mana type every 20 years, and +1/+1/+1 from max power projection. As far as I can tell, AI basically doesn't press estate buttons ever.

I started going for exploration ideas. Not sure what to take next. I thought religious, but conversion is crazy easy now between +3% from max piety, +1 missionary from Jerusalem, and +2% in any province I give to right estate.


And so it begins

4 cores to go, but let's get some fresh territory instead

It's a reasonable start. Timurids and Golden Horde have absolutely zero business being in coalitions against me.

Tech map. That won't last long, as soon as my ruler dies, it will look much worse, but then we're hoping to get free westernization at paper tech level 10.

Post 2 - Originally published on Google+ on 2016-09-07 01:38:33 UTC

Colonial Kebab: Part 02: 1461-1473: First coalition war

It should surprise nobody that I got hit by a coalition war right away.

The whole list of countries is fairly retarded:
• Austria
• Aq Qoyunlu
• Qara Qoyunlu (with Trebizond)
• Timurids
• Knights
• Golden Horde (with Ryazan, Circassia)
• Venice (with Naxos, Corfu)
• Ragusa
• Hungary

And then some random allies of Austria - Poland (with Lithuania and Moldavia), Brandenburg, Florence, and The Palatinate. It's good Austria didn't call in their remaining ally England.

Unfortunately unlike in the old patches when we just needed to defend our capital to get ticking warscore, coalitions now have bullshit show superiority wargoal.

Timurids started by completely ignoring fort ZoCs, so forts are still total bullshit.

My K:D ratios were completely ridiculous, at least at first - mostly thanks to my sword tech being 6 vs theirs 3 or 4. Of course my allies did much worse, and most of them ended up peacing out separately. On sea the fight was about even - neither of us could achieve total dominance.

Eventually I got +40 warscore from battles, but not enough to get ticking warscore - and that wasn't good enough for Austria to accept even my concession of defeat.

After five years of fighting I got them to accept giving me fairly symbolic 129 gold - they got far more in reparations from my allies, but that was the end of the coalition.

After that I beat up the Mamluks a bit more, and second coalition got reestablished - of Venice, Genoa, Aq Qoyunlu, Qara Qoyunlu, and Timurids.

Interestingly I could threaten Aq Qoyunlu to return one of my cores, and that established fresh truce and kicked them out of the coalition.

Well, not that much I can do now. I just need to wreck the whole coalition again - Venice/Genoa have stronger total navy, but not by crazy much, and I could maybe call Tunis into the war to help.

Setting up some kind of vassal Iraq and Persia to start tearing apart Qara Qoyunlu and Timurids sound like a good idea.


Post 3 - Originally published on Google+ on 2016-09-07 16:03:53 UTC

Colonial Kebab: Part 03: 1473-1484: Colonial era begins

Strategic goals are:

• move towards Vienna and Prague for fast westernization decision - even if capturing either would be hard due to HRE
• extend my colonial range into Indian Ocean (longer term also towards West Africa and New World), capturing Mecca and Medina on the way
• setup Iraq and especially Persia before hordes start to fall apart

So it's time to attack second coalition and setup some Eastern vassals. Except while I was waiting for rebels to go away second coalition fell apart.

That's much better - instead of coalition war, I had a quick war with Qara Qoyunlu to setup vassal Iraq and Persia, then another quick war with Timurids to enlarge Persia to size where they're a bit butthurt over being vassals.

I got expansion as second idea group, so I'm basically flooded by sword mana and spending it on development, as it doesn't do anything useful. I should perhaps pick some military idea group as third, but then again they all suck.

I've been trading for maps and stealing maps a lot in addition to sending my explorer around. Finally I discovered uninhabited island of Mahe next to Madagascar, which will be our first colony.

It would probably be a good idea to expand towards Vienna / Prague, but European alliance networks are crazy dense, and everybody's reasonably up to date with military technology, so if I do I'll face painful coalition.


Oh no, my troops are pocketed! Wait, wrong game.

A bit of bordergore, and it will kill like 3000 bird mana to annex all my vassals when they get all their cores back, but it keeps coalitions away.

I released Persia as Shia, not sure if it was a good idea, or if it would be better to wait a couple years to convert them, and only then release.

And there's my first colony. Also look at all those stolen maps. Africa is the only exception - it was basically bugged as I could put explorer in Red Sea, tell him to explore West Africa Coast (in range from other side), and he'd explore out-of-range South African Coast just fine.

Post 4 - Originally published on Google+ on 2016-09-08 19:25:32 UTC

Colonial Kebab: Part 04: 1484-1505: At Gates of Vienna

I got Janissaries who give me nice army bonuses, but they keep demanding money under every pretext. Oh well, I can afford that.

My amazing ruler died, and that means I really need to quick westernize - unfortunately it's difficult to focus on the West, as for coalition management I'm trying to destroy everyone who thinks I'm expanding aggressively.

I annexed Bosnia and Serbia, and I plan to annex Iraq, but Persia reached the stage of Greater Butthurt (+50% liberty desire due to being over 300 development), so that could be problematic. At least they have Sunni rebels, so there's some small hope they'll flip. Probably not, they're just too big.

When I finally decided to do something about Hungary, my tech advantage was pretty much gone. I had maxed out manpower when war begun, but I burned through almost all of it. So maybe quantity ideas wouldn't be that bad at some point? Still that's mostly because I'm overflowing with sword mana, and have shortage of paper mana, and I'll have massive shortage of bird mana when I pay about 3000 to incorporate Persia and Iraq.

I'm next to Vienna now, and Austria is setting up coalition against me, but it doesn't seem like a terribly big coalition and if they attacked me maybe I should just take Vienna as compensation. The downside is mostly that they match me in technology, and I'm currently low on manpower.

Thinking globally I got a network of allies - France, Bohemia, Crimea, Tunis, Hejaz, Sind, Brunei - so if anybody attacks me I'll have someone to throw under the bus as a distraction at least. Especially France might be useful to keep Austria-led coalition busy for a while.

I colonized up to Cape and to some minor Indonesian Islands, but it's rather uneventful so far. I can't really funnel that money into my trade node, so it's just money sink, at least for now.

The game decided to punish me for having too few rivals while simultaneously offer zero possible choices. For fuck's sake, how old is this feature and it's still broken.


Alliances, most just to distract any potential coalition. Austria claimed Hungarian throne, so I wanted to act before they get a PU.

Wouldn't it be nice if Sunni zealots flipped Persia to Sunni?

We're almost at Vienna, unfortunately I'll still need paper tech 10 (I'm at 7/9/9 now), and zero separatism to be able to westernize. In a way defending from coalition attack might be the easiest way to get Vienna.

Post 5 - Originally published on Google+ on 2016-09-09 02:10:39 UTC

Colonial Kebab: Part 05: 1505-1518: Still at gates of Vienna

Another attempt at building coalition against me got nowhere, so I reduced autonomy everywhere and fought some rebels while waiting for mana to build up.

I also made Persia Sunni as I really don't want to deal with messy cleanup - which weirdly only increased their liberty desire by about 40% instead of promised 100%, presumably due to some undocumented cap.

Somehow I managed to diplovassalize Adal in Horn of Africa, and used my CBs against primitives to get a bit of the coast - the good stuff are gold mines all the way to the south, so I should probably get there someday.

Hedjaz was allied to both me and the Mamluks, so when I attacked Mamluks they chose their side - and weirdly a mission giving me free claims on half of Hedjaz popped up just then - maybe it was blocked by us being long term allies. On the upside I got Mecca and Medina out of it.

Venice tried to setup another coalition against me, so I attacked them in a non-coalition war, sunk their fleet, wiped out their armies, and got Ragusa and Rhodes.

I got to tech 8/10/10 so presumably at this point if I timed thing better I might be pressing free westernization button.

And now Bohemia is attacking Hungary who's allied with Austria and Bavaria. I'm not sure how I should respond - what's likely to happen is that I'll get nothing or get wrong provinces and even bigger coalition against me.

I think Bohemia is confused because I once marked a bunch of Hungarian provinces as vital interest, but forgot to reset that once I got my border with Vienna - so they think their offer to give me Hungarian land is what I want. Well...

If Austria is cobeligerent, then it would make sense for me to accept, and maybe I'd get Vienna. Otherwise I might be simply getting another long truce for no value. I'm not sure if I can check if they're cobeligerent or not in any way.

Current coalition against me is a weird mix of Bavaria, Pope, Sienna, and Timurids - but a lot more countries would join - like Austria, Poland, Hungary, Genoa, and Naples if they weren't currently at war, and then once truces expire Venice, and Hejaz. The strategy of eliminating anyone who thinks I'm aggressively expanding is keeping it all in check.

So maybe I should just attack Austria now. They're allied with Hungary, England, Poland, Aachen, Palatinate, Saxony, Milan, Trier, and Brabant, but at least that would not be a coalition war, so I can just wait out their allies and have Vienna in ten years tops.

Of course after Vienna coalition war is pretty much inevitable, but we'll get to it in due time.


Post 6 - Originally published on Google+ on 2016-09-10 03:04:44 UTC

Colonial Kebab: Part 06: 1518-1523: Viennese Kebab

Well, apparently I can't start wars until I accept or refuse "call to arms". That was sort of cheesy, but I liked this tactic - "sorry, we can't join your war, we're already fighting them".

Well, I asked Muscovy for an alliance (since Hejaz's slot is now empty), and got into three wars:

• Bohemia's offensive calls to arms against Hungary/Austria/Bavaria (Bohemia thinks I want some Hungarian land I forgot to unmark as special interest)
• Muscovy's defensive call against Poland/Lithuania/Moldavia/Sweden for PU over Ryazan
• Muscovy's offensive call against Golden Horde/Uzbek, because apparently if they're in one defensive war, they can also make you join their offensive wars or something...

I've been sort of motivated to accept by mission to get 50 prestige, and I'm at 47 now. Otherwise I'd seriously consider throwing my allies under the bus - a lot of my old allies like Wallachia, Georgia, Teutonic Order, Hejaz are wrecked, and Bohemia and Crimea only so-so. Then again, Tunis is doing quite nicely, so it's not all negative.

Unfortunately I couldn't even commit all my troops to them seriously, as my lands were infested by rebels - and thanks to all the wars and rebels I've been baiting with autonomy reduction I finally ran out of manpower - and it went somewhat negative during all those wars.

Oh and I did something crazy and even build one castle in Hejaz territory - and another one in Nubia - just to keep rebels in check. I still destroyed a lot more than that, but cutting their costs to reasonable level makes them actually meaningful, at least when ZoCs are not going crazy.

Somehow all the wars turned out far better than I expected - Bohemia even actually transferred control over Vienna to me and gave it to me in the peace deal. When they transferred Vienna, I gave them control over about 15 Hungarian and Austrian provinces I sieged, but they only took Poszony in all of that. I didn't even have that much war contribution - about 16% last time I checked, with Bohemia's HRE allies doing most of the fighting, and me doing mostly just sieges.

Well, now just 30 more years to get separatism to zero and we can westernize.

For that one province I got hit by a coalition all the way to Aragon and Brabant, plus of course their dependents. When they attacked, these were a total of 42 countries - and my up to this point allied France and Muscovy (whom I recently helped in two wars) abandoned me.

Meanwhile, colonial affairs were happening on their own, disregarding European politics. I annexed Medri Bahri, whatever that might be, attacked Ethiopia for Adal's cores, circumnavigated the globe (but got no promised +100 prestige and +40 naval tradition, as explorer got stuck somehow).

I used spare prestige I had to place my relatives on thrones of Persia and Adal, but that doesn't seem to affect diploannexation speed. I vaguely recalled that it did once, but I'm not really even so sure of that.

Well, now it's time to deal with coalition war.


A bunch of easy wars


Post 7 - Originally published on Google+ on 2016-09-10 06:18:28 UTC

Colonial Kebab: Part 07: 1523-1532: Second, Third, and Fourth Coalition Wars

Coalition outnumbered me massively on land, I had zero manpower, and of my European allies only Bohemia joined my side, so it was time to merc up hard.

At least Tunis was on my side, and I had truces with some maritime powers (with Venice still rebuilding their fleet after I sunk it), so at least on the seas thing seemed to be doing well.

The war was series of doomstack on doomstack battles, with my stacks outnumbered, as is usual since they introduced forts:

• I lost initial 71k vs 178k battle of Varsad. Seriously, which fucking century is this?
• I won 126k vs 181k battle of Ragusa. We lost 39k men, they lost 53k. This is going to be dumb war of attrition with me throwing money at it, and they throwing manpower.
• I lost 132k vs 241k battle of Bosnia. Seriously, what the fuck. After HoI4's sophisticated strategy, this throwing around of doomstacks feels ridiculous.

I got to paper tech 10, and I decided to pick some military ideas for my third - going for quantity, as quality apparently got nerfed (no more policy for free colonist). Quantity is probably still a better idea, but since when is this a hard game where we need to make optimal choices about everything?

Unfortunately after third doomstack battle Bohemia had enough, and gave away not only the only province they took it war against Hungary, but also two more provinces to Saxony.

Apparently max attrition is now only 5%, so it's totally reasonably to keep 218k troops in a province which can supply only 34k.

Fortunately I was getting crazy warscore from killing random small stacks and minor fleets.

• I lost 112k vs 223k battle of Lika, so doomstack battles are mostly going against me.

But it was not all yet - Pope, Sienna, and Memmingen, concerned about being left out of their coalition war, started simultaneous war. It was totally ridiculous.

The coalition was technically losing the war, they've been all near war exhaustion cap, and still showed zero interest in actually ending it. Weirdly war exhaustion to reasons for peacing out translated in weirdest ways from -4 to -40 for 20 WE.

At this point I decided that enough bullshit is enough, and increased WE cap from 20 to 50. They reached the cap in about one more year of fighting - and were forced to split their doom stacks to deal with rebels at home a bit more.

Of course none of that was even close to done, as Naxos, Corfu, and Timurids attacked me in third simultaneous coalition war. It mostly cost me allies who decided to fuck that shit and after third coalition war in a row just gave up upon me.

After I finally beaten the main coalition enough to get them to give me some reparations (due to AI weights there is no middle ground between "we'll accept nothing short of total surrender" and "actually, we'll give you money"), two small coalitions weren't a huge deal.

I let Pope get away with reparations, and focused on the last war, as I wanted to get something from Timurids - one province from Khiva and Afghanistan each, to have some fresh vassals to dismantle the bastards completely.

And unfortunately EU4 is being total shit again, as I can't take provinces neighbouring my vassals at all now, because they don't neighbour me directly. For fuck's sake.

After all that mess I reestablished alliances, with new setup being:
• Crimea, Sind, Morocco (replacing Tunis), Muscovy, Brunei
• vassals Persia, Adal, Majeerteen

Now it's time to abandon any further European expansion for a good while (except maybe cleaning up leftovers islands) and focus on Africa. I wanted to expand into Asia as well, but apparently I can't before I finish annexing Persia, which currently has ETA of 1582.

I was attacked by a total of 48 countries, but it seems only 17 (Hejaz could have joined but missed on all the action) are still bothered by my so called "aggressive expansion", so I'll probably be fine.

CK2 attrition can reach 40%/month if you do something as ridiculous as this. Forts dumbed down the game hard, and they even lowered attrition cap to just 5%/month.

At least coalition dealt with Catholic rebels. HRE is half Protestant already.

That's how bad it eventually got before I fixed WE cap.

Now that WE is a real mechanic, they were forced to send some troops to deal with rebels at home, and I was able to push them back a bit. I burned through 2000 of my ~3500 money, but actually gained a ton of manpower in those wars - so I was relying on mercs too much and on regular troops not enough.

Post 8 - Originally published on Google+ on 2016-09-10 19:01:19 UTC

Colonial Kebab: Part 08: 1532-1539: East African Kebab

European expansion was closed due to coalitions, Asian expansion was closed because I couldn't take any land Persia bordered but I didn't. So that left East Africa, where I had two diplovassalized countries to expand.

But there was one more thing - I somehow had one Georgia's core, and Circassia with no allies had the rest, so there was thing small border correction to do.

Somehow East African fought way better than they're supposed to, and I'm not sure why. I was two sword techs ahead, had better general, and even some bonuses, and they still managed to force me to withdraw a lot of times. Part of it is that I couldn't uncover uncolonized provinces through which they surprise attacked me, and part of it is that they're also Sunni with piety-based morale bonus I guess.

Not that it mattered in the end - both my vassals there got huge swaths of new territory. Now it's time to do my missions to conquer North Africa - Europeans will be a bit annoyed as it's somewhat close to them, but not too much.

I settled some colonies in Brazil and conquered 3 OPMs, so soon I'm going to spawn a colonial nation there.

Coalition against me has 17 members, which is a fairly good reason not to expand into Europe for now.


Post 9 - Originally published on Google+ on 2016-09-10 23:43:45 UTC

Colonial Kebab: Part 09: 1539-1550: Western Kebab

Venice somehow got pope to call crusade on me, but nothing actually happened. Coalition looks at me funny, but they do nothing.

So I continued expanding Georgia, returned to Persia their last core, held by Circassia, got some land from Oman, and beaten up a lot of African countries to setup Mutapa as third East African vassal.

In New World I have colonial nation in Brazil, I'm setting up another in La Plata region, and I'm looking for golden cities.

Separatism in Vienna ticked down to the point I could finally westernize, even if it's like two decides behind optimal time. Now I'm overflowing on all types of mana.

Except my PP fell under 50, and my only rival is the now-unified Commonwealth who's in a fairly big coalition. I don't think they have any intention of attacking me, but they're good at stopping my expansion there.


I need to attack Portugal with Confusing Color CB.

With coalition in one direction, I'm expanding in others. Someday I might want to try a showdown with Commonwealth, kicking Lithuania out of them.

Post 10 - Originally published on Google+ on 2016-09-11 13:24:12 UTC

Colonial Kebab: Part 10: 1550-1568: Religious Leagues

So with westernization done, let's revisit the goals:

• Unifying Islam - that's about half done actually - Sicily is going to be the hardest part, as Morocco already took Cordoba, so I just need to grab it from them

• Control over Mediterranean - collapse of HRE in Italy and reformation both reduce AE it would generate, but mostly I'm just following peripheral road

• Control over Indian Ocean - lack of Suez canal makes any major naval operations difficult, but at least East Africa and Middle East are mostly controlled; I'd need to either build second fleet (as one in Mediterranean is very important for keeping coalitions in check), or just march to India by land.

• Control over the New World - one tiny colonial nation already spawned, but a lot more to go - lack of 4th colonist from policies and much larger number of provinces will make it harder than it used to be

• Disbanding or otherwise disabling the HRE - I could maybe take part in league wars at some point

Well, religious leagues started so I joined the Protestant league to make all my enemies join the Catholic league, then I switched to Catholic league after 5 year timer, and they all got friendly attitude and left coalition. With so few members, the remaining countries all disbanded the coalition anyway.

There are probably ways to cheese that even harder.

I got into war with Genoa, and apparently their Black Sea provinces are worth crazy amount of AE. It's as if AE was calculated based on distance to capital, not to province itself.

While I was waiting for warscore to tick up, Pope, Mantua, and Genoa all invaded my South American colonies. So much or AI giving a shit about naval range.

I finished Arabia, and almost finished conquest of East Africa, with Madagascar being split between my vassals - it turns out I overlooked tiny island held by Kilwa (with center of trade to make it sillier), and they westernized so I lost my nice CB.

My ally Brunei westernized too. Meanwhile my first colonial nation is still Anatolian. Maybe it should get border with the second one and westernize off that?

I diplovassalized Khiva and Tlemcen. I planned to release Khiva and feed it Timurid lands ages ago, but couldn't due to silly rules about not being allowed to take land unless I border it. It's all fine now.

Commonwealth briefly became possible to rival, then not again, so at least I refilled PP a bit.

Next step is to continue expanding in Asia and Africa. I plan to take influence ideas, which give AE discount, more relations, more reputation (for faster diploannexation), diploannexation cost discount etc. Considering how ridiculously long it takes to annex Persia, and how high AE in Europe is (elsewhere it's fine), I pretty much need it.


Post 11 - Originally published on Google+ on 2016-09-12 11:50:14 UTC

Colonial Kebab: Part 11: 1568-1580: Colonial matters

Corruption mechanic is pretty awful. I have slider at zero, so I'm not paying anything to fight it. It reduces itself by -0.02/year per stability level, and -0.05/year for being up to date on paper tech, and another -0.05/year for bird tech. So as Western (or almost Western) power, you can completely ignore it. As rest-of-the-world, you're fucked.

I fought natives near New York to establish 4th colonial nation, unfortunately I got stack wiped as I forgot how ridiculously large numbers natives get. So I sent them army twice the size, and once I conquered my way around New York, I sent them down to Mexico for 5th CN.

Colonization by Europeans is fairly anemic. Portugal got Brazil as CN, but it got wrecked so hard in Europe (reduced to Azores and Cape Verde) it had to release it. Castile got wrecked a bit less, but doesn't seem interested in colonization. Britain setup a small CN in Caribbean.

In Indonesia I conquered most of the westernmost island, and with light ships I sent to protect my transports I thought why not privateer a little. The answer is that I got found out, and that made Brunei angry at me and break our alliance. And they westernized so I can't just annex them. Oh well.

I annexed Persia, setup bigger Khiva, switched alliance from Sind to Delhi (as Sind is a lovely target), and expanded Georgia's territory into Crimea - my AE in Europe got low enough that it's fine now.

I was so busy with colonial business I missed that league wars started. I literally only noticed when I got report of naval combat. Wait, natives don't have navies, so WTF? What's that second war I'm in? What? Ouch.

This is going to be annoying, I only joined the league to disband my coalition, I guess I should have left as soon as it was done.

This is going to hurt, as 2/3 of my armies are in various remote parts of the world - Indonesia, Madagascar, Mexico, Ethiopia, Tunis. I only have 14k troops on rebel duty in my core lands, and Russia (in opposing league) already invaded by three times as much.

I even sold most of my ships (probably shouldn't have bothered, it's way too much micro with current interface), as new ships were coming in next two bird techs - but I didn't replace them as I went all in influence ideas instead of getting bird techs. And of ships I have, 2/3 are likewise away.

Then again, our league has twice the numbers and three times the ships theirs has, so even with half-hearted effort on my side we'll probably do fine. And if Germans on both sides bleed before it ends, so be it.

HRE Religious Leagues War interfering with my colonization plans. If it wasn't for Russian invasion of Georgia I'd be able to mostly ignore the whole thing.

New York waits for coring, Itza getting conquered, after it's done I can setup one in Louisiana.

Not sure what kind of CBs I can have on the now-independent Brazil. Just setup colony, fabricate one claim, then take the rest no-claim?

Slow kebabization of Indonesia. New Zealand is somehow filthy rich.

Post 12 - Originally published on Google+ on 2016-09-13 03:57:31 UTC

Colonial Kebab: Part 12: 1580-1590: League Wars

I was technically at war with half of Europe, but I mostly ignored it and started a bunch of others - Sind (my former allies), Timurids, Golden Horde, Bashkiria, random natives.

My fleets definitely underperformed. It wasn't quite a total disaster, but often even "won" battles ended up with more of my ships being sunk than enemy's. Once I get two next bird technologies, I'll rebuild it all.

Catholic League won a total victory, with 5 heretic electors losing their status, a lot of OPMs getting released, a lot of heretics paying up to Holland, dead Germans everywhere, and a lot of ships sunk. I got one of Greek islands from Venice as a reward, but my contribution was like 4% or so.

It was my first serious fight with Russia, and they don't fight too well, but they have huge stacks.

I'm not sure if Catholic victory benefits me or not. My most obvious enemies are Protestant League countries like Russia and Aragon, and if Holland stacks electoral roll with its allies and remain emperor, that's better for me than Austria as emperor.

On the other hand, if HRE now turns mostly Catholic and emperor starts passing reforms, that can end up poorly.

As soon as the league war ended, I finished a ring of vassals in Central Asia:

• Georgia (no good expansion opportunities left)
• Golden Horde (+50% liberty desire as historical rival, but a lot of Russia-held cores)
• Khiva (no good expansion opportunities left)
• Afghanistan (a lot of cores left)
• Baluchistan (can expand with CB on Indian tech countries)

I still have more vassal slots, so I plan to setup a few more in South-East Asia.

And meanwhile, I'll keep incorporating native tribes into my colonial nations. I'd prefer to avoid too much overextension until my corruption goes back to zero.


That's a very different HRE from usual.

Vassal ring.

Indonesia and New World. Those scattered colonies need to get connected.

Post 13 - Originally published on Google+ on 2016-09-14 10:30:20 UTC

Colonial Kebab: Part 13: 1590-1601: Global Kebab

I got hit by regency council, so I reloaded monthly autosave as this mechanic is bullshit (also getting removed next patch). Then again. Third time it was only a year, so I let it happen.

I continued colonial expansion, setting Sunda as Indonesian vassal to eat all OE and corruption. Unfortunately I split my fleet into 3 parts (New World, Mediterranean, Indonesia), and Indonesian fleet got sunk by Brunei.

The empire has fairly independent theaters, due to slow transfer time between them:

• New World - setup Louisiana CN, ate a bit more of Mexican minors, took over most of Brazil
• Indonesia - setup Sunda as vassal, ate most minors and chunk of (westernized) Brunei - remaining countries are mostly westernized
• Mediterranean - returned 4 cores from Morocco to my vassal Tlemcen - I'll need to conquer prettty much all of Morocco to reunify Islam
• Central Asia - ate most of Timurids and spawned OPM Punjab

Because of shortage of bird mana and lack of peace my WE was constantly >6 until I got event that made Brazil CN pay for the wall, I mean for reducing WE.

That meant a lot of rebels, and that meant a lot of attrition from fighting them, and that meant more WE. Not a fun loop, but I'm not wasting bird mana on this before I catch up on tech.

At least I got all the key shipbuilding technologies and rebuild my fleet with new ship types.

What's left of Castile and Portugal is small enough to be vassalized, but I lack good CBs for that.

Current plans is to keep expanding in multiple directions a bit, that's pretty close to coalition-proof.


Confusing color almost removed. Brazil somehow allied Morocco, so they got cobeligerented into this.

There's still a few CNs to do, like Canada and California.

I'm expanding so slow Europe forgot to coalition me.

Sunda had like 200k rebels by now. Because vassal OE is fine.

Post 14 - Originally published on Google+ on 2016-09-16 05:44:45 UTC

Colonial Kebab: Part 14: 1601-1613: Wars on all fronts

Even one year without starting a new war was annoying so I instantly started a lot of them:

• Castile (getting their guarantor Aragon into it) to vassalize them
• Genoa for leftovers
• some natives, then some more natives
• Kokkand, which meant breaking my alliance with Delhi
• Ahmendagar, also allied with Delhi
• Russia, for Golden Horde's cores
• OPM Ladakh because why not
• Morocco for Castile's cores (unfortunately Aragon occupied a lot of them first) and Ifni
• and a huge number of rebels - seriously, rebels are totally retarded and they spawn with 6x 42k stacks at once when the biggest armies in the world are below 100k except mine which is just a bit more than that. I think there was like a million rebels during those wars.

The problem was that I had no way around that WE as I was annexing two big vassals so until that ended, I'd not have bird mana to reduce WE (or to tech up).

The other problem was that they were actually fighting back a lot harder than I expected them to - well, mostly because I tried to do it with half as much army as I originally planned for, as the other half was fighting rebels.

And so my manpower actually went down to 0.

Unfortunately while I was fighting Morocco for Castile's cores, it got wrecked by Aragon, who took most of them, so I'll have to take them from a relatively stronger country.

In India I definitely overextended, and I had to fight about 100k of rebels from Baluchistan (140% OE) in addition to my own. So I diplovassalized Malwa, who's getting next batch of land.

I need just 4 provinces (for myself or vassals) to pass Unify Islam, and 1 of them is held by Morocco, and the other 3 by Aragon, so it will probably happen in  not too distant future.


Post 15 - Originally published on Google+ on 2016-09-16 14:10:58 UTC

Colonial Kebab: Part 15: 1613-1623: Kebab in Cordoba

I wanted to keep pressing, going against Aragon and Delhi, and maybe some assorted natives, but still scaling it a bit back from previous decade's warmongering.

It was still a lot more painful than expected, so against Aragon I just outlasted their allies, then I could deal with just them alone. I wanted to invade Sicily, but then new strait rules let them cheat their way back so that didn't really work out.

In the need I fully recovered Castile's territory (except for small French bits), and now I only need 3 more provinces to Unify Islam - Ifni, Palermo, and Messina.

And since Portugal accepted diplovassalization, I'll recover another good chunk of relevant trade nodes.

My army underperformed somewhat - part of the reason is that there's pretty much constant crusade against me, which gives countries fighting me military penalties. Why can't I call jihad as a caliph? It seems unfair.

My fleet underperformed too against Aragon and friends. It seems that light ships got nerfed quite hard since early days.

I let ulema go over 80% influence for a while, since penalties didn't seem that bad - it turns out there's hidden -1 stability when that happens and hidden -3 stability to remove the penalty - which I'll only be able to press after I remove their influence.

That's surprising, as penalties for very low estate loyalty are rather modest, and penalties for very high estate influence didn't seem so bad on tooltip.


I needed some provinces in West India for Unify Islam, but once my armies were in place, I might as well continue expanding.

I'm surprised by how many provinces I was able to get in one war. And nobody cares any more while they all wanted to wreck me for Vienna.

Post 16 - Originally published on Google+ on 2016-09-23 16:25:24 UTC

Colonial Kebab: Part 16: 1623-1643: Unifying Islam

I just noticed that having fought in league wars gives +0.5/year army tradition and -5% sword tech cost for 100 years. I didn't expect to get that as kebab.

Also taking humanist ideas, as honestly everything I want is in bird section (espionage, diplomatic), but I'm constantly short on bird mana and I have a ton of big vassals to integrate.

Getting rid of ulema was painful - I lost 1 stab when it happened, 1 by event, and 3 to remove them.

The next problem is that my vassals have been growing, and a lot of them are whiny crybabies who need constant placating. I should probably figure out how to quickly check their development to keep it just a bit under 300.

And there was one more issue - I need to be 100% Muslim to pass unify islam, but I got religious center in Ceylon. Fortunately I can cheese it, declare wars on infidel OPMs, get them to pay money, and get 100% piety this way.

A few wars later, I got remaining two provinces in Sicily necessary to unify islam. The problem is that I need all my provinces to be Muslim, which means not only those two, but also any ongoing colony.

I didn't even get any massive coalition for it - just Commonwealth and assorted Indians, the rest of Europe doesn't really care at this point.

With Portugal, Castile, and Tlemcen being my vassals, that's very good level of control over Mediterranean. I'd like to get the rest of coastline at some point.

Level of control over Indian Ocean is pretty good too.

Post 17 - Originally published on Google+ on 2016-09-29 13:18:59 UTC

Colonial Kebab: Part 17: 1643-1653: Islam Unified

I was about to press the Unify Islam button, but I overlooked that small thing that I needed 200 of each mana type for it, and wy bird mana was not really that high due to all the vassal integrations.

After I got the mana, I couldn't press it because one of my colonies wasn't Sunni. I'm not totally sure what's going on, as all my new world colonies became Turkish Sunni instantly, and so did early Indonesian colonies, but new ones in Philippines trade node stayed Filipino and pagan - I converted them later, but they retained wrong culture. Is it because there's a trade company in the region (even though those provinces are not assigned to it)? That's my best guess at least.

Well, it was a matter of some more waiting - and sending next colonists to New World. With all major goals achieved (except disbanding HRE, but how many times can one do this), it's good time to finish the game.

So some modding conclusions:

• long truce timers are not too awful, in sense that everything else got slowed down to that speed, so you'll be waiting with or without them
• sword mana is overflowing, paper/bird mana are always in short supply
• I wish AI was more moddable, just being able to tweak some weights would be enough
• it's painful how expensive vassals are to integrate
• it's also painful how butthurt vassals become when over 100/300 development
• the whole state/territory anti-blobbing measure seems fairly pointless
• forts and ZoCs are as broken as ever
• estates are probbaly a good addition
• I don't have much desire for more EU4