taw's blog

The best kittens, technology, and video games blog in the world.

Tuesday, September 13, 2016

Adventures with Raspberry Pi: RGB Led Take 2

Once upon a time I tried to do software PWM to set different colors in a RGB Led. It failed.

LEDs are generally either on or off, with no intermediate states - so to get LED at half the intensity, you just turn it fully on for half the time, and blink it fast enough that human eye won't be able to tell the difference.

The problem was that the blinking wasn't fast enough. So now it's time for the long overdue debugging.

First, what the hell is the gem doing? Apparently it's simply writing to files like /sys/class/gpio/gpio17/value. So what if we just write to this file in a loop, skipping the gem? It turns out that's also just not fast enough.

So fallback plan, let's get wiringPi library (git clone git://git.drogon.net/wiringPi) and write it in C:


#include <stdio.h>
#include <wiringPi.h>

int main(int argc, char **argv)
{
  int r = atoi(argv[1]);
  int g = atoi(argv[2]);
  int b = atoi(argv[3]);
  int i;

  printf("Raspberry RGB %d %d %d blink\n", r, g, b);

  if (wiringPiSetup () == -1)
    return 1;

  pinMode(0, OUTPUT); // R
  pinMode(2, OUTPUT); // G
  pinMode(3, OUTPUT); // B

  for (;;)
  {
    digitalWrite(0, rand() % 256 <= r);
    digitalWrite(2, rand() % 256 <= g);
    digitalWrite(3, rand() % 256 <= b);
  }
  return 0;
}

Then compile with gcc -o rgbled rgbled.c -lwiringPi, and run like sudo ./rgbled 255 127 0
And it works!

Now obviously I don't want to write C programs for every trivial thing, so next step would presumably be using ffi interface to wiringPi instead of what PiPiper does with file-based interface.

Monday, September 12, 2016

Using Trello for GTD

I IZ SAD. NO CHEEZBURGER 4 ME by stratman² (2 many pix and busy) from flickr (CC-NC-ND)
The big problem with GTD is that no software solution really matches the ideal workflow, and using post-it notes for it has its own problems.

I tried a lot of different software solutions. For fairly long time I tried using a bunch of files in a Dropbox folders for it. The big upside was how easy it was to integrate with it - just crontab a script to put a file into inbox if I need to be notified of something. But plain text files are really poor format for anything.

So as another tool in the long list I tried Trello. Here's the setup.

GTD board

I have one main list with a lot of columns:
  • Today (5) - just a place to highlight whatever I'm currently working on, or plan to work on if top item gets blocked. It gets empty by either finishing things or moving them back to action lists about daily.
  • Next Actions - I don't really feel like there's much value in using crazy number of contexts, most of which would contain no or very few items most of the time, so most actions go here.
  • Code Me - There's pretty much the only context which is constantly filled and clearly distinct from non-code actions.
  • Waiting For - what I'm waiting on to happen. Trello has advantage over plain text files, as I can put links, dates etc.
  • Someday/Maybe - a fairly vague list of ideas
  • Projects to Plan - these are sort of next actions, any project with no obvious next action goes there; the idea is that they'd go to Projects list once more actionable. It could be seen as another next actions column with "Plan Me" context tag.
  • Projects - any projects bigger than one action go here. Actions and projects should generally be linked, but usually it's obvious enough that I don't bother. Trello doesn't have easy way of showing projects with no associated actions, so I wanted to write a script to tag them, but I never got to it (Trello API isn't too bad).
  • Done - any recently finished action or project
  • Areas of Responsibility - mostly for reference during reviews. Anything bigger than a project.

GTD Archive board

About once a week I move Done column there, and add a proper date. It's mostly a feel-good board, with fairly little functionality.

Trello labels

Any long running project or area of responsibility gets its own label, as labels are the only easy way to tag trello cards. I use Card Color Titles for Trello Chrome extension, as otherwise Trello labels are fairly useless (you can see before and after in that link).

The only other label is red "blocked" label, which can be quickly applied and unapplied to action cards.

Off-Trello parts

Once upon a time I used to have "Buy Me" list, but nowadays I just throw things into my Tesco groceries or Amazon basket right away, and actually buy them weekly or so - and things not purchasable in either are rare enough they can go into generic action list.

Inbox is still a Dropbox folder, mostly with plain text files, so existing crontab scripts can still use it.

How Well it Works?

It all sort of works, but it's not exactly amazing. I don't plan to return to plaintext files, but I'll probably try something else eventually.

It's really annoying that I can't use it when offline, for example in London Underground - Dropbox had far better online/offline integration.

Friday, September 02, 2016

Modern Times mod for Crusader Kings 2 - Reaper's Due release

Lucy by hehaden from flickr (CC-NC)
Here's new release of Modern Times mod, now updated for 2.6.1, and mostly containing bugfixes, such as American invasion no longer accidentally being theocracy.

By accident infectious diseases were all gone in previous versions of the mod. While this could accurately show modern medicine, it's more fun to keep them, so now you can get them all depending on your game rules choices.

You'll only get Black Death if you set it to random, as on historical settings it will be long gone. Minor diseases happen just as in vanilla.

There's no SARS / HIV / bird flu or anything like that.

Thursday, September 01, 2016

How to teach coding

Book cat by raider of gin from flickr (CC-BY)

I've been helping people learn coding for a while now. Here are some notes.

Free resources

  • there's a lot of free resources out there
  • nearly all of them are of poor quality
  • it's very difficult to make good resources for make resources for someone very different than you - and by the time you can write a tutorial you're long past beginner phase
  • very often resources spend far too much time on pointless distractions, have huge difficulty spikes, present material in order where current lesson depends on something that will only be explained in the future etc. It's clear they're not adequately tested on actual beginners.

How to learn coding

There's absolutely no reason for anyone to ever do anything else than:
  • stay in-browser as much as possible
  • learn basics of HTML and CSS
  • learn basics of jQuery
  • only then progress to anything else
As far as I can tell that's the only way beginners can actually create something interesting and useful.

If you start by teaching people ruby or python, the best they can do is some completely artificial terminal programs like guess-a-number or such.

Even if someone needs to learn ruby/python, the best way is to first teach them web technologies, and then thanks to some framework like Ruby on Rails they can build something useful.

I'd very strongly recommend against teaching people "Javascript" as such. What people need is just bare minimum to be able to do simple jQuery style manipulations. Non-jQuery Javascript is better left for far later.

Tools

A lot of resources try to teach beginners how to use terminals, text editors like Atom, git, github etc. before they get to any coding. Crazy ones even try things like vim.

It's mindboggling why anybody would consider it appropriate to start with this. It's a massive distraction from the goal of learning programming and writing useful programs.

Fortunately there's a powerful environment even absolute beginners are comfortable with, and that's the browser.
  • repl.it - run simple program and repls in almost every programming language
  • codepen.io - experiment with HTML/CSS/Javascript and related technologies
  • most online courses have in-browser editors and tests
It's useful for every beginner to have a github account and to download Atom, but these shouldn't be the focus.

For people who use OSX, going off-browser is tolerable, but for people with Windows laptops that's huge amount of pain, so it's especially important to stay in-browser as much as possible.

Free resources reviews for web development

They're fairly good, and you can do a lot in-browser:
  • freecodecamp - this is the best beginner resource for web technologies I found - it covers a lot of content, it's well structured, and contains low amount of nonsense; there's a bunch of stuff that's "coming soon"
  • codecademy - it has a lot of content (web and non-web), but a lot of it has serious issues like random difficulty spikes and chapters with poor explanations
  • codebar tutorials - they're OK, but they suffer from having to download files and do everything locally - I found that in-browser lets beginners focus on the subject much better and be less confused by tooling
It's important that beginners can use minimum of unfamiliar tools for it, and mostly stay in-browser.

It's also great that hosting on github.io offers free and very easy to setup hosting for such apps.

Free resources for non-web development

I'm much less happy with these resources compared with web development resources:
  • ruby in 100 minutes - it seems to take people about twice as much. Whenever anyone wants to do it, I generally tell them to go chapters 2, 3, 5, 7, 8, 6, 9, 1 0, 11 and use repl.it.
  • Learn Ruby the Hard Way - I don't like this book, as it teaches Ruby as if it was Python, which feels like it completely misses the point.
  • codewars - good practice for intermediate level if you set the filters correctly (8kyu only, unsolved only, sort by popularity), as the defaults are completely wrong for beginners. It's much more useful for people who can already program and simply want practice in new language.
  • try ruby - a nice in-browser introduction. It suffers from minor distractions like symbols (I wish ruby just killed them completely) and ruby 1.9 leftovers.
  • udacity - I've been generally rather unhappy with quality of that, and they completely ignore all reported errors
  • books - just not worth it for beginners - in-browser environment and immediate feedback are just far superior
  • everything that you need to download to solve like rubykatas, exercism etc. - they're ok, but best left for later
It's much harder to setup hosting for your ruby/python programs, and it usually costs money.

Free resources for tools

Tools I'd recommend teaching:
  • stay in browser as much as possible - that's what everybody already knows
  • browser's development tools - this is generally fairly straightforward progression from basic browser skills everybody already has
  • codepen.io - far easier to get started than creating a bunch of files and keeping them synchronized etc.
  • repl.it - this should be deafult repl, not any kind of in-terminal irb/ipython/etc.
  • Atom - from what I've seen beginners have little trouble with it, unlike with some complex editors. It has ton of plugins, works on everything, and it's perfectly adequate for serious programming as well.
  • github - the browser side of it is reasonably approachable, terminal side much less so, and I'm not sure if there are any good client-side programs to make it easier.
  • github.io hosting - to keep people's motivations
  • terminal basics - it's fairly painful, and I wish Atom did more of it, so terminal would be needed less.
  • git basics - it really pains me, as this is extremely unfriendly towards beginners, but there's no alternative, and at some point they'll need to learn it - at least there's immediate payoff in github and github.io.
Unfortunately I haven't found great tutorials for any of the tools.

Wednesday, August 31, 2016

Let's Play Hearts of Iron 4 as Poland

I played Poland once before, in 1.0 version. It was my first real campaign after short Iran game to figure out game controls, and mostly thanks to AI being horrible I conquered Germany by 1938.

Now AI is somewhat less dumb, so it would probably be harder. The rush Germany strategy probably still works, but I wanted to try some alternative, in case they ever make Germany too strong for Poland to take. (there's even stronger strategy of taking advantage of Sudetenland glitch, but this is exploit-free campaign)

The strategy I wanted to try:

  • rush revanchism focus to be able to fabricate at 10% world tension, before guarantee spam starts
  • conquer all 3 Baltic states for extra factories
  • give Germans Danzig when presented with ultimatum
  • focus exclusively on Soviet Union
  • after Soviet Union falls, get Danzig back, and while at it Berlin as well

The series also tries to answers the question of just how good the build of 6 mountaineers, 2 artillery, and 1 medium tank is, but conclusion of that is only in the last episode.

Here's episode 1. The rest will be published once a day.

Sunday, August 28, 2016

Let's Play Crusader Kings 2 as Islamic State with Modern Times mod

Here's a fun campaign I played on twitch as ISIS in Modern Times 2016 start. It should hopefully suffer from fewer technical problems than my HOI4 Nationalist China campaign, which started with poor microphone positioning (but eventually got better).

The campaign was fairly short, as after death of the first caliph my backup system failed me, so I couldn't continue - but his life was definitely eventful, and it should be fairly fun to watch, and in a way this gives it some kind of closure.

It's all using Modern Times mod I wrote, which allows playing any time from 1815 Congress of Vienna to 2016 today. It's still on 2.5.2 so if you want to enjoy the diseases in Modern Times you'll have to wait a few days for mod to update.

The whole playlist is on youtube, with episodes coming once a day as usual.

Here's the first episode:

Enjoy!

Friday, August 26, 2016

Data loss postmortem

Flash Fail? by E V Peters from flickr (CC-NC)

I just lost a lot of data, and I'm extremely annoyed, to describe thing mildly.

Here's my backup setup:
  • OSX laptop as primary
  • Gaming Windows 7 box as secondary, with cygwin installed
  • (in the past I also had a few more boxes to which this system was extended)
  • status script automatically checks all boxes - every file or folder is inspected according to some set of rules:
    • system files are considered safe
    • all git repos are considered safe if they're pushed to master with no extra files
    • everything that's in Dropbox folder is treated as safe
    • for things too big for Dropbox there's a pair of backup drives - everything on them is considered safe as long as both files contain same files (for obvious performance reasons I'm only checking directory listing not TBs of content)
    • symlinks pointing to safe locations are safe
    • there's a whitelist of locations to ignore, for various low value data, applications' folders with nothing I care about etc.
    • everything else is automatically flagged as TODO
  • to prevent data loss in shell, rm command is aliased away (safe trash is used), mv and cp are aliased to -i to prevent accidental overwriting, and I'm very strict about always using >> and never under any circumstances > in shell redirects
  • Dropbox offers 30 day undelete, so anything deleted locally can still be recovered
  • and just to be super extra sure, various cloud contents are snapshotted every now and then to backup drives; list of installed software is snapshotted etc.
  • phones, tablets etc. all sync everything with the cloud, and contains nothing valuable locally
  • MP3 player and Kindle are mirrored on Dropbox, and synchronized automatically by scripts whenever they're connected
This system is really good at dealing with hardware failures, system reinstalls, and random human error. All files are protected from single failure, and in some cases from multiple failures.

Unfortunately there are two huge holes in the system:
  • configuration which doesn't live in user-accessible files - like /etc on OSX, Windows registry etc. This is less of an issue nowadays than it used to be.
  • the manually created whitelist of locations to ignore. You can guess where this leads.
It also offers limited protection from any kind of hacking or ransomware attack, but in any realistic threat model they're not terribly important.

Video Games

For casual gamers it's enough to just install games with Steam or whatever, and enjoy.

This unfortunately is absolutely unacceptable if you're into any kind of serious gaming. Steam autoupdates both game and all its mods, with no way to roll back, so if you had any kind of long running campaign, it will get wrecked.

As far as I can tell, that's what caused death of my Let's Play Civilization V as Germany series - I probably mindlessly pressed buttons to update 3UC/4UC mods, and that resulted in unfixable save game corruption.

So to protect against this, if possible I'm not playing using Steam - instead I install every version to separate folder. All versions of same game unfortunately share same user data folders, so if I ever want to go back I need to do some folder reshuffling, but as long as I don't run that game in Steam, mods won't get overwritten by newer versions, so I can safely play even campaign that takes months.

And I'm perfectly aware than for Paradox games it's possible to revert to previous versions as betas, but that does absolutely nothing whatsoever to deal with mods irreversibly autoupdating without my consent, and in HOI4 (and apparently Stellaris, but I never played that) it's even worse as mods are saved deep in Steam user data, so I had to write some script to even have mod folder I can safely backup.

Now here's where first part of the problem begins - I added all folders with save games to the whitelist. This is mostly reasonable, as I don't need long term backups of them, and if I lose saves from campaigns I already finished, it's no big deal.

Unfortunately whitelist has no good way to tell them apart from saves (and mod folder) for any ongoing campaigns, so here's failure number one.

Uninstallers

I've noticed that I had way too many old versions of various games installed, so I decided to clean them up - there's zero risk in deleting installed applications, so it was a routine thoughtless operation.

While uninstalling some old version of Crusader Kings 2, just another confirmation popup happened, which I automatically replied with a yes, and then it deleted my whole user directory with all my saves and everything else.

This is unacceptable UX on so many levels:
  • Surprise popups should never ask to delete user data - it should either never happen, or be a checkbox user must explicitly choose. It is completely unacceptable.
  • if you ever actually delete user data, use system trash. It is completely unacceptable to use hard delete like it's 1980s and we learned nothing in last 30 years of computing.
If your software does it, just stop writing software until you learn better, because you're causing more harm than good.

So we had 3 failures in a row (one my fault, other two the fault of whoever wrote that uninstaller), but that was still sort of recoverable with undelete process which existed since days of DOS.

I downloaded some software for it - the first one was bait and switch bullshit which would display files it found, but wouldn't actually recover anything. If you write that kind of software, please just kill yourself, there's no hope for you.

Second I found some legitimate recovery software, it recovered the files to second drive, so I thought 4th level of protection worked... and unfortunately they were all filled with zeroes. That confused me, but then I noticed that it was all on an SSD and TRIM command was indeed enabled, so completes the explanation.

Next actions

Historical saves from my past campaigns were nice to have for testing some tools, but I don't care about them terribly much. Recovering settings and mod folder from scratch will take maybe an hour, as it contained a mix of mods from Steam Workshop, downloaded separately, and my own. Annoying, but not a big deal.

What I lost were mostly saves for my ongoing Let's Play CK2 as Islamic State [Modern Times mod] campaign I've been playing on twitch. It got up to the point where the first caliph died, and his underage son inherited Islamic State. It was still quite fun, and I have all the video saved, so I'm going to upload that to youtube soon enough - and in the meantime all 3 sessions are available on twitch.

Even after this loss, I still have 22GB of save files in my folder. If this was OSX, I could just move them to Dropbox and symlink back (size to value ratio is not great, but often doing this brute force is good enough), but that's not terribly reliable on Windows, so I'll probably just delete old ones manually, remove save folders from the whitelist and instead tell the script to copy them all over to Dropbox.

The upside is that this is the biggest data loss I had in something like 10 years. The only other incident was losing about two day's worth of git commits to one repository I apparently forgot to push before formatting old laptop, which also annoyed me greatly.

Two incidents in a decade is pretty much nothing compared to the kind of massive data loss I suffered (to hardware failure) before that twice, and which made me the level of anal about backups you can see.

Monday, August 22, 2016

CSS is uniquely impossible to test

Cat by Adrian Midgley from flickr (CC-NC-ND)

Times change. Back when I started this blog ten years ago serious automated testing was something people have generally heard of, but very few actually did. It was very common for even big projects to have literally zero tests, or if they did they were token tests or at best some regression checks.

Then TDD's shaming campaign happened, and it was even more effective than shaming campaigns against smoking, and now not testing is the exception, and most fights are over what kind of testing is most appropriate.

It was mostly cultural change. Ruby or Java were pretty much just as testable 10 years ago as they are now, but underlying technology changed considerably as well. Just some of such changes:
  • Very low level languages like C/C++ where any bug just corrupts memory at random are extremely hard to test - they're far less popular than they used to be (and the ones that still exist usually have nonexistent or very shitty tests)
  • Languages like Perl which didn't even have working equality and had a lot of context dependence are much less popular - Perl was still possible to test, but it was a bit awkward
  • Headless browsers made it possible to reasonably test javascript
  • jQuery and greater compatibility between browsers made cross-browser javascript testing basically unnecessary
  • Web-based user interfaces are far easier to test than most native interfaces
  • Going all web made cross-OS testing unnecessary, and if you really need them VMs are far easier to setup than ever
  • Application logic in database paradigm mostly died out, and much easier to test application logic in application paradigm is clearly dominant now
  • Complex multithreading never got popular, and it's more common to have isolated services communicating over HTTP or other messaging
  • Cloud makes it much easier to replicate production setup in test environment for reliable system-level testing
  • All languages have a lot more testing libraries, so things like mocking network or filesystem communication which used to be massive pain to setup are now nearly trivial.
  • There are now ways to test with multiple browsers at once, even if it's still not quite as simple.
And yet, one technology from dark days before testing is still with us, and shows no sign of either going away or becoming testable. CSS.

Let's just cover a few things which would be difficult to automatically validate, and in theory they ought to be possible to automate, but there are no good ways to do that:
  • Site works with no major glitches on different browsers. Any major difference should be flagged, but what counts as "major" difference would probably need somewhat complex logic in testing library.
  • Site looks reasonable on different screen sizes. There will be differences, and testing library would need to contain a lot of logic to determine what's fine and what's not. Some examples would be maximum/minimum element sizes, no content missing unless specifically requested to be hidden, no content cut by overflow, no horizontal scrollbars etc.
  • All CSS rules in your application.css are actually used. It seems everybody's CSS accumulates leftovers after every refactoring, and with some browser hooks it ought to be possible to automatically flag them.
  • When you do CSS animations, start and end state show what they ought to. Even disregarding transitions. Some kind of assertions like "X is fully visible and not covered by any other element or overflow: hidden", "Y cannot be seen" would be great, but they're not easy to do now.
As far as I can tell there's been minimal progress in ten years. There are some attempts at CSS testing, but their tests are far too low level, don't address real needs, and as a result nearly nobody uses them.

I don't have any solutions. Hopefully over next few years it will get better or we'll replace CSS with something more testable.

Sunday, August 21, 2016

Test coverage in ruby

IMG_5547 by rautiocination from flickr (CC-NC)

In highly dynamic language like ruby it ought to be common sense to aim at 100% test coverage. Any line without coverage can contain any kind of typo or other silliness and crash in production.

Statically typed languages have a minor advantage here, as compile-time type check is sort of a test. Unfortunately it has such a massive false positive rate and incurs such a massive penalty to expressiveness as to make the whole idea of static typing worthless. In early days of computing people seriously expected rich static type systems to be mathematical proofs of various properties of languages, but that went nowhere. Anyway, enough history.

In ruby 100% test coverage - every part of the program being executed at least once by some kind of rudimentary test - really ought to be the minimum goal. Just running the code and verifying that it returns something vaguely reasonable without crashing really isn't that much to ask for.

Unfortunately it seems to be far too common for people to see 100% coverage as some highly ambitious goal, and be satisfied with "good enough" numbers like let's say 80%.

Let's have a talk about all that.

Rails autoloader

A lot of ruby code uses Rails, which uses code autoloading in development for a good reason, but unfortunately it also uses it during testing when it really seriously shouldn't, and someone please submit that as a bug report on an off-change it won't get EWONTFIXed.

This leads to a very nasty situation where simplecov reports that you have 100% test coverage, even though you completely forgot about a bunch of models and controllers. Not loaded means not included in coverage report at all, instead of being big red 0%.

Rails generators will automatically create empty test classes for you, so if you use generators you're going to be somewhat less affected by this problem, but most people end up creating a lot of files manually as well.

Anyway, if you use any kind of Rails project for which you want to get reasonable coverage report, you ought to put something like this in your spec/spec_helper.rb or equivalent, possibly guarded by if ENV["COVERAGE"].present? if you'd prefer to only generate coverage report when explicitly requested:

SimpleCov.at_exit do
  Rails.application.config.eager_load_paths |= Rails.application.config.autoload_paths
  Rails.application.eager_load!
  SimpleCov.result.format!
end

If you do that on existing project, you might be for a big surprise.

0 tests = 50% test coverage

Next time you think 80% coverage is good, try to run coverage report with all tests disabled. You'll probably get a figure around 50% coverage, from zero tests. Let's take a look at this example:

class HelloWorld
  def say_hello
    putts "Hello, world!"
  end
end

In meaningful sense it has only one line of code, third, which also happens to have nasty typo. As far as ruby and simplecov are concerned however, there are three lines, as that code is more or less just syntactic sugar for:

(HelloWorld = Class.new).instance_eval do
  define_method :say_hello do
    putts "Hello, world!"
  end
end

And lines 1-3 are all perfectly executable. Depending on how long your methods are, you'll get something close to 50% test coverage with no test- just from the way ruby works.

That means that "80% coverage" might very well be 60% of actual code, and that sounds a lot less appealing, doesn't it?

ruby -w

I tend to dislike linters, as they mostly tend to enforce authors' misguided ideas on what is proper way to code, and have extremely high false positive rate for finding genuine problems.

For some reason every one of them has some ideas which are not just pointless, they result in actually horrible code. For example rubocop will whine if you use consistent double quotes and will demand awful mix of single and double quotes; and then it will whine even more if you consistently use trailing comma in multiline Array/Hash literals, instead forcing you to write ugly code with even uglier git diffs and pointless merge conflicts. The project should die in a fire, and its authors should feel ashamed of ever publishing such pile of crap.

One simple alternative is to that is checking your program with ruby -w, which has fairly low rate of false positives (whining about system *%W[some --command] is the only one I keep running into a lot), but can still find most simple typos, misindentation, and such.

This kind of analysis is usually not counted as part of coverage report, but is useful supplement.

Error lines

Even in reasonably tested software it is very common to see code which raises exception not covered.

There are two cases which look the same syntactically:
  • errors because something bad happened
  • errors because something supposed to be impossible happened
For example z3 gem had this code in a private method:

  def check_sat_results(r)
    case r
    when 1
      :sat
    when 0
      :unknown
    when -1
      :unsat
    else
      raise Z3::Exception, "Wrong SAT result #{r}"
    end
  end

z3 library is supposed to return -1, 0, or 1, so the whole raise line is just debugging aid and it's rather pointless to test it - but on an off chance gem author messed it up or z3 quietly changed its API, it's better to have this kind of line than just return nil quietly.

On the other hand if the same code was in some user-exposed API, then we should probably be checking that exception happens when we pass unexpected value.

These two situations are semantically different, but syntactically equivalent, so coverage reporter can't tell the difference between the two.

This only gets worse because...

Coverage is by line

So it's very easy to rewrite such one lines of "not covered" into code which doesn't execute but appears green in coverage report. Like this:

  def check_sat_results(r)
    raise Z3::Exception, "Wrong SAT result #{r}" if r > 1 or r < -1
    case r
    when 1
      :sat
    when 0
      :unknown
    when -1
      :unsat
    end
  end

or this:

  def check_sat_results(r)
    {
      1 => :sat,
      0 => :unknown,
     -1 => :unsat,
    }[0] or raise Z3::Exception, "Wrong SAT result #{r}"
  end

Sometimes such code which doesn't get own line looks perfectly natural, other times it feels a bit convoluted.

Unfortunately there's no reason to expect that we'll consistently use this form for "can't happen" errors, but never for "legitimate" exceptions, so we'll have both false positives and false negatives here.

Scripts

Unix scripts, rake files, deploy scripts, and such are left untested so often they're usually not even included in coverage report. I wrote previously about some patterns for testing them, but even then they're not really the low hanging fruit of testing.

Even worse when you actually test your scripts in realistic way by actually executing them, simplecov requires quite a few hoops to jump before it will show code executed via such scripts as covered.

What is code anyway?

Your codebase might contain files which are sort of code, but not completely so, like .haml/.erb files. It's not generally included in coverage reports, even though view logic is as much part of program logic as anything else, and complex views are fairly common, even if people frown at them.

You can move complex logic out of views into models, controllers, and helpers, but even when all that's left is relatively straightforward it would be useful to know it was ran at least once during testing.

Automatically generated code

Metaprogramming is a big aspect of ruby, so it's fairly common for ruby programs to contain a lot of generated code.

It still deserves some kind of at least rudimentary testing, even if tests are generated as well.

Count all kinds of tests together for coverage report

Test coverage scale doesn't go from "no tests" to "great tests", it goes from "no tests" to "probably good enough tests", and it can't distinguish between "probably good enough" (at 100% or close to it) and tests which are actually great.

There's a temptation to check unit test coverage - counting only code tested directly, not "covered" accidentally by some high level integration tests.

That's a clever idea, but small number of high level integration tests can provide a lot of value for relatively little effort, and a lot of code probably doesn't deserves much more testing than that.

Excessive number of low-level tests would not only be slow to write, but they'd need constant updates every times something gets refactored.

It's a red flag if code you expect to run on production doesn't get run even once during your tests, but you should leave yourself flexibility on what kind of testing is most appropriate.

Merging coverage reports from multiple runs

This is currently far more complex than it has any reason to be. Let's say your program has a bunch of scripts, and you want to run them as scripts, and still include in coverage report.

The easiest way to deal with it I found is injecting a small bit of code into them. So in your test you'd run your scripts as IO.popen("ruby -r./spec/coverage_helper #{executable_path}").read and spec/coverage_helper.rb would setup per-script coverage report:

if ENV["COVERAGE"]
  require "simplecov"
  SimpleCov.command_name "#{File.basename($0)}"
  SimpleCov.start do
    add_filter "/spec/"
  end
  SimpleCov.at_exit do
    # Remove verbosity
    $stderr = open("/dev/null", "w")
  end
end

Those partial coverage reports are then automatically added together when you finish.

Thursday, August 18, 2016

Map mode for Hearts of Iron 4 resistance and suppression system

If HoI4 developers insist on keeping the current resistance and suppression system, and I think they shouldn't, the bare minimum they need to do is making its map mode functional.

Unfortunately there's no way to make them do so, and no way to mod map modes in-game (one of the biggest gaps in modability - like where are truce map modes?). So as the next best thing I grabbed my old map generation scripts (which I originally used for generating timelapse gifs for my EU4 campaigns) and adapted them to display suppression information for HOI4.

Here's alt-tab map mode for that, the way it ought to work.

Colors and numbers indicate how many 1cav divisions you need to suppress states, with correctly applied rounding etc. All colors for 15+ cavs are the same, as it felt unnecessary to have whole scale for just a handful of provinces.

See my previous post for explanations of why 1cav is best strategy and math behind it all.

Gentlest:


Gentle (default):

Harsh:

Harshest:

This is pretty much what I'd like to see in game when you select resistance map mode. Not forcing math on tooltip - just show by colors and numbers how many units are needed to keep a state in good order.