Another calibration test

See also previous results.

I stumbled upon another calibration test:

This one is almost entirely based on distances between US cities. I don’t even know where all the states are, so this involved a lot of guessing. But apparently, I’m quite a bit better at this than I thought:

The test is not quite good for multiple reasons:

  1. It’s almost entirely based on US cities, making it very US-centric. It would be better if it was based on world cities, say, the top 100 largest cities in the world. Yes, I’m aware that calibration tests are semi-independent of content knowledge, but they are not entirely independent.
  2. They don’t provide the data from the test. This could be interesting.
  3. They don’t provide any numerical output. The graph above is not so easy to interpret because it lacks e.g. confidence bands. By the looks of it, I was spot on for the 80-90% bands, but I think this one was based on only a few items as I didn’t select that category often.
  4. They seem to choose city pairs at random. Some of them end up being meaningless. I had one with the same city twice! And also had one without any question at all, simply had TRUE or FALSE.

In general, it appears I’m underconfident.