Monthly Archives: May 2013

Old Solutions For New Problems

About a year ago, I wrote a quick little program to help the judging process for the 59 Days Of Code contest.  Please note that I need to emphasize the word ‘quick’ before I go any further, and that’s a word that should scare any professional programmer.  It’s alright to write a ‘quick’ prototype, but trying to do anything that needs to be production quality ‘quickly’ is a dangerous idea.  Testing is important, not just to catch the bugs you expect, but the bugs you don’t expect.

In the case of the program in question, we had an issue where somehow the data stream between device and server was corrupted.  We had all the original data stored in logs, but trying to sort through JSON strings stored in a log is a time-consuming, annoying process.  It’s technically doable, but as I rapidly realized on the day of the event, trying to do it on the fly simply isn’t practical.  I needed to have already written or found an application to help handle the process, and I hadn’t.

But restoring the data, while important, isn’t half as good as preventing the corruption.  Towards that goal, I hope to re-write that software this year to do two things different.  Firstly, devices won’t download data from the server willy-nilly in an attempt to ‘just work'; they download on request only.  Secondly, I need some way to make sure the data gets to the server in good format.

The question of course, was how?  I could have dug in and done some research on the problem, and oh boy oh boy are there a lot of solutions out there fore it.  But when the opportunity to re-write the software was first discussed, I didn’t have the internet handy.  I had to rely on what I already knew, and for some reason verifying messages against accidental corruption isn’t something I really learned in school.

And from that quick solution, come up with in about 5 seconds of thought while on the phone, comes the point of this post.  I didn’t learn anything about how to handle accidental issues, but something else came to mind: when studying security, I learned several techniques to handle man-in-the-middle attacks.  Sure, the cryptographic portions of those techniques would be a pain to manage, but the basic underlying concept didn’t necessarily need to apply against an actual ‘attack’.  Having the packet become malformed between device and server is, in a very real sense, a ‘man in the middle’ attack by random chance.  And since I’m worried about random chance, not an accident, suddenly I have a solution.  Just take the basic message, generate a simple, non-cryptographic MD5 hash, and voila!  Message integrity checking made ‘easy’.

The point of this, of course, is that I took an old solution for a different problem, and re-purposed it to a new one.  The fact that historically speaking, the solution for security probably evolved out of the solution previously used for verifying data integrity is simply a rather amusing joke on me.

The ability to take a new problem, and use an old solution, is important.  In programming, old does not automatically mean bad.  Sure, A* pathfinding isn’t as cool or good as flow pathfinding; for the context of a game with many units I’d probably pick to implement flow any day of the week or twice on Sundays!  But if I’m writing a GPS program, A* is probably still a better route to take, because my concerns regarding other ‘units’ (other cars on the road) can be better expressed by modifying the relative weight of various connections between nodes on my map.  More than that, some day I might find that the basis of the A* algorithm could be useful for something else entirely.  What, I don’t know.

The only knowledge that is ever wasted is knowledge you forget because it’s ‘useless’.  Maybe it doesn’t apply immediately, but keep it tucked away somewhere.  Maybe you can use it somewhere down the line, for something totally unrelated to it’s original source.

Bit Length: Making A=B cause A==B evaluate false

I was hoping to my make my first ‘real’ blog post something interesting and useful.

Instead, it’s going to be a laughing rant over the wonders of bit lengths.

Once upon a time, it was necessary to count every byte of data you put into something.  If an 8 bit number served your purpose, you used 8 bits exactly and tried to shave some memory out elsewhere.  You didn’t use 16 — or higher — bit lengths unless you really needed them because every last bit was precious.

These days, we’re mostly past that.  We just say ‘int’ and be done with it, without really thinking about how wide it is.  After all, the default of 32 bits is huge, and while some applications can hit that limit, many, many more won’t.  (And with 64 bit OS’s becoming more and more common, I would not be surprised to see the default int length become 64…).

So, when I created my Core Data schema, I picked a nice, sensible 16-bit length.  Given that I was transitioning from storing a version date to a version number, it made sense.  We’re not going to have hundreds and thousands of versions of an ad.

Unfortunately, the server side of the code hasn’t made that transition yet.  The server-side code is based on date, still, and the datetime values are reduced to a 32 bit unix timestamp to transfer them to the device.

Unfortunately, the 32 bit value is larger than the 16 bit address space.  So I just spent several hours debugging code which reduces, logically, to the following:


if(A!=B)
{
//Do STUFF//
A=B
}

For some strange reason, I always got into the if block.  It was only after a long stretch of debugging that I discovered that the issue was the above code — at which point it didn’t take too long to figure out what the actual issue was.  Just… frustrating to get there.