We’ve been confronted with the problem of working with a poorly documented API at work, again. The samples are overly complex and the simplest to the most advanced still link with and use functions from the same files making it very difficult to figure out what each individual part of it actually does and why it’s needed. I’ve had to deal with this issue before but this time around it’s definitely worse. In order to break down this API we have decided to apply a method known as “Kristian’s Model” which allows you to quickly gain an understanding to complex samples.
Kristian’s Model is a five step process which I will try to explain below. The original author has requested to remain anonymous.
1. Design comp.sh
If the existing build system is working against you quickly have it replaced by a simple build script that will compile exactly the application which you are trying to break down .. no more, no less. I like GNU Make as much as the next guy but when developers have too much spare time they have a tendency to abuse the more powerful features of GNU Make in order to concoct Makefiles which are almost more difficult to understand than the application itself. By running a make -n <target> you’ll quickly and painfully be able to extract the necessary commands to build your target. Put these commands into comp.sh and prune them when you need to.
2. Replace function calls with their actual code
Now that the build system is out of the way we can move on to the more interesting part, the actual code.
Breaking up large functions into smaller parts and putting these parts into functions with descriptive names is a very important quality to have in a programmer but sadly this seems to be somewhat of a lost art in the world of API-developers using C. It’s not at all uncommon to see functions spanning pages upon pages of unrelated code which would be a prime candidate for being broken up in any sane developers mind. An example from one of the simpler (!) samples I have to work with is a main function which spans a whopping 1150 lines. It’s bad enough to see this in a regular application but seeing it in a sample application which is meant to show how to use a moderately complex API makes me want to peel someone eyes out.
What makes things matters is that someone, probably management, realized that having single functions span more pages than your average datasheet was a bad idea and forced the inexperienced programmers to break up their functions. This had the unfortunate result that instead of having one humongous main function where at least execution flow was largely obvious the code was broken up into, still very large functions, with generic names such as initialize_options (which initializes maybe 100 variables stored in one enormous struct) and apply_options which used a way too small set of sub functions to apply this large number of options. When they ran out of generic function names they decided to use the next best thing, obscure names such as refresh_soft_cc and set_cp_bit (and no, they don’t make sense even if you know the context).
The second step goes against everything any good programmer should know. Replace all non-stdc calls in your main function with the actual code used in those functions. Iterate until you have nothing but stdc and API calls. You will end up with one very large main but at least code flow will now be painstakingly obvious. This step will aid you in your next venture, step 3.
3. Code reduction
Now that all code is nicely packed into one large bundle start removing anything that doesn’t affect the result of application execution in a negative way. My personal favorite is using #if 0 .. #endif since you can quickly enable/disable code to analyze its effect. I also use this method a lot in step 2 where I add an #else case with my replacement code until I can ascertain that it works like the original function call.
Once you have reduced the remaining code to a bare minimum of functionality you can finally start wrapping your brain around it.
Not all people like object oriented development but to me it’s one of the most powerful tools I have as a programmer. It allows me to hide away functionality behind clearly understandable concepts such as somePath.parent() and FileUtils::isDirectory(..) which in effect means that you can break large complex problems into smaller and smaller objects. Sometimes I’ve even been known to design prototypes in high order object oriented languages only to make the actual implementation in regular C, without using fancy struct constructs.
It should be obvious by now but the fourth step is to divide your large lump of code into small understandable units of objects. This process will help you to gain a deeper understanding of what is actually going on behind the scenes while at the same time generating objects which you can use more or less unmodified in your applications.
5. Implement all possible features
The last and final step could be considered optional but it is helpful if you want to make sure that no stone is left unturned. By now you should also be close to the peak of total understanding of the system which you are breaking down and now is probably the best time to wrap as much as possible of the API in easy to understand objects. Try to implement features which will exercise all capabilities of the API even if you don’t need it now or any time soon.
By now you should have a pretty complete understanding of the system which you have to work with. Reading the remaining samples should be easier since you should be fairly familiar with domain specific concepts and terminology. Hopefully from hereon you will find it easier to read the remaining code samples as you know what most or maybe even all of the API calls do.
You’ve walked down a long and winding path but in your heart you know it had to be done. Try not to think of the long and tedious process when you start out and instead focus on the end results of your hard labor. I’ve tried having one or two highly interesting side projects to jump to whenever it feels like you are trapped in a dark pit but my experience is that it is better to grab the bull by the horns and just do it as distractions will only cause you to lose focus.
My kudos goes out to the guy who formalized this model which in turn made it much more quantifiable as a legitimate project task.