Monday, August 12, 2013

... just write *what* down?

In a previous post about the importance of reducing documentation-related technical debt in the near term, I said Just write it down.

But what's the it in it?

If someone vaguely suggests you need to document your code better -- what? How, where, when?

Here are a few ways to answer that question.

Explaining to yourself and others

Six months from now when you come back to this project you'll have to spend a morning figuring out your own stuff. Whatever you spend that morning figuring out (Yes, a List<?> argument would have been more elegant here, but a bit downstream we're writing to a third-party API which takes List<Object>, or, The reason for the counterintuitive spelling of these log-file entries is because of the syntax expected by a log-parsing tool, not in this package) is precisely what you need to write down now. 

But ... 6 months from now hasn't happened yet! So maybe leave the code as is, and do the documenting 6 months from now -- ? Right now, while you're mid-project, your brain cache is loaded with corporate value. If you wait, that value will expire from the cache. 

Instead, maybe think about the previous project, and mimic that. 

Or, ask someone else what's not clear to them. Say you spend half an hour explaining something to them. Whatever you say to them to make it clear ... that's what needs to go into code comments and/or a document.

Build up patterns

After a few projects you'll see the pattern of what kinds of things you end up explaining to yourself and others. Then on project n+1 you can go ahead and write those things down in the first place. 

My personal list of the kinds of things I end up writing down: 

* At top of source files, a hyperlink to the project's wiki doc. And in the project's wiki doc, a link to the source-code package. Also, just a few sentences about Why does this code exist? Who was it written for? What does it connect to? Who consumes its data? Nothing fancy or elaborate is needed here, but providing this context is crucial. This is the first light at the beginning of the tunnel your reader is about to walk down -- so tell them what tunnel this one is, and where it goes.

* For the wiki doc, take the time to make an architecture/flow diagram or two. (Use the figure-draw feature of any office software, then save as PNG or take a screenshot.) This is worth its weight in gold. Usually these are better off in wiki docs (rough-draft with pencil and paper) but occasionally ASCII art is feasible -- and has the benefit of being able to go directly into the code. 

* Use nitty-gritty comments for nitty-gritty code. Any subtle thing which, upon re-reading causes me to puzzle 5 minutes before the light bulb goes on over my head. That light-bulb insight must become a code comment. 

Classic example: a subroutine with a complicated regexp for parsing some config-file contents. Or pct =`df -Ph /var/log`.split(/\n/)[1].split(/\s+/)[-2]. (Wait, what?) Simply copy and paste a sample line of that file, saying, "Here's the kind of thing we're parsing". For example:

# Sample input (we're taking the second-to-last field of line 2,
# without first checking for existence of the directory):
# $ df -Ph /var/log
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1       185G  137G   39G  78% /
free_percent =`df -Ph /var/log`.split(/\n/)[1].split(/\s+/)[-2]

This kind of thing takes seconds to paste in, and not much space, but it's a huge time-saver for the next person to be looking at that scary regexp or split/join/etc.  (Especially when the file format in question is something off a socket, something they can't just type for themselves.)

* During a project, from design through implementation, testing, and ops, I always end up keeping a set of browser bookmarks and a text file with commands I copy/paste into there -- basically any often-repeated commands I end up using. Paths to things, tools I wrote, tools I often use for the project, etc. Those bookmarks and the frequently used items from my cheat sheet must be copy/pasted into a document and/or code comment. These are also worth their weight in gold. 

The key point here is that writing takes time and shouldn't be done for its own sake. Write down something which will save your co-workers more time than the time you're spending to write it. Be frugal with your team's time -- with foresight.

See also previous posts herehere, and here.

On-call support

What will you or your co-worker need at 3 a.m.?

* Ask yourself

* Ask them

* For a not-yet-deployed project, see what kinds of alarms / error conditions are going off, things that would result in a page if they were live. 

What about when it goes stale?

I'm asking you to do a dangerous thing. I like to use a junk-DNA metaphor: DNA has coding sequences (a mutation there is fatal) and junk sequences (a mutation there does nothing). [Disclaimer: apparently junk DNA does more than nothing. But I'm not a geneticist. It's my metaphor and i'm sticking to it.]

Code is like the coding sequences of DNA. Change an a to a q in the code and you've got a compile-time error or run-time bug -- which is actually a good thing since you'll find it right away. Comments are like junk sequences: change an a to a q in a comment (let alone something more significant) and the program runs as before. It's an informational time bomb which can blow up months or years later.

As time goes by, code changes, but comments often don't. They can be outright dangerous, leading someone to think something untrue. They can be worse than no comment at all. And keeping comments up to date as code gets refactored significantly can be an awful lot of work (and a pain). Leading us perhaps not to bother in the first place.

This is an important criticism. 

So what do we write down? And how do we insure against future harm? 

* Some things needn't be written down. (Don't turn everything into a novel.) In particular, the living code should be self-describing; this is the coding-sequence and it will stay current. (Name the boolean haveSeenHeader, not just flag).

* Big-picture comments belong at the top of the file, or in a wiki page, and hopefully they'll stay true, or be easily found.

* Nitty-gritty comments belong right next to the nitty-gritty code. If that code gets changed then the comment should get found and go along with it. 

* When I'm in doubt, I find that a disclaimer of the form As of this writing (July 2013), this is ... is a good compromise: some information is better than none, yet you're giving notice that it may not always be true. 

Things that deserve special attention

* Interfaces: APIs and config files. File formats you consume or produce.

* Clever algorithms. Clever anything.