EDIT: Hello from the future! This post is part of the old blog. That means it may be deprecated. However, I deemed it valuable enough to keep around. The new blog starts here!

Today I want to rant a lot about documentation. The emphasis will be on software documentation, but it holds for other kinds of documentation too. I chide because I love.

Although the Internet is now this untold well of data and nearly everything comes with 'documentation' attached to it, we've now only reached level 'terrible'. That's just above level 'abyssmal'. Let's be honest; most documentation is somewhat dysfunctional.

Act 1: A terrible experience with documentation

Example 1

Ok. Maybe you don't believe me. Let's do an experiment.

I write Python for my day to day work and at some point I need to see what's the deal with lists. I want to explore all the operations I can do with them, see when to use them or not and so on. I just want concise information on 'lists'.

Hey kids! You can follow along my misadventure from home: fire up your favourite search engine, search for 'python list' and keep the results in a window on the side!

My first result is: 5. Data Structures — Python v.2.7.5 . Here is a dramatic re-enactment of what ensues:

Me jubilant: Great, a link to the official documentation! I will be drinking from the very bosom of Python knowledge today!

clicks, reads first headings

Me: 'More on Lists'? There are a couple of functions here, but where is the "Less on Lists" part then?

looks around, checks the sidebar

Me: Only section 5.1 deals with lists and the rest seems to be dealing with data structures - wait, what is 'More on Conditions'? What does that have to do with data structures?

clicks, reads through

Me quizzical: Apart from the one paragraph among the seven, that has nothing to do with data structures really.

notices the footer

Me: Oh, this is the Python Tutorial. I get it... huh ... actually I don't. There is nothing 'tutorial-y' about this section. Like at all... Whatever.

looks back at the table of content

Me: Previous and next topics are unrelated, let's look at the table of contents for the whole Python Tutorial. It does say 'More on Lists', so lists have to have been covered before.

finds *3.1.4. Lists*_

Me: It mentions 'len()'; it mentions slice... That's all good, but isn't there a concise page just on lists? It is kind of strange that there is no link to the canonical page about lists. Clearly, this isn't complete since there is a 'More on Lists' section. And who knows if that section is complete? Bah! I am looking at the tutorial, I shouldn't be expecting too much! It's for learning. It's not a reference! Let's refine the search.

regains hope, searches for 'python list functions', finds *5. Built-in Types — Python v2.7.5 documentation*_

Me: Ah, yes! This links comes from the library and not the tutorial. It is the real thing. The light to the shadow I encountered before. Come to me knowledge!

clicks, ctrl+f to find the list section

Me muttering: 5.6. Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange . Hmm, this is not a reference. This is an explanation of various things that apply to these various objects. Where is the section devoted solely to lists?

looks around

Me : Ooh, there is a link named 'list'.

clicks

Me: What. Is. This.

End scene.

Yep, there is no single page in the official documentation where everything about a Python list is, well, listed. Maybe I was asking for something bizarre and elusive. Or maybe not: http://effbot.org/zone/python-list.htm.

This link comes as close as I've seen to what I was looking for: a page solely concerned with Python lists that shows the things you can do with a list and that discusses implementation and performance details. It is only missing a reference of all the functions pertaining to a list. There is also the unfortunate fact that it dates back to 2006. Is it even accurate anymore? And why is this coming from a third-party and not from the official documentation?

Exhale.

Example 1.5

Now this is just one case of a terrible experience. I am sure you've had many more. I was going to refer to Canonical's MAAS (Metal As A Service) and Juju documentation, but I am a fair and balanced critic and it seems like they have improved their game since their latest release. It was slightly incoherent and disjointed before. There is still no explicit link from the MAAS documentation to the Juju one though. And the video is still basically a lie. You will need all those O'Reilly books. Trust me.

Example 2

Let's pick on Sublime Text a little bit instead. The old documentation is marked as deprecated, yet it is essentially the same as the new one. Except that the old one is more complete and accurate and applicable. Compare the snippets page of the old documentation to the one of the new documentation. The new documentation seems to be the result of a bad copy-paste. Thankfully the version these documents refer to is explicit. Oh... Oh, that's right... They're not versioned at all.

Actually it's not even the official documentation. It's the community one. The official one is much sparser on snippets - read: there is no section on snippets. It's honestly somewhat alarming and disappointing for a product you bought. Sure there is the forum, but don't get me started on forums as 'documentation'.

Argh!

Act 2: Break it down

There are a couple of reasons why these experiences with documentation were not optimum (and again I consider those to be well above the rest of the documentation that's out there). In my mind I attribute their failings to missing the mark on these three key components for great documentation:

  1. Targeted
  2. Referenceable
  3. Explanatory

Targeted

Documentation, like code, should be seen through the lens of a user story. Steve wants to know what a snippet is and how to trigger it. There should be a page for that. Alice wants to see how to write her own snippet. There should be a page explaining to her how to do it. Bob is interested in contributing to the core software. There should be documentation for him (whether that is by having the code speak for itself eloquently or having well-documented methods or whatever).

In each of these cases there is a clear goal that is achieved and more importantly there is an audience in mind. From top to bottom: there is the user, then there is the developer and then there is the contributor.

The way this is done varies. There is nothing preventing these different pages to be combined in a single section-divided page or leaving them be in different ones as long as there is a single page that links to them all.

The Python tutorial targets the beginners. Great! The library should target the developers and the in-code documentation should target the contributors. It doesn't work if one has to travel to many disparate pages to satisfy a single user-story. That brings us to the next point.

Referenceable

Conciseness is of tremendous importance in documentation. If the user of your product is at the point where he/she has to be looking through documentation, you don't want to enrage them anymore than they already are by splitting it across many pages. Nor do you want to go to the DOCUMENT-EVERYTHING extreme where there is so much documentation everywhere that the typical person is just drowning in it. You know the saying: "If you want to find a needle in a haysack, you don't help by adding more hay".

Try to bundle up all related documentation. Make it easy to interface with it. Ruby-Doc is a great example of that. It's short, sweet, coherent, efficient and ubiquitously accessible.

Preposterous statement coming in 3...2...1... The linux man pages are kind of bad at this. Each man page is similar-but-not-quite-the-same. Examples? If you are lucky. Options ordering? There is no rule about consistency. Navigability? Hypertext came too late, let's stick to our guns!

Ok, I know. It is arrogant to think that this 40-years old system still in place is flawed. Everything is explained somewhere and with time you understand why it was done that way... or you just become completely enthralled by Stockholm syndrome - your pick. Yet some part of me says that it is equally arrogant to believe that this system is the end-all-be-all of documentation after only a couple of decades. I am pretty sure we can do better.

In the end the content has to be king. That's my next point.

Explanatory

It might seem that I've only been discussing reactive documentation so far, so let me be clear that this is not the case. Quick aside:

Reactive documentation. Defn. Superficial documentation that seeks to address the practical implementation of knowledge while never addressing the theory behind it. see 'How-to documentation'.

The Targeted section might have given this impression, but true explanations should not be precluded from documentation. Explaining why things are such and explaining the theory/design decisions is absolutely crucial. The documentation should be complete is what I am getting at.

You really want to answer the Why of it all. This will actually answer many other questions down the road and it will be beneficial to you too. It keeps you honest. If your documentation is incomplete, tell us. If it should still be applicable in 2013, tell us. Keep this cartoon by the Oatmeal in mind.

With these 3 components in mind you can look at various documentation and see why they sometimes fail or succeed:

  • man pages: Super explanatory, kind of referenceable and kind of targeted
  • Wikipedia: Super explanatory, very referenceable, absolutely untargeted
  • and so on...

These components are probably not exhaustive but it's a start in understanding our documentation failings.

Act 3: Can-do spirit

I don't want to just rave and rant here. What can be done to improve the state of documentation? I am young and inexperienced in these matters so I am not sure.

In fact I am not sure, because I don't think it's just a matter of tools here. Python's official documentation is automatically generated via Sphinx. Doxygen is another great tool that does the job and there are others.

We have some tools. Are we maybe lacking the culture? Finding out something on your own is cool and praised, but formally contributing the discovery back to the original source seems less common. Documentation is almost always the entry point for many open-source projects. Weird. Shouldn't that actually require the documenter to be much more familiar with the code base than any new person could be?

There is a need for editors. People that will go through the generated documentation and create coherent use cases out of them. Making the documentation consistent and adequately self-referencing builds up the trust in it its user will have.

Documentation can often times be seen as an after-thought. It shouldn't. You wrote all this code for a reason. Tell us! We, the public, will be glad to hear about it. We want to know the choices you had to make and the reasons behind them. It might be strange to hear, but we are mostly sympathetic creatures.

Anyway, next time you tell someone to RTFM! Go read them yourself and ask yourself why they had to ask the question in the first place. Maybe they are FM after all ;).