Less comments, more structure - a proposal for self explanatory code

Understanding source code is not easy, especially someone else's. But even your own hand crafted carefully designed algorithms eventually become a little less clearer to yourself after months, weeks or even days. And you do want everybody to see your genius, don't you?

All right, if it's not clear enough you'll write some comments, no big deal. What a bummer that sometimes you forget to change the comments along with the code when refactoring. And is this comment refering to the next line or the next 20 lines? Ups, there is a comment all alone, someone must have forgotten to move it with the code ... Ok, there must be a better way.

 

I hereby propose a new way of writing code and writing less comments (ok, like 90% of them). Here's some code to show the "before", not very elegant, not very structured. But at least it's got comments, right? (please don't check for 100% correctness or deeper meaning of this example, because it has neither): 

function handleRequest(req, res) {
  var id;
  var user;
  var name;

  // extract interesting params
  for (var i = 0; i < req.params; i++) {
    var paramsSplit = req.params[i].split("=");
    if (paramsSplit[0] === "id") {
      id = paramsSplit[1];
    }

    if (paramsSplit[0] === "name") {
      name = paramsSplit[1];
    }
  }

  // get user from DB or create new one
  if (id) {
    // we found id, so get the record from DB
    var dbConnection = createDBConnection();
    var userCollection = dbConnection.getCollection("user");
    user = collection.find({id: id});
  } else {
    // we didn't find id, so create a new object
    user = {
      id: createNewId(),
      name: name
    }
  }

  // respond with user
  res.setStatus(200);
  res.send(user);
  res.end();
}

 

To improve upon this, I change the flat function to a hierarchy of function calls. This way the different abstraction levels are clearly separated. Let's have a look at the first level of abstraction:

 

function handleRequest(req, res) {
  var user;

  var paramObject = extractParams();
  var user = getOrCreateUser();
  respondWithUser(user);
  ...
}

Notice that there are no comments, because the function names already explain what the functions are supposed to do. We can have a look at the implementation of the second function getOrCreateUser . You could write a comment to explain what the function does like "gets the user object or creates a new one". But there is no real need to do that, because the function name already contains that information.

  function getOrCreateUser() {
    if (paramObject.id) {
      return getUserFromDb();
    } else {
      return createNewUser();
    }

    function getUserFromDb() {
      var dbConn = createDBConnection();
      var users = dbConn.getCollection("user");
      var user = users.find({id: id});

      return user;
    }

    function createNewUser() {
      var user = {
        id: createNewId(),
        name: name
      }

      return user;
    }
  }
}

This function has two subfunctions which are called depending on paramObject.id being set or not. You could write a comment for that like "get the user object from database in case id is not set or create a new user otherwise". But there is no need for that, because the code already expresses this. 

 

The sub functions are quite easy to understand too. getUserFromDb contains of three function calls that don't need comments either. createNewUser creates an object literal and returns it, no sense in trying to comment this.

 

You can find the complete example at the end of this article.

 

The new version has lots of advantages:

  • Functions with descriptive names are part of the actual code that gets executed and are not separated from it like comments.
  • This way refactoring is easier and there is less risk of making mistakes.
  • The hierarchical structure makes the higher level concepts much clearer. Each level should stick to the same abstraction level though and you should avoid mixing them. 
  • If you want to go into details, you just have to jump to the definition of the particular function. In most modern IDEs this is possible with a shortcut like F12.
  • Speaking of IDEs, you can use their code folding feature (collapsing of functions) to hide the details and only show them when necessary. You usually can't do that with code that is separated by comments.
  • Functions are declared in the scope in which they are needed/called and should be written at the end of this scope (after the - sometimes invisible - return statement). Don't write them at the beginning of the scope, because readers will then have to get past them (the details) first, before getting to the actual higher level concept.

Essential for this code style is to write many functions with very little code in them. Usually you should not exceed the 10 lines of code limit, most of the time it should more be like 6 lines maximum.

 

People worrying about performance when writing many short functions, should take this into account:

  • Most of the time it is way more important to have readable and maintainable code than making the user wait for another 10ms. Nobody is going to notice a 10ms delay, but surely everybody is going to notice buggy and not maintainable code. 
  • Modern JavaScript engines are very good at optimizing script code. Usually there is only a small penalty when loading/compiling the code.
  • If you really think you found a noticeable performance problem due to too many short functions, you can always optimize later on. But make sure you really identify the source of the performance problem and only optimize this part. Otherwise you end up with "super fast" unreadable and unmaintainable 100 lines functions. Only optimize the parts that are provably slow, leave the rest.

Here the complete second (improved) example code:

function handleRequest(req, res) {
  var user;

  var paramObject = extractParams();
  var user = getOrCreateUser();
  respondWithUser(user);

  function extractParams() {
    var paramsObject = {};

    for (var i = 0; i < req.params; i++) {
      var paramsSplit = req.params[i].split("=");
      if (paramsSplit[0] === "id") {
        paramsObject.id = paramsSplit[1];
      }

      if (paramsSplit[0] === "name") {
        paramsObject.name = paramsSplit[1];
      }
    }

    return paramsObject;
  }

  function getOrCreateUser() {
    if (paramObject.id) {
      return getUserFromDb();
    } else {
      return createNewUser();
    }

    function getUserFromDb() {
      var dbConn = createDBConnection();
      var users = dbConn.getCollection("user");
      var user = users.find({id: id});

      return user;
    }

    function createNewUser() {
      var user = {
        id: createNewId(),
        name: name
      }

      return user;
    }
  }

  function respondWithUser() {
    res.setStatus(200);
    res.send(user);
    res.end();
  }
}


Kommentar schreiben

Kommentare: 14
  • #1

    Christopher Hiller (Samstag, 19 Dezember 2015 11:47)

    I'm not sure if this is clickbait or what?

    The comments in your first example explain "what" the code does, which is always going to be redundant, regardless of how many or few functions you have, or how concise things are. Once the code changes, the comment becomes a problem if it is not updated.

    I look at your second example and see little improvement, because in both examples it's completely unclear "why" the code even exists at all--and these are the questions code comments must answer.

    Comments should tell you what the code cannot. Arguing that we shouldn't comment code is silly.

    Here's hoping an inexperienced programmr doesn't read this post and think comments are evil and must be replaced with functions.

  • #2

    Stephan Haewß (Samstag, 19 Dezember 2015 12:35)

    Hi,

    thanks for commenting. I dont't see why this should be a clickbait, it is just a proposal for better code structuring. If you dont't agree, it's ok and we can discuss it here.

    The second example has 3 function calls on the first level:

    var paramObject = extractParams();
    var user = getOrCreateUser();
    respondWithUser(user);

    In my opinion it should be pretty clear what is happening here: first the parameters are extracted, then the user is fetched or created and the user object is returned in the response. The good thing here is, you dont't need comments, because the function names already explain everything you need to know on this level. The details of how the 3 functions are implemented are hidden. If you want to know the details you can jump to the definition of the functions.

    And that is pretty much my point, you hide the details by putting them in well named (!) functions and not by deviding the code by comments. Comments should not be used to structure the code, but only to explain something that can't be explained by the code itself. And most of the times your code should speak for itself, so in my opinion there should be very little comments at all.

    I think that especially inexperienced programmers should not learn to put lots of comments into the code instead of trying to structure it with functions.

    That's just my opinion and it work very well for me so far (and for others I am working with). Please feel free to comment again and we can discuss it in more detail.

    Cheers!

  • #3

    Stephan Haewß (Samstag, 19 Dezember 2015 13:49)

    Hi,
    I edited the article to hopefully make my point clearer (and made it less provocative ;).

    Cheers!

  • #4

    Chris Geirman (Samstag, 19 Dezember 2015 13:53)

    I have been doing this for years. one thing I like about it is that you can easily get an overview of the entire app, usually without scrolling. I couple this with very simple, very shallow flow logic, when needed so that the app's business logic is also clear.

    good post. thx

  • #5

    Christopher Hiller (Samstag, 19 Dezember 2015 20:45)

    You miss my point. "Code should have few comments at all because the code should be clear enough" makes sense in a vacuum, but please re-read what I wrote about the comments existing to tell developers what the code cannot. If you need an example I can provide one.

  • #6

    Stephan Haewß (Samstag, 19 Dezember 2015 20:50)

    If code really cannot speak for itself, commenting it is ok and necessary. Now the question is how often does that happen. And yes, to discuss it further, an example would be helpful.

  • #7

    Stephen Burgess (Sonntag, 20 Dezember 2015 01:54)

    Interesting article. I definitely like to have non-abbreviated and descriptive names for variables and methods as well. However, especially in ES6 with its classes, I actually like to comment most everything! I like to explicitly label @param {String} user.name and @return {Object} user. This way there is an explicit and consistent pattern of commenting, allowing for new people to dive into code and understand what's expected for an method. React gets this part right I think, but I even like to go redundant beyond that. Use minification and compression to optimize the code. Users don't need to see expanded code anyway, and if they do they'll go to GitHub. I completely agree with creating descriptive and precise naming, but I'm also in favor of using comments to type parameters and provided consistent labels in classes. Describing params, returns, methods, etc in the comments is more in line with the OO styles of Java and PHP frameworks like ZF2. Now that JavaScript ES6 has classes, that's a practice that translates over well I believe. Thanks for the article.

  • #8

    Stephan Haewß (Sonntag, 20 Dezember 2015 14:58)

    Thanks for commenting!
    That was really my point, don't structure your code with comments, use descriptive names and write short functions. Commenting on interfaces of modules, classes, etc. is OK of course, especially if you write APIs and generate documentation from it. But you have to be careful, that code and comments are always in sync, because otherwise readers and user of the API get confused.

    So even for modules and classes I would say it makes sense to reduce the amount of comments and use descriptive names for methods and their parameters. My point was more to aim for as little comments as possible and this way make the code speak for itself instead of commenting badly structured code (or bad interfaces/APIs).

    Cheers!

  • #9

    Mike (Mittwoch, 23 Dezember 2015 15:55)

    Isn't this the same as the Extract Method refactoring? http://refactoring.com/catalog/extractMethod.html

  • #10

    Stephan Haewß (Mittwoch, 23 Dezember 2015 20:43)

    Thanks for pointing this out. Yes, you can refactor code with long functions (with comments) by using the extract method refactoring. But you don't have to refactor, you can also write short nested functions in the first place. I use the "extract method" refactoring in WebStorm sometimes, but it puts the function at the top of the enclosing function not at the end, so I have to move it there.

    Cheers!

  • #11

    gotofritz (Donnerstag, 24 Dezember 2015 02:25)

    This is basically what the say in the classic book Clean Code, which every programmer ought to read.

  • #12

    Stephan Haewß (Donnerstag, 24 Dezember 2015 14:24)

    Your are right, it follows clean code guidelines. But there is one thing specific to JavaScript, you can nest function declarations which is not possible in other languages like Java or C#. This makes the code even clearer, because the details are hidden inside the enclosing function.

    Cheers!

  • #13

    Alain Van Hout (Sonntag, 27 Dezember 2015 15:07)

    Fundamentally, code should always reflect intention, typically through concepts or abstractions. That's why statically typed OOP languages (when used properly) at core revolve around building and using interfaces. More generally though, that means that any piece of code that performs a self-contained functionality should be segregated, be it in a separate method/function, a separate class, or a separate module/package/library. That will inherently prevent wall-of-text code as well as spaghetti code.

    What you describe above in relation to method/function naming is just one aspect of that, but it definitely does not bypass the need for comments (except for those who comments which were already pointless to begin with and/or where the result of poor application of separation of concern).

    Function/method/variable/class/package/module names should always have appropriate names, which reflect their intent (though as we all know, and Phil Karlton is famous for having said, there are only two hard things in Computer Science: cache invalidation and naming things). Comments on the other hand are not about what a piece of code does, but about how it does it (that's why design patterns have specific names) and why it does it like that and not differently.

  • #14

    Stephan Haewß (Sonntag, 27 Dezember 2015 20:12)

    The article is actually not about avoiding all kinds of comments, but the kind of comments that are used to structure the code and to describe what it does. In your words these are the "pointless" comments, but the problem is that not everybody sees them as "pointless". To help with that, one could try to replace the code that is commented (including the comment) with a function with a descriptive name. If that is not possible, the comment may be reasonable.

    But in my opinion more important than avoiding (useless) comments is writing short (less than 10 or even 6 loc) nested functions in general. My "before" example in the article could be even worse, if you remove the comments. So instead of introducing comments to improve it a little bit (which is the intuitive way of doing it for many programmers I guess), it is better to directly structure the code into short functions. In my experience, writing short functions that are really not longer than 6 -10 loc is kind of hard for many programmers, but it improves the code a lot.

    It is also true that naming is crucial and not easy. Many programmers don't "waste" their time with naming things correctly and rather write comments to explain the poor naming, which is a mistake in my opionion. So for me atm the two most important refactoring methods are "rename" and "extract method" (which are both available in WebStorm fortunately, though not a perfect fit).

    Cheers!