Code Generation

by Krishna on May 6, 2009

Justin Etheredge had a good post about using code generation only as the tool of last resort. He cites three problems:

  1. Maintenance – Generated code can be a maintenance nightmare. If your solution to a problem is to just generate reams of code, then you have created that much more code that you now need to maintain. […] Sadly, many of the problems that people use code generation for can be solved with a bit of reflection, a few abstract or virtual methods, and a sprinkling of interfaces. […]
  2. Flexibility – Generated code is often spit out by a custom piece of software or a template. This means that you either accept what is generated for you, or you have to customize the template. What happens when you need some piece of logic generated into a particular class? Well, you either edit your template to enable this case to be covered, or you create a new template for your class to enable this “one-off” solution. […]
  3. DRY – […] code generation can be the ultimate DRY violator if you aren’t using it for the right kind of code. Whenever I see someone use code generation for something that could be easily pushed into a base class or abstracted away into another chunk of functionality, I just want to scream! Code generation, when used, should be used for code which does not repeat. […]

That last statement is the killer: “Code generation should be used for code which does not repeat.” Almost by definition, code generation is meant to generate repetitive code with a few variations here and there. Obviously, you cannot use a code generator to create code that never repeats, because what would be the point in that? You would write the actual code instead of writing the code generator.

So if the purpose of a code generator is to create repetitive code and avoid typing, the question is: What kind of repetitive code are we talking about? Is it repetition you can avoid by better design or using a tool such as ORM? Or is it something you cannot avoid because of the nature of the language or framework you are using?

Several years ago, many people would have been faced with the latter situation. For example, EJB required a lot of boilerplate code. It was better to generate it instead of wasting time chasing compiler errors because of spelling mistakes. Sometimes, you needed a data layer that required writing similar database access code. There were no good solutions around it.

Nowadays the situation is different. We have more sophisticated ways of reducing the code that we need to write. The basic idea is to write as little code as you possibly can, because each line you write contains potential bugs. Therefore, code generators do more harm than good because they increase your code footprint.

Another problem with code generators is that their developers try to keep them alive even though the code generated uses older techniques. With the level of innovation we have in the industry, it is risky to be invested in any code generator that forces you along a particular design path. Because code generators are written by the top dog in the programming house, they tend to be very defensive about them, which is detrimental to the organization.

I also hear code generators used as a means to enforce a specific coding standard. Good intention, but wrong implementation. The right way is to get developers to understand the coding conventions and learn to follow them, instead of keeping them ignorant or not allowing them to learn on their own. This kind of behavior also reduces other quality initiatives such as code reviews.

I would say it is not necessary to completely rule out code generators. Never say never. Design well. See what you can leverage from existing solutions. At some point, once you exhaust other options, you may need some automatic code generation. But they should never be a starting point for a solution.

Comments on this entry are closed.

Previous post:

Next post: