Do The Basic Research

by Krishna on January 28, 2010

I fail to comprehend how someone can write this (emphasis mine):

Overdesign is also very common with databases. People get hung up on rules like “third normal form”. Sometimes when you over-normalize a database is [sic] makes it impossible to query the data later on. If your application is going to require reports, sometimes its better to design the database with the reports in mind. In other words, let the reports dictate how the data is stored.

This kind of advice is insane, ridiculous stuff. Take a minute and read what the third normal form means. Not following the third normal form means essentially having duplicate data in different places in your database creating a serious nightmare in maintaining data integrity. What you gain in reporting, you will lose when you start writing code to ensure that data is consistent everywhere.

Normalizing a database does not make it “impossible” to query data. If you have many tables because of normalization, work is harder because you need multiple joins to do the trick. But if you know elementary SQL, you can get it done easily. On the other hand, if you haven’t normalized properly, you have very silly problems. For example, in the Wikipedia example, to find out how many players were born on the 1st of January, you have to write code that does not count the birthdays of multiple players twice. In a normalized database, it is an extremely simple query.

It is ironic that the author talks about “overdesign” while falling victim to premature optimization. First, understand what normalization is supposed to do and how it helps you. Learn SQL or use tools that can help you get your data out easily. Profile your application and find ways to improve performance. And only if everything fails, then go for denormalization. And that too, only after understanding its downsides and how to prevent them.

Read my previous article on the same subject.

Comments on this entry are closed.

Previous post:

Next post: