Shop Talk: 2022-02-14

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan

Notes: Questions and Topics

Upcoming Conferences

Our first topic of the night hit on two upcoming conferences. First is SQLbits, which is coming up March 8-12. Registration is open right now and there are in-person and virtual options available. Note that this is a paid conference. Tracy will be there.

Second, DevIntersection, which includes the SQL Server & Azure SQL Conference, has a $100 discount if you use the code AZUREDC. This is also a paid conference and is in-person. I will be there.

Rant One: Dimensional Modeling is Dead

Mala wound me up and I took it out on this poorly-reasoned article. The article includes a woeful misunderstanding of the purpose of normalization, makes unsupportable claims about storage and compute prices, claims that dimensional models are too hard for business users to understand while recommending that we use source systems for queries instead, and misunderstands the whole point of data lakes. Data lakes don’t replace data warehouses; they augment and help solve a different kind of problem.

In short, the gist of the article is, “Yes! You, too, can pay more money to be stupid!”

Non-Rant One: $FAMOUS_COMPANY

The above article is exactly what this article parodies. No rant on this one, only raves.

Rant Two: the Value of Normalization

Mala and Mike got me wound up a second time on the topic of normalization. We talked about this argument, that normalization isn’t used and isn’t useful. I tried to be a little less ranty here; whether or not I succeeded, you be the judge.

First, let me lay something out: normalization is a logical consequence of set theory and relational design. It isn’t a set of rules you bolt on afterward; it’s an understanding of the properties which underlie the fields of mathematics which underpin relational database technologies. It’s fundamentally about getting your data model correct. Now, a few points in increasing order of rantiness:

  • Normalization as formal logic is under-used in industry, so I agree to that extent with the tweeted argument.
  • It is also difficult to change this. For example, try getting a talk on normalization accepted. I’ve tried; it’s hard.
  • Where I start disagreeing is with the notion that “nobody” does normalization. I disagree with it in literal terms—I use it; therefore, that statement is not literally true. I also disagree with it in practical terms—formal application of the rules of normalization are under-used but informal application is pretty common. It could be used to better effect if people were trained in its mechanisms.
  • That’s where I start blaming academia: you’re not doing a good job of teaching normalization. People don’t understand the concepts because they have relatively few good examples. Therefore, we stop teaching it? Is that actually the right answer? Because the consequence is that the people you’re training end up with crappier data models as a result.

Leave a Reply

Your email address will not be published. Required fields are marked *