Shop Talk: 2022-04-25

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Mike Chrestensen
  • Tracy Boggiano

Notes: Questions and Topics

PASS Summit Discount Code

If you’re looking at going to PASS Summit in Seattle this year, use the discount code Azure175 to save $175 off the cost of registration.

Least Privilege and SQL Server

Mala started us off with a fun(?) question about whether anybody actually implements the principle of least privilege with SQL Server. We got into the discussion quite a bit but my short version is, the principle of least privilege is an aspiration. Solomon mentioned module signing as the answer. My counter-point is that yes, module signing is the right tool but least privilege is as much a people & process problem as it is a tooling problem. The tool allows you to get there if and only if you have the people and process in place to do so.

Bro, Do You Even GROUPING SETS?

After that, we talked about a topic from Mike: grouping options, including CUBE, ROLLUP, and GROUPING SETS. I like ROLLUP for hierarchical data, such as Country -> State -> City. I don’t much like CUBE and I’ve never found a really good non-trivial example for why I’d use it. GROUPING SETS, however, is a top-notch operator. Read on for more info here.

Hardware for Practicing SQL Server Operations

Mike also asked for some hardware recommendations for doing “real” query tuning work with SQL Server. We talked about physical hardware, cloud VMs, and simulating weak networks (a trick I learned from Kendra Little).

Remembering the Basics

Our last topic came from Tracy and is around how easy it is to forget the “basics” when troubleshooting, especially in new environments. We make a lot of assumptions around the state of a system and sometimes those assumptions come back to bite us. Tracy’s example was using a balanced power plan on a server running SQL Server, thereby throttling CPU. We talked about the importance of checklists on the air but I want to expound on my thoughts just a little bit more.

Checklists are critical for troubleshooting but so is the right mindset. That mindset starts with careful reading and paying attention to available signals. That means reading error messages and understanding the content, which can sometimes be enough right there. Error messages often don’t do a great job of explaining themselves so it probably isn’t enough. Knowing where to look for more information, especially in non-destructive ways, is also vital. Some of this is “in-built” in the sense that it’s attitude but it is attitude which you can train into people. The rest is knowledge that you gain over time, and that’s what gets converted into the checklist.

Shop Talk: 2022-04-11

The Recording

The Panelists

  • Kevin Feasel

Notes: Questions and Topics

DevIntersection and In-Person Events

The first topic of the evening was an after-action report of DevIntersection, as well as some discussion around in-person TriPASS events. We talked about why there won’t be a SQL Saturday Raleigh in 2022…but there might be in 2023. Regardless of when we go in-person again, I’ll still have the recording equipment available.

What Makes Data Science Teams Thrive?

The next topic was something Mala brought up. This survey about data science teams hits on some topics but really clangs on a big one.

Shop Talk: 2022-03-28

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Mike Chrestensen

Notes: Questions and Topics

Targeting LAPSUS$

Our first topic of the night was a review of Microsoft’s write-up of the LAPSUS$ group, including how the group infiltrates organizations, what they tend to do, and how they operate. It’s a really interesting dive with some solid recommendations on how to prevent attacks like this in the future.

Replication: Push or Pull?

A while back, Mark Gordon reached out and asked about replication models, specifically whether push or pull should be the default at organizations. My quick thoughts are as follows:

Push is a good default because it lets you centralize jobs and processes (though if you have a really large number of jobs, you might need a separate job server to manage all of it). This setup works best when machines are on the same domain and the same people own and operate all of the servers.

When those conditions don’t hold, pull may be a better option. For example, if the machines are on different domains and there is no trust relationship between the domains, you can give the subscribers a SQL authenticated account and let them retrieve data. This also works well when you don’t want the publisher to have access to remote subscriber networks and when the people operating subscribers are different than the publisher server operators.

Shop Talk: 2022-03-14

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Mike Chrestensen

Notes: Questions and Topics

SQLbits and Hybrid Conferences

Our first topic of the night was a after-action report from Mala, who spoke at SQLbits. We also talked a bit about how difficult running a hybrid conference is: you have the difficulties of an in-person conference with the difficulties of a remote conference and few economies of scope.

HA/DR for a Small Shop

Most of our episode’s theme came from a question from rednelo, who reached out after the prior episode and asked about disaster recovery options. The short story is that they’re a small shop with one production server hosting both SQL and web servers. A Windows patch ended up breaking connectivity between the web server and SQL Server and they eventually needed to take a downtime to recover.

Rednelo followed up by mentioning log shipping as an option and wanted to see if that was reasonable. Spoilers: yes, it can be, especially for smaller shops.

One of the resources I love is Brent Ozar’s HA/DR planning worksheet. The last update was in 2016 but it’s still pretty well valid, with the only major addition being distributed availability groups.

Shop Talk: 2022-02-28

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan

Notes: Questions and Topics

Upcoming Conferences

Our first topic of the night hit on two upcoming conferences. First is SQLbits, which is coming up March 8-12. Registration is open right now and there are in-person and virtual options available. Note that this is a paid conference. Tracy will be there.

Second, DevIntersection, which includes the SQL Server & Azure SQL Conference, has a $100 discount if you use the code AZUREDC. This is also a paid conference and is in-person. I will be there.

Goodbye Big Data Clusters and PolyBase Scale-Out

The entirety of today’s episode was a Viking funeral for Big Data Clusters and pre-2019 PolyBase, based off of this Microsoft post. In short, Big Data Clusters are dead. PolyBase is not dead. We also talked a bit about making bets and I stretched that metaphor like riding 2-7 off-suit to a full house.

Shop Talk: 2022-02-14

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan

Notes: Questions and Topics

Upcoming Conferences

Our first topic of the night hit on two upcoming conferences. First is SQLbits, which is coming up March 8-12. Registration is open right now and there are in-person and virtual options available. Note that this is a paid conference. Tracy will be there.

Second, DevIntersection, which includes the SQL Server & Azure SQL Conference, has a $100 discount if you use the code AZUREDC. This is also a paid conference and is in-person. I will be there.

Rant One: Dimensional Modeling is Dead

Mala wound me up and I took it out on this poorly-reasoned article. The article includes a woeful misunderstanding of the purpose of normalization, makes unsupportable claims about storage and compute prices, claims that dimensional models are too hard for business users to understand while recommending that we use source systems for queries instead, and misunderstands the whole point of data lakes. Data lakes don’t replace data warehouses; they augment and help solve a different kind of problem.

In short, the gist of the article is, “Yes! You, too, can pay more money to be stupid!”

Non-Rant One: $FAMOUS_COMPANY

The above article is exactly what this article parodies. No rant on this one, only raves.

Rant Two: the Value of Normalization

Mala and Mike got me wound up a second time on the topic of normalization. We talked about this argument, that normalization isn’t used and isn’t useful. I tried to be a little less ranty here; whether or not I succeeded, you be the judge.

First, let me lay something out: normalization is a logical consequence of set theory and relational design. It isn’t a set of rules you bolt on afterward; it’s an understanding of the properties which underlie the fields of mathematics which underpin relational database technologies. It’s fundamentally about getting your data model correct. Now, a few points in increasing order of rantiness:

  • Normalization as formal logic is under-used in industry, so I agree to that extent with the tweeted argument.
  • It is also difficult to change this. For example, try getting a talk on normalization accepted. I’ve tried; it’s hard.
  • Where I start disagreeing is with the notion that “nobody” does normalization. I disagree with it in literal terms—I use it; therefore, that statement is not literally true. I also disagree with it in practical terms—formal application of the rules of normalization are under-used but informal application is pretty common. It could be used to better effect if people were trained in its mechanisms.
  • That’s where I start blaming academia: you’re not doing a good job of teaching normalization. People don’t understand the concepts because they have relatively few good examples. Therefore, we stop teaching it? Is that actually the right answer? Because the consequence is that the people you’re training end up with crappier data models as a result.

Shop Talk: 2022-01-31

The Recording

The Panelists

  • Kevin Feasel
  • Tracy Boggiano
  • Mike Chrestensen

Notes: Questions and Topics

Upcoming Conferences

Our first topic of the night hit on two upcoming conferences. First is SQLbits, which is coming up March 8-12. Registration is open right now and there are in-person and virtual options available. Note that this is a paid conference. Tracy will be there.

Second, DevIntersection, which includes the SQL Server & Azure SQL Conference, has a $100 discount if you use the code AZUREDC. This is also a paid conference and is in-person. I will be there.

Columnstore and Rowstore

Mike came up with the theme for tonight’s episode, which is a discussion of columnstore databases. The context of this comes from a statement in 2013 that columnstore databases are 10-50 times faster than relational databases and that traditional relational databases would disappear in favor of columnstore databases.

In all fairness to the person who made the statement, columnstore databases are considerably faster for analytical queries than traditional rowstore indexing. What happened in reality, though, is that traditional relational databases gained columnstore capabilities, such as clustered columnstore indexes in SQL Server. As a result, we saw another instance of a new technology challenging the existing relational model and leading to a synthesis of the new technology into the existing paradigm.

As a quick side note, no matter how transformational or interesting your technology, it’s a bad bet to say that relational databases are going away. The most generous I can be here is saying that if you hang your hat on “traditional” you might win the technical point—technologies like SQL Server, Oracle, and PostgreSQL changed and incorporated columnstore capabilities. On the broader point, we’re nearly a decade later and relational databases are 7 of the top 10 on DB-Engines.

Shop Talk: 2022-01-17

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Mike Chrestensen

Notes: Questions and Topics

SQLbits Coming Up

Our first topic of the night hit on SQLbits, which is coming up March 8-12. Registration is open right now and there are in-person and virtual options available. Note that this is a paid conference.

Becoming a Data Analyst

Our main topic of the night came in from a chat question:

I want to become a data analyst. I’m currently working in a profession not related to tech. Is this a pipe dream?

This is a great question and we spend a lot of time on (hopefully!) practical tips on how to get into the world of data analysis. Mike, Mala, and I come at this from different angles and one of the things I reiterate a few times is that data work is a career people move into, not start in. Quite often, somebody starts in a different field, begins to do data analysis within that area, and eventually decides to specialize in it when they find out how fun it is.

Shop Talk: 2021-01-03

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Mike Chrestensen

Notes: Questions and Topics

Code Reviews

The main topic of this episode was around code reviews. We talked about some of the pain points when doing code reviews for database changes, how they differ from application code reviews, and also ended up going down some cryptographical rabbit holes.

Here’s the link to the T-SQL static code analysis checks Mala and I discussed, as well.