Shop Talk: 2020-05-04

The Recording

The Panelists

  • Kevin Feasel
  • Tracy Boggiano
  • Mala Mahadevan
  • Tom Norman

Notes: Questions and Topics

The Right PC for the Home Lab

Chris Voss e-mailed us with a great question. “What are the “home lab” setups like? What computers/specs does everyone have, and for what purposes? I’m talking about person systems rather than work. Part of why I ask is because I’m looking at new computers, so I’m asking basically every tech person to ensure I’m doing this right.”

Answer: This will depend on whether you want a desktop or a laptop. Or you could just have a server room in your basement…

Tom and Tracy went over some of the characteristics they look for in laptops, starting with 64 GB of RAM. I mentioned that I’d much rather have extremely fast disk for a single SQL Server installation, though both of them have multiple SQL Server instances running on multiple VMs, so the need for lots of RAM makes perfect sense.

If you want a resource for building a desktop machine, the folks at Logical Increments do an incredible job. The service is entirely free and I used it to build my machine learning and video processing desktop. You pick the price point and they give you several recommendations on hardware choices. One thing I would say is that I’d recommend going up a notch on drives—prosumer grade NVMe (like the Samsung Pro series) over consumer-grade SSD. Fill your motherboard’s NVMe slots first before using SSD or HDD.

Neil Hambly had some nice recommendations as well, including Overclockers for the UK folks and making sure that you swap out SSD every 18-24 months to eliminate the risk of a drive dying on you. Unlike hard disks, SSD doesn’t give you much as much warning before it dies out, and it can just suddenly drop off.

SSD vs NVMe

As a bit of kismet, Mark Gordon had e-mailed me earlier with a great follow-on question: “When it comes to storage for SQL Server, does NVMe offer an improvement over SSD?”

Answer: Oh, you bet it does. On stream, I read a tiny bit from this article on the differences between NVMe and SSD. The relevant portion is:

NVMe is not affected by the ATA interface constrictions as it sits right on the top of the PCI Express directly connected to the CPU. That results in 4 times faster Input/Output Operations Per Second (IOPs) rivaling the fastest SAS option out there. The seek time for data is ten times faster. NVMe can deliver sustained read-write speed of 2000MB per second, way faster than the SATA SSD III, which limits at 600MB per second. Here the bottleneck is NAND technology, which is rapidly advancing, which means we’ll likely see higher speeds soon with NVMe.

With SQL Server, you will notice the difference under load. NVMe is still nowhere near as fast as RAM, but it’s a lot closer than SSD (which is itself way closer than 15K spinning disk).

By the way, for the pedantic-minded, I am aware that NVMe disks are still SSD; when I say SSD, I mean SSD over SATA in the classic 2.5″ form factor.

Mark had some follow-up bits I can hit briefly here. He mentioned tempdb as a good candidate for fast disk and that’s a smart idea: tempdb should be on the fastest disk you can possibly get. Here’s a rough guide that I’m coming up with off the top of my head, ranking things in order of best to worst:

  • All direct-attached NVMe
  • All-flash SAN array
  • Tempdb on NVMe and all other databases on direct-attached SSD
  • All direct-attached SSD
  • SSD SAN array
  • Tempdb on direct-attached SSD and all other databases on HDD
  • Hard disks everywhere

There are other configurations that nestle in between some of these (e.g., direct-attached SSD for tempdb but SSD SAN for the rest is slightly better than SSD SAN array but slightly worse than all direct-attached SSD), but the general rule of thumb is that direct-attached beats SAN and that NVMe > SSD > HDD.

Learning T-SQL

John Fan Zhang asked for a good book to learn SQL.

My recommendation, to the point where I have purchased this book for one of my employees needing to learn T-SQL, is Itzik Ben-Gan’s T-SQL Fundamentals 3rd Edition. Itzik is brilliant and an outstanding teacher, and even if you have an advanced knowledge of T-SQL, you’ll still pick up things from his beginner-level book.

Mala also recommended Itzik’s training, available on demand.

Fun with RIGHT JOIN

Gabriel hit me up with this question before the stream began: “Is there a really a need to support and maintain RIGHT JOIN?”

Answer: Tracy says no, Tom says no, Mala says no, and Kevin says mostly no.

The thing about RIGHT JOIN is that it is usually confusing to people because it’s backwards from how we want to read. In English, we read left to right, top to bottom. We also work from the assumption that the most important stuff comes first. RIGHT JOIN violates this by making the latter table the “important” one. The other consideration here is that every RIGHT OUTER JOIN operation can be rewritten as a logically equivalent LEFT OUTER JOIN.

That said, I have personally run into a couple of cases where it made sense to use a RIGHT JOIN rather than switching to LEFT JOIN. These cases were mostly around complex join criteria with a combination of LEFT JOIN and INNER JOIN and one last RIGHT OUTER JOIN to catch the “I don’t have anything else” scenario. So I wouldn’t get rid of RIGHT OUTER JOIN, but if I see it in a code review, the first question is asking why this needs to be ROJ and cannot be a LOJ.

Aliases and Table Names

Finally, chat got off onto the tangent of aliases and table names. On this topic, @iconpro555 tossed us into the briar patch with “why not use 4 letter names because that is what you use for aliases anyway?”

As far as naming goes, my rule of thumb is: make it clear but not overly verbose. 4 characters is fine if a table is called dbo.Home and represents information about a home (location, square footage, tax appraisal, etc.). But don’t be afraid to add a few extra characters to a column name if it clarifies intent. One thing I really like to see is unit of measure. You show me a thing called TotalCPUTime, but is that in seconds? milliseconds? microseconds? This gets really annoying even with SQL Server DMVs because some of them are milliseconds and others microseconds.

Names are for developers, whether that’s the current developer or a future maintainer. Just like with the discussion about RIGHT OUTER JOIN, we are optimizing for developers rather than for the database engine. There are times when you need to optimize for the sake of the database engine rather than the developer, and that’s where you start adding copious notes clarifying your intent.

Events of Note

We wrapped up with one event of note because I forgot about the other one:

Leave a Reply

Your email address will not be published. Required fields are marked *