Author: Kevin Feasel

Kevin Feasel is a Microsoft Data Platform MVP and CTO at Envizage, where he specializes in data analytics with T-SQL and R, forcing Spark clusters to do his bidding, fighting with Kafka, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL (https://curatedsql.com) and author of PolyBase Revealed (https://www.apress.com/us/book/9781484254608). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.

Shop Talk: 2024-07-15

by Kevin Feasel

July 17, 2024

4:56 pm

Leave a comment on Shop Talk: 2024-07-15

Show Notes

The Recording

The Panelists

Kevin Feasel
Mala Mahadevan
Mike Chrestensen

Notes: Questions and Topics

Shop Talk: Now with Added Decibels!

Our first topic was me talking about some of the audio silliness I put up with for years because I didn’t know any better. I blame nobody but myself for hiding all of those decibels for so long. Things are better now, by which I mean I’m a lot louder. It also means that people who don’t speak English will understand me better. Because I’m louder.

Version-Specific Functionality

Mala brought up our primary topic for the day, a post on X from Grant Fritchey regarding version-specific functionality. My official ranking of responses is:

Build and optimize for what you have if it’s an internal app. If your systems are running SQL Server 2022, take advantage of what the product has. Don’t limit yourself to old versions of T-SQL for some silly reason. I specifically disagreed with one response talking about limiting T-SQL to “2008/2012”—which is itself a wild answer, considering that 2012 was a huge change in the T-SQL surface area. Think about extended window functions, TRY_CAST(), PARSE() and TRY_PARSE(), CONCAT(), OFFSET/FETCH, and the use of sp_describe_first_result_set. Those are available in 2012, not 2008.
If you are building something for internal use, don’t build “for portability” unless you actually port databases with regularity. Because almost no company actually does that. They stick with a platform for decades, and even if a database does move from one platform to another, it’s a one-time thing. You’re crippling your database capabilities in an attempt to be the lowest common denominator.
If you are building a third-party product as an ISV and you expect your database to be “real” (in the sense that you have sufficient data and number of queries per second that it couldn’t comfortably run on a Raspberry Pi), build for each platform you support. Don’t use generic ANSI SQL just because you’re supporting SQL Server, Oracle, PostgreSQL, and MySQL. Write and use T-SQL for SQL Server, PL/SQL for Oracle, etc. Take advantage of what’s in each platform and then have your application’s data layer sort out the differences so that your business objects can be the same regardless of the data platform technology.
If you are building a third-party product as an ISV and your database can comfortably run on a Raspberry Pi, go ahead and do whatever you want.

Normalization

Our next topic was a good article from Daniel Calbimonte on normalization. I covered this on Curated SQL, but wanted to get a bit further into detail on why I argue that atomicity is not actually a pre-requisite for 1st Normal Form. In Daniel’s defense, a lot of references on database design and normalization include atomicity as a pre-req, but I base my argument on CJ Date’s definition of 1NF and his critique that “atomic” doesn’t actually have a precise meaning, whereas each rule of normalization must have a precise mathematical definition.

I also touched upon a fantastic essay from Phil Factor about the entity-attribute-value anti-pattern. I don’t know that I’ve ever met Phil (though if you’re out there, let me know—I can keep a secret!) but this is a must-read essay.

Shop Talk: 2024-07-01

by Kevin Feasel

July 5, 2024

1:03 pm

Leave a comment on Shop Talk: 2024-07-01

Show Notes

The Recording

The Panelists

Kevin Feasel
Mike Chrestensen

Notes: Questions and Topics

SQL Saturday Survey

Our first topic was a SQL Saturday survey that Steve Rezhener put together and that Steve Jones has promoted. I do recommend filling out the survey, and you can also view the results. Some of the questions on the survey were a bit of a challenge to understand, so I think that has limited the number of responses (in fact, Anders quit about halfway through the survey for that reason). But the information is potentially useful for getting a pulse of where SQL Saturday attendees are in terms of their current capabilities, as well as an idea of what types of sessions might be interesting to include.

Along the way, I talked a little bit about designing surveys, took some potshots at DB2, learned from Solomon that some of my cheap shots at DB2 around 8-character column names are no longer valid, and dove into how ranking mechanisms work for sites like DB-Engines and why its results might so radically differ from the Stack Overflow survey. Populations matter a lot for surveying.

Paginated Queries

The other major topic came from Tracy, despite her not being on the show this time around. She wanted to talk a bit about pagination and OFFSET/FETCH in particular. I happen to have a talk that includes pagination techniques and went into detail on how each works and the expected performance profile for each one. OFFSET/FETCH is fine for systems where you don’t have a huge number of pages, or where people rarely go past the first few pages. But it performs steadily worse the farther in a person goes, as it requires scanning all of the offset records and then pulling in the fetch records. I also lamented how much of a challenge pagination is, and how there are so few really good patterns for it given how common a technique it is and how important it is for websites.

Shop Talk: 2024-06-17

by Kevin Feasel

June 20, 2024

12:40 am

Leave a comment on Shop Talk: 2024-06-17

Show Notes

The Recording

The Panelists

Kevin Feasel
Tracy Boggiano
Mike Chrestensen

Notes: Questions and Topics

30 SQL Server Security Checks in 30 Days

Our first topic was a series via Straight Path Solutions on 30 SQL Server security checks in 30 days. There are some good things to keep in mind while going through this series. That said, I’d highly recommend Solomon Rutzky’s Module Signing over anything to do with cross-database ownership chaining.

UNISTR() and || in Azure SQL Database

The second topic was a blog post about two new additions to T-SQL in Azure SQL Database: UNISTR() and ||. The || operator acts very similarly to the + operator and serves to concatenate two items together. || is the ANSI SQL standard, so that makes it a perfectly reasonable addition to the syntax, though I’ll use CONCAT() and FORMATMESSAGE() over || or + most days of the week.

The other addition is UNISTR(), which lets you escape Unicode characters. Solomon had a good explanation of why he’s not the biggest fan of this change.

SQL Server Linting

Our final major topic was thanks to Tracy: linting in SQL Server. Tracy recommended super-linter as a project for linting in a wide variety of languages. In this broader project, there are a pair of linters for relational databases, and Tracy’s preference is SQLFluff. We explained what the purpose of a linter is, looked at some linting rules, and got derailed on “usp_” and passive voice, as is my wont.

Shop Talk: 2024-06-03

by Kevin Feasel

June 6, 2024

6:04 pm

Leave a comment on Shop Talk: 2024-06-03

Show Notes

The Recording

The Panelists

Kevin Feasel

Notes: Questions and Topics

Vista Equity Writes off Pluralsight

Our first topic came from Brent Ozar’s Monday links newsletter and covers a report that Vista Equity is writing off the entire equity value in Pluralsight, which they bought for $3.5 billion a few years back. I ranted a bit about private equity and talked about some of the challenges that learning platforms have in our modern era. I really liked Pluralsight and was a subscriber for years, only cancelling the subscription after I moved to a totally different role and really didn’t have time to take any more courses.

Copilot+ Recall: A Bad Idea

Our other big topic was an amazing article covering a terrible idea. Kevin Beaumont has written a top-notch post describing the many ways that Copilot+ Recall is not ready for prime time. And one cynical take I read after broadcast summed up the problem quite nicely: if Copilot+ Recall does become generally available, you have to assume that any document you ever send to a user running Windows 11 (or later) will be stolen.

All the Copilots

Speaking of Copilots, we also talked about the various Copilots Microsoft has available right now, based on this Ginger Grant article. Ginger lists 10 of them and we might even have quibbles about there being additional ones (e.g., PowerPoint and Word being separate Copilots rather than one Office 365 Copilot) in practice if not in pricing.

Shop Talk: 2024-05-20

by Kevin Feasel

May 22, 2024

4:30 pm

Leave a comment on Shop Talk: 2024-05-20

Show Notes

The Recording

The Panelists

Kevin Feasel
Mike Chrestensen

Notes: Questions and Topics

SQL Saturday Richmond Recap

Our first topic was a quick recap of SQL Saturday Richmond. It was good to see the RTP area turnout, as well as some people we don’t get to see very often, like Anders. I also talked about how the SQL Saturday Richmond people gave me 2 hours and 15 minutes to talk about normalization and I still went long. It’s a mad world.

GPT-4o Released and Azure Regions

Our next topic was OpenAI’s release of GPT-4o, a cheaper, faster, multi-modal model. It’s also available in Azure OpenAI, though only two regions as of right now (East US and West US 3). That got us on a winding discussion of Azure regions, including a discussion of what Azure OpenAI resources are available in which region, how Sweden Central gets all the nice stuff, and then a dive into Azure regions as a whole. I brought up several references, including an article on Azure regions.

From there, I had to show the Nuke map, which gives you an idea of the effective zone of a nuclear explosion. Marshall brought up the Mars Bluff atomic weapon blast that affected his grandparents.

Back on Azure regions, I showed one of my favorite sites for pricing; CloudPrice.NET. This started as a discussion of relative region prices and then went into discussion of specific VM costs. Searching across regions is a great way to save money, assuming you have isolated resources and aren’t sending a whole lot of data cross-region.

LLMs and Data-Control Path Insecurity

Bruce Schneier named a security issue I’ve always had some challenge describing: data and control path admixture. This is a real challenge with LLMs today, though it is a solvable problem. It may just take some time to solve.

We also dug into a linked article on prompt injection and had a bit of fun with other topics tangentially related, as happens from time to time (by which I mean “every time”) on Shop Talk.

Shop Talk: 2024-05-06

by Kevin Feasel

May 9, 2024

1:53 pm

Leave a comment on Shop Talk: 2024-05-06

Show Notes

The Recording

The Panelists

Kevin Feasel
Mala Mahadevan
Mike Chrestensen

Notes: Questions and Topics

Happy Birthday, Happy Graduation Day

First up, we wished Tracy Boggiano a happy birthday, as one does. After that, Mala announced that she’s graduating from a Master’s program in a couple of weeks, so we wished her an early happy graduation day.

T-SQL Tuesday

I gave a sneak preview of this month’s T-SQL Tuesday, which I am hosting. The theme is, what is your favorite job interview question? I talked at some length about questions as the interviewee or candidate, because asking good questions is something a lot of candidates don’t think to do. You’re interviewing the company just as much as the company is interviewing you, after all.

Azure SQL Database Copilot

Mala brought up for discussion an article about the private preview for Azure SQL Database Copilot. Anders mentioned that his company is in the private preview and he was pretty happy with how it worked.

The Unfortunately-Named SUQL

Mala also shared a link to this article on the Structured and Unstructured Query Language, otherwise known as SUQL. I had far too much fun mocking the acronym. As an idea? I dunno, maybe it’ll work out great. But as a name? No, no, no, please talk to the marketing department first.

Anders Has a Song

Anders used Suno to generate a song for the DBA, lyrics, chords, and all. You can and should enjoy it for what it is.

Shop Talk: 2024-04-22

by Kevin Feasel

April 25, 2024

12:04 am

Leave a comment on Shop Talk: 2024-04-22

Show Notes

The Recording

The Panelists

Kevin Feasel
Mala Mahadevan
Mike Chrestensen

Notes: Questions and Topics

The EU AI Act

Our first topic came from Mala, covering the EU’s new Artificial Intelligence Act. We covered a press release on the topic and I engaged maximum cynical overdrive on this one. We also walked through a fairly recent article about the EU AI Act’s timeline, and I spent an inordinate amount of time talking about how Germans use passive voice far too often for my taste.

Upcoming Events

There are a few upcoming events that we highlighted:

SQL Saturday Richmond is coming up on May 18th
I did not mention it, but SQL Saturday Jacksonville will be on May 4th.
A new conference, called AI in Production, will take place in Asheville, NC on July 18th and 19th. The Call for Speakers is still open.
Mala promoted Mental Health and Wellness Day, coming up on May 10th.

GitHub Comments as a Malware Vector

Our final article of the night came from Bleeping Computer, where Lawrence Abrams wrote about an interesting malware distribution vector. The way this works is really tricky because GitHub repo owners won’t know about file uploads associated with their repositories. The attacker isn’t actually messing with repo code or anything of the like, but rather uploading files as part of comments, and the file upload happens even if the attacker doesn’t post the comment. From there, the attacker rides on the reputation of the GitHub repo or account owner to try to sneak one over on people.

Shop Talk: 2024-04-08

by Kevin Feasel

April 13, 2024

6:10 pm

Leave a comment on Shop Talk: 2024-04-08

Show Notes

The Recording

The Panelists

Kevin Feasel
Mala Mahadevan

Notes: Questions and Topics

Happy Disappearing Sun Day

There was an eclipse, though the Research Triangle area didn’t have a good view of it, so Mala and I riffed on it a little bit.

Hype Versus Reality: Direct Lake

Our first real topic was a great article by Marco Russo, with input from Kurt Buhler, on Direct Lake and where it actually fits in the Power BI world. I think it’s perfectly fair to talk up Direct Lake as a very nice capability, while still recognizing that it’s not a one-size-fits-all solution or a total replacement for anything currently in place.

How to Create a Legendarily Bad Database

Mala gave us the heads up on our other topic of the night, an article by David Tate on how to create a legendarily bad database. I was concerned that the advice would create a merely bad database, but I think David does a good job of taking it the extra mile and making me want to nuke the setup from orbit, because it’s the only way to be sure.

Shop Talk: 2024-03-25

by Kevin Feasel

March 28, 2024

8:10 pm

Leave a comment on Shop Talk: 2024-03-25

Show Notes

The Recording

The Panelists

Kevin Feasel
Mike Chrestensen
Mala Mahadevan
Special Guest Star Bart Vernaillen

Notes: Questions and Topics

Bart Talks Performance

Our first topic was a bit of an interview, one that Mala put together with Bart Vernaillen, a Belgian consultant who has developed a couple tools around shredding execution plans, finding good index candidates, and so on. He showed off a bit of his work and talked about potential future plans. If you want to learn more about his tools, reach out to him on LinkedIn and say TriPASS sent you.

Regex Support in Azure SQL DB

Microsoft has introduced a private preview of regular expression support in Azure SQL Database. Anders and I went back and forth a bit on the value of this, with Anders being strongly opposed and me being generally in favor.

Mike Likes Azure Data Studio

We wrapped the show up with a quick discussion of why Mike really like Azure Data Studio. If you haven’t used Azure Data Studio in the past, the discussion may be of interest to you. The product isn’t for everybody—if you’re a classic DBA, you’re going to find the product more limiting than SQL Server Management Studio. But for developers or people with hybrid roles, Azure Data Studio has matured into a good tool.

Shop Talk: 2024-02-26

by Kevin Feasel

February 28, 2024

11:35 am

Leave a comment on Shop Talk: 2024-02-26

Show Notes

The Recording

The Panelists

Kevin Feasel
Mike Chrestensen

Notes: Questions and Topics

Performance Tuning Large Execution Plans

Our first topic comes from Mike and is all about how to tune queries that result in large execution plans. I went through a couple of techniques but mostly focused on using Solarwinds Plan Explorer. This free tool is fantastic for analyzing SQL Server queries and I talk about some of its coolest functionality. I used an old Brent Ozar post as an example of a, uh, large execution plan.

Anders mentioned a pair of Grant Fritchey books that are also quite helpful. SQL Server 2022 Query Performance Tuning is a powerhouse of a book and I highly recommend it. And his SQL Server Execution Plans book is free from Red Gate, so grab a copy of that for free.

A Book Review

After that, I talked about a book review of Peter Carter’s book 100 SQL Server Mistakes and How to Avoid Them. I have a lengthier blog post that covers my full thoughts so check that out. Short version: I recommend buying the book, even though I don’t agree with everything in it.

ORDER BY Not in SELECT

Our next topic involved a bit of trivia: in SQL Server, you can have columns in your ORDER BY clause that do not appear in your SELECT clause. For a long time, this was not possible in Oracle, as Mike points out, but by Oracle 19C, you now have that ability, too.

Terminology Time

The final topic we hit was a bit of terminology, where I helped explain to Mike the difference between a few terms. As I noted on the video, I’m definitely not the poster child for terminological exactitude, but I did have the answers for this one. My answers are:

Clause — A primary part of a SQL statement or query. Examples include SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, etc.
Statement — An operation that does not return a value. For example, I can run an UPDATE statement. It updates my table but does not return anything. In T-SQL, CASE is not a statement because it returns something. In F#, case is not a statement because it returns something. In C#, case is a statement because does not return anything.
Expression — An operation that returns a value of a particular type. In T-SQL, CASE is an expression because you have to return something: that’s the THEN part of a CASE expression, as in, CASE WHEN <predicate> THEN <thing to return> END. Speaking of predicates:
Predicate — An expression that returns TRUE, FALSE, or (in T-SQL) UNKNOWN. The contents of your WHERE clause is a predicate, but so is each bit within your WHERE clause: T1.Col1 > 3 is a predicate, because, for each row, T1.Col1 is either greater than 3, less than 3, or NULL. And the combination of T1.COL1 > 3 AND T2.Col2 < 17 is a predicate: the whole returns one of true, false, or unknown. Butt so do each of the parts making up the whole predicate.