Welcome to Shop Talk, a regular Q&A-style broadcast hosted by the Triangle Area SQL Server Users Group (TriPASS).
After each episode airs, you will see the show notes as well as an embedded link below.
Welcome to Shop Talk, a regular Q&A-style broadcast hosted by the Triangle Area SQL Server Users Group (TriPASS).
After each episode airs, you will see the show notes as well as an embedded link below.
I spent a little bit of time gleefully recapping Data Architecture Day. We had 376 viewers over the course of the 11 sessions. If you missed any sessions or want to review something, I have put together a playlist for you.
I ran into some issues around SQL Server Machine Learning Services and processes (R and Python) running out of memory. The reason is that, by default, SQL Server creates an external resource group with a limit of 20% of the allocated memory for external scripts. In my talk on Machine Learning Services in production, I cover how to change this value.
@iconpro5555 asked during Data Architecture Day: “what are some approaches to versioning changes? do u mix. type 2 data with type 1?”
Answer: The reference here is to slowly changing dimensions in a Kimball style warehouse. If you want an excellent write-up, I recommend Simon Whiteley’s take on the matter. Here’s the script I used when putting together my answer:
USE tempdb GO -- Type 0: Nothing will ever change...right? CREATE TABLE dbo.Type0Dim ( Name VARCHAR(50), DateOfBirth DATE ); -- To insert: add a new row. INSERT INTO dbo.Type0Dim(Name, DateOfBirth) VALUES('Tony', '2001-01-01'); SELECT * FROM dbo.Type0Dim; -- To update: we don't! GO -- Type 1: History? Who cares? CREATE TABLE dbo.Type1Dim ( Name VARCHAR(50), DateOfBirth DATE ); -- To insert: add a new row. INSERT INTO dbo.Type1Dim(Name, DateOfBirth) VALUES('Tony', NULL); SELECT * FROM dbo.Type1Dim; -- To update: simple update statement. UPDATE dbo.Type1Dim SET DateOfBirth = '2001-01-01' WHERE Name = 'Tony'; SELECT * FROM dbo.Type1Dim; GO -- Type 2: History is a new row. CREATE TABLE dbo.Type2Dim ( Name VARCHAR(50), FavoriteColor VARCHAR(30), IsCurrent BIT, PeriodBeginDate DATETIME2(3), PeriodEndDate DATETIME2(3) ); -- To insert: add a new row. INSERT INTO dbo.Type2Dim(Name, FavoriteColor, IsCurrent, PeriodBeginDate, PeriodEndDate) VALUES('Tony', 'Green', 1, GETUTCDATE(), '9999-12-31'); SELECT * FROM dbo.Type2Dim; -- To update: update the current row and add a new row. DECLARE @Now DATETIME2(3) = GETUTCDATE(); UPDATE dbo.Type2Dim SET IsCurrent = 0, PeriodEndDate = @Now WHERE Name = 'Tony' and IsCurrent = 1; INSERT INTO dbo.Type2Dim(Name, FavoriteColor, IsCurrent, PeriodBeginDate, PeriodEndDate) VALUES('Tony', 'Puce', 1, @Now, '9999-12-31'); SELECT * FROM dbo.Type2Dim; GO -- Type 3: History is the prior version. CREATE TABLE dbo.Type3Dim ( Name VARCHAR(50), FavoriteColor VARCHAR(30), PriorFavoriteColor VARCHAR(30) ); -- To insert: add a new row WITHOUT prior values. INSERT INTO dbo.Type3Dim(Name, FavoriteColor) VALUES('Tony', 'Green'); SELECT * FROM dbo.Type3Dim; -- To update: set the prior value. UPDATE dbo.Type3Dim SET PriorFavoriteColor = 'Green', FavoriteColor = 'Puce' WHERE Name = 'Tony'; SELECT * FROM dbo.Type3Dim; GO -- Type 4: the Princess is in another castle. CREATE TABLE dbo.Type4Dim ( Name VARCHAR(50), FavoriteColor VARCHAR(30) ); CREATE TABLE dbo.Type4DimHistory ( HistoryKey BIGINT IDENTITY(1,1) NOT NULL, Name VARCHAR(50), FavoriteColor VARCHAR(30), CreateDate DATETIME2(3) ); -- To insert: add a row into the dimension *and* a row into the history. INSERT INTO dbo.Type4Dim(Name, FavoriteColor) VALUES('Tony', 'Green'); INSERT INTO dbo.Type4DimHistory(Name, FavoriteColor, CreateDate) VALUES('Tony', 'Green', GETUTCDATE()); SELECT * FROM dbo.Type4Dim; SELECT * FROM dbo.Type4DimHistory; -- To update: update the dimension *and* add a new row into the history. UPDATE dbo.Type4Dim SET FavoriteColor = 'Puce' WHERE Name = 'Tony'; INSERT INTO dbo.Type4DimHistory(Name, FavoriteColor, CreateDate) VALUES('Tony', 'Puce', GETUTCDATE()); SELECT * FROM dbo.Type4Dim; SELECT * FROM dbo.Type4DimHistory; GO -- Type 6: I just can't decide! CREATE TABLE dbo.Type6Dim ( Name VARCHAR(50), FavoriteColor VARCHAR(30), PriorFavoriteColor VARCHAR(30), IsCurrent BIT, PeriodBeginDate DATETIME2(3), PeriodEndDate DATETIME2(3) ); -- To insert: add a row without prior values. INSERT INTO dbo.Type6Dim(Name, FavoriteColor, IsCurrent, PeriodBeginDate, PeriodEndDate) VALUES('Tony', 'Green', 1, GETUTCDATE(), '9999-12-31'); SELECT * FROM dbo.Type6Dim; -- To update: update the current row and add a new row. DECLARE @Now DATETIME2(3) = GETUTCDATE(); UPDATE dbo.Type6Dim SET IsCurrent = 0, PeriodEndDate = @Now WHERE Name = 'Tony' and IsCurrent = 1; INSERT INTO dbo.Type6Dim(Name, FavoriteColor, PriorFavoriteColor, IsCurrent, PeriodBeginDate, PeriodEndDate) VALUES('Tony', 'Puce', 'Green', 1, @Now, '9999-12-31'); SELECT * FROM dbo.Type6Dim; GO -- Clean up. DROP TABLE IF EXISTS dbo.Type0Dim; DROP TABLE IF EXISTS dbo.Type1Dim; DROP TABLE IF EXISTS dbo.Type2Dim; DROP TABLE IF EXISTS dbo.Type3Dim; DROP TABLE IF EXISTS dbo.Type4Dim; DROP TABLE IF EXISTS dbo.Type4DimHistory; DROP TABLE IF EXISTS dbo.Type6Dim; GO
One thing that I hit at the end of the discussion but want to mention here is that the types properly refer to attributes, not to dimensions. What I mean by this is, you can definitely have a dimension which combines types 0, 1, and 2 together. You may have some information which will never change, and so those attributes are type 0. Then, you may have some attributes for which we don’t care about history—those would be type 1 attributes. And some attributes might affect our fact tables, so we track the history as type 2.
Let’s say we have a Person dimension and a Sales fact. On the Person dimension, perhaps we have a DateJoined indicator. We never update that indicator, as the person joined on a specific date and that’s it. So DateJoined would be type 0. The person has a telephone number. We care about the current value of the telephone number, but the telephone number itself won’t tell us much about historical sales. At one point in time, a phone number change was typically indicative of moving to a different area, but between area code splits and mobile phones, your area code isn’t really a great indicator of your location any more. So a phone number might be a type 1 attribute: if it changes, we want to keep track of the latest value, but we don’t need to store a history of it. Favorite color, however, might be a type 2 attribute—if a person’s favorite color changes, that might affect their sales behavior as they shift from buying things in one color to buying things in a different color. In this case, we want to know when the person had a particular favorite color.
This starts to sound complex, but it’s the kind of analysis which is critical for a solid data warehouse. This may also be part of why so many data warehousing projects fail!
Mala has a couple of book recommendations for us this week:
Tom asked for recommendations on hands-on labs. I shared some of my experiences and Neil Hambly brought up ideas as well. I don’t have a great answer, especially if you don’t do hands-on labs frequently. If you’re in the ML space, I like Azure Notebooks because it’s free and I don’t have to worry about the computers people bring in. We also talked about a few options for when notebooks aren’t a good solution: Docker containers, VMs on thumb drives, bringing your own hardware (as a trainer), and Azure VMs come up. Basically, unless you control the situation really well and pass out the hardware yourself, I would shy away from trying to use attendees’ machines directly for a hands-on lab and virtualize, containerize, or cloudify whatever I could.
We had 100% approval of Oxford commas in chat. It was beautiful.
Mark Gordon wanted to know what people use for working with Git.
Answer: Here’s what everybody uses:
git stash pop)
@johnfan14 asks, “How many presentations you do in a month on average? How many sessions you normally submit to a conference? What is the optimal numbers do you think?”
Answer: This really depends on the person and we spent some time covering the various criteria. The short answer is that it really depends on your budget and available time, as well as location. Being based in the southeastern United States and in a city with a reasonably good airport, all four of us have an easy time traveling to a variety of events.
I didn’t really say out loud my answer to the specific questions so I’ll do so here. Prior to this year, my presentation counts were 63 in 2019, 70 in 2018, 50 in 2017, 53 in 2016, and 19 in 2015. Some events had me give multiple presentations, so I wasn’t at 63 events in 2019, for example, but the answer is “quite a few.” But I’m an extreme outlier, so I didn’t want to skew things too much.
As far as session submissions, this depends on the conference rules, but when there is no limit, I submit 4. That way there’s a variety of topics (which makes me more likely to be selected) but not so much noise that an organizer needs to wade through a dozen submissions. If you have one good talk or two good talks, just submit those—you don’t need 4.
From @rporrata: “DB Documentation what tools have you used especially for redesigning systems outside of the usual suspects erwin/studio.”
Answer: This ended up blending together two sets of tools: database documentation tools and database architecture tools. I’m not a big fan of database architecture tools in general, so without further ado, links to resources we came up with and chat helped us out with.
Mark Hutchinson e-mailed before the show began and had a question for us: “I’m going to teach a couple of friends SQL. Just the DML subset of the language to start (Select, Insert, Update, Delete). Currently, we’re going to use MS Access, since the GUI can help do some of the heavy lifting at the early part of the course. For later, maybe a second course, I was thinking about introducing the students to a large database, such as SQL Server. What is the best free (or damned cheap) database? One of the students is out of work and I’m retired, so money is an issue.”
Answer: We’re SQL Server folks, so we’re going to be biased. But each one of us came up with SQL Server Developer Edition. It is 100% free for non-production use and easy to install. We debated the merits of other editions as well, so here’s the quick +/- on each:
My question is when we say “all expressions that appear in the same logical query processing phase are evaluated conceptually at the same point in time,” does that mean the query looks for all instances of FROM then all instances of WHERE, etc.?
How would this be processed?
SELECT c.CUSTOMERID FROM CUSTOMER c WHERE c.CUSTOMERID NOT IN (SELECT ag.CUSTOMERID FROM ANGRYCUSTOMERS ag )
Answer: Mike is talking about clause ordering, where the
FROM clause parses before the
WHERE clause, etc. And within the
WHERE clause, all predicates are handled at the same time, meaning that if you have a
WHERE clause with
X=1 AND Y=2 AND Z=3, the database engine can take them in any order (regardless of the order in which you wrote them). From your perspective, all of those filters happen concurrently, so you can write them in any order and the database engine will (hopefully) pick the smartest path. This is why short-circuiting may not work in SQL the way you’d expect it to in an imperative language: because the optimizer doesn’t care about the order in which you write things and can shake up the order if it looks like a better path.
In this particular instance,
FROM CUSTOMER c will process first, and then we will get to the
WHERE clause. Inside there, we have a
NOT IN operator which operates on a sub-query, so we move to
FROM ANGRYCUSTOMERS ag and then
SELECT ag.CUSTOMERID. After that completes, we finally get to
We wrapped up with one event of note because I forgot about the other one:
Chris Voss e-mailed us with a great question. “What are the “home lab” setups like? What computers/specs does everyone have, and for what purposes? I’m talking about person systems rather than work. Part of why I ask is because I’m looking at new computers, so I’m asking basically every tech person to ensure I’m doing this right.”
Answer: This will depend on whether you want a desktop or a laptop. Or you could just have a server room in your basement…
Tom and Tracy went over some of the characteristics they look for in laptops, starting with 64 GB of RAM. I mentioned that I’d much rather have extremely fast disk for a single SQL Server installation, though both of them have multiple SQL Server instances running on multiple VMs, so the need for lots of RAM makes perfect sense.
If you want a resource for building a desktop machine, the folks at Logical Increments do an incredible job. The service is entirely free and I used it to build my machine learning and video processing desktop. You pick the price point and they give you several recommendations on hardware choices. One thing I would say is that I’d recommend going up a notch on drives—prosumer grade NVMe (like the Samsung Pro series) over consumer-grade SSD. Fill your motherboard’s NVMe slots first before using SSD or HDD.
Neil Hambly had some nice recommendations as well, including Overclockers for the UK folks and making sure that you swap out SSD every 18-24 months to eliminate the risk of a drive dying on you. Unlike hard disks, SSD doesn’t give you much as much warning before it dies out, and it can just suddenly drop off.
As a bit of kismet, Mark Gordon had e-mailed me earlier with a great follow-on question: “When it comes to storage for SQL Server, does NVMe offer an improvement over SSD?”
Answer: Oh, you bet it does. On stream, I read a tiny bit from this article on the differences between NVMe and SSD. The relevant portion is:
NVMe is not affected by the ATA interface constrictions as it sits right on the top of the PCI Express directly connected to the CPU. That results in 4 times faster Input/Output Operations Per Second (IOPs) rivaling the fastest SAS option out there. The seek time for data is ten times faster. NVMe can deliver sustained read-write speed of 2000MB per second, way faster than the SATA SSD III, which limits at 600MB per second. Here the bottleneck is NAND technology, which is rapidly advancing, which means we’ll likely see higher speeds soon with NVMe.
With SQL Server, you will notice the difference under load. NVMe is still nowhere near as fast as RAM, but it’s a lot closer than SSD (which is itself way closer than 15K spinning disk).
By the way, for the pedantic-minded, I am aware that NVMe disks are still SSD; when I say SSD, I mean SSD over SATA in the classic 2.5″ form factor.
Mark had some follow-up bits I can hit briefly here. He mentioned tempdb as a good candidate for fast disk and that’s a smart idea: tempdb should be on the fastest disk you can possibly get. Here’s a rough guide that I’m coming up with off the top of my head, ranking things in order of best to worst:
There are other configurations that nestle in between some of these (e.g., direct-attached SSD for tempdb but SSD SAN for the rest is slightly better than SSD SAN array but slightly worse than all direct-attached SSD), but the general rule of thumb is that direct-attached beats SAN and that NVMe > SSD > HDD.
John Fan Zhang asked for a good book to learn SQL.
My recommendation, to the point where I have purchased this book for one of my employees needing to learn T-SQL, is Itzik Ben-Gan’s T-SQL Fundamentals 3rd Edition. Itzik is brilliant and an outstanding teacher, and even if you have an advanced knowledge of T-SQL, you’ll still pick up things from his beginner-level book.
Mala also recommended Itzik’s training, available on demand.
Gabriel hit me up with this question before the stream began: “Is there a really a need to support and maintain RIGHT JOIN?”
Answer: Tracy says no, Tom says no, Mala says no, and Kevin says mostly no.
The thing about
RIGHT JOIN is that it is usually confusing to people because it’s backwards from how we want to read. In English, we read left to right, top to bottom. We also work from the assumption that the most important stuff comes first.
RIGHT JOIN violates this by making the latter table the “important” one. The other consideration here is that every
RIGHT OUTER JOIN operation can be rewritten as a logically equivalent
LEFT OUTER JOIN.
That said, I have personally run into a couple of cases where it made sense to use a
RIGHT JOIN rather than switching to
LEFT JOIN. These cases were mostly around complex join criteria with a combination of
LEFT JOIN and
INNER JOIN and one last
RIGHT OUTER JOIN to catch the “I don’t have anything else” scenario. So I wouldn’t get rid of
RIGHT OUTER JOIN, but if I see it in a code review, the first question is asking why this needs to be ROJ and cannot be a LOJ.
Finally, chat got off onto the tangent of aliases and table names. On this topic, @iconpro555 tossed us into the briar patch with “why not use 4 letter names because that is what you use for aliases anyway?”
As far as naming goes, my rule of thumb is: make it clear but not overly verbose. 4 characters is fine if a table is called
dbo.Home and represents information about a home (location, square footage, tax appraisal, etc.). But don’t be afraid to add a few extra characters to a column name if it clarifies intent. One thing I really like to see is unit of measure. You show me a thing called TotalCPUTime, but is that in seconds? milliseconds? microseconds? This gets really annoying even with SQL Server DMVs because some of them are milliseconds and others microseconds.
Names are for developers, whether that’s the current developer or a future maintainer. Just like with the discussion about
RIGHT OUTER JOIN, we are optimizing for developers rather than for the database engine. There are times when you need to optimize for the sake of the database engine rather than the developer, and that’s where you start adding copious notes clarifying your intent.
We wrapped up with one event of note because I forgot about the other one:
Mala had a two-part question for us. When do you upgrade? And why do you upgrade?
Answer: Each of us has different opinions.
Mala would regularly wait for SP1 of a product before upgrading. With Microsoft eliminating regular service packs for SQL Server, she’s not quite sure what rule of thumb to follow.
Tom likes to push the envelope, preferring to upgrade quickly. Though he doesn’t hit each version—he tends to skip a version, e.g., 2016 to 2019. He does want to see compelling items in a version before upgrading.
Kevin likes to upgrade for upgrading’s sake. Or something like that… I have enjoyed being part of the Early Access Program for SQL Server and getting a chance to try out products under development. I pushed back a bit against the “Wait for SP1” argument, but one thing I failed to say during it is that if everybody waits for SP1, SP1 will still have a bunch of bugs. I am thankful for the people whose philosophy is “Someone’s got to find the bugs, and it might as well be me” and everybody who waits to upgrade should as well.
From there, I derailed things onto my refusal to work for a company stuck on old version of SQL Server, with no plan to upgrade (or a plan but no real desire to upgrade). Tom and Mala make me walk it back a bit.
Mike Lisanke wanted to know why we call the language SQL for SQL Server, Oracle, DB/2, Postgres, etc., and yet they’re all different languages.
We covered a lot in here, but the gist is that ANSI releases versions of the standard which companies subsequently adopt in part (and extend in part). I mentioned that there isn’t “an” ANSI SQL standard and Wikipedia has a nice table (about 2/5 of the way down) showing the different versions of ANSI SQL. I had guessed about the pre-89 versions and wasn’t quite right—there was only one pre-89 version, there wasn’t a 1997 version, and 2000 was 1999. Other than that the answer was fine! But there have been 10 iterations of the ANSI SQL standard.
We also talked about the origin stories of a few platforms, including Sybase/SQL Server, Oracle/Postgres, and MySQL/MariaDB. We also talked about coding for ANSI compliance. Tom likes that idea (or just using PolyBase—which I recommend!). I don’t care much for coding for ANSI compliance for most places because you lose chances to improve performance for a chimerical gain. The exception here is if you must write software which is cross-platform; then you’re stuck.
Tom mentions making use of hierarchyid in SQL Server. Then we started name-dropping books.
First up, I recommend Adam Machanic, et al’s, Expert SQL Server 2005 Development. I haven’t read it in a while and obviously the development surface area has changed in 15 years, but there is an excellent chapter on trees and hierarchies.
Mala and I both recommend Louis Davidson and Jessica Moss’s Pro SQL Server Relational Database Design and Implementation. I have an older edition, but I mentioned that it has the best explanation of 4th normal form that I’ve ever read.
I then pulled out my copy of Candace Fleming and Barbara von Halle’s Handbook of Relational Database Design. I consider it the best explanation of normalization I’ve ever seen in print (and thanks to Grant Fritchey for the recommendation!). Just don’t read the second half of the book unless you want a story of how implement on ancient systems.
We wrapped up with one event of note:
I started us off with a topic of discussion: working from home. Mala and Tom both have significant experience with the topic and they share their thoughts. Stick around for a bit of ranting about Microsoft Teams. @thedukeny points out this highly-upvoted item to allow for multiple Teams accounts at the same time. Slack does it right, and teams is painful.
Tom brought up desk-sharing, which I absolutely hate. On the plus side, it did remind me of a Dilbert strip from 25 years ago.
Chris Voss asked a question a while back and I finally got a chance to answer: Our team is starting the use of containers for local environments to test our database development, before deploying to the shared dev environment.Can anyone share their container strategies, and what are space considerations for local sandboxes? Would it make sense to put an entire application code base in the same container?
Answer: There are a few questions in here, so let’s take them in turn.
As far as space goes, Tom Norman pointed out that containers won’t save you space across machines: if you have a 500GB database you need on every developer’s laptop, even if that database is in a container, it’ll cost you 500GB of disk space per laptop. Kevin pointed out that the container savings is when you can layer your containers: if you have a bunch of applications using .NET Core, for example, you can reuse container layers so that you might have a couple dozen .NET Core apps which all use the same base layer, so that layer gets stored on disk once.
Does it make sense to put application code in the same container as database code? No, for the same reason that you wouldn’t put app code on the same server as your database. Keeping components isolated reduces system complexity and makes it easier to upgrade or swap out parts.
Mark Gordon raised a question about the telemetry service which derived from a weird account setup. Mark’s research led him to read up a bit on the telemetry service. We then had a bit of discussion about the telemetry service itself and I referenced a Brent Ozar post on the topic.
My personal opinion is that I’m fine with a telemetry service. I build telemetry in my applications and would expect the same from products like SQL Server. There are differing opinions on the topic, though.
We wrapped up with a few events of note:
Anders Pedersen starts us off with a doozy. When deleting a large number of rows, should we do this in one transaction or not?
Answer: Nope. Delete in batches, although this can take a while for enormous tables. If you’re retaining a tiny percentage of rows, then it might be easier to create a new table, migrate the data you want to keep to that table, drop the old table, and rename the new table back to the old name.
If you’re using Enterprise Edition, you can partition your tables by date and use partition switching.
As part of deleting lots of data, we ended up talking about long-term archival storage of data. Tom brought up Stretch DB and I laughed. I laughed because Stretch DB is dead on arrival as soon as you look at the price.
If you aren’t made of money, there are a few other options. One I like is to use PolyBase for cold storage of data. Solomon Rutzky also recommended storing archival data on slow disk within SQL Server.
Mike Lisanke calls me out and says that magnetic storage has its place in the world.
To that I say, this is true. I want things as fast as possible, though, and faster storage is one of the easiest ways to make your SQL Server a lot faster. Spinning disk and tape are good for long-term backup storage. But they’re generally not for OLTP or even OLAP scenarios. Give me NVMe or even SSD any day of the week.
From Mike Lisanke, why do databases not have the concept of multi-level caching?
Answer: This answer is for SQL Server in particular; it may be different for other database technologies.
SQL Server has a buffer pool, where data is read into memory before it is returned. That’s one level of caching. From there, multi-level caching is more of an architecture decision: adding caching apps like Redis or using in-process cache in your app servers. That’s outside of the database but replaces database calls, so it effectively acts as another layer of caching.
Also, there is a concept of aggregations in SQL Server Analysis Services, where the engine creates pre-computed aggregations of slices of your data. That gives you a performance boost sort of like what caching does, and you can replicate this in the database engine with rollup tables.
Mala recommends Skype, as it is free and lets you save recordings. She also recommended checking out work from Doug Lane (for example, his gear to make technical videos—though that is a few years old) and Erik Darling.
Tom uses GoToWebinar but doesn’t do many recordings.
I use Whereby for streams and you can record on there. I use Camtasia for professional video editing and post-processing. OBS Studio is great for gonzo work when you don’t want post-processing. It’s also the software I use for streaming. Windows Video Editor is a thing but I have no experience with it so I don’t know how well it would work here. Adobe Premiere Pro is great if you can afford it.
Mala is currently going through SSIS training from Andy Leonard. Andy is an outstanding teacher and one of the best at SSIS. If you get a chance to learn from Andy, take it.
Tom is working on building an enclave in his environment so he can use Always Encrypted with enclaves.
John fan Zhang had a lengthy question for us which I’m summarizing as, given a new stored procedure which inserts batches of rows into a table, I am seeing resulting worse database performance. What can I do about this? The table is a heap. Will a unique clustered index help?
Answer: My first thought is, check your storage. If you have cheap disk, get better disk performance and your problem probably goes away.
Inserting into heaps can be faster than inserting into tables with clustered indexes due to the hot page problem. This typically matters more when dealing with concurrent insertion rather than single batch operation. Still, in most cases, a clustered index will be faster for insert than a heap.
Mike Chrestensen asks, will using
MERGE to insert data be faster than
Answer: No. Also, avoid MERGE. It has lots of bugs. It’s easy to end up with terrible performance. It’s generally slower than independent INSERT/UPDATE/DELETE operations.
Can SQL Server Management Studio 18.4 connect to SSIS 2017 and SSIS 2019?
Answer: Yes. As of SSMS 18, you can connect to Integration Services 2017 and 2019. For prior versions of Integration Services, you will need the same version of SSMS as SSIS.
Kevin’s mini-rant about Azure Data Studio shortcuts can be summed up in two GitHub issues: supporting Jupyter shortcuts and supporting Command Mode. Please upvote those by choosing a thumbs-up reaction if you want to see these in Azure Data Studio.
johnfan14: Can I ask a question？ The question is If we must modify the Orders table to meet the following requirements: 1. Create new rows in the table without granting INSERT permissions to the table. 2. Notify the sales person who places an order whether or not the order was completed. What should we create?
For the first part, my recommendation is to use certificate signing. Solomon Rutzky has an excellent tutorial on that. Solomon happened to be in chat and mentioned that ownership chaining can work as well for many circumstances.
The answer to the second part is generally to use something like Service Broker. For more on that, I’d recommend Colleen Morrow’s series of posts on the topic.
We had a question in chat about using SQL Server in containers. Microsoft has some good documentation on how to get that going.
I also mentioned running Linux containers natively in Windows without emulation via Hyper-V. You can read more about that on the Docker website.