Show notes for the Triangle Area SQL Server User Group's Shop Talk program.
Author: Kevin Feasel
Kevin Feasel is a Microsoft Data Platform MVP and CTO at Envizage, where he specializes in data analytics with T-SQL and R, forcing Spark clusters to do his bidding, fighting with Kafka, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL (https://curatedsql.com) and author of PolyBase Revealed (https://www.apress.com/us/book/9781484254608). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.
Tracy and Mala started off with a quick review of SQLbits, with Mala mentioning that it was probably the best hybrid experience she’s had with a large conference.
Parameter Sensitive Plan Optimization
After that, Mala shared her thoughts on a new feature in SQL Server 2022 that she’s been trying out: parameter sensitive plan optimization. Jared mentioned some of the challenges with it but we also talked about how some of the criticism of this feature is a bit overblown.
40 Problems with a Stored Procedure
Mark Hutchinson got us to talk about this article from Aaron Bertrand involving a code review of a nasty piece of work. Aaron found 40 separate problems, so we went through and talked about each of them. I came in expecting to disagree with 10 or so, but I think I really only disagreed with 3-4. I was actually a little surprised by that, though then we had some fun pointing out the formatting problems in Aaron’s updated procedure. Sometimes what is best in life is to be just a little petty.
Mike and I had a mini-debate for this topic. While we were talking about the topic, I included this explanation of ChatGPT. Personally, I am very pessimistic on the idea of using ChatGPT for anything other than enjoyment at the clever way in which it puts together words. It is a language model, not a truth model: there is no concept of truthfulness in its responses and there is no ghost in the shell. My response to this comes from three places. First, a strong agreement with the thrust of Charlie Stross’s post about this being a rather fishy time for a bunch of ChatGPT-related endeavors to pop up, just in time to soak money after the last bubble. Second, I’ve heard some really dumb ideas involving ChatGPT, like having it write academic papers or code. And third, because I am a strong believer in the weak AI theory (quick note: I misspoke and said “hard” and “soft” AI when I meant “strong” and “weak” AI). As I mentioned in the video, I’m obviously not able to prove that there will never be a strong AI, but I’m quite skeptical of the notion and if I had to put money on it, would be more comfortable with the “never” bet than it actually occurring before any specific time frame.
Mike, meanwhile, talked about some of the practical things he was using ChatGPT for, and he also accidentally exposed a weakness in ChatGPT to old information when asking a question about PASS Summit.
We had the great honor of having Kevin Kline on, so we spent most of the episode grilling him and Mala about the history of the SQL Server community and PASS as an organization. Both of them have such a great deal of knowledge about the organization and broader community, so if there was ever a good episode for me to lose my voice, this is the one.
Because there was nobody to stop me from spiraling, I started off the episode with some bad news:
We probably aren’t going to have a SQL Saturday Raleigh this year due to difficulty finding an appropriate venue. I had a bunch of places shoot us down or ghost us, so although I’m sure we could have found somewhere to host, we weren’t able to figure out where that place was in time.
I got the privilege of telling my employees that we were all being laid off as part of a reorganization plan.
Thoughts on Synapse
After that, I riffed for a while on a blog post by Eugene Meidinger covering the difficulty in learning Azure Synapse Analytics from someone without that classical warehousing or ETL experience. Earlier that day, Eugene, Carlos L. Chacon, and I interviewed someone (and I’m being a little cagey here just because the episode hasn’t come out yet so I don’t want to spoil too much) on this topic.
“Big Data” and Its Discontents
The final topic of the evening was a discussion of how “Big Data” platforms—the author’s experience is in BigQuery but I’d also include Hadoop and even things like the Azure Synapse Analytics dedicated SQL pool—have become less common over the past several years. I think the article makes a good number of points, particularly around the major increases in per-machine power we’ve seen over the past decade. There are a couple of parts where I think the author overplays his hand, but overall, the article is worth the read.
The first topic of the night was a couple upcoming events the Shop Talk crew will be at. I’ll be at SQL Saturday Atlanta BI Edition on February 25th. Tracy will be in Wales for SQLbits in March and Mala will present remotely.
Laid Off? Andy Leonard Has Free Training for You
Andy Leonard has a generous offer for anyone who has been laid off recently: a full year of free access to his training catalog. Andy has a lot of great content and is a great person to learn from when it comes to data movement in SSIS or Azure Data Factory.
Implicit Conversions are Bad
Tracy authored a blog post recently on eliminating implicit conversions in Hibernate and JDBC. She wasn’t able to make the show but Mala and I talked about the topic and Solomon Rutzky reminded us that the most likely problem Tracy ran into involved collations and data type mismatches—with Windows collations, we wouldn’t see these issues.
Debugging T-SQL Code
Mala wanted us to talk about a recent Brent Ozar post on debugging T-SQL code. I agree with Brent that RAISERROR and table variables form a potent combination for error handling. I will, however, never pronounce it as “raise-roar.”
Code Commenting
We wrapped things up with a diversion around this Maelle Salmon post on code commenting, with an emphasis on R. I like the principles of it and it got me thinking about whether there are languages which are more or less comment-needy: in other words, are there some languages in which you absolutely need more comments and other languages in which you definitely don’t need more? As a first approximation, I went with math-heavy (and functional) programming languages as benefitting more from detailed comments, and I could see relatively more verbose languages like COBOL needing fewer explicit comments. I’m not sure this is actually correct, however; I’d have to think about it some more.
Because we talked about this during the last episode, here’s a quick update. We have booked all three groups (Advanced DBA, main meeting, and BI/Data Science) through July. TheĀ call for speakers is still up, however, and if you want to speak for our group, please submit one or more sessions..
Workplace “Red Flags”
A Kevin Kline tweet formed the basis of our first topic:
Mala and I shared some painful responses, though I cheated a bit and picked several situations in which I saw the red flag before taking the job.
“Big Data” Trends
We spent the rest of the episode taking a look at this Petr Nemeth article. We looked at and responded to each of Petr’s main trends. Some of them, I think, are reasonable; others have been a pipe dream for the past 15 years and I don’t foresee that changing.
The first topic of the night is that we are looking for speakers for the Advanced DBA and Business Intelligence / Data Science TriPASS meetings. These are (currently) remote-only, so all are welcome to submit sessions. The call for speakers is currently up and running.
SQL Saturday Raleigh Update
As a first step toward hosting SQL Saturday Raleigh in 2023, we started looking for a venue. The place which hosted us last time around is no longer doing weekend events and I’m currently 0 for 4 on locations. We have a few other irons in the fire and, assuming we can lock down a venue, will get to work on hosting SQL Saturday Raleigh. Our provisional date is April 15th but there’s no call for speakers or official announcement yet.
We had a chat question come in around normalizing addresses: that is, given some arbitrary string a user typed in, what is the “official” address? We recommended Melissa Data for this, as they handle files and have an API, as well as SSIS components. Other alternatives we kicked around were the Google Maps API and OpenStreetMap, both of which have APIs to support address lookup.
PyTorch Compromise
Our final topic of the night involved PyTorch, a popular deep learning library for Python. It seems that, sometime shortly after Christmas, someone pulled off a supply chain attack on PyTorch, creating a malicious package with the same name as an internal PyTorch package. This only affected people who installed the nightly build between December 25th and December 30th and the PyTorch website has cleanup instructions, as well as more details. The specific nature of the attack was particularly interesting, as the attackers put a lot of effort into staying hidden.
We’ve completed another round of TriPASS elections and the slate of candidates passed: Kevin Feasel as President, Rick Pack as VP of Marketing, and Mala Mahadevan as Treasurer. Thank you to any TriPASS member who voted.
The Siren Song of Reusable Queries
Our big topic for this episode was around reusable code and how much of a trap it can be in SQL Server. Thinking about ways to reuse code is great in most procedural languages but we cover in some detail why that plan can fall apart with common T-SQL constructs, including functions and views.
Resume Thoughts
The other topic we covered involved resumes. I looked at it from two angles: me as a hiring manager and me as a candidate. A couple of the big things I’m looking for:
Brevity. My resume is 1 page long and I’ve done a few things. Your resume is not a curriculum vitae: it’s not intended to be everything you’ve ever done, just items which are most relevant to the job at hand. As you gain more experience, it’s okay to leave off older jobs, especially when they aren’t directly relevant.
Impact. You worked at BigCo for 14 years but what did you do? Pick one or two major projects which had the biggest impact and give me concrete measures of how you made somebody’s life better.
Appropriate humility. If you call yourself an expert on something, be prepared: that’s a big target on your back. But at the same time, if you’ve written a book and delivered a 6-lecture series at Oxford on a topic, don’t underplay your level of knowledge. Finding the appropriate level is tough, especially when there aren’t clear, common delineations between levels of expertise in a given field.
Hit the HR bullet points. This isn’t something I look for as a hiring manager but it can prevent me from getting your resume. Be sure, when you customize your resume for a particular job, to include as many of the relevant keywords as possible, as automated HR systems act as gatekeepers here. If the job mentions T-SQL, SQL, database administration, query tuning, and database security, fit those in. You should still be able to keep it to 1 page of impact-driven statements, especially if you do include a “Key skills” section with a line or two of relevant skills that you demonstrate (even if between the lines) in your job experience section.
The big news of this week was around SQL Server 2022 now being generally available. We caught up with some of the things Tracy is excited about (mostly on the administrative side) and Mala pointed out Bob Ward’s SQL Server 2022 workshop as a getting started guide.
Conference Wrap-Up
We got three different perspectives on two conferences in this section. Mala attended PASS Summit virtually, whereas Tracy was there in person. Both of them enjoyed the conference and it sounds like it was a lot of fun. TriPASS alumnus Tom Norman was also there so hopefully you had a chance to see him.
Meanwhile, I was in Orlando for Live! 360 so I shared some thoughts about that.
TriPASS Elections Now Open
We have three seats on the TriPASS board up for election this November: President (me), VP of Marketing (Rick Pack), and Treasurer (Mala). There’s only one candidate for each seat, so assuming a rash of no confidence hits TriPASS members, we’ll have our board set for another year.
Voting is open and runs through the first Thursday in December. Terms will begin in February of 2023 and run for 2 years. If you are a TriPASS member, you will have received an e-mail providing instructions on how to vote. If you aren’t a TriPASS member, sign up at Meetup. I mean, it’s free, so what do you have to lose?
TriPASS Survey Results
Over the past month or so, we have been canvassing TriPASS members to fill out our semi-annual survey, which drives the direction of the organization. You’ve spoken and now you’ll have to deal with the fallout, getting what you deserve.
A quick summary of the survey results is as follows:
A large majority of people want a return to hybrid user group meetings, though nobody had a place they could offer up. We’ll start canvassing places to see who’s willing to host us and aim to return back to hybrid meetings in spring of 2023.
Almost everybody who responded wants a SQL Saturday Raleigh 2023. The board still needs to meet to determine the feasibility of it but we’ll try to make it happen. Expect a call for volunteers assuming the basic groundwork is there.
TriPASS Call for Speakers
With the imminent(?) return to hybrid events, we’ll also want some in-person speakers. Our TriPASS call for speakers is officially open, so if you’re interested in presenting at a future TriPASS meeting, submit a session or three. We will give preference to people who can meet with us in person, though there will be slots available for remote speakers as well.
Thank You
As we round in on Thanksgiving, I wanted to give out a few thanks.
Thank you to Mala, Mike, and Tracy, who keep Shop Talk from being an hour of me talking about me. They limit it to a much more appropriate 55 minutes of me talking about me.
Thank you to the regulars who drop in and ask questions, try to derail my train of thought (usually with great success), and make the TriPASS water cooler a better place to be.
Thank you to everyone who has asked a question. As a quick reminder, if you do have any questions you would like us to answer on the air, you can always e-mail us. The address is shoptalk and the domain is tripass.org. Throw a little at symbol in there and you’re good to go.
We have three seats on the TriPASS board up for election this November: President (me), VP of Marketing (Rick Pack), and Treasurer (Mala). Nominations are open, so if you are a TriPASS member on the Meetup, reach out to me if you are interested in running for one of the board seats. Rick, Mala, and I are running but if you want to throw your hat into the ring, you are welcome to do so.
Voting will start on the 3rd Thursday in November and run through the first Thursday in December. Terms will begin in February of 2023 and run for 2 years.
TriPASS Survey
If you are an active TriPASS member, this is your last chance to fillĀ out our semi-annual survey. It will be open for a few more weeks and this helps shape the next two years for the organization.
Data Platform Conferences
The main theme of tonight’s show was around data platform conferences, as both PASS Summit and Live! 360 are coming up next week. We covered a few topics:
How specific should data platform conferences be?
How difficult it can be for high-end people to learn much at conferences. Alternatively, why are there typically so few 300+ level sessions?
Some of the background on session selection and trying to balance competing needs.
How many times can Kevin say the word “ecumenical” on a broadcast?