- Kevin Feasel
- Tracy Boggiano
- Mala Mahadevan
- Tom Norman
Notes: Questions and Topics
GUIs for Git
Mark Gordon wanted to know what people use for working with Git.
Answer: Here’s what everybody uses:
- Tom — Visual Studio’s Git integration most of the time and Sourcetree for tricky parts.
- Mala — TortoiseGit
- Tracy — Visual Studio Code’s Git integration
- Kevin — Mostly Sourcetree, sometimes TortoiseGit, and sometimes just using the Git bash shell (for things like
git stash pop)
Presentations and Sessions
@johnfan14 asks, “How many presentations you do in a month on average? How many sessions you normally submit to a conference? What is the optimal numbers do you think?”
Answer: This really depends on the person and we spent some time covering the various criteria. The short answer is that it really depends on your budget and available time, as well as location. Being based in the southeastern United States and in a city with a reasonably good airport, all four of us have an easy time traveling to a variety of events.
I didn’t really say out loud my answer to the specific questions so I’ll do so here. Prior to this year, my presentation counts were 63 in 2019, 70 in 2018, 50 in 2017, 53 in 2016, and 19 in 2015. Some events had me give multiple presentations, so I wasn’t at 63 events in 2019, for example, but the answer is “quite a few.” But I’m an extreme outlier, so I didn’t want to skew things too much.
As far as session submissions, this depends on the conference rules, but when there is no limit, I submit 4. That way there’s a variety of topics (which makes me more likely to be selected) but not so much noise that an organizer needs to wade through a dozen submissions. If you have one good talk or two good talks, just submit those—you don’t need 4.
Database Documentation Tools
From @rporrata: “DB Documentation what tools have you used especially for redesigning systems outside of the usual suspects erwin/studio.”
Answer: This ended up blending together two sets of tools: database documentation tools and database architecture tools. I’m not a big fan of database architecture tools in general, so without further ado, links to resources we came up with and chat helped us out with.
- Visio allows you to reverse engineer database diagrams, though watch out: there were some editions which didn’t have this functionality.
- SSMS database diagrams are a low-end way of showing relationships. They barely function but work okay with small numbers of tables and columns.
- Lucidchart has a database design tool
- @mikec_07 recommended Aqua Data Studio
- Solomon Rutzky recommended Diagrams.Net, which is an old favorite of mine—it used to be called Draw.io.
- I’ve used SqlDBM a few times. It’s fine for building static models, though there are niceties I wish it had (like being able to prevent table overlap).
- Red Gate’s SQL Doc, which reads extended properties.
- ApexSQL Database documentation tool
- Neil Hambly brought up DOC xPress by SentryOne (formerly a Pragmatic Works product). Mala mentioned that this product works well for documenting SSIS packages.
Free Databases for Teaching SQL
Mark Hutchinson e-mailed before the show began and had a question for us: “I’m going to teach a couple of friends SQL. Just the DML subset of the language to start (Select, Insert, Update, Delete). Currently, we’re going to use MS Access, since the GUI can help do some of the heavy lifting at the early part of the course. For later, maybe a second course, I was thinking about introducing the students to a large database, such as SQL Server. What is the best free (or damned cheap) database? One of the students is out of work and I’m retired, so money is an issue.”
Answer: We’re SQL Server folks, so we’re going to be biased. But each one of us came up with SQL Server Developer Edition. It is 100% free for non-production use and easy to install. We debated the merits of other editions as well, so here’s the quick +/- on each:
- Developer Edition — Installation is really easy (basically next-next-next if you’re installing for a development box and don’t need custom configuration like you would for production) and you get the full SQL Server surface area.
- Express Edition — Installation is also easy and you can use Express Edition in production environments for free. It’s limited to 10GB per database, no more than 4 cores on 1 socket, and a limited amount of RAM, but if you wanted a zero-cost SQL Server in production, that’s an option.
- LocalDB — SQL Server Express LocalDB integrates really well with Visual Studio and lets you work with database projects. It’s a single-user variant of SQL Server with practically no configuration necessary. Some of the surface area is different in LocalDB, but that’s stuff you wouldn’t notice while learning SQL.
- Containerized Developer Edition — You can certainly get SQL Server as a Docker container. If you’re already familiar with Docker and have it set up on the training machines, then this is a good technique as it’s really easy to spin up a new container and blow it away if you mess it up too badly. But if you don’t have things configured already, it can turn into a mess where you spend so much time trying to set up your environment that you lose out on what you wanted to do.
- Azure SQL Database — Avoid installation altogether with Azure SQL Database. There is no free tier option anymore, but the Basic tier DTU model is $5 per month and there’s no installation or configuration at all.
My question is when we say “all expressions that appear in the same logical query processing phase are evaluated conceptually at the same point in time,” does that mean the query looks for all instances of FROM then all instances of WHERE, etc.?
How would this be processed?
SELECT c.CUSTOMERID FROM CUSTOMER c WHERE c.CUSTOMERID NOT IN (SELECT ag.CUSTOMERID FROM ANGRYCUSTOMERS ag )
Answer: Mike is talking about clause ordering, where the
FROM clause parses before the
WHERE clause, etc. And within the
WHERE clause, all predicates are handled at the same time, meaning that if you have a
WHERE clause with
X=1 AND Y=2 AND Z=3, the database engine can take them in any order (regardless of the order in which you wrote them). From your perspective, all of those filters happen concurrently, so you can write them in any order and the database engine will (hopefully) pick the smartest path. This is why short-circuiting may not work in SQL the way you’d expect it to in an imperative language: because the optimizer doesn’t care about the order in which you write things and can shake up the order if it looks like a better path.
In this particular instance,
FROM CUSTOMER c will process first, and then we will get to the
WHERE clause. Inside there, we have a
NOT IN operator which operates on a sub-query, so we move to
FROM ANGRYCUSTOMERS ag and then
SELECT ag.CUSTOMERID. After that completes, we finally get to
Events of Note
We wrapped up with one event of note because I forgot about the other one:
- Marcin Gminski has put together the Data Community Event Tracker, an indispensable tool for finding online data platform events.
- Data Architecture Day is Saturday, May 16th. The schedule is available and we have a jam-packed schedule.
- EightKB, a conference for internals, is open for registration.
- There are several virtual SQL Saturdays up and running as well.