Shop Talk: 2020-06-22 – Shop Talk with TriPASS

The Recording

The Panelists

Kevin Feasel
Tracy Boggiano
Mala Mahadevan
Tom Norman

Notes: Questions and Topics

Apple on ARM

@thedukeny starts us off by talking about rumors that Apple is moving away from Intel chips and toward ARM processors for its new line of MacBooks and asked for our thoughts. @rporrata follows up with questions about SQL Server on ARM.

Answer: Answering the second question first, Azure SQL Edge runs on 64-bit ARM processors and x64 processors. I’ve used it in private preview and liked what was available. You can also get into the public preview right now.

On the point about Apple, I can see it happening for their lower-end devices, as we’ve seen a cottage industry of ARM-based Chromebooks out there, so we know it’s viable. You wouldn’t use them for heavy gaming, video editing, and the like, but they’d definitely offer battery life improvements over the current generation if you’re a casual user of the product.

Partitioned Tables

Tom dropped the next question on us: he is looking to use table partitioning to improve performance where he only needs recent data. Will this technique help him out?

Answer: Mala chimed in with the best answer: maybe, but probably not. For a detailed answer, check out Kendra Little’s video. For a deep dive on partitioning itself, Andrew Pruski has a great presentation on the topic.

GOTO and the Modern Era

Mala made me defend goto. Her question: when and where would you expect to use goto statements in code? What was Dijkstra’s argument against them based upon?

Answer: I enjoyed the dive into this topic, as it really tells a story of the history of computer science. If you want, you can easily read Dijkstra’s letter to the editor (and which title Dijkstra really didn’t like; a classic case of the editor getting to choose the title). But temper this with a quotation from Dijkstra a few years later:

Please don’t fall into the trap of believing that I am terribly dogmatical about [the go to statement]. I have the uncomfortable feeling that others are making a religion out of it, as if the conceptual problems of programming could be solved by a single trick, by a simple form of coding discipline!

Donald Knuth had a response a few years after the paper’s release which defended goto in specific circumstances.

Summarizing a couple of themes that I spent a lot of time on in the episode, it’s hard for those of us who never really worked with pre-structured programming languages to understand the debate. At the time Dijkstra was writing, common languages didn’t always have structure components like break, continue, return, switch, case, do, while, and sometimes not even else! In lieu of those structural keywords, programmers needed to use the tools available, and the biggest one was goto. Today, we throw the brakes when we see a single goto statement. But Dijkstra wasn’t really concerned about that; he was concerned about it being the blunt instrument programmers used even if there were better options available.

Nowadays, it seems like the generally accepted exceptions to “don’t use goto” are:

switch cases in languages like C#, where you can go to a different case in the switch.
Breaking out of deeply nested loops, though in that case the question might be, why are you nested so deeply?
Ensuring that all code paths reach a certain destination for cleanup steps in languages without finally. For example, this might involve freeing memory, closing connections, and releasing handles.

The reason I’m not automatically critical of GOTO in SQL Server is that there is no FINALLY block in TRY/CATCH. But then again, it’s really uncommon that you’d need that construct.

Mala’s Book Corner

Mala recommended two books for us this week:

I can second both of these recommendations, having the paper copy of Joe’s book and a PDF of Kalen’s. The link to Kalen’s book lets you download it as a PDF for free.

Second Thoughts on Azure ML

The final thing I’m covering here is some second thoughts on Azure Machine Learning. The brief version of it is as follows.

When it came out, Azure ML felt like SSIS for machine learning. You dragged and dropped items, clicked the mouse a whole bunch of times, and end up with a pretty-looking data flow to build a model. Data scientists tended to complain that they could do most of the work in Azure ML in about 6 lines of R or Python code, and that the visual interface made model comparison really clunky.

Later on, demos were still drag-and-drop, but I remember that instead of dragging and dropping various data cleanup components, they’d drop in an R/Python code block and put in those 6 lines of code. So that was better, but the visual interface was still too constraining and compute was rather expensive once you did the math.

I skipped Azure ML for a while (including the Studio phase but came back to it as the result of a work project and I have to say that it looks a lot better. The integration with Azure Container Instances and Azure Kubernetes Service is pretty nice, model registration competes with MLflow (in that it’s easier to maintain, though not as feature-rich), and I approve of AutoML for at least getting you 80% of the way there (though @thedukeny points out that in many cases, AutoML can do at least as well as a data science team). And pricing isn’t too bad—we have a moderately used web service (called approximately 16K times per day and pushes roughly 4 million rows) and would be out approximately $100 per month for 24/7 utilization of ACI. That’s a fair bit less than we’re paying now.

The bottom line is, if you ignored Azure ML over the past couple of years like I did, I recommend giving it another try to see if it might fit some of your needs.