Shop Talk: 2020-10-12

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Tom Norman

Notes: Questions and Topics

Announcements

We had two announcements. First, TriPASS board elections are coming up in mid-November. We have three positions up for election this year: President, VP of Marketing, and Treasurer. All TriPASS members in good standing of eligible to run, and the election will run from November 19th through December 3rd, for terms beginning February 23, 2021 through February 26, 2023. If you’re interested, e-mail (shoptalk at tripass dot org will work) and I can provide more details as needed.

Second, Mala brought up the Azure SQL Championship, so check that out.

Code Review Tools

Our first topic for the night was code review tools, focusing mostly on the database world. Mala led us through some of the research she has done to find an alternative to Crucible and Fisheye. Mala keys us in on several tools, with an emphasis on tsqllint.

Xenographics

Over at Curated SQL, I posted about Xenographics, a website dedicated to…uncommon visuals. I enjoyed walking through several of them. This site includes some visuals I like as well as some which I can’t understand even after reviewing them.

Watch the video for the visuals we look at, but I wanted to take a moment and hit six characteristics I think make for a good visual. This list is neither necessary nor sufficient—a good visual need not be all of these at once, and I won’t claim that this is the authoritative list of rules for quality visuals. That said, here they are:

  • Intuitive — A visual should be easy for a person to understand despite not having much context. In some cases, you have the opportunity to provide additional context, be it in person or in a magazine. That lets you increase the complexity a bit, but some visuals are really difficult to understand and if you don’t have the luxury to provide additional context, it makes your viewer’s job difficult.
  • Compact — Given two visuals, the one which can put more information into a given space without losing fidelity or intuitiveness is preferable. This lets you save more screen real estate for additional visuals and text. There are certainly limits to this philosophy, so consider it a precept with diminishing marginal returns.
  • Concise — Remove details other than what helps tell the story. This fits in with compactness: if you have unnecessary visual elements, removing them lets you reclaim that space without losing any fidelity. Also, remove unnecessary coloration, changes in line thickness, and other things which don’t contribute to understanding the story. Please note that this doesn’t mean removing all color—just coloration which doesn’t make it easier for a person to understand what’s happening.
  • Consistent — By consistency, what I mean is that the meaning of elements on the visual does not change within a run or between runs. Granted, this is more relevant to dashboards than individual visuals, but think about a Reporting Services report which uses default colors for lines on a chart. If you refresh the page and the colors for different indicators change, it’s hard for a person to build that mental link to understand what’s happening.
  • Glanceable — Concise and consistent visuals tend to be more glanceable than their alternatives. Glanceable means that you are able to pick out some key information without needing to stare the the visual. Ideally, a quick glance at a visual tells you enough of what you need to know, especially if you have seen the same visual in prior states.
  • Informative — This last consideration is critical but often goes overlooked. The data needs to be useful and pertinent to users, describing the situation at the appropriate grain: it includes all of the necessary detail for understanding while eschewing unnecessary detail.

Shop Talk: 2020-10-05

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan

Notes: Questions and Topics

Announcements

We had two announcements. First, TriPASS board elections are coming up in mid-November. We have three positions up for election this year: President, VP of Marketing, and Treasurer. All TriPASS members in good standing of eligible to run, and the election will run from November 19th through December 3rd, for terms beginning February 23, 2021 through February 26, 2023. If you’re interested, e-mail (shoptalk at tripass dot org will work) and I can provide more details as needed.

Second, Mala brought up the Azure SQL Championship, so check that out.

So You Want to Learn about Database Development

Leslie sent in an e-mail (shoptalk at tripass dot org) asking for guidance or a syllabus for learning to become a database developer, so Mala and I tackled that. We broke it down into five categories and focused mostly on books. One book-buying tip I have: if you’re looking at older books, check out a website like ISBN.nu, as it aggregates across several book sellers to find the best price. I’ve used that site for about 15 years now. Most of the links I provide are from Amazon, but that’s almost never the cheapest price.

First up is T-SQL, as I think you should learn the language before trying to work through the rest of it. Carlos Chacon has a great book called From Zero to SQL in 20 Lessons, which was written for someone with no experience to get started. From there, I’d recommend Itzik Ben-Gan’s T-SQL Fundamentals or Kathi Kellenberger’s Beginning T-SQL. Mala brought up John Deardurff’s series on learning T-SQL as well as the SQL Quickstart Guide. Finally, once you’ve spent a couple of years building up those SQL skills, grab a copy of Grant Fritchey’s SQL Server 2017 Query Performance Tuning book, which is a massive tome full of great info. Or you can get the much smaller version from Red Gate.

After taking on T-SQL, learn about data modeling. The single best reference is the Handbook of Relational Database Design, released in 1989. The first half of the book is gold; the second half is only useful if you’re trying to implement this stuff on a late 1980s Oracle or Informix database… For something a bit more recent, check out Louis Davidson’s Pro SQL Server Relational Database Design. The 6th edition is coming out soon, but if you’re impatient, Louis and Jessica Moss teamed up on the 5th edition. I have an older edition, and I think Louis’s explanation of 4th normal form is the best I’ve ever read. Mala has a couple of recommendations as well: Information Modeling on Relational Databases and Database Design for Mere Mortals.

If you want to move from OLTP-style data modeling and into warehousing, Ralph Kimball’s Data Warehouse Toolkit is the book to get. The Kimball model is everywhere and this is the type of book you can come back to multiple times and learn more and more each time. If you’re interested in the Data Vault approach, check out Building a Scalable Data Warehouse with Data Vault 2.0. I haven’t read it, but people who are into Data Vault have recommended the book.

Of course, once you’ve learned a bit about warehousing, you might want to read up on ways to work with warehouses. If you’re looking at report-writing, Kathi Kellenberger’s Beginning SQL Server Reporting Services is a good start. From there, Paul Turley, et al’s Pro Microsoft SQL Server 2016 Reporting Services is the next step, but I’d probably only go that far if I were writing SSRS reports as a main part of the job.

Meanwhile, if you want to learn about Power BI, Marco Russo and Alberto Ferrari have you covered. They’ve released a number of books and also have plenty of videos available for free. I’d also recommend Phillip Seamark’s Beginning DAX with Power BI, though I don’t think it’s really a beginner book by any stretch of the imagination. Finally, Mala reminded me that I couldn’t finish a list of Power BI resources without mentioning Guy in a Cube.

Career Next Steps

During our coverage, @jpanagi1 asked us about career guidance:

my background – dont have compsci degree – engineering background – learned fundamentals of sql – recommended next steps? do i need more coege or certs? what are employeers looking for?

We spent some time talking through this specific case. To summarize, @jpanagi is currently working at a company which has teams working on interesting projects, and the question is, how can I get into that position?

The simple form of our advice is, work with the team a bit on projects. If you have official projects, that’s easiest; if you can volunteer to help them out a bit, that can work too. If you have a reasonable manager, bring up that you’d like to do that kind of work in the medium term. A reasonable manager might not want to lose you, but there are several options here.

One is “in-house” work. Suppose that team is a report-writing and dashboarding team and they’re using some nice products in interesting ways. You might be able to build a dashboard specifically for your team with the same products, gaining experience while providing direct benefits to your team and your manager.

Another option is working part-time on projects with members of the report-writing and dashboarding team. It might not be directly for your team, but maybe a time-sharing of 80-20 or 90-10 could get you some experience on projects and try out something new without your manager losing a person.

If your manager isn’t reasonable, things get harder. In that case, you might need to volunteer on the sly and work a few extra hours with that team after you get your regular work done. I’ve known people who were trained up “after hours,” where they’d stay late at work and learn from someone in another department. Smart managers at good companies formalize this kind of training, but don’t let that limit you otherwise.

Also, a bit of freelancing doesn’t hurt. Again, this is off-hours, but if you want to try out a different path, see if there’s something tangentially related to your team. Learn about the products and try them out. Then, if you come up with something interesting, you can bring it up with your manager. If you ultimately get locked out from doing this thing, well, at least you have a project for your resume, and you can decide whether the position you’re at is the right one.

Shop Talk: 2020-09-28

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan

Notes: Questions and Topics

Azure Active Directory Outage

Shortly before the episode began, Azure had a major outage in its Active Directory service. We used this as a springboard for discussion around what happens when you rely on a cloud service for your company’s infrastructure, but that cloud service goes down. The easiest answer is multi-cloud and cloud/on-premises hybrid scenarios, but those can get expensive and also limit you from some of the benefits of Platform-as-a-Service offerings.

SOS_SCHEDULER_YIELD

Our next segment was around a server having issues that I was asked to take a look at. The two biggest waits on the server were CXPACKET and SOS_SCHEDULER_YIELD. The knee-jerk reaction when you see SOS_SCHEDULER_YIELD is to say that it’s a CPU problem, but Paul Randal explains why that is not necessarily the case.

From there, you want to see if the signal wait time is high—the rule of thumb I’ve seen is that if signal waits are more than about 20% of total wait time for SOS_SCHEDULER_YIELD, then we have a problem where there isn’t enough CPU to go around. In my scenario, signal waits were approximately 99.999% of total waits. So the kneejerk reaction is to blame CPU, the more thoughtful reaction is to investigate further, and the results of the investigation were to blame CPU.

If that’s the case, here are a few of the options potentially available:

  1. Tune some queries. Look for high-CPU queries, especially ones which run frequently. There are a few of those in my scenario, so every second of CPU you shave off per query call is a second which can go to another query and lessen the waits.
  2. Drop MAXDOP and cost threshold for parallelism if it is too high. I’m not talking about the knee-jerk reaction of “Parallelism is high, so drop MAXDOP to 1” but if you have 32 cores and your MAXDOP is 32, that probably doesn’t make much sense for an OLTP system. As a warning, though, changing MAXDOP without reviewing poorly-performing queries can lead to bigger problems, as those poorly-performing queries are already struggling with a high value of MAXDOP, so if you reduce .
  3. Increase CPU cores if you can. If your queries are looking good, it may just be that you have too much load on your server and adding more hardware can be the solution.

SSIS and ADO.NET: A Rant from Someone Not Me

Mala spent far too long working through an issue with multi-subnet failover on Availability Groups and SSIS. We use OLEDB drivers very heavily when working with SSIS, as we appreciate work finishing at a reasonable pace. But the OLEDB drivers installed with SQL Server 2017 don’t include support for multi-subnet failover. But hey, ADO.NET drivers do. So Mala spent a lot of time trying to switch over to that and catalogs some of the issues she ran into.

Furthermore, it turns out that the latest OLEDB driver (installed with SQL Server 2019 and available separately if you don’t have that version of SQL Server) does have support for multi-subnet failover, so we just needed to update drivers.

Shop Talk: 2020-09-21

The Recording

The Panelists

  • Kevin Feasel
  • Tom Norman

Notes: Questions and Topics

Microsoft Ignite

Microsoft Ignite is this week. Tom and I spent some time talking about the sessions and themes in the data space. Tom mentioned some sessions he’ll attend. I’m going to check out the Azure SQL Edge session.

Kerberos with Tom

Tom is working through some Kerberos double-hop issues, so we talked a bit about SPNs. Tom’s question was whether we needed to register the SPN for just the SQL service account, or also for SSRS, SSIS, etc. My recollection was that you only needed to register for the SQL Server account. It looks like that’s correct.

Innovate and Modernize Apps with Data and AI

I wanted to call out that a Microsoft Cloud Workshop that I helped build: Innovate and Modernize Apps with Data and AI. It hits on a lot of Azure technologies, so if you’re interested in how some of these technologies can fit together in an event-driven architecture, check it out.

Conference Thoughts

We wrapped up tonight’s show with a quick discussion of virtual conferences.

Shop Talk: 2020-09-14

The Recording

The Panelists

  • Kevin Feasel
  • Tom Norman

Notes: Questions and Topics

Azure Labs

Tom has spent a lot of time recently with Azure Lab Services, so our first segment is an interview where we cover some of the details of Azure Labs, how it worked for Tom, how much it ended up costing him, and some of the things you can do with the product.

Tom is doing another training for O’Reilly on December 4th. Here is a link to his prior training, to give you an idea of the offering.

Refactoring Databases

Raymond brought us into the main topic of the evening: practices and planning for refactoring databases.

We go into it in some detail, but I wanted to share my notes as there were a couple of topics I did not have a chance to cover. What follows is a disjointed set of concepts which hopefully resemble advice in the aggregate.

First, refactoring is not performance tuning. Refactoring is supposed to be a zero-sum code change intended to make maintenance easier. It, by its nature, is not intended to make things faster or change behavior. I gave a few classic examples of refactoring in application code, as well as a couple examples of database refactoring. In code, think about breaking a segment of code out into its own method/function, or separating two things into different classes. This doesn’t necessarily improve application performance but it does make it easier for developers to understand and maintain. On the database side, normalizing a set of tables can be an example of refactoring—you’re not necessarily changing the performance of the application, but instead making it easier to manage query calls to insert, update, and retrieve information. Another example of database refactoring would be changing text data types to VARCHAR(MAX).

One key consideration is, do you have the downtime to perform refactoring? In other words, do you need the application running while your changes happen? If so, check out my talk on approaching zero-downtime code deployments.

I think the most important mindset to take when refactoring is to ask why you need to refactor. Make sure you have a good reason and it’s not just “It’s not my code, so it’s bad.” Understand that code has a story and the developer who wrote it knows more about the story than you do—even if you wrote the code, the you of two years ago was closer to the problem then than you are today. Chesterton’s Fence is a good mindset to have: before you change code, understand what it is doing and why it is there. If you can’t explain why the code looks the way that it does, be very hesitant about changing it.

Keep in line with your application deployment. Working with app devs reduces the risk that you’ll drop procedures in use or make changes which break the code. Also, try to use stored procedures as interfaces whenever possible. They allow you to make changes much more cleanly than writing queries directly through the application or via an ORM.

Another thing to consider is whether to include a rollback process, or if forward is the only direction. This will depend on your (and your company’s) risk appetite. Only moving forward is a lot easier to develop against, but it does require foresight and is much higher risk. Rollback scripts force you to think about how to back out of a problem, but because they are (hopefully) almost never used, it’s a lot of extra development time. This will depend on the company and scenario—if you’re working on a real-time financial system, then you don’t have much of a choice. But if you’re working on a personal website or an in-house product with a small number of users, it may make more sense just to keep moving forward.

Whenever possible, have tests. A thorough set of database integration tests is great. But if all you have is a hand-created workbench of queries to run, that’s still better than nothing and can be sufficient. The goal is to have enough information to determine if there are issues with the rollforward phase as quickly as possible, ideally before users experience them.

Write your scripts to be re-runnable and put them through change management. Store the scripts in source control. There are a couple philosophies on what to store in source control: either the current state of the system (creation scripts for tables, stored procedures, etc.) or the change process (the set of database modification scripts run over time). I like a compromise approach of having the current state plus some history of changes.

Finally, early refactoring can be painful, especially because you’re usually trying to untangle years of growth. Choose some easy targets to build momentum. Those tend to be tangential projects or small parts of a major database. Start building some of the muscle memory around rollback scripts (if needed), change control, and testing; that will make subsequent work that much easier.

Shop Talk: 2020-08-31

The Recording

The Panelists

  • Kevin Feasel
  • Tom Norman
  • Tracy Boggiano
  • Mala Mahadevan

Notes: Questions and Topics

The DBA as Gatekeeper

Our first topic was all about the DBA as a gatekeeper. Kenneth Fisher’s blog post inspired the discussion. We got into a fairly detailed discussion on what gatekeeping means, where it makes sense, and where it doesn’t. Mike in chat summed it up with an excellent analogy: gatekeepers guard gates, not walls. In other words, “No” can be a viable answer, but can’t be the only answer.

The Case for Heap Tables

From there, Mala gave us the second topic of the night: is there a place for heap tables? tg came in quickly with “Yes, in a museum.”

The general consensus is that yes, there are cases for heap tables, but they are rare: rare enough that we’re talking 1% or fewer of user tables across an environment. But there are some good cases:

  • Tiny tables which fit on one page, such as enumeration tables. There’s no benefit to a clustered index when you’ve only got one page.
  • “Write-only” tables, such as log tables, can potentially be faster with heaps than with clustered indexes. That’s not guaranteed and advice can vary based on the version of SQL Server we’re talking about (where later versions make it less likely that you want to use a heap), but it’s possible.
  • Temporary tables. It may sound like a bit of cheating for me to include these, but “temporary” can include “temporary permanent” tables: non-temp tables which you don’t expect to be around for very long and aren’t used in your application.
  • If I need to perform a full scan of the table every time I query it—if I actually need every record and don’t look for ranges or individual rows—then it might make sense to leave the table as a heap.
  • In Azure Synapse Analytics SQL pools, we have three options: clustered columnstore index, clustered index, and heap. Clustered columnstore indexes are recommended for fact-like tables (lots of numeric values, no long strings) with at least 60 million rows. Clustered indexes are recommended for cases when you perform a single-row lookup. And heaps are recommended in other cases: small data sets supporting scans rather than point lookups.

Mala’s Book Corner

Mala has two book recommendations for us:

Shop Talk: 2020-08-24

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Tracy Boggiano

Notes: Questions and Topics

Working with ORMs

Our primary topic for the evening was working with ORMs. We’re data platform specialists, so the feeling is generally fairly negative toward ORMs, but I tried to give the “pro” side a reasonable airing, as there certainly are valuable uses for them.

One of the key things about ORMs to consider is that there are two major varieties: lightweight ORMs (also known as micro-ORMs) and their heavyweight bretheren. Two examples of micro-ORMs that I personally like are Dapper and FSharp.Data.SqlClient. These are small wrappers around ADO.NET which create simple objects from T-SQL statements and stored procedures. They save the development time of mapping result set outputs to .NET classes / record types without adding a lot of overhead. For that reason, I’m a big fan of using them.

On the other side, we have heavyweight ORMs like Entity Framework and NHibernate. These do a lot more and aim to create a single development experience in C#, so that developers don’t have to think in two languages. They also work well with fluent APIs like LINQ, translating those statements into SQL queries.

As far as performance goes, micro-ORMs are faster in most cases. Products like EF and NHibernate can generate some really nasty SQL and cause performance problems on complicated queries. But if you stick to fairly simple queries—especially simple insert, update, and delete operations—heavyweight ORMs can save you a good bit of time.

Tips for Creating a Presentation

The secondary topic for this evening was tips for creating a presentation. Mala, Tracy, and I have all put together presentations and so have several of the great people in chat. We talked about some ideas on how to get into presenting and shared a few stories of things that go wrong, with the expectation that hey, stuff happens but the presentation still works out in the end.

Mala’s Book Corner

Mala has two book recommendations for us:

Shop Talk: 2020-08-17

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Tracy Boggiano

Notes: Questions and Topics

Thoughts on PASS Pro Membership

Mala, Tracy, and I spent a good bit of time talking about PASS’s professional tier membership. It’s pretty common for trade organizations to require member dues—I brought up as examples the ACM and IEEE, and Tracy & Mala came up with other examples like Oracle’s official user groups. In this case, there’s no requirement: the free tier of PASS membership is the same as before.

We talked for a while about the question of whether to become a pro member. The really short version is that if you’re taking a hard look at membership benefits versus the annual cost, it’s not worth it today. But if you have goodwill built up for PASS, it’s worth it, as it helps keep PASS afloat.

Architectural Diagrams

The other key theme of the evening was architectural diagrams. Mala and Raymond both shared the same article on what makes for a good diagram, so that seems like a sign that it’s a good article.

My key judgments on what makes for a good architectural diagram are:

  • The diagram is concise. Show what you need but don’t include a lot of unnecessary detail. For example, if we’re talking about a ETL process in Azure, such a diagram might show a virtual machine pushing data through Azure Data Factory into an Azure Synapse Analytics SQL Pool, and from there into Azure Analysis Services and Power BI. At this high level, including the list of specific VNet settings is unnecessary. Even the set of data flows you’re creating through this process is unnecessary unless there’s a need—for example, if there are two processes and you need to differentiate them.
  • At the appropriate level. Ideally, boxes or shapes in a diagram should be independent units, each of which is necessary to understand for the solution.
  • Built with the audience in mind. We can have multiple diagrams for different people, and diagrams aren’t (or at least shouldn’t be!) the only documentation available.

As far as diagramming tools, I brought up Diagrams.net and Diagrams as Code. We also received recommendations for PowerPoint, Visio, and LucidChart.

Mala’s Book Corner

Mala has two book recommendations for us:

Jobs

Shop Talk: 2020-08-10

The Recording

The Panelists

  • Kevin Feasel
  • Mala Mahadevan
  • Tracy Boggiano
  • Tom Norman

Notes: Questions and Topics

Transaction Modes

The first topic of the evening was around transaction modes, following from a blog post and video I released yesterday.

Not surprisingly, most of us use explicit transactions in many cases, particularly higher-risk scenarios. Tracy has a template which she fills in, and Mala follows a similar path to me: begin a transaction if it’s potentially scary or if you’re doing this in application procedures. Tom’s answer: it depends. Most of the time, Tom uses autocommit because he’s in easy mode, but when he kicks it up to dangerous mode or if he needs to wrap multiple tables in a single transaction, he’ll mark them as explicit.

We also got into a rat’s nest on nested transactions and save points. Nested transactions are a lie. Savepoints are mostly a lie.

Maximum Number of Joins in a Query

Mala brought up a topic asking about the maximum number of joins in a query. Tom took a strong stand with “It depends.” Mostly it depends on how big the tables are.

@srutzky had the best answer: 20 to 30, but they must all be to views which have 20-30 joins. I’d add on that the views need to be nested views going at least three or four levels deep.

I don’t think there’s a real answer to the question. I’ve run into cases with 18-20 joins where the query plan just falls apart and taking one of those joins out (even to a minor table where it’s a simple nested loop lookup of a fairly small number of rows) makes the query perform a lot faster. But I’ve also worked with queries with more joins than that which worked quite nicely. At the end of the day, if you are able to generate a stable plan, that’s how many joins you can get away with.

As a bonus, I rant about the phrase “Normalize until it hurts, denormalize until it works.” This isn’t the 1980s; that phrase generally doesn’t apply to OLTP systems anymore and hasn’t for a good decade-plus. If you need to denormalize your tables to get queries to run efficiently, it probably wasn’t really normalized.

Mala’s Book Corner

Mala is back with two book recommendations for us:

A Diversion on Technical Writing

We ended the broadcast with a discussion on the importance of technical writing and some of the difficulties around it. It started on the idea of writing a book, but we ended up focusing on the documentation itself. One thing I want to stress is just how difficult it is to get this documentation right, especially because we tend to take mental shortcuts and expect that others will know the context currently in our heads. I’m really bad at it and have to try hard to remember that the reader needs all of the relevant context. It’s particularly difficult because the reader will go from A to B to C to D, but I might have written it B, D, A, C, such that by the time I get to A, I forget that I needed to explain something to make B make sense.

Raymond also asked where you can store documents. It’s a tough problem and we punt around the problem a bit.

Shop Talk: 2020-08-03

The Recording

The Panelists

  • Kevin Feasel
  • Tom Norman

Notes: Questions and Topics

Tracy the MVP

Congratulations to Tracy Boggiano for finally getting her MVP. It’s been a long time coming and I’m glad that she is getting the appropriate recognition for her community support.

Azure Data Studio Database Projects

Tom gives us his first thoughts on Azure Data Studio database projects. Tom referenced a blog post by Wolfgang Strasser and walked us through his first thoughts.

We then turned this into an extended discussion on the state of Azure Data Studio today.

A Rant on the XEvent Profiler

Tom and I then discussed Profiler. No, not that profiler; the other one. I hate the fact that Microsoft named this the XEvent Profiler because it really muddies the waters. The product itself is fine and is starting to give Extended Events a reasonable UI. But that name…

The biggest problem I have with the name is that it seems to be intentionally confusing, and as long as there are two tools called Profiler, that ambiguity will lead to confusion. “Oh, I heard from <insert name here> that Profiler is bad, so I’ll avoid this thing called XE Profiler. What’s an XE?” It would have been better to name it something different and make it easier for people to say something like “Avoid Profiler and use the SQL Server Performance Tracker instead.”

The product is fine; the name is not.

Licensing

We had several questions around licensing, and I’m bundling them here.

First, if you have questions about SQL Server licensing, Thomas Grohser did a talk for our group last month and he explains it better than I ever will.

We also talked about licensing in tools like Visual Studio Code, which has its own license based on MIT. We talked a bit about which licenses tend to pass muster in legal teams at organizations, as well as some of the ones which don’t.

I also talked about why I hate Oracle and the exact amount of my Oracle razzing which is real versus me being a troll.

Mala’s Book Corner, by Kevin

In this week’s edition of Mala’s Book Corner Kevin’s Book Hovel, I recommended Spark in Action 2nd Edition by Jean-Georges Perrin. Jean-Georges is local to us in the Triangle area and published a great book on Spark. The examples are a lot better than what I’ve seen in other Spark books and training materials, so if you’re interested in learning about Spark, get this book.