The Recording
The Panelists
- Kevin Feasel
- Mala Mahadevan
- Tracy Boggiano
Notes: Questions and Topics
Announcements
Our major announcement is that the TriPASS 2020 membership survey is open. If you haven’t filled it out, it will be open for the next couple of weeks.
Data Saturdays
We spent the first segment of the show talking about the post-PASS future. We included a link to Data Saturdays, which is one step toward replacing SQL Saturdays.
I’d also recommend checking out Call for Data Speakers, a service hosted by Daniel Hutmacher to give data platform speakers and conferences a central location, even outside of the SQL Saturday/Data Saturdays paradigm.
Azure Databricks and Azure Synapse Analytics
We spent a good amount of time walking through the differing use cases of Azure Databricks and Azure Synapse Analytics. Microsoft has an architecture guide walking through the use cases. One point of difference I have: I don’t think HDInsight is worth using.
Ivana Pejeva has a great article on the topic. One thing about the article, though, is that it was written a few months back, so Azure Synapse Analytics has changed a bit since.
We also talked about Snowflake competitors, which in the Azure cloud is Azure Synapse Analytics dedicated SQL pools.
Time Series Databases
@rnicrosoft asked a question around analytics when you have 800 million or so key-value pairs. The transactional side solution is typically something like Cosmos DB, where you’re reading and writing single records at a time. But what happens when you need to perform analysis on the data?
One solution is to use something like Azure Synapse Link to pull that data from Cosmos DB into Azure Synapse Analytics and organize the data in a classic fact-dimension model.
But another solution would be to store the data in a time series database like InfluxDB and visualize it in Grafana. Tracy and I have implemented monitoring with InfluxDB, Telegraf, and Grafana, and you can use this for “normal” analytics as well.