- Kevin Feasel
- Tom Norman
Notes: Questions and Topics
Tom has spent a lot of time recently with Azure Lab Services, so our first segment is an interview where we cover some of the details of Azure Labs, how it worked for Tom, how much it ended up costing him, and some of the things you can do with the product.
Tom is doing another training for O’Reilly on December 4th. Here is a link to his prior training, to give you an idea of the offering.
Raymond brought us into the main topic of the evening: practices and planning for refactoring databases.
We go into it in some detail, but I wanted to share my notes as there were a couple of topics I did not have a chance to cover. What follows is a disjointed set of concepts which hopefully resemble advice in the aggregate.
First, refactoring is not performance tuning. Refactoring is supposed to be a zero-sum code change intended to make maintenance easier. It, by its nature, is not intended to make things faster or change behavior. I gave a few classic examples of refactoring in application code, as well as a couple examples of database refactoring. In code, think about breaking a segment of code out into its own method/function, or separating two things into different classes. This doesn’t necessarily improve application performance but it does make it easier for developers to understand and maintain. On the database side, normalizing a set of tables can be an example of refactoring—you’re not necessarily changing the performance of the application, but instead making it easier to manage query calls to insert, update, and retrieve information. Another example of database refactoring would be changing
text data types to
One key consideration is, do you have the downtime to perform refactoring? In other words, do you need the application running while your changes happen? If so, check out my talk on approaching zero-downtime code deployments.
I think the most important mindset to take when refactoring is to ask why you need to refactor. Make sure you have a good reason and it’s not just “It’s not my code, so it’s bad.” Understand that code has a story and the developer who wrote it knows more about the story than you do—even if you wrote the code, the you of two years ago was closer to the problem then than you are today. Chesterton’s Fence is a good mindset to have: before you change code, understand what it is doing and why it is there. If you can’t explain why the code looks the way that it does, be very hesitant about changing it.
Keep in line with your application deployment. Working with app devs reduces the risk that you’ll drop procedures in use or make changes which break the code. Also, try to use stored procedures as interfaces whenever possible. They allow you to make changes much more cleanly than writing queries directly through the application or via an ORM.
Another thing to consider is whether to include a rollback process, or if forward is the only direction. This will depend on your (and your company’s) risk appetite. Only moving forward is a lot easier to develop against, but it does require foresight and is much higher risk. Rollback scripts force you to think about how to back out of a problem, but because they are (hopefully) almost never used, it’s a lot of extra development time. This will depend on the company and scenario—if you’re working on a real-time financial system, then you don’t have much of a choice. But if you’re working on a personal website or an in-house product with a small number of users, it may make more sense just to keep moving forward.
Whenever possible, have tests. A thorough set of database integration tests is great. But if all you have is a hand-created workbench of queries to run, that’s still better than nothing and can be sufficient. The goal is to have enough information to determine if there are issues with the rollforward phase as quickly as possible, ideally before users experience them.
Write your scripts to be re-runnable and put them through change management. Store the scripts in source control. There are a couple philosophies on what to store in source control: either the current state of the system (creation scripts for tables, stored procedures, etc.) or the change process (the set of database modification scripts run over time). I like a compromise approach of having the current state plus some history of changes.
Finally, early refactoring can be painful, especially because you’re usually trying to untangle years of growth. Choose some easy targets to build momentum. Those tend to be tangential projects or small parts of a major database. Start building some of the muscle memory around rollback scripts (if needed), change control, and testing; that will make subsequent work that much easier.