Real-time Analytics over OLTP, good or bad?
Updated: Sep 5, 2018
With Microsoft adding column store technology to same relational database engine in SQL server 2016 onwards, you can now enable real time analytics over organizations most valuable operational systems such as ERP, CRM, PLM, Accounting etc. without writing single ETL package and minimal effort. All you have to do is create column store indexes on tables, each column store indexes uses separate store than relational and can also cache data but it does consumes compute and memory.
First question which comes to mind after reading this is how about impact on operational workloads and application performance, yes it is a valid concern although column store engine use separate storage but it will consume the compute and resources, so what to do?
If you like to implement this then along with column store indexes use real time replication to sync data to another SQL server or Azure SQL DB and give analytical users access to replica not primary. You will be creating the columns store indexes in your primary which will get auto replicated to reporting database along with other database objects such as column store indexes, it is one time setup and now you can run analytical workloads by connecting using Power BI or Tableau etc. without impacting performance of your live applications.
For more details on how to implement real time analytics go to https://docs.microsoft.com/en-us/sql/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics?view=sql-server-2017
This will work over when you need to analyze data from one system at a time but how about if you need to combine data from multiple systems including some data in excel files? Stay tuned for next blog in the series where I will go through how to do near real time analytics over multiple large datasets with no ETL.