How to build a great ‘self-service’ data solution in just a couple of hours

…usually with just 1-line of code.

Introduction

In my experience I have seen quite a few companies who have found themselves in one of the two situations below:

A – No Self-Service data solutions
B – Self-Service solutions that exist but are hardly used – or only used by the people who built them

Scenario B is usually because the model that supports drag ’n’ drop analysis is either:

  • Way too big – so many fields that the data model is intimidating or overwhelming
  • Too slow – so many fields that the model grinds to a halt – and the user lacks the quick feedback required to navigate
  • Unintuitive – the user is confused by the model and isn’t sure of what data they are looking at
  • Untrusted – the data could just be wrong, the risk of this increases as the complexity of the model grows

If your organisation isn’t in this situation and you have pretty good self-service BI then this approach can still be used as a tactical solution in circumstances where you are looking to add reporting capabilities to a new system or if your organisation acquires another company through M&A.

Before some of the modern tooling became available, to build a new self-service solution you would need something like a SSAS MD cube or a Business-Objects universe – and you would need a developer who knew what they were doing to build them.  However, some of the modern BI / Analytics tools such as Power BI & Tableau have been built with non-technical end-users in mind and these can be used to put together a great self-service solution in just a couple of hours.

Where to start

To create a self-service data solution we are going to need some data.  And to build it in a couple of hours we can’t start from scratch.

So we are going to find an existing report to convert.  Which report to choose should be fairly obvious as nearly all companies have one or two reports that are used for a whole range of use-cases.  They are usually very wide reports, built in something like SQL Server Reporting Services and contain a lot of columns and parameters.  These reports will often get exported out to .xlsx as data-dumps for manual analysis in Excel.

A brief comment on data modelling

With the tools available 10 years ago, to build effective self-service solutions you needed your data stored in a star-schema – and in my opinion, if you are building self-service BI solutions of any scale then you still need a star-schema.

Therefore, a lot of people tend to think of data existing in one of two models:

Entity Relationship (or 4NF, BCNF) for application databases and star-schema for analytic databases.

But we are going to use a lesser discussed data structure – a ‘flat dataset’, also known as a simple single table (In 1NF if you want to make it sound more professional).

One of the best features of the star-schema was that the physical data structures closely matched the conceptual data models the end-users had in their minds.

And the beauty of tools such as Tableau and Power BI is that we can easily overlay that same mental model over a flat dataset to give the appearance and ease-of-use of a star without the additional effort of reshaping the data in a relational database.

Back to building our dataset using our chosen report

The below is an example of a Power BI dataset I created a couple of years ago – which took me around 30 minutes – and the original report it came from.

I copied the Stored Procedure for the report and ran it with the date filters set to either 01/01/1900 or 31/12/2099 and the rest of the parameters set to “ALL”. This then returned all the possible data available to that report.

In a traditional SSRS report, you want to design it so that it doesn’t return too many rows; this is because you want it to return quickly and only give the user relevant data. From a technical viewpoint, it takes ages to render a lot of rows as HTML in your browser and there isn’t really that much interaction the user can do.  However, tools like Tableau & Power BI are perfect for interacting with data and ‘slicing & dicing’ large datasets – and they allow you to create really effective slicers and filters in seconds.

Very little coding required

If you don’t have access to the database then you will need to run the report in a browser and export to Excel, using the exported .xlsx as the source, unfortunately refreshing will be a bit clunky. There are other limits to this approach such as report timeouts but it’s worth mentioning that it can be done.

BUT… assuming you have access to the database, all you need to do is find the Stored Procedure for the report and add an INTO dbo.TableX statement on the final SELECT and then run the Stored Proc or the report.  If you are scheduling it daily then you may have to add a DROP TABLE or TRUNCATE/INSERT but all pretty simple really.

Finally, you may have a report that brings the data that you need but for, lets say, a single customer.  In this instance you can usually just comment out the WHERE customer = @customer clause and the SP will usually bring back everything you need.

Finishing the self-service solution

Once you have your data stored either in a database table or an Excel spreadsheet, load up Power BI or Tableau and import the data. 
From here, give all the columns friendly names and group the columns into display folders until the dataset is nicely presented and is very intuitive to use.

It is also a good idea to improve the model by creating useful hierarchies, hiding unused columns and making sure numbers have the correct aggregation settings (Summarize By). The below is an example of where to find the relevant settings in Power BI.

And note, you shouldn’t have to worry about performance as Power BI and Tableau will also compress the data with columnar storage so it will be nice and responsive even with a few million rows.

The data in these models will be very familiar to the users who regularly run the reports which will mean less barriers to adoption.

The other great thing is that the reports we are converting have usually been around for many years and accuracy has been achieved over many years of iterations – this all results in much easier testing and trust in the data.

Summary

I wrote this post for anyone who might work with reporting tools like Reporting Services, Crystal Reports, Tableau or Power BI – so data analysts, business analysts, business intelligence developers etc. etc.

Although it outlines the steps required, each organisation will be different, but hopefully the concepts can be understood and applied. Sometimes in the data world, there is a misunderstanding that complexity=better – but this is a nice, simple technique to breathe new life into some of your existing reporting. Hopefully you find it useful 🙂