Layered Architecture for Data Platforms: the place that turns data into insights has been saved
Layered Architecture for Data Platforms: the place that turns data into insights
Data is often visualized so that it can be consumed in a consistent, understandable and user-friendly way. In this part 6 of the series on the Layered Architecture for Data Platforms we discuss different technologies and techniques to visualize data.
In the previous blogs about the Layered Architecture for Data Platforms we introduced the layered architecture for data platforms, dove deeper into the data sources and ingestion layer, discussed the processing layer, looked into the technologies to store data and described how the data can be analyzed. In this blog we look into the Visualization Layer, where the data is visualized in dashboards and reports.
Figure 1- Layers of a Data Platform
After the data is stored and/or analyzed, it can be visualized using dashboards and reports to turn the data into proper information which can help gain insights and drive business decisions.
When we talk about visualizing data, we make the distinction between the following four types of visualization:
- Self-Service BI
- Embedded Analytics
A business intelligence (BI) dashboard is a data visualization tool that displays information about key business performance indicators (KPIs) or other metrics. Dashboards usually represent the information using graphs, tables or KPI cards; its purpose is to provide the information at a glance to quickly give an overview on the performance of the business. Dashboards are often highly configurable to change how the data is visualized or which details are shown. A very popular feature of most dashboards is the ability to drill-down from an overview of the data to a more detailed and granular view to see a more specific part of the data. Showing information visually has many benefits.
The flexibility of the dashboard and the possibility to drill-down, however, has some consequences on the data set. The data set should not only contain the data that is shown in the dashboard, but also all data that could possibly be shown in the dashboard based on all the selections that the user can make. The possibility to drill-down means that the dataset should not only contain the highly aggregated data but also all the details behind it.
Since database queries are often too slow to get the required response time, most dashboard tools have an in-memory engine that contains the complete dataset so that changing selections or drilling-down goes very quickly. In some cases this in-memory engine is not needed. For example when the database that stores the data is very fast (for example when it is an in-memory database) or when an OLAP layer is used on top of the database. (See the blog about the Analytics layer to read more about the OLAP layer.) Since the amount of data is exponentially increasing, it is sometimes a challenge to keep all required data in memory, therefore dashboards might need to be divided into smaller dashboards or advanced techniques must be applied to swap data between disk and in-memory.
Reports usually have one or more screens or pages with graphs or tables to represent the information. The information is static so the person looking at the report cannot change how the data is shown or drill-down into the details as they can with a dashboard. However, it can be that before the report is opened, certain filter criteria can be entered to select which data will be shown in the report.
Reports can be created either using the same tools as the dashboards, but then with less flexibility in what the user can do with it, or with special reporting tools specifically used when the report needs to be pixel-perfect. A pixel-perfect report means that the developer of the report has control over how every single pixel is placed on the report. This is often required when the report needs to be printed. Another reason to use a special reporting tool is the functionality to manage and share the reports from a central place. Reporting tools support the distribution of reports in various ways, for example by sending them automatically by mail to the users.
Reports don’t need to have the same high performance requirements for the data as is the case with dashboards, because the data only needs to be retrieved when the report is generated. Often the generation of the report is done at another time than when the report is viewed. For example, it is possible to generate the reports during the evenings so that they are readily available during business hours the next day. Most reporting tools only retrieve the data from the database once the report is generated.
Self-service BI means that the end-users are able to develop graphs, reports or dashboards on pre-developed datasets. With self-service BI, the users are not only using the existing reports or dashboards, but they are creating it themselves and are also able to share their developed reports or dashboards with other users.
Not all users have the ability to use self-service BI as it requires extensive knowledge about the data, processes and business context. The small group of users that do have this knowledge and capabilities are known as super-users. The dataset(s) that they work on is often developed by the BI developers which differs from the datasets used for reports and dashboards in the following ways:
- The dataset for a dashboard / report is specific and limited to only what is necessary for the dashboard while the dataset for self-service BI should be as generic as possible so that it can be used to answer many different types of questions.
- The dataset for self-service BI should be “fool” proof. It should not be possible to do alter things that will crash the server(s), give performance issues for other users or provide information that does not make sense.
- Within the dataset for self-service BI it should be very clear what each data point means and how it should be used.
A good practice is to describe the dataset in a data catalog that contains the (business) meaning, where the data is coming from and how it can be used. See also the next blog about Data Governance that describes the data catalog.
Reports and dashboards created by self-service BI can be shared by other users, but it should be known to those users that this is not a report / dashboard that is developed by the BI or IT department and as such can be of less quality (this means that it is not guaranteed that the figures are correct or that it has been thoroughly tested). A good way to do this is to mark the report / dashboards that are developed by the BI / IT department as “Approved” so that users can easily see the difference.
There are some golden rules to follow when choosing the best tool and designing data visualizations:
Be relevant. Use a user centric and not a data or technology pushed approach to be relevant to the business problem. User engagement drives adoption!
Be specific. Use different solutions for different users and use cases. Differentiate between operational, tactical and strategic business needs resulting in different solutions with different levels of aggregation and data refresh cadences.
Be a layer. Don’t mix up visualization layer with the other layers (such as the processing and analytics layers). Ensure that whatever can be done in the back-end, or another layer, is done there, and not in the visualization layer
Embedded analytics is when the analytics capabilities are integrated into the transactional systems (for example in the ERP or CRM systems) or when it is integrated in a web portal, so that it is integrated in the day-to-day operations and business processes.
Some examples of embedded analytics are:
- Data visualizations or graphs that are integrated in a web portal or application
- Reports that are executed from within the ERP or CRM application
- Benchmarking where current data is compared to external data within the application
- Mobile reporting where the data visualizations or graphs are shown on the mobile device
- Visual Workflows which changes transactional data from within the analytical application and then writes back the changes to the source system.
While dashboards provide a more or less centralized overview of the data, embedded analytics is more integrated with the operational systems and as such is more tapered towards a specific action or decision that needs to be made. Embedded Analytics makes users more productive because it is integrated in the applications that are used every day to improve the user experience and the decisions that are made.
As read in the above section, data visualization can be divided in four types (dashboards, reports, self-service BI and embedded analytics). We are seeing some trends in data visualization that favors certain methods and that influences how the visualization is consumed. The trends in data visualization that we are seeing are outlined below:
- Reports and dashboards are consumed more often using mobile devices. It is therefore important to keep in mind when designing and developing reports and dashboards that the screen is smaller and the user interface that will work well on a computer will not always be ideal when using on a mobile device.
- Visualization tools use advanced analytics to propose what data can best visualized or which type of graph is most suitable for the selected data. Visualization tools also offer the possibility to ask questions about the data. For example questions like “Show products sorted by the manufacturer” or “Show total units by manufacturer and categorize in a tree map”.
- Complex custom graphs are increasingly used to visualize data. When you want to use your own graphs, make sure that you use a dashboarding tool that supports the use of your own developed graphs. Often, also a library of custom developed graphs is available that you can choose from to expand the capabilities of the standard dashboarding tool. We notice that there is great reluctance among our clients about working with custom graphs from third parties. With free custom graphs you often have no support and no certainty that it stays in the air and does not break with an update. Paid custom graphs, on the other hand, result in higher costs; the benefits of the better visualization often does not compete with that.
- The next thing in visualizing data effectively is data storytelling. Data storytelling is an approach for communicating data insights which includes data, visualizations and also a narrative. A new way of data storytelling is video visualization where the data is presented in an infographic video.
- We see a clear desire at our clients to move more towards self-service BI instead of pre-developed reports and dashboard because:
- The BI departments gets overwhelmed with requests for dashboards and reports with the increasing and diverse need for insights from the data.
- More organizations are moving towards data-driven ways-of-working, where more employees are data savvy and therefore able to apply self-service.
- The technology of visualization tools is developing rapidly, making it more user friendly for people to interact with the data.
We discussed in this blog the different types of data visualizations, some golden rules to follow when designing data visuals and some trends we see in the market. The data visualization layer is dependent on the other layers of the layered architecture; before data can be visualized the data should be ingested into the data platform, processed, stored and analyzed before it should be visualized. Often also organizational changes are needed to make decisions based on data. See the article “Today’s organizational challenge: From gut feeling to data-driven decision making” to read about the challenges and what actions can be taken to overcome them.
Deloitte’s Data Modernization & Analytics team helps clients with modernizing their data-infrastructure to accelerate analytics delivery and can help you with designing and developing dashboards, reports, self-service BI datasets or embedded solutions and guide on how it will work together with the other layers in the data platform. Our next blog will be about the Security and Data Governance Layers. If you want to know more how the data can be secured and managed please read our next blog in our series about the Layered Architecture.