top of page
  • Harinder Saluja

SAP Analytics Cloud Smart Discovery

Machine Learning for Analysts

As an analyst, you want to get the most out of your data. You want to have access to the best techniques to really understand what’s going on in your business. This empowerment is at your fingertips with Smart Discovery in SAP Analytics Cloud. Smart Discovery offers you a viable way to use automated machine learning on top of your BI data, without losing precious analysis time on data preparation. Simply decide the business question you want to ask your data and let Smart Discovery analyse it for you, by running a machine learning algorithm. You can then explore the generated results to gain insights into your data.

Smart Discovery

In Q1 2021 a significant update to Smart Discovery was released that can better answer your business question by now helping you more clearly define the context of your question. Being able to define a better question means that Smart Discovery can automatically prepare the data for you and create better analysis results for you to explore. To make sure you’re happy with the business question you’ve defined, Smart Discovery now offers you a preview of your question before it starts its analysis. The three main benefits of Smart Discovery are:

  1. Bridges the knowledge gap between machine learning and BI and allows analysts to easily use machine learning in BI.

  2. Automatically leverages data contained in existing BI models eliminating manual preparation requirements.

  3. Maps the output cleanly to the business question making it simple to then refine and improve.

This article explains how to use Smart Discovery to explore your data and answer real business questions.

Specify the Business Question


Smart Discovery Settings

Smart Discovery helps you to understand the process that it’s using to analyse your data for you. It helps you to specify the right question, and quickly understand the generated results. You can refine the question by modifying the target or entity, filtering the datasets or excluding variables for the analysis. The Target is the measure or dimension you would like to know more about like Revenue or Customer movements. The Entity defines the dimension or dimensions that describe the object in the data you would like to know more about, for instance customer or product. The entity describes the key that identifies each instance of that object. Smart Discovery will aggregate the data to the level described by the entity.

Analysts normally require specialist data science knowledge to effectively apply machine learning techniques to business data. Some of the challenges they face are:

  • Selecting the correct machine learning technique for a particular problem.

  • Selecting and preparing the data.

  • Correctly interpreting the results.

Smart Discovery allows an analyst to simply specify the business question. Based on this question the correct predictive algorithm is selected and the BI data is automatically prepared to allow the predictive algorithm to be applied. Smart Discovery then produces results that are easy to understand. As the automatic data preparation allows machine learning to be applied directly to BI data, it is simple to refine the question, or you can always ask more than one business question. Explore your data from different angles by asking Smart Discovery to analyze the same target in relation to different entities, and it produces different results.

Confirm the Business Question


Smart Discovery will analyse the data and generate content to gain insights into how underlying variables influence a target in relation to an entity within a dataset. Smart Discovery automatically prepares the data and builds a predictive model to predict Gross Margin for Customer Name. From this predictive model, it extracts and generates content that helps the analyst understand Gross Margin.

A key issue when applying machine learning to BI data is that the data is not naturally structured in a way that allow Machine Learning to be applied. This can mean that the results generated by Machine Learning do not match the user’s expectation and can be misleading. When configuring Smart Discovery, you specify the question by selecting both the Target and Entity. The Entity defines the object in the data you wish to explore. The entity is defined by a dimension or multiple dimensions. Essentially, this forms the key of the generated dataset. By specifying both the data can be prepared to match the question ensuring the generated output is safe and easily understood.

In this example you specify the target as Gross Margin and the entity as customer name. The other dimensions in the data may play an important role for explaining the target and must be represented in the flattened dataset. Measures are aggregated based on their aggregation type at the entity level.

How a dimension is represented in the dataset depends on the relationship it has to the entity; based on the following:

  1. If a dimension has a single value per entity it will be included in the dataset as is with its original name. The relationship in this case will be many to 1 (m:1).

  2. If there is a unique value of a dimension for each value of the key the dimension will not be included. In this case the relationship is 1 to 1 (1:1).

  3. If a dimension has multiple values per entity a count of the distinct values will be included with a “Number of” prefix. The relationship in this case can be many to many (m:m) or one to many (1:m).

Smart Discovery automatically prepares a dataset that contains one row of data for each instance of the entity. For instance, if the selected dimension was customer ID the dataset would contain 1 row of data for each unique customer ID. Identifying the entity allows the automated machine learning to provide much more focused analysis.

Automatically Generated Story

Smart Discovery automatically prepares the data for the business question, analyzes the data and generates content for you. The process automatically builds a predictive model to predict the target. The insights provided on the Key Influencers, Unexpected Values and Simulation pages are based on this model.

It is important to note that the analysis is performed on a snapshot of the data at the time Smart Discovery is run and that the analysis is not updated automatically in response to updates to the data. All the content generated by Smart Discovery is dynamic and changes based on the underlying data.

The Overview page provides visualisations to summarise the results for your target dimension or measure in relation to your entity.


The Key Influencers page is generated based on the predictive model. The Key Influencers page lists (ranked from highest to lowest) up to 10 dimensions and measures that significantly impact the target. For each influencer, visualisations are provided that show the average target value and a distribution of the target for each value in a dimension or for each binned value for measures.

In this case there is a record in the data for every customer name. This record contains data at the customer level such as the aggregated Gross Margin for that customer and any dimension values that are unique for that customer.


The Unexpected Values page provides records in the data where the value predicted by the predictive model is very different to the actual value in the data. These values are significant as the predicted values is based on the patterns generally found in the data, so these values are exceptions to the general rule. In this example, the value of Gross Margin for these customer names is different from that predicted by the behaviour of the other customers. These customers may be interesting to the analyst as they may reveal special cases that require investigation or may show issues with the underlying data quality.

The influencers are listed with an indication of the relative impact the selected values have on the expected value. In this example, we can see the expected value for a customer with these properties has an expected Gross Margin of 2,278,419.


Summary

With Smart Discovery, users can easily use automated machine learning to quickly understand their BI data directly in SAP Analytics Cloud without the need for any data science or machine learning expertise. By simply specifying your business question you can benefit from insights generated by automated machine learning. The elimination of data preparation can be a game changer as being able to quickly run analysis, understand the simple results, then modify the settings and run further analysis allows you to iteratively gain a better understanding of your business data. This simple process allows you to power better decision making and story building while generating useful content.



Recent Posts

See All

An overview of SAP BW/4HANA

BW/4HANA In-memory Data-warehousing The new reality for organisations happens against a quickly changing data landscape. Organisations and their customers access and consume/produce more data, more qu

bottom of page