How to create your own predictive model

How to Create Your Own Predictive Model

Why choosing this solution?

Like we discussed in the first blog article of our “predictive series”, as analysts we need to provide businesses with lots of important insights – insights that drive decisions across the company by improving customer experience, identifying factor affecting performances and exposing hidden risks.

We must dig through a lot of steps to build a single model that predicts the best business outcomes. First, we need to clean and transform the data and choose the right algorithms. Then there is the training, tuning and testing, until our predictive model is finally ready to be deployed into business processes.

But just when we think we can move on, the data changes and our model needs to be updated, and for other business decisions we might need other predictive models.

How can we keep it up?

The answer is SAP BusinessObjects Predictive Analytics. We only need to feed in the data and then it automates the model development, generating and testing the information for each decision. We can deploy the model across the enterprise, where the best decisions are continually made.

Another important aspect that comes with this solution, is the reduction of our timelines from weeks or even months to just days. And when the time comes to update the models based on our latest data, it does that too, and they’re always relevant.

Creating a model and making predictions based on it

In this article we are going to create a predictive model, which we will use later for further predictions. I am going to show you how simple it is to build a successful model based on a meaningful dataset.

First, I used a public dataset from Kaggle which is the best option for anyone who wants to start with machine learning. I’ve chosen to use the Rossman Store Sales data, because the data provided showed historical sales, over a couple of years, for about 1,115 Rossman stores.

You can find the dataset on the following link: https://www.kaggle.com/c/rossmann-store-sales

The task was to predict the “Sales” over each store for a specific year (e.g. 2015).

8 Steps to your predictive model

Step 1
Let’s assume that you have already installed SAP HANA Studio on your computer, as well as SAP BusinessObjects Predictive Analytics. Also, a SAP HANA server connection is required to be able to use the Hana database and engine.

Step 2
Gather the data and load the csv files into the SAP HANA database. For creating our model, we used the store.csv, which holds the information about the stores, and the train.csv, that holds the historical data including the Sales (Fig 1.1).

SAP HANA Database

Fig 1.1

Step 3
Create a Calculation view for each dataset (Fig 1.2). Then, join them into another Calculation view, which you will use later in the Predictive Analytics tool. The final schema of the views is presented in Fig 1.3.

Calculation View

Fig 1.2

Final View

Fig 1.3

Step 4
Now, let’s move further. When opening the SAP Predictive Analytics application, you will find the Expert Analytics toolset (Fig 1.4). Expert Analytics in a statistical analysis and data mining tool that allows you to build predictive models on which you can make predictions about the future events.

Expert Analytics Toolset

Fig 1.4

Step 5
Add a new dataset. You can either use the Download from SAP HANA source as shown in Fig 1.5. This means that the data will be saved on your local machine and the model will run locally, too. The second option is to use Connect to SAP Hana, this means that your data will be stored and your model will run, you guessed it….. in HANA.

Add new dataset

Fig 1.5

Step 6
Connect to your SAP HANA server using your credentials as shown in Fig 1.6.

Connection to SAP HANA

Fig 1.6

Step 7
Find the view that we created before, and press Create (Fig 1.7).

Create

Fig 1.7

Step 8
In Designer screen we construct our predictive schema (Fig 1.8). First, we are sampling the data (choosing only the data from year 2015), then on the partitioning block we allocate the training, testing and validation data. For classifying the data, we are using the Auto Regression linear algorithm, on which the variables are depending linearly from the previous values and on a stochastic term.  The final block, Model Statistics, is used to display the results.

Designer Screen

Fig 1.8

Results
After running this model, we got the following result (Fig 1.9). As expected the predicted value, the line marked with green, is closely the same as the actual value, the line marked with blue. Over the summary tab we can see the algorithm summary, and how the distribution over the training, testing and validating set looks like (Fig 2.0).

Result

Fig 1.9

Result

Fig 2.0

Conclusion

SAP BusinessObjects Predictive Analytics lets your transform complex patterns hidden in your data into actionable insights – insights that will drive customer engagement and sales, improve operational performance and reduce risk across the organization. And that is a huge impact on your bottom line. All of this can be highly automated, making it easy to maintain peak performance and manage hundreds of models.

My suggestion is, that you should highly consider reimagining your digital enterprise with SAP Business Objects Predictive Analytics tool.

In case you have any questions, please contact us! We will be happy to help.

Sources: https://www.youtube.com/watch?v=hSoYane8TSw

Author
Andrei Orbai Associate SAP BI
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de