Machine Learning with SAP Hana and SAP Fiori-Inspiricon

Machine Learning with SAP Hana and SAP Fiori

What is Machine Learning and why is it important?

Well, first of all it is nowadays a hype. So, it is important to at least to know what the rest of the world is talking about.

And the reason why it is a hype, is because machine learning is bringing huge advances in various fields. It gives computers the possibility not only to perform certain tasks, but it also enables a computer to first learn the rules of performing a given task (learn from experience, from historical data).

Let us take the healthcare field for example, machine learning algorithms are successfully used to spot signs of various sever illnesses (breast cancer for example) as early as possible and reduce the risk on the patients.

Financial institutions also use machine learning algorithms for fraud detection and to combat money laundering. These algorithms are able to analyse millions of transactions and point out those that indicate suspicious patterns.

In the online security field, machine learning algorithms are used to track suspicious behaviour and detect privacy intrusions.

And also, we should not forget that we all use machine learning in our daily lives. Whether it is Siri we summon on our Apple device or Alexa on our home pod, whether we use Social Networks in the internet, or the Google Maps in our car, the core of these systems is powered by machine learning algorithms.

And in the daily operating business of companies, machine learning algorithms are automating basic tasks that would otherwise be done manually, like analysing invoices to detect duplicates, orders, etc. …

In the field of BI, one of the reasons why machine learning is important is because it is a part of the techniques used in predictive analytics. This gives employees the possibilities to predict certain results in the future. Sales people for example can make predictions of their sales volume, managers can evaluate multiple predictions of how certain decisions might impact future results, and make their decision based on these.

How does it work?

Let’s take a look at this quiz:

  • 2 → 4
  • 3 → 9
  • 5 → 25
  • 6 → ?

Now why have you been able to figure out that 36 is the right answer? Because you have recognized a pattern. And that is exactly what machine algorithms are doing. They are trained on sets of sample data where they are learning to recognize patterns and match these patterns to the correct responses (they are Learning from Experience). After the training we can query the algorithm for a response by providing it with a new set of data and what we get is (hopefully) an accurate response.

The machine learning algorithms are designed to work on problems much more complex than the quiz presented above, with a great number of input dimensions. This enables them to perform complex task like image or speak recognition, or forecasting some potential sales results based on complex historical market data.

Machine Learning with SAP Hana and Fiori

Since Machine Learning is such a hot topic, it generates a lot of curiosity and desire to experiment, and that was also the case for us. We at Inspiricon became curios how this new field could bring added values to area we are already working in, which includes BI, SAP Fiori and SAP Hana.

Well, it turns out SAP Hana has already a pretty robust support for Machine Learning. SAP provides the SAP Hana Predictive Analytics Library which offers the possibility to use machine learning algorithms and even build neural networks. Combining the power of this with SAP Fiori, it is possible to build some interesting applications in the field of Predictive Analytics. For example, we were able to build a small Fiori Application to predict the daily and monthly Sales figures for individual stores within a supermarket chain. Following illustration shows a rough overview of an architecture for this application:

Architecture_Application

The Fiori Application we have developed would be targeted to managers, and they would be able to explore the forecast until the end of the year from within the Fiori Application. Even more, we are experimenting further with this scenario and investigate how to extend it with other features like the integration of What-If Scenarios, such that one can investigate how certain management or marketing decisions (like promotions) can influence the predicted sales:

Fiori_Predictive_Demo

Conclusion

Machine learning can already be tackled with a simple Hana backend!

While there are powerful big tools out there like Tensorflow for neural networks or SAP Predictive Analytics, what is important to know is, that these are not necessarily mandatory in order to approach the topic. As explained above, SAP Hana already provides the means to build such approaches and with SAP Fiori it is possible to build an UI Application tailored for the specific scenario that is implemented. And the preliminary data analysis can be performed with powerful data analytics tools that are available for free for Python (Pandas) or R. So, with no additional cost in licensing or infrastructure this is can be a very attractive approach, especially for smaller problems that do not require intensive data processing.

Which approach is finally chosen however, depends on the specific use-case and shall be properly be evaluated by the development team. The maintenance of the solution and the license cost will also be an important factor for the owner of the solution and must be taken into account when making a decision.

Image sources: Inspiricon

Author
Gerald Iakobinyi-Pich Solution Architect
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
Classic DataStore Object

Classic DataStore Object vs. Advanced DataStore Object

There have been many architecture level changes in SAP BW/4HANA. One of this change are data modeling based.

In this article we will walk through the various features and capabilities of ADSOs, as well as explore how these capabilities help to optimize various tasks in your SAP BW environment.

At first, we will talk about the classic DSO and his features. After that I will present you the differences between the classic DSO and the new implemented ADSO.

DSO (Data Store Object)

What is DSO?

A DSO is a two-dimensional storage unit which mainly stores transaction data or master data on a lowest granularity. The data is stored at detailed level.

Types of DSO

When creating a DSO, you must choose the type:

dso

When we create a DSO, the system sets a system ID of ‘SIDs Generation upon Activation ‘by default. This option can be found in the edit mode settings of a DSO. If we checked this option, the system will check the SID values for all the characteristics in the DSO. If a SID value for the characteristic doesn’t exist, the system will then generate the SIDs. If the SIDs are generated during the Activation, this process will help the system to improve the runtime performance of a query. In this way the system doesn’t have to generate SID’s at query runtime. SID values are always stored in SID table of a InfoObject. Using this SID, the attributes and texts of a master data InfoObject is accessed. The SID table is connected to the associated master data tables via the char key.

The following Table shows you the properties of the different DSO types and architecture:

table-DSO-types

ADSO (Advanced Data Store Object)

The Advanced DSO manages to replace all these objects.

BW4HANA-Modeling-Objects

Before we create an ADSO we must know that it includes 3 main tables:

  1. Inbound Table
    • Activation queue table for classic DSO
    • Uncompressed fact table of non-SAP HANA optimized InfoCube
    • All records with are stored with a technical key
  2. Table of Active Data
    • Same as classic DSO, contains the current values after activation.  The key of the table is the DSO-Key (more about keys later)
    • Compressed fact table of non-SAP HANA optimized InfoCube
  3. Change Log
    • Same as classic DSO
    • Stores the difference between Inbound and Active-table
    • Needed for Delta-generation

Important Steps in creating a ADSO

We create an ADSO in the BWMT in Eclipse like all new Objects (in BW 7.5 you have the possibility top create the classical objects still in SAP GUI, in BW4HANA you can create only the new objects in BWMT).

In the General tab you will be able to configure activation settings and other property. At first the user must write a description. After that we have the possibility to choose a Model Template. In the ADSO you can behave like either one of the objects from classic BW:

template-adso

  • Acquisition Layer

In this layer you can create objects that cover the “write-optimized” use cases for classic DSO. It is divided into 3 other layers:

  1. Data Acquisition Layer
    • Corresponds to a persistent staging area (PSA) and acts as an incoming storage area in BW for data from source systems
    • No use of Active Table, so activation is not needed
    • Requests will be loaded into and extracted from the inbound table
    • All the records in the Inbound Table contain a Request Transaction Number (TSN), Data packet, and Data record number
    • The inbound (Old name = New Data / Activation Queue Table) table is accessed to execute a BEx query and for extraction
    • Data doesn’t get aggregated
  2. Corporate memory with compression feature
    • Requests will still be loaded into the inbound table
    • Old requests that are no longer needed on detailed level can be compressed into the active data table.
    • To save memory space, the CM – compression ADSO doesn’t use a Change Log table, only an Inbound Table and an Active Data Table.
    • As soon as a load request is activated, the system loads the data into the Active Table and deletes it from the Inbound Table
    • If there are 2 records with the same key, BW/4HANA overwrites all the characteristics of the record with the characteristics of the lastly loaded record.
  3. Corporate memory with reporting option
    • A difference between this template and the “Corporate memory with compression feature” template is that, the system does not erase data from the Inbound Table. Instead, the data also remain in the Inbound Table so that none of the technical information is lost.
    • The CM reporting template has no Change Log though
    • Another difference is that the data is not extracted from the Active Table but from the Inbound Table
    • Because the data remain in the Inbound Table after activation, these ADSOs are a good solution for you when you want to store data but save space by not using a Change Log
  • Propagation Layer
    • Provides a basis for further distribution and reuse of data
    • Corresponds to a standard DataStore object (classic)
    • Requests will be loaded into the inbound table
    • For reporting the user must activate the loaded requests
    • The data is then transferred into the active data table and the deta is stored in the change log
    • The change log is also used to rollback already activated request
  • Reporting Layer
    • Used to perform queries for analysis
    • Corresponds to a standard InfoCube
    • The inbound table acts as “F”-table and the active data table as “E”-table
    • It does not have a Change Log. If the Change log table do not exist the Delta process cannot be done.
    • After activation, the Inbound Table is empty
    • The user reports on a union of the inbound table and the active data table
  • Planning Layer

It splits in 2 other layers:

  1. Planning on Direct Update
    • Data is automatically loaded into the Active table, so no need for activation
    • It has no Change Log or Inbound Table
    • You can fill the Active table with an API
    • also load data to this type of ADSO using a DTP
    • Only have an Overwrite option. No summation of key figures like there is in the Planning on Cube-like ADSO
  2. Planning on Cube-like
    • Has an Inbound Table and an Active Table
    • All characteristic fields are marked as key fields in the Active Table, which is a necessary requirement to make it suitable for planning.
    • Only have an Summation option

Process of SID generation highly optimized for HANA

With the goal to optimize the performance, in BW/4HANA it is possible to set a flag not only on InfoProvider level, but individually per characteristic of the DSO. The data integrity check then is only executed on the selected characteristic.

InfoObjects/Fields

As a new feature, you can use fields with simple data types instead of InfoObject. To do so, go to the Details tab and click the Add Field button. Under Identify, you can specify in the “With” dropdown menu whether you want to use an InfoObject or a Field for the definition.

InfoObject

In BW the user can choose whether he is modeling with InfoObjects or fields. Modelling with InfoObjects brings extra effort, but also brings a lot of advantages. Before you choose one of this option, you should consider the advantages and the disadvantages of both of this modeling options.

In the following I will present you a part of the advantages and disadvantages when you choose the option of modeling with fields:

Advantages when using fields:

  • If the query contains fields, it can be processed key-based in SAP HANA
  • Using fields can enhance the flexibility and range of the data warehouse, when the data volume is small.

Disadvantages when using fields

  • The services for InfoObjects (attributes and hierarchies for example) are not available for fields.
  • Validity characteristics for DataStore objects (advanced) with non-cumulative key figures must be InfoObjects.
  • InfoObject attributes must be InfoObjects
  • A field-based key figure cannot be an exception aggregation
  • Planning queries on DataStore objects (advanced) are only supported with fields as read-only
  • If fields are used in the query, the InfoProviders can only be read sequentially
  • In the query on a CompositeProvider, not all data types for fields are supported (ex. maximum length for fields is 20 characters)

Defining Keys for a ADSO
Also, in this tab we select which fields make up the keys of our ADSO. To define a key, click on Manage Keys button.

fields-adso

Key Fields

There are 2 types of keys: Primary  and Foreign key

Advantages of using Key fields:

  • uniquely identify a record in a table.
  • Key Fields cannot be NULL
  • Used to link two tables
  • Main purpose of a foreign key is data validation.
  • Read Master Data: using the input field value as a key, you can read the value of a Characteristic attribute belonging to a specified Characteristic
  • Read from advanced DataStore: using the input field value(s) as a (compounded) key, you can read the data fields of a specified advanced DataStore Object (DSO)
  • Somethings that you don’t wish for is that, 2 records with the same key, BW/4HANA overwrites all the characteristics of the record with the characteristics of the lastly loaded record

Disadvantage of not using Key fields:

  • Records are not uniquely identified =>Duplicates records allowed
  • Performance affected

Benefits of using a ADSO instead of a classic DSO:

  • Simplification of object types
    • Can behave like 4 Objects from the classic BW
  • Flexibility in data modeling
    • Modeling your ADSO using the Reporting Layer settings
  • Performance of data loads and activation is optimized for HANA as ADSO is a HANA native object.

Source of images: SAP SE, Inspiricon AG

Author
Roxana Hategan Associate
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
ABAB-eclipse-tool

Take advantage of ABAP Development Tools in Eclipse

When did it all start?

ABAP-1992

As time passed by, solutions have constantly improved and so did ABAP development tools.

Independent from the classical SAP GUI-based development environment, there is also the possibility to develop ABAP applications in Eclipse.

ABAP-2009

It has been provided a great development experience on boarding of an Integrated Development Environment (IDE).

ABAB-switch-to-Eclipse

In Eclipse you can use several tools in one program. Add-ons like SAP HANA Database Studio, ABAP Development, SAPUI5, BW Modeling Tools and many more were incorporated as a delight of each SAP consultant.In this article, we will focus on the installation and utilization of ABAP Development Tools (ADT) within Eclipse. Before starting with the induction, let me give you the background of our environments.

Prerequisites

  • Eclipse IDE – get a suitable installation to stand the desired tools

Note that support is no longer maintained for Mars version. It is recommended to use Oxygen or Neon.

  • JRE (Java Runtime Environment) 1.6 or higher
  • SAP GUI 7.2 or above
  • Operating system (OS): Windows 7, Mac, Linux

Useful links:

In the following procedures, it was used Eclipse Neon.3 Release (4.6.3) and SAP GUI 7.4 on Windows 7 Professional OS, 64 bit.

Setting the ABAP Development Tools (ADT)

Assuming you already have SAP GUI and Eclipse installed on your computer, let us explore the ABAP world.  In order to install the plugin needed for ABAP development, in the Eclipse starts up window, go to Help menu and choose Install New Software.

Install-new-software-eclipse-5

We should now provide the location of the ABAP Development Tool (ADT). Press Add and introduce the path of the required packages.

select-site-location-eclipse-6

browsed-location-eclipse

Considering Eclipse is a platform delivered as an integrated place for many Add-Ons, we will take advantage and download all the components needed to rejoice ABAP on HANA development.

filter-text-eclipse

All you have to do is to accept the license agreement and press Finish to start the installation. If you already have some of the components installed, an update will be performed instead.

installing-software-eclipse

After the packages are downloaded and installed, Eclipse must be restarted.

restart-eclipse

Eclipse Welcome page should appear after the Restart process.

Do not worry if the content is distinct. It usually varies from one product to the other and it only presents an overview of the product and its features.

overview-eclipse

Since the installation is complete, all we have to do is to open the ABAP perspective. From the Window menu, go to Perspective  → Open Perspective → Other.

open-abap-option

Select the ABAP option and then close the Welcome page.

select-abap-option-eclipse

So how does the ABAP perspective look like in Eclipse?

The ABAP tools are perfectly placed on screen to lighten your development. Moreover, you are free to adapt the views according to your personal needs.

views-adaptation-abap

Before starting with the real coding, we have to make sure that services are activated in the ABAP backend system.

Logon to SAP GUI and go to transaction SICF. Type DEFAULT_HOST into Virtual Host field and press Execute.

activated-services-abap

There are two things that must be checked here:

1. Check for Services

Expand the default_host node and navigate to sap → bc → abap. Right click on items docu and toolsdocu and choose Activate Service from the context menu.

 

check-services-abap

2. Check for Links

Second thing here is to verify if the paths to some particular ABAP links are available for development.

Expand the default_host node and navigate to sap à public à bc à abap. For items docu and toolsdocu right click and choose Activate Link from the context menu.

check-links-abap

You are ready to connect to the SAP system.

Now let’s get started with the ABAP development in Eclipse

As a first touch with developing ABAP in Eclipse, I kindly recommend you to start with a simple program. For this to be accomplished, we will following create a „Hello World” project.

In order to create a new Project, navigate to File → New → ABAP Project.

new-project-abap

ABAP Backend must be configured for development purposes. Here we need to define the connection system details.

There are two ways to connect to a system:

1.Creating a new connection

On the dialog window that appears on the screen, click on new system connection.

create-new-connection-abap

Your System ID, Application Server name and Instance Number should be completed here.

server-name-abap

2.Choosing an existent System Connection

If you already have connected to a system before, you just have to select it from the list of available connections.

existing-system-connection-eclipse

Type in your credentials used in SAP GUI and press. Next to log on and retrieve compatibility information from the backend system.

retrieve-compatibility-eclipse

If the connection was successfully established, type a meaningful name for your project and choose Finish.

project-naming-eclipse

Your newly created project should now appear in the Project Explorer view.

appearance-new-project-eclipse

In order to break the barrier of coding , we need a program. Right click on a package under the project name node and go to New → ABAP Program.

break-coding-abap

Choose a technical name for your program and a suitable description and go to the next step.

technical-name-abap-project

On this step, a transport on which you want to save the program must be chosen, otherwise you have to create a new request.

In our case, for the $TMP package, there is no need for selecting a transport request.

$TMP-package

The ABAP Editor will open. Introduce the below ABAP code or type your own message to be displayed on the screen.

type-message-display

Save and activate your program. Run it as an ABAP Application.

run-abap-application

We have now successfully created the ABAP „Hello World” project. Enjoy the magic!

Conclusion

Fortunately, IT professionals cooperate and look up to optimize the ABAP stack for HANA and to mutual develop and exchange ABAP code in a modern IDE, taking into consideration that SAP HANA is a technology game changer.

In the next future, I will present you an analogy of some distinct functionalities implemented in Eclipse and SAP Netweaver and their appearance.

Stay tuned for more insights and developments in the ABAP world. I would be happy to read your questions and comments.

Source of images: SAP SE, Inspiricon AG

Author
Andra Atanasoaie Associate
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
Inspiricion_comparison_sap-bw_bw4hana

Comparison between the modelling in SAP BW and SAP BW/4HANA application

“We study the past to understand the present.” – William Lund

More and more customers approach us to learn more about BW 7.5 and BW4/HANA. All the more reason for us to start a new blog series to take a closer look at this subject. Let us start by examining the history of SAP BW and then move on to outlining the subject areas we will be covering over the coming weeks.

For this article, the past refers to the SAP BW modelling and the present to SAP BW/4HANA modelling.

Most organizations and individual users are still not sure which are the differences between SAP BW(old modelling) and SAP BW/4HANA.The purpose of this article is to put things into perspective and to provide you a clear answer regarding this topic.

SAP BW History – a short overview

Inspiricon_SAP-BW-History-Overview

What are the differences between SAP BW and SAP BW/4 HANA?

One of SAP’s main goals is to simplify the system. Consequently, it bundles together objects and processes and reduces the number of steps involved.

1.  Modelling Objects

A quick comparison between the modelling objects accessible in the classic SAP BW application and those in SAP BW/4HANA may help illustrate the level of modelling simplification accomplished.

sap_classic-sap-bw_sap-bw4hana_objects

In the upcoming articles in our series we will introduce you to the new Providers, starting with ADSOs.

2. Data Flows

The central entry point for modelling in SAP BW∕4HANA is the data flow. This defines which objects and processes are needed to transfer data from a source to SAP BW∕4HANA and cleanse, consolidate and integrate the data so that it can be made available for analysis and reporting. SAP BW∕4HANA is using a new integrated layer architecture (Layered Scalable Architecture – LSA++).

The classic SAP BW is using LSA, the old version of LSA++. This layer is more restrictive and not so flexible with the data.

Inspiricon_Comparison_Classic-BW_BW4HANA

One of the major benefits of using LSA++ is the reduction in the number of persistence layers. This has two effects:

For one, it improves data processing performance: You spend far less time saving and activating!

Second, this reduces the data volume. Given that storage place was not considered a critical factor, it used to be that redundancies were deliberately used in the BW system to improve read performance. But with the advent of HANA, things changed profoundly. Main memory is expensive, both in terms of hardware when compared to hard disk storage, and in licensing terms, as HANA is licensed as main memory. Another benefit is that the reduction in “physical” layers allows for far more flexibility in system design.

3. Source Systems

SAP is also pursuing its simplification approach when it comes to the source systems.

SAP BW∕4HANA offers flexible ways of integrating data from various sources. The data can be extracted and transformed from the source and load it into the SAP BW system, or directly access the data in the source for reporting an analysis purposes, without storing it physically in the Enterprise Data Warehouse.

sap-bw4hana-simplified-source-systemsa) SAP HANA Source System

  • this connectivity can be used for all other databases (e.g. Teradata, Sybase IQ, Sybase ASE).

b) SAP Operation Data Provisioning (ODP)

  • acts as the hub for all data flowing into BW from external sources
  • used exclusively with SAP Landscape Transformation (SLT), SAP ERP Extractor (SAP Business Suite), HANA Views and SAP BW.
  • The PSA no longer exists with the new ODP concept which provides a much faster extraction mechanism.

With those two connectivity types, data can be made available in batch mode, using real-time replication or direct access.

HANA views are automatically generated within the Sap HANA database, after you activate the objects (ex. ADSO, Composite Provider).

4. Performance

As pointed out in connection with LSA++, data processing is much faster with HANA. While data flows were all about streamlining the architecture, there are also a number of tangible benefits in terms of technical performance:

Additionally to classic SAP BW, SAP BW/4HANA offers in memory Data Warehousing:

  • No Aggregates or Roll-up Processes
  • No Performance specific Objects
  • Fewer Indexes
  • Faster Loading and Processing

SAP is going in the same direction with the ability to move transformations directly to the database, the so-called push-down.

This performance that SAP BW/4HANA offers is ensured by an algorithm push-down.

sap-bw4hana_algorithm-push-down

This is one of the subjects that we will be discussing in one of our next articles.

Source of images: SAP SE, Inspiricon AG

Author
Roxana Hategan Associate
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
Predictive Analytics in Native Hana

How To Use Predictive Analytics In Native Hana

In this article we will be focusing on Hana’s native predictive analytics capabilities, explaining step by step how to use PAL (Predictive Analytics Library) comprised in AFL (Application Function Library) to create a multiple linear regression model and how to use that model for predictions.

You want to get to know more about Predictive Analytics? Read our two other articles about this topic:

  1. Welcome to the World of Predictive Analytics
  2. How to Create Your Own Predictive Model

WHAT IS PAL?

We have previously discussed Predictive Analytics using SAP Business Objects Predictive Analytics. Another way to create trained models and predictions based on the created models native in HANA is by utilizing PAL (Predictive analytics Library). This method is more complex and requires the user to have technical knowledge.

While the application offered by SAP is more user-friendly, the idea of implementing HANA PAL solutions offers a greater possibility of including the data into other applications by creating a process native in HANA.

The easy way to use Predictive Analytics tool brings high costs, mainly caused by the license fee – in contrast to Hana PAL where the user only requires knowledge on how to use the PAL functions.

HOW DOES PAL WORK?

There is a great number of functions offered by Hana to be implemented for different scenarios (Clustering of data, Classification, Regression, Time Series Forecast etc.).

In this article we will be focusing on one function from PAL: Multiple Linear Regression.

We will be using the same data (Rossmann dataset) that we utilized in the second article of our Predictive Analytics series (How to create your own predictive model).

The dataset is offered by the Rossmann store chain on Kaggle, a website where people and organizations can upload real datasets for competitions or for the purpose of helping people develop data science skills.

For a detailed view on the codes you will need to built a multi-linear regression we provide 2 documents that will help you get through it:

PREQUISITES AND DATA HANDLING

Let us assume that you have already installed SAP HANA STUDIO on your computer. Also, a SAP HANA server connection is required to be able to use the Hana database and engine. It is recommended to have the latest version of the HANA database (2.0). In our example both algorithms are supported by HANA 1.0 SPS 09 and HANA 2.0 SPS 01.

In our scenario we have imported the data found on the train.csv and store.csv in our HANA database as a Calculation View.

In the train.csv file we have data regarding sales of different stores and the date when the sales were made, also information regarding promotions, client numbers, day of the week etc.

In the store.csv file we have data regarding master data for store such as, the distance of the competition of the store, type of the store, and assortment of store.

In figure 1.1 we can see the two data tables:

Fig. 1.1

In our created Hana Calculation View we will join the two files, in order to create a complete dataset so that our algorithms can make an accurate model for a precise prediction (as seen in Fig.1.2).

Fig 1.2

PAL ALGORITHM IMPLEMENTATION

In our scenario we will implement the prediction using multiple linear regression.

1.WHAT IS Multiple Linear Regression (MLR)?

Linear regression is an approach to modeling the linear relationship between a variable , usually referred to as dependent variable, and one or more variables, usually referred to as independent variables, denoted as predictor vector . In linear regression, data is modeled using linear functions, and unknown model parameters are estimated from the data. Such models are called linear models.

 2. HOW TO CREATE A MLR MODEL IN 7 STEPS?

Step 1: Definition of parameters

In order to create a multiple linear model the algorithm must have 2 input tables and 4 output tables and we need to create the definition of these tables.

Input Tables:

a. Data – this contains the columns used for training the model. The data must contain a primary key column (“ID”) that must be the first column defined. The second Column must be the target/dependent column (the “Y variable” that we will be predicting). The rest of the columns will be populated with the independent variables (X1, X2,….) that will help the algorithm make an accurate model)

b. Significance – this contains the coefficients of the model and their values (the higher the value the more effect it has on the model)

c. Fitted – The fitted values (“Sales” in our case) of the target

d. Results – the results of the model (model accuracy and confidence)

e. Exported model – the model that will be used for future predictions.

f. Control – the parameter that specifies how the algorithm should work (here we can enable and calibrate the elastic net penalties, to specify the thread numbers, the algorithms used to resolve the least square problem etc. .)

Step 2: Creating the Parameter Table

After creating the definitions for the model, we have to create the table that will contain all the parameters and we need to specify whether or not the tables are input or output parameters (Data and Control Tables will be input parameters and the rest will be output).

Step 3: Wrapper Creation

The wrapper will create a procedure that will use the specified function and the earlier created parameter table. We will use this procedure later on to create the regression model based on our training data.

Step 4: Population of Data Table

First we will create a new table that will have the same columns and column types as the definition created for the parameter table.

Step 5: Population of Control Table

In the control table the user specifies how the algorithm will work. It is similar to creating settings for your program. The following control settings can be specified:

a. THREAD_NUMBER -> specifies number of threads (only when algorithm 1,5 or 6 is used)

b. ALG -> Specifies algorithms for solving the least square problem:

  1. QR decomposition (numerically stable, but fails when A is rank-deficient)
  2. SVD (numerically stable and can handle rank deficiency but computationally expensive)
  3. Cyclical coordinate descent method to solve elastic net regularized multiple linear regression
  4. Cholesky decomposition (fast but numerically unstable)
  5. Alternating direction method of multipliers to solve elastic net regularized multiple linear regression. This method is faster than the cyclical coordinate descent method in many cases and recommended.

c. VARIABLE_SELECTION -> “0” for to include all variables, “1” Forward Selection, “2” Backward Selection

d. APLHA_TO_ENTER -> P-value for forward selection

e. ALPHA_TO_REMOVE -> P-value for backward selection

f. ENET_LAMBDA -> Penalized weight

g. ENET_ALPHA -> elastic net mixing parameter

In our example we will use ALG 6 with elastic net penalties enabled (a method used for the optimization of the regression model) and we will have a thread number of 5.

Step 6: Build output tables

Now all that remains to be done is to create the output tables based on the definitions.

Step 7: Create the model

Finally we have to call the procedure that we created in the wrapper and to view the results and to create a table where we can compare the fitted values with the real values

Results

The first result that will pop up is the Coefficient Table that will tell us the impact of each coefficient upon the model where 0 means it has almost no effect on the created model.

Fig. 1.3

The next output table will show us the fitted values of the training model data:

Fig 1.4

The last table will give us the statistics of the created model

  • R2 – is the power of the model, in our example our model has a 89,996% precision.
  • F – The F value is the ratio of the mean regression sum of squares divided by the mean error sum of squares.
  • AIC – The Akaike Information Criterion (AIC) is an estimator of the relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Thus, AIC provides a means for model selection.
  • BIC – Bayesian information criterion (BIC) or Schwarz criterion (also SBC, SBIC) is a criterion for model selection among a finite set of models; the model with the lowest BIC is preferred. It is based, in part, on the likelihood function and it is closely related to the AIC.
  • MSE – the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors or deviations—that is, the difference between the estimator and what is estimated.
  • RMSE – Root Mean Squared Error
  • MAE – The mean absolute error (MAE)
  • MAPE – the mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics, for example in trend estimation.

Fig 1.5

The comparison between the fitted and real values can be displayed in a more user-friendly way by using a chart where you can compare the real value and the predicted/fitted value (the sales values in our case) over time, as shown below:

Fig 1.6

Fig. 1.7

3. HOW TO CREATE PREDICITONS BASED ON MLR MODEL’S RESULTS

(1) Definition of parameters

Similarly to our model creation we are required to create parameter definition, the difference being that the data table “type” must not contain the target variable.

(2) Forecasting Procedure Wrapper

Our generated procedure will be fed the parameter table specifications and structures and will be using the ‘FORECASTWITHLR’ function.

(3) Input and output table creation and data generation and regression specification

Similarly to the model creation we will specify how the linear regression forecast function will handle the input data. The input data must not contain the target column.

(4) Coefficient table

The coefficients table’s data must be transferred from the Coefficient table from the model’s result.

(5) Forecast Procedure Calling

All that remains to be done is to call the previously created procedure.

(6) View Results

View the predictions made by the forecast algorithm.

 

Sources:

https://help.sap.com/viewer/2cfbc5cf2bc14f028cfbe2a2bba60a50/2.0.00/en-US/eedc9094daf04419bc25f6ed097ac03b.html

https://help.sap.com/doc/86fb8d26952748debc8d08db756e6c1f/2.0.00/en-us/sap_hana_predictive_analysis_library_pal_en.pdf

Author
Gellert Peter-Siskovits Associate
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
Who needs SAP Vora?!

Who needs SAP Vora?!

What is SAP Vora good for anyway?

SAP Vora allows you to analyze structured and semi-structured data within an existing Hadoop cluster in a modern interface and to combine these data types with one another as well as with data from SAP HANA.

From a technical standpoint, this is an extension of the Apache Spark execution framework, which has long been established in the world of Hadoop.

This way, SAP Vora gives you a distributed in-memory query engine for both structured and semi-structured data in a Hadoop cluster.

How can I use SAP Vora to my advantage?

There are several options for you to benefit from SAP Vora; it goes without saying that SAP would like to see you use it in the cloud – thus keeping in line with its own cloud strategy:

On-premise:

  • By downloading the Developer Edition from the SAP Development Center (http://developers.sap.com), which is available free of charge for SAP partners
  • By downloading the Production Edition from the SAP Support Portal (https://support.sap.com)

Cloud-based:

  • By using the Developer Edition, which is free of charge for SAP partners, through Amazon Web Services (AWS) or SAP Cloud Appliance Library (SAP CAL)
  • By using the paid Production Edition through AWS
  • By using it as a service through SAP Big Data Services
  • By using a bring your own license (BYOL) model (SAP Cloud Platform, AWS)

The SAP Vora Developer Edition in AWS provides complete functionality and with just a few clicks, the environment can be custom-configured according to pre-established parameters.

The underlying Hadoop cluster is a Hortonworks distribution (HDP) with the corresponding tools/software solutions such as Spark, Ambari, Zeppelin, etc. and has a maximum of 4 nodes.

The variant offered by SAP through the SAP Cloud Appliance Library (CAL) is delivered as a pre-configured appliance with functionality that is very similar to AWS. It is best suited for anyone already using SAP CAL.

The Production Edition differs only in terms of upward scalability of the cluster and, of course, in terms of cost.

How does SAP Vora work?

Once you have made your decision regarding a deployment model (on-premises or cloud) you then go on to – depending on your choice – installation and configuration.

The installation process involves three steps:

  1. Determining the number of nodes required for Vora in the Hadoop cluster depending on your
    • availability requirements
    • sizing requirements (CPU, disk vs. RAM, control nodes vs. compute nodes, different sizing for each specific Spark engine, etc.)
    • expected data growth
  2. Deploying SAP Vora Manager Services on the required cluster nodes
  3. Configuring and starting the SAP Vora Services on the cluster nodes using the SAP Vora Manager UI

Once you have successfully completed the installation and configuration in a Hadoop cluster (the HDP, Cloudera and MapR distributions are supported), you can start using SAP Vora. In addition to the above-mentioned SAP Vora Manager for the more administrative side of things, end users are provided with a central GUI by means of a set of tools known as the SAP Vora Tools.

The following tools are available in the GUI:

  • Data Browser: view the contents of tables and views
  • SQL Editor: create and execute SQL statements
  • Modeler: create and modify tables and views
  • User Management: manage user accounts and access to the SAP Vora Tools

The end users can leverage the SAP Vora Tools to analyze data that differs in structure and data type found in the Hadoop cluster. In the next section, we will take a closer look at the analytics options.

What can I analyze with SAP Vora?

Vora enables you to interpret JSON documents, conduct time series and graph analytics, and use SQL to also analyze data that is conventionally structured in a relational way.

In doing so, Vora uses a specific Spark engine with optimized processing for each of the different types of analytics.

The “Doc Store” – NoSQL analytics of semi-structured JSON documents

Starting with version 1.3, SAP introduced the “Doc Store”. With it, you can store modified documents as schema-free tables, which in turn allows you to scale out and flexibly handle document fields (delete, add).

Once you have created a document store (= collection) based on JSON documents existing in the cluster in Vora, it serves as the basis for the creation of a view that can also be expanded with the familiar JSON expressions. This view is then stored in Vora’s own Doc Store and can be processed both in the table and the JSON format.

Time series analytics – leveraging efficient compression and data distribution

The Spark engine available for time series analytics exhibits its full strength when the underlying data is spread across as many cluster nodes as possible and can be efficiently compressed.

Based on the time series data stored in the cluster, a “times series table” is created within Vora, for which a unique column with time ranges (= range type) must exist. Along with various other options, you can also enter equidistance properties and additional compression parameters.

In order to be able to analyze time series data, you also need to create a view that can be enhanced with specific table functions (e.g. cross/auto correlation).

With this, you can then conduct the corresponding analyses such as regression, binning, sampling, similarity, etc.

Real-time graph analytics – analyzing very large graphs

Vora comes with its own in-memory graph database that was specifically developed for the real-time analysis of large graphs. Accordingly, the modelling of the graphs in the graphical metadata viewer is supported by a path expression editor.

With an in-memory engine available, it is capable of highly complex graph-related queries and you can count on the visualization of the graphs to be state of the art.

The graph analytics engine is particularly suited for supply chain management applications or to visualize elaborate organizational and project hierarchies or business networks.

Relational engine – using SQL to analyze relations

Last but not least, Vora also lets you use SQL to represent and query structured data in the cluster in the form of relational, column-based tables. This approach also uses in-memory data compression.

For relational data that does not need to be kept in memory, Vora also comes with a disk engine. It stores the data in a file on the local node on which the engine runs. As with the dynamic tiering option in HANA, you can also easily join the column-based relational disk tables with the in-memory tables.

Also worth mentioning

  • Once you have completed the registration in the registry, Vora also allows you to use SAP HANA tables along with any views and tables created in Vora. From Vora, you can also write data to SAP HANA.
  • The creation of both level and parent-child hierarchies and the use of joint fact tables is supported.
  • You can use currency translation (standard or ERP) in tables and views.
  • There are specific partitioning functions and types for each engine, that is, for the specific data structures created in Vora that allow you to optimally distribute or partition them in the cluster (hash, block, range).

What data sources and formats are currently supported?

With the SAP Vora Tools, you can process the following files in Vora:

  • .CSV
  • .ORC
  • .PARQUET
  • .JSON
  • .JSG

In addition to the standard data type HDFS and the ORC and PARQUET types (option (format “orc” / format “parquet”)), it is also possible to load the following additional types in the “CREATE TABLE” statement in Vora:

  • Amazon S3 (option (storagebackend “s3”))
  • Swift Object (option (storagebackend “swift”))

Conclusion and outlook

It is hardly surprising that SAP Vora’s main strength lies in the combination with SAP HANA, as this enables you to analyze relational data from HANA along with semi-structured data from your Hadoop cluster. What’s more, Vora gives you an array of analysis options (graphs, documents, etc.) combined into a single tool that would otherwise require you to rely on multiple tools (or different databases) from different Hadoop distributors or third-party vendors.

SAP is planning to support the transaction concept (ACID) in Vora to improve on its consistent data storage capabilities. For 2018, initial support for insert/update/delete statements is already in the works. SAP furthermore plans to add support for SQL Pass-through from SAP HANA to SAP Vora.

All friends of SAP BW will also be glad to hear that SAP plans to support DSOs beyond 2018.

If you’re an SAP partner, you can easily get started with the free Developer Edition to familiarize yourself with the subject – it’s the perfect place to learn more about its configuration and use cases.

Or you can just ask us – we’ll be happy to help!

Author
Andreas Keller Associate Partner
Phone: +49 (0) 7031 714 660 0
Email: info@inspiricon.de
inspiricon-text-analysis-sap-hana_jack-moreh

How to Get More Insights With Text Analysis with SAP HANA

We all know that knowledge is power. Every day we accumulate information, we share it or we post on social networks. Everything around us is pure information. What can we do with all this news? Simply, we can store it as data.

Data can be classified in two categories: unstructured and structured data.

Figure1 Unstructured Structured Data

Figure 1. Difference between unstructured and structured Data

What is unstructured Data?

The phrase “unstructured data” usually refers to information that does not reside in a traditional row-column database. For example Facebook, Twitter or Emails.

The benefits of structured data is that it can be identified and processed by machine. After the storage, data is so much easier to search, combine and filter for one’s own purpose.

What is structured Data?

Data that resides in a fixed field within a record or file is called structured data. For example a database.

So with Text Analysis we take unstructured data, transform it to structured data and analyze it.

What is Text Analysis?

The term text analytics describes a set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for exploratory data analysis, research, or investigation.

Text Analysis powered by SAP HANA applies full linguistic and statistical techniques to extract and classify unstructured text into entities and domains.

With SAP HANA platform, you can gain real insights from your unstructured textual data. The platform provides search, text analysis, and text mining functionality for unstructured text sources.

There are several techniques used in Text Analysis in order to find a specific string or to perform linguistic searches.

  • Full Text Indexing:
    A full-text index is a special type of token-based functional index that is built and maintained by the Full-Text Engine for SQL Server.
  • Full Text Search:
    Full text search is designed to perform linguistic (language-based) searches against text and documents stored in your database.
  • Fuzzy Search:
    Fuzzy search is the technique of finding strings that match a pattern approximately. It is a type of search that will find matches even when users misspell words or enter in only partial words for the search.

In this article we will talk about Full Text Indexing.

Full Text Indexing: Table $TA

Creating a full-text index with parameter TEXT ANALYSIS ON triggers the creation of a table named $TA_<indexname> containing linguistic or semantic analysis results.

The set of columns in the table $TA is always the same regardless of the text analysis configuration used with the full-text index:

inspiricon-figure2-ta-table

Figure 2. Example for a TA Table

  • Key columns from source table (ID):
    The first columns in table $TA are a direct copy of the key columns from the source table.
  • TA_RULE:
    The rule that created the output. Generally, this is either LXP for linguistic analysis or Entity Extraction for entity and fact analysis.
  • TA_COUNTER:
    A unique sequential ID for each token extracted from the document.
  • A_TOKEN:
    The term, entity, or fact extracted from the document.
  • TA_LANGUAGE:
    The language of the document.
  • TA_TYPE:
    The type of the token. In linguistic analysis, this is the part of speech. In semantic analysis, it is the entity type or fact. (‘noun’, ‘StrongPositiveSentiment’, ‘Person’)
  • TA_NORMALIZED:
    The normalized version of the token. Inflection is maintained, but capitalization and diacritics are removed. This column is null for entity extraction.
  • TA_STEM:
    The stemmed version of the token. This field is fully un inflected and normalized. If the stem is identical to the token, this column is null. It is also null for entity extraction.
  • TA_PARAGRAPGH:
    The paragraph in the document that contains the token.
  • TA_SENTENCE:
    The sentence in the document that contains the token.
  • TA_CREATED_AT:
    Creation time of the record.
  • TA_OFFSET:
    Character offset from the beginning of the document.
  • TA_PARENT:
    The TA_COUNTER value of the parent of this token.

Built-in Configurations

Sap Hana has seven built-in configurations that are used to analyze the behavior and output of the text:

  • LINGANALYSIS_BASIC:
    it’s the most basic linguistic analysis, that tokenizes the file, but the normalization and stemming are not used, so the columns remain empty for TA_NORMALIZED and TA_STEM.
  • LINGANALYSIS_STEMS:
    Normalizes and stems the tokens so the TA_NORMALIZED and TA_STEM fields will be populated.
  • LINGANALYSIS_FULL:
    Uses full linguistic analysis, so all the columns in the $TA_Table will be populated.
  • EXTRACTION_CORE:
    It extracts entities from the text. For example: people, places, URLs.
  • EXTRACTION_CORE_VOICEOFCUSTOMER:
    It extracts entities and facts to identify positive and negative emotions associated with the tokens.
  • EXTRACTION_CORE_ENTERPRISE:
    It extracts data for enterprise. For example: mergers, acquisitions, organizational changes, and product releases
  • EXTRACTION_CORE_PUBLIC_SECTOR:
    It extracts security-related data about public persons, events and organizations.

Creating a table and index

We will make a practical example for the EXTRACTION_CORE_ VOICEOFCUSTOMER configuration, which identifies the sentiments (positive or negative feeling):

We have to create a table, insert values and create an index.

Open a SQL Console and write the following command:

inspiricon-figure3-create-table

Figure 3. Create Table

 inspiricon-figure4-create-Index-insert-values

Figure 4. Create Index and insert values

The created table will look like this:

inspiricon-figure5-created-table

Figure 5. Table

Finally we got the text analysis. As you can see “likes” or “enjoys” appear as a “weak positive sentiment”.

inspiricon-figure6-text-analysis

Figure 6. Text Analysis

This article was inspired from this blog: https://blogs.sap.com/2017/05/21/sap-hana-ta-text-analysis/

Author
Adelina Ramona Popa and
Lucian Tomeac
Associates SAP BI
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de
Query Designer in Eclipse

How to successfully use the Query Designer in Eclipse

We are very happy to present to you our first guest author: Jürgen Noe. He is Managing Partner of Jürgen Noe Consulting UG (haftungsbeschränkt) in Mannheim (www.juergen-noe-consulting.de). His article is about – as is his new book – Query Designer in Eclipse. Many thanks again Jürgen – and now enjoy the read!

 

Along with support for HANA, the introduction to SAP BW 7.4 also brought a silent interface revolution. Up until this SAP BW release, support in terms of software development environments (SDE) had been limited to the SAP GUI. But with the development of the HANA database, the Hasso Plattner Institute relied on the Eclipse as a SDE from the get-go. This now gives developers of SAP applications two relevant development environments. When it comes to, for example, developing HANA database objects HANA Studio is the go-to environment, while traditional ABAP applications still required the SAP GUI.

But SAP had another surprise in store for us: What started out with support for the HANA platform only was eventually expanded with other tools for the development and customization of applications on HANA in Eclipse.

One of the tools here is BW-MT, short for BW Modelling Tools, which allows developers to completely move their typical BW customizing tasks to Eclipse. The creation of InfoProviders, really the entire ETL (extraction, transformation, load) process can now be carried out from start to finish in BW-MT.

The logical consequence was to recreate the central tool for creating BW queries in the Modelling Tools as well. This renders the good old-fashioned Query Designer as a standalone application within the context of the Business Explorer Suite (BEx) obsolete with all releases starting from SAP BW version 7.4.

A quick start to the Query Designer in Eclipse

Against this background, I wrote a book to describe the new functionalities offered by the Query Designer in Eclipse. The book titled “Schnelleinstieg in den Query Designer in Eclipse” and published by Espresso Tutorials in September of 2017 is available in German only.

Query Designer

Click here to purchase the book.

I would like to take this opportunity and use the following paragraphs to outline the book for you:

The book starts out with some basic information about SAP BW and Eclipse in general. In the Eclipse section of the book, I provide a short explanation of how Eclipse is structured and break down essential terms such as plug-in, view, and perspectives. Experienced Eclipse users can skip this chapter.

The third chapter summarizes the BW Modelling Tools. I explain how to call well-known transactions such as the Data Warehouse Workbench in Eclipse and how to create data flows, accompanied by an in-depth description of central views such as the Project Explorer view and the InfoProvider view.

 

Given the central role that the Project Explorer plays in Eclipse, the book includes a detailed walk-through of how to create a new project and work with it. After that, I will explain how to navigate to the InfoProvider view, which is shown in the following figure 1, in Project Explorer:

Infoprovider View

Figure 0.1 InfoProvider view

This view allows you to create global, reusable objects such as restricted or calculated key figures, structures, filters, but also variables. You can find them under the Reusable Components folder in figure 1.

Chapter four then features a detailed description and many screenshots of how to create the different reusable elements and an overview of the various setting options along with their impact. The ability to create reusable components from a central location is one of the reasons why I think switching from the old BEx version to the new Query Designer in Eclipse is worth your while. Gone are the times when you had to click your way through multiple windows in the BEx Query Designer in order to, for example, create a formula variable. What’s more, I also noticed major improvements in navigation and usability.

There is yet another area where BW-MT demonstrates its full strength: It has never been easier to jump from one object to another, change it, and view the changes in the initial object right away. Here’s an example: You realize that you need an additional key figure in the query. It used to be that you first had to create it in the DataWarehouse Workbench, add and assign it in the MultiProvider, and restart the Query Designer for it to register the change before you could insert it into the query. Now, you no longer have to deal with the inconvenience of having to jump back and forth between different tools and transactions. With BW-MT all that changes are the views in Eclipse! You simply switch from the Query Designer view to the master data view, where you create your key figure, and go on to the InfoProvider view to add it to your data model in the MultiProvider. Once you have saved it, you can switch right back to the Query Designer view.

And you can do all of this in parallel in a single tool, using multiple windows, however you see fit!

With Eclipse, you can view the changes to the MultiProvider right away. And even if not, simply hit refresh to have your new key figure available in your query. It has never been so easy!

A detailed look at the query properties

Surely, you are now asking yourself how the Query Designer view that allows you to create, change and delete queries looks like. You can find the answer to this in figure 2:

Query Filter

Figure 0.2 Query definition (filter)

As you can see, the query definition is spread across multiple tabs. The General tab allows you to configure general query properties such as how to display repeated key values and much more.

Figure 2 shows the definition of a query filter. As with the BEx Query Designer, the fundamental query structure with columns, rows, and free characteristics stays the same. You can define this structure in the Sheet Definition tab. All of these configurations are carried out using the context menu, which lets you access all relevant functions in the respective views.

The Conditions tab allows you to specify conditions such as show me all datasets with revenues of more than 1 million euros.

Use the Exceptions tab to define any exceptions. These exceptions allow you to color code rows, columns or individual cells to highlight outliers or special circumstances.

I’m very fond of the Dependency Structure tab, which provides you with an overview of any other queries in which the variables used in query at hand are also used.

The Runtime Properties tab lets you configure the performance properties of the query, for example whether to use the delta cache process and many other properties that you are already familiar with from the transaction RSRT.

Chapter five of the book includes many screenshots and examples that serve to explain the various options provided by the different tabs and their respective impact.

So, what does the query result look like?

Once you have created you query, you will want to test and execute it. With BW-MT, the query result is presented in a separate view, as shown in figure 3.

Result

Figure 0.3: Query result view

You can navigate the query results freely, apply filters, add drilldowns, delete, just like you did in the past. Once again, you will find everything you need in this view, there is no longer the need to have a JAVA server installed to produce web output or to switch to the BEx Analyzer to create Excel output.

For more complex queries, you may need two structures:
In the old BEx Query Designer, you had to work with the cell editor. The cell editor was completely overhauled with the new Query Designer and now includes useful options such as copy & paste. It also eliminates any annoying roundtrips to the server to check the entries, which makes working with the cell editor that much faster. Take a look at the cell editor in figure 4:

Cell editor

Figure 0.4 Cell editor

Last but not least: the variables

The last item on our list are the variables that add dynamic to your queries. The sixth chapter takes a closer look at variables and uses screenshots and simple examples to demonstrate how to create all typical variables.

The advantages of the new Query Designer in Eclipse:

  • A modern, user-friendly and future-proof interface
  • Any existing BEx Query Designer functions can also be found in the new Query Designer in Eclipse
  • Seamless integration of BW data modelling in a single tool

My conclusion is a wholehearted recommendation to switch to the Query Designer in Eclipse along with BW-MT. It has never been so easy to create and test entire BW data models and queries. To me, the Query Designer in Eclipse is a big step towards the future!

Author
Jürgen Noe Managing Partner Jürgen Noe Consulting UG (limited liability)
Phone: +49 (0) 621 72963337
Email: juergen.noe@juergen-noe-consulting.de
Cognos Migration SAP BW on HANA

How to leverage a standardized SAP solution to harmonize processes across the enterprise and ensure comparability

Hey there:

Today we would like to share with you yet another successful project in our portfolio. For this project, we were tasked with migrating an existing Cognos reporting solution from SAP BW on HANA with SAP BusinessObjects as the frontend.

The customer that sought our support specializes in the manufacturing and distribution of electronic components. As part of an international group, it is one of several subsidiaries, all of which rely on different ERP and business intelligence solutions. Faced with the group’s continued growth and the ever-increasing complexity that goes along with it, the proprietary ERP solution was reaching its limits both in terms of functionality and reporting capabilities.

THE PROJECT

Working from a newly introduced and standardized SAP ERP solution with SAP BW, Inspiricon was able to replace the existing Cognos reporting solution.

There were a number of reasons that motivated this decision:

  • For one, it would have been too costly and time consuming to keep developing the reporting solutions that, for the most part, had been developed in house and customized to specific needs.
  • Also, maintenance and support had become almost unmanageable.
  • What’s more, the group was no longer able to maintain the consistency of KPI purpose and content across its different companies, resulting in the inability to compare their processes and results.
  • The sheer number of different interfaces with other systems drove costs and significantly increased the error rate. Reporting to the parent company, in particular, had become a tedious and time-consuming endeavor. Even more interfaces were needed to do so because reporting was tailored exclusively to customers based on a – custom – Cognos reporting solution that had been implemented on top of everything else.
  • Internal expertise regarding the existing solutions was limited to a small number of people inside the company. Loosing this technical knowledge due to employee turnover was an ever-present threat.

THE SOLUTION

Since another subsidiary was already using SAP products, the new ERP solution was already present on the system.

The project, however, was not just about bringing the ERP system up to speed, the group was also eager to take its BI system to the next level. And it wanted it to be a SAP solution, prompting the decision to replace the existing Cognos reporting solution with SAP Business Warehouse on HANA.

The project was launched with release SAP 7.4 on HANA. During the project, a migration to SAP 7.5 on HANA was carried out. In this case too, SAP BW was already being used by another subsidiary and was therefore readily available.

Products in the SAP BusinessObjects family were to be used for the frontend as well. These adjustments were intended to bring about following changes and benefits:

  • Standardized data models that can be leveraged by every company in the group, allowing its globally dispersed service centers to provide cost-optimized support.
  • Consistent KPIs that guarantee comparability.
  • A smooth transition for users of the Cognos solution by means of a like-to-like migration.
  • Elimination of unnecessary interfaces.
  • Improved and accelerated flow of information along the value chain.
  • Increased responsiveness for better corporate management, which also improves competitiveness.

Our solution consists of Web Intelligence reports based on SAP BW on HANA along with standardized data provisioning from the operational SAP ERP system. The following illustration outlines the project’s structure:

Illustration Project Structure

By replacing its existing ERP and BI systems, our customer was able to improve its business. Not only can users now access reports and data much faster, substantially reducing time to decision, but they can also, to a certain degree, customize their reports in the context of a self-service BI. Another big plus is a significantly improved security posture due to centrally managed access authorizations. Last but not least, we were able to reduce license costs and therefore improve cost-effectiveness for the customer.

The introduction and standardization of SAP ERP and BI solutions typically require significant investments. By having our employees from our nearshoring location in Cluj-Napoca in Romania contribute to the project, we were able to drastically cut these costs. At the peak of the project, up to 6 colleagues from Cluj-Napoca were involved in the project. We coordinated through daily briefings during which our team communicated remotely both through video and audio.

Author
Claudio Volk Member of the Management Board
Phone: +49 (0) 7031 714 660 0
Email: info@inspiricon.de
4 steps to create a universe

4 steps to create a universe – the Information Design Tool, part 2

To continue our journey to create a universe full of data we will learn the steps to do this mission. The major steps in this process are:

  1. Create a new project in IDT (Information Design Tool).
  2. Create a connection between the HANA system and our project from IDT.
  3. Create a data foundation and a business layer.
  4. Publish to a repository.

First of all we have to define some basic terms that you have to know when handling universes and IDT.

  • Connection: defines how a universe connects to a relational or OLAP database. Local connections are stored as .CNX files, and secure connections are stored as .CNS files.
  • Data Foundation: a scheme that defines the relevant tables and joins from one or more relational databases. The designer enhances the data foundation with contexts, prompts, calculated columns and other SQL definitions. It can represent a basis for multiple business layers.
  • Business Layer: it represents the first draft of a universe. When the business layer is complete, it is compiled with the connections or connection shortcuts as well as data foundation. Then it is published and deployed as a universe.
  • Universe: the compiled file that includes all resources used in the definition of the metadata objects built in the design of the business layer.

Step 1: Create a new project in IDT

No matter if your plan to discover the universe is big or small, you need to have in mind a whole project. Technically speaking, the journey to a universe starts with a new project. We can create a new project opening IDT by opening IDT and then going to File -> New and selecting Project. The project will represent your box full of ideas, a workspace where you can build your universe brick by brick.

The bricks are represented by specific objects like data foundation or business layer.

How to create a project in IDT.

Figure 1: How to create a project in IDT.

 

Project is the house for resources used to build a universe. This project is used to manage the local objects for the universe creation.

In a project, you can have a multiple number of objects like data foundation, business layers, and data source connections etc.

Figure 2: How to create a project: enter name.

Figure 2: How to create a project: enter name.

 

We will enter the project name (in this case we use the name Test). Then we will set the project location in our workspace, too.

You also have the possibility to edit an existing project. You have to go on File -> Open Project Local Project area.

Another very interesting functionality in a project is represented by the locking resource – in this way you are able to inform other developers that you are working on the resource.

Step 2: Create a connection between the HANA system and our project from IDT .

Once the project is created, we have to assign a connection to be able to connect a data source.

This connection will define how the source system provides data. There are two types of connections: Relational or OLAP connection.

A Relational connection is used to connect to the database layer to import tables and joins.

An OLAP connection is used to connect to the multidimensional model like an Information View in SAP HANA.

How to create a project in IDT.

Figure 3: How to create a relational connection.

 

In order to create the connection, we have to enter the system details, the password and the user name.

After we have created the connection, we will test it and then we have to publish it to the repository to make it ready to use for our universe.

How to publish a connection.

Figure 4: How to publish a connection.

 

In order publish the connection, you have to provide the BO Connection Parameters (Repository, Username and Password).

Step 3: Create a data foundation and a business layer.

Once we have an available connection to the repository, we can proceed to create a data foundation for our universe. Data Foundation layer gives you the opportunity to import tables and joins from different relational databases.

In Information Design Tool, there are two types of Data Foundation: Single-source enabled or multi-source enabled.

Single-source Data Foundation supports a single relational connection. Single Source Data Foundation supports a local or a secured connection so the universe designed on this can be maintained locally or can be published to the repository.

How to create a Data Foundation.

Figure 5: How to create a Data Foundation.

 

Multi-source enabled Data Foundation supports one or more relational connections. Connections can be added when you design the Data Foundation or even later. Multi-source enabled Data Foundation is designed on secured connections published in a repository.

We have to define Data Foundation Technical Name and click on Next.

An interesting option in IDT is that you can create a universe on multiple sources. In the next screen, you can see a list of available connections to create the universe (both .cnx and .cns).

.cnx connection is used when we do not want to publish the universe to a central repository (can be used in local universe).

If you want to publish the universe to any repository (local or central), you have to use the .cns (secured) connection.

Once we set the finish button we will have a data foundation.

How does a Data Foundation look like?

Figure 6: How does a Data Foundation look like?

 

Congratulations, you are already half-way on your way to an own universe!

We have a project, a set connection and a data foundation so far. Now we need the other half of the universe, the Business Layer.

To do this, we have to go in the project menu, new and select Business Layer. Here we have to select the connection and also the Data Foundation on which we want to create the Business Layer.

How to create a Business Layer.

Figure 7: How to create a Business Layer.

 

Business Layer contains all classes and objects. You can also check dimensions and measures that are defined in a universe. When you publish the business layer in the repository, this shows the completion of a universe creation.

Once you created the business layer, you have to decide which fields need to work as dimensions (attributes / characteristics) and which fields functions as measures (Key Figures).

To give the universe a better layout, we can create folders for each dimension and one for Measures. Set by default option, measures are also treated as attributes.

Step 4: Publish to a repository.

Our journey to a universe full of data approaches the final step. We have all the elements and only just one click left to the finish line. After an integrity check we have to save and publish the universe to a repository.

How to publish the universe to a repository.

Figure 8: How to publish the universe to a repository.

This article is inspired by tutorials provided by Tutorials Point.

Author
Ionel-Calin Oltean Consultant SAP BI/BW
Phone: +49 (0) 7031 714 660 0
Email: cluj@inspiricon.de