Process Mining with Alteryx and Celonis Snap

Author Image

Von Roman Stalder

29 Juli 2020

Do you want to know how process minning works, where it is used, what the benefits are and what requirements have to be met? Then you've come to the right place.

In the following, these questions are explained step-by-step using a real use case "Order2Invoice" with anonymized data from a retail company. So that you can gather your first experiences yourself afterwards, we will use Celonis Snap, the free process mining tool from Celonis. Since process mining is not a plug-and-play technology and the data is usually not available in a suitable form, we use the Alteryx Designer as a data transmission.

Regardless of whether you want to discover new processes, check conformity, or expand your process knowledge through process mining, you need at least a rough definition of which processes you want to analyze and which IT systems support them. The event data from the relevant IT systems must meet three requirements. At a minimum, a case ID, a timestamp and an activity description are required. If the data quality, integrity and protection are fulfilled, nothing stands in the way of using Celonis Snap and the first analyses can be carried out.

Step 1: Document actual processes

a.) Determine processes to be analyzed
b.) Define activities
c.) Define systems

Step 2: Organize and prepare log files or event data from the relevant IT systems

a.) Organize data or input files (.xlsx or .csv)
b.) Define Case-ID
c.) Define timestamp
d.) Prepare data
e.) Add activities defined in step 1
f.) Merge, anonymize and audit data
g.) Generate output files (.csv)

Step 3: Perform process analysis with Celonis Snap

a.) CSV-Files hochladen
b.) «Case ID», «Activity» und «Timestamp» manuell zuweisen
c.) Prozesse analysieren


Use Case "Order2Invoice"

Step 1: Document actual processes

a.) Determine processes to be analyzed 

Procurement: Order2Invoice

Description: By a message of a commodity requirement the central purchase procures the appropriate goods with the respective suppliers in coordination with the finances and logistics.

b.) define activity 

In order to define the activities, we have modeled the process "Procurement: Order2Invoice" according to BPMN 2.0. This visual representation will be very helpful to us in the following definition of the timestamps.

c.) define systems 

  • ERM
  • EDI-System

Step 2: Organize and prepare log files or event data from the relevant IT systems

a.) Organize Data or inputs files (.xlsx oder .csv)

Tabular data could be exported from the relevant systems, which can now be used as input files for data preparation with Alteryx Designer and for subsequent process analysis with Celonis Snap.

b.) Define Case-ID 

The column "Order" is unique and present in every input file. We now use this as a case ID, one of the three minimum requirements that must be met in order to use Celonis Snap. If this were not the case, then it would not be possible to reconstruct the process from start to finish using the log data.

c.) define timestamp 

In addition to a case ID, however, an event also requires a time stamp and an activity designation, i.e. when did the event take place and what actually happened during it. In our input files there is a wide variety of time data. In order to determine the correct timestamp, a certain know-how or understanding of the process to be analyzed is indispensable. It is recommended not to rush here, because if wrong time stamps are assigned to the respective activities here, the process analysis is on the one hand for the cat and on the other hand still wrong conclusions could be drawn.

By defining the case IDs and the time stamps from the five input files along the modeled process, we have implicitly already assigned the corresponding activities.
 

d.) prepare data 

Before we can explicitly assign the previously defined activities, we first prepare the data in Alteryx Designer. To illustrate this, we have also modeled this process and show it to you using the first input file as an example.

Via drag & drop, the corresponding input file can be dragged into the Alteryx workspace, which then automatically converts to "Input Data Tool" due to its CSV format. After that, the empty values are simply filtered out with the "Filter Tool", whereupon all irrelevant columns are deselected with a "Select Tool" on the one hand and the case ID is defined. Since time stamps are available from the first source or system for three activities within the process to be analyzed, the workflow is divided accordingly. Using a "Formula Tool", the activities defined in step 1 are now added to the Alteryx workflow, but more on that later. After that, only the defined timestamps, which have different names, have to be renamed uniformly with a "Select Tool" and data without timestamps have to be filtered out with a "Filter Tool".

e.) Add activities defined in step 1

The previously described process step "Add defined activities" is now illustrated below for the first input file.

This makes it easy to see where which timestamp makes an event in the process an activity and how its designation is then integrated accordingly in the Alteryx workflow.

Since we have however several input files and activity designations with the process "procurement: Or2Invoice", this procedure must be modeled accordingly often.
 

The data from the individual input files have now been prepared and must now be merged and anonymized.

f.) Merge anonymize and audit data 

To do this, let's now take a look at the entire Altery workflow, which we have also visualized with a BPMN diagram.

Since this is so somewhat difficult to understand, its structure is explained step by step in the video below.

Datenaufbereitung für Process Mining

Now let's move on to the effective merging of the data.

Due to the harmonization of the data in the previous step, the "Union Tool" is now suitable to merge the data sets in an uncomplicated way. The "Find Replace Tool" replaces the sensitive data with the anonymized data.

How is the data anonymized? First, it must be determined which data is to be anonymized at all. Then all duplicates are removed and references are generated. Then a corresponding reference value ("Supplier1", "Storage location1" & "Employee1" etc.) is assigned to each reference.

Here, the "Summarize Tool" simply checks the number of records before and after anonymization.

g.) Generate Output-Files (.csv)

Use the "Output Data Tool" to drop a CSV file at the defined destination.

Step 3: Perform process analysis with Celonis Snap

a.)Upload CSV files & b.) Assign "Case ID", "Activity" and "Timestamp" manually.

Upload in Celonis Snap

We have now seen how to upload a CSV file in Celonis Snap and manually assign the necessary parameters.

c.) Analyze processes 

In the following video, we will show you how to analyze case count, turnaround times, and rework rate in Process Explorer.

You have now seen what is required in advance to be able to operate process mining. Celonis has a few process connectors up its sleeve and is constantly expanding them, but it is far from being a plug-and-play solution. So if you want to do process mining in a more complex application architecture, harmonization of event data is essential.

Do you want to gain experience in process mining together with us? Or do you already use Celonis and want to use the full potential of it with Alteryx?

Then get in touch with us. We will be happy to address your individual needs and show you possible solutions.


Bilder-Gallery

Content


Twitter


Address

St. Jakobs-Strasse 3, 4052 Basel, CH

Phone number

+41 (0)61 551 0012

Linkedin

banian-ag

We look forward to hearing from you