Skip to content

Process mining

Description

The activity performs process sequence extraction from raw event data to support use cases like flow visualization, performance analysis, and bottleneck detection.
It operates on a list of events, where each event belongs to a group and has a name, a start time, and an end time. These four columns—group, event, start time, and end time—are configurable through the provided configuration, allowing flexibility to adapt to different datasets.

The core logic begins by grouping all events based on their group identifier. Each group represents a single execution or instance of a process. Within each group:
Events are sorted by their start and end times.
Each event is assigned an index representing its position in the timeline.
The activity then establishes connections (or “edges”) between events based on their timing—typically from an earlier event to a later one.

Initial sequences are drawn from the first events in a group (those that start earliest), followed by connecting each event to its most recent predecessor that ended before it began. This captures a linear or sequential flow of execution.

In parallel processes where multiple events may start around the same time, the logic also identifies and links these overlapping activities. Additionally, if certain events aren’t part of any sequence yet, the activity attempts to connect them to the next plausible event.

Finally, events that end last in their group are considered terminal points and linked accordingly to indicate the process end.

Each detected transition is stored with details such as the group, start and end timestamps, a descriptive sequence like “A->B”, and a count of how many times that transition occurred. This structured output helps in mapping the real execution path of processes for further analysis.

Use case: In a customer support process, each support ticket logs events like Received, Assigned, In Progress, and Resolved. By applying the Process Mining activity, teams can extract the actual flow of events per ticket and identify delays or skipped steps, helping improve response time and service quality. The extracted sequences can then be visualized or analyzed further using dashboards

Input

Input TypeRequired
DataRequired

Output

Output TypeDescription
DataTransformed data with sequences

Configuration Fields

FieldDescription
Event groupColumn representing the group identifier (e.g., case ID).
EventColumn that uniquely identifies an event.
Start timeColumn indicating when an event starts.
End timeColumn indicating when an event ends.
Column name for sequence IdName of the column to store the sequence string (e.g., A->B).
Sequence Id seperatorCharacter or string used to join event IDs in a sequence (e.g., ->).

Sample Input

eventIdcaseIdstartTimeendTime
26730231241661534602025-01-02T03:29:07.000Z2025-01-02T03:29:21.000Z
34705675661661534602025-01-02T03:29:22.000Z2025-01-02T03:29:32.000Z
15531582851661534602025-01-02T03:29:56.000Z2025-01-02T03:29:56.000Z
15531582851661534602025-01-02T03:30:08.000Z2025-01-02T03:30:08.000Z
867098901661534602025-01-02T03:30:10.000Z2025-01-02T03:30:12.000Z

Sample Configuration

FieldValue
Event groupcaseId
EventeventId
Start timestartTime
End timeendTime
Column name for sequence IdSequenceId
Sequence Id Seperator->

Sample Output

Case IdSequence IdStart timeEnd time
166153460->26730231242025-01-02T03:29:07.000Z
1661534602673023124->34705675662025-01-02T03:29:21.000Z2025-01-02T03:29:22.000Z
1661534603470567566->15531582852025-01-02T03:29:32.000Z2025-01-02T03:29:56.000Z
1661534601553158285->15531582852025-01-02T03:29:56.000Z2025-01-02T03:30:08.000Z
1661534601553158285->867098902025-01-02T03:30:08.000Z2025-01-02T03:30:10.000Z