Find text

Description

The Find Text activity extracts specific portions of text from selected columns based on a user-defined regex pattern. It is useful for parsing structured tokens, keywords, codes, or patterns from unstructured text.

Use Case
Extract keywords such as product codes, IDs, or tags from a sentence or description field using regular expressions.

Input

Type	Description
Data	Input dataset containing text columns

Output

Type	Description
Transformed Data	New columns with extracted values from patterns.

Configuration Fields

Field Name	Required	Description
Columns To Find	Yes	Column(s) from which the text will be extracted using regex.
Pattern	Yes	Regular expression used to extract matching portions from the column text.
Output Columns Prefix	Yes	Prefix used when creating new output columns for extracted matches.
Include Original	No	If enabled, original columns will be included in the output.

Sample Input

ID	Description
1	This contains ABC and XYZ
2	Find CODE inside this text
3	No pattern matches here
4	Extract INFO and DATA points
5	SAMPLE test for extraction

Sample Configuration

Field	Value
Columns To Find	`Description`
Pattern	`([A-Z]{3,})`
Output Columns Prefix	`Column_`
Include Original	`Enabled`

Explanation: This regex extracts all words with 3 or more uppercase letters.

Sample Output

ID	Description	Column_1	Column_2
1	This contains ABC and XYZ	ABC	XYZ
2	Find CODE inside this text	CODE
3	No pattern matches here
4	Extract INFO and DATA points	INFO	DATA
5	SAMPLE test for extraction	SAMPLE

Use grouping patterns like (\d{4}) to extract numeric codes, or #(\w+) to extract hashtags.