Description
The Find Text activity extracts specific portions of text from selected columns based on a user-defined regex pattern. It is useful for parsing structured tokens, keywords, codes, or patterns from unstructured text.
Use Case
Extract keywords such as product codes, IDs, or tags from a sentence or description field using regular expressions.
Type | Description |
---|
Data | Input dataset containing text columns |
Output
Type | Description |
---|
Transformed Data | New columns with extracted values from patterns. |
Configuration Fields
Field Name | Required | Description |
---|
Columns To Find | Yes | Column(s) from which the text will be extracted using regex. |
Pattern | Yes | Regular expression used to extract matching portions from the column text. |
Output Columns Prefix | Yes | Prefix used when creating new output columns for extracted matches. |
Include Original | No | If enabled, original columns will be included in the output. |
ID | Description |
---|
1 | This contains ABC and XYZ |
2 | Find CODE inside this text |
3 | No pattern matches here |
4 | Extract INFO and DATA points |
5 | SAMPLE test for extraction |
Sample Configuration
Field | Value |
---|
Columns To Find | Description |
Pattern | ([A-Z]{3,}) |
Output Columns Prefix | Column_ |
Include Original | Enabled |
Explanation: This regex extracts all words with 3 or more uppercase letters.
Sample Output
ID | Description | Column_1 | Column_2 |
---|
1 | This contains ABC and XYZ | ABC | XYZ |
2 | Find CODE inside this text | CODE | |
3 | No pattern matches here | | |
4 | Extract INFO and DATA points | INFO | DATA |
5 | SAMPLE test for extraction | SAMPLE | |
Use grouping patterns like (\d{4})
to extract numeric codes, or #(\w+)
to extract hashtags.