Description
The Filter On Bad Meaning activity helps clean and standardize your dataset by detecting and handling rows that contain undesirable data patterns—such as URLs, IP addresses, booleans, dates, and more—based on configurable meanings per column.
You can take various actions on these rows, including removing them, clearing the problematic cells, or flagging them for review.
Use case:
Useful for detecting and eliminating noise from survey results, logs, scraped data, or user inputs before applying analysis or transformations.
Type | Description |
---|
Data | Dataset containing values to inspect. |
Output
Type | Description |
---|
Transformed Data | Filtered or annotated data based on detected bad meanings. |
Configuration Fields
Field Name | Required | Description |
---|
Meanings | Yes | Mapping of columns to “bad meanings” to detect. Each column can have one or more bad types. |
| | Supported Bad Meanings: URL , Port , IP Address , Boolean , Text , Decimal , Integer , Date |
Actions | Yes | Action to perform on rows that match any bad meanings:- Remove matching rows
- Clear content of matching cells
- Keep matching rows
- Flag rows
- Clear content of non-matching cells
|
Flag Rows Action | No | Method to flag matching rows, such as using 0/1 , True/False , etc. (Visible only if action is Flag rows ) |
Flag Rows Column Name | No | Name of the new column that flags rows with bad meanings. (Visible only if action is Flag rows ) |
ID | Column1 | Column2 | OtherColumn |
---|
1 | http://example.com | TRUE | Value A |
2 | 192.168.1.1 | 42 | Value B |
3 | ValidText | FALSE | Value C |
Sample Configuration
Field Name | Value |
---|
Meanings | Column1 → [URL, IP Address] , Column2 → [Boolean, Integer] |
Action | Flag rows |
Flag Method | Binary (0 for clean, 1 for flagged) |
Flag Column | BadDataFlag |
Sample Output
ID | Column1 | Column2 | OtherColumn | BadDataFlag |
---|
1 | http://example.com | TRUE | Value A | 0 |
2 | 192.168.1.1 | 42 | Value B | 1 |
3 | ValidText | FALSE | Value C | 1 |
Combine this activity with Extract Pattern, Standardize Data, or Validate Data Type to build robust data quality workflows.