Skip to content

Filter on bad meaning

Description

The Filter On Bad Meaning activity helps clean and standardize your dataset by detecting and handling rows that contain undesirable data patterns—such as URLs, IP addresses, booleans, dates, and more—based on configurable meanings per column.

You can take various actions on these rows, including removing them, clearing the problematic cells, or flagging them for review.

Use case:
Useful for detecting and eliminating noise from survey results, logs, scraped data, or user inputs before applying analysis or transformations.


Input

TypeDescription
DataDataset containing values to inspect.

Output

TypeDescription
Transformed DataFiltered or annotated data based on detected bad meanings.

Configuration Fields

Field NameRequiredDescription
MeaningsYesMapping of columns to “bad meanings” to detect. Each column can have one or more bad types.
Supported Bad Meanings: URL, Port, IP Address, Boolean, Text, Decimal, Integer, Date
ActionsYesAction to perform on rows that match any bad meanings:
  • Remove matching rows
  • Clear content of matching cells
  • Keep matching rows
  • Flag rows
  • Clear content of non-matching cells
Flag Rows ActionNoMethod to flag matching rows, such as using 0/1, True/False, etc. (Visible only if action is Flag rows)
Flag Rows Column NameNoName of the new column that flags rows with bad meanings. (Visible only if action is Flag rows)

Sample Input

IDColumn1Column2OtherColumn
1http://example.comTRUEValue A
2192.168.1.142Value B
3ValidTextFALSEValue C

Sample Configuration

Field NameValue
MeaningsColumn1 → [URL, IP Address], Column2 → [Boolean, Integer]
ActionFlag rows
Flag MethodBinary (0 for clean, 1 for flagged)
Flag ColumnBadDataFlag

Sample Output

IDColumn1Column2OtherColumnBadDataFlag
1http://example.comTRUEValue A0
2192.168.1.142Value B1
3ValidTextFALSEValue C1

Combine this activity with Extract Pattern, Standardize Data, or Validate Data Type to build robust data quality workflows.