Skip to content

Duplicate Record

The Duplicate Record Check ensures that a specific column does not contains duplicate values. This rule verifies whether the values in a given column meet the configured success criteria based on the uniqueness of the values.

Configuration fields

  • Operator options

    Greater than

    Less than

    Equal to

    Between (requires specifying a start and end range)

  • Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).

  • Value The threshold value used for success criteria. Required for Greater than, Less than, and Equal to operators.

  • Value range Required only when the Between operator is selected, specifying the start and end range.

  • Threshold type Indicates whether the Value or Value Range to be considered as percentage or an absolute count.

  • Allow null values Determines if null values are permitted.

Success criteria

The success criteria are evaluated based on the number of distinct values in the column.

  • If the column has N rows, the number of distinct values is calculated.
  • The success condition is met if the distinct value count satisfies the given operator and value.
  • For example, if operator is Greater than and value is 3, then the column must have more than 3 distinct values to be within the threshold.

Sample Input

IDNameAge
1Alice25
2Bob30
3Alice25
4Charlie40
5AliceNULL

Sample configuration

  • Operator Greater than
  • Value 3
  • Threshold type Absolute Count
  • Allow null values False

alt text

Sample Output

Column NameRule NameSuccess CountWithin ThresholdNull Count
NameName Duplicate Check3No0
AgeAge Duplicate Check3No1