Skip to content

Duplicate Record

The Duplicate Record Check ensures that a specific column does not contain duplicate values. This rule verifies whether the values in a given column meet the configured success criteria based on the uniqueness of the values.

Success criteria

The success criteria are evaluated based on the number of distinct values in the column.

  • If the column has N rows, the number of distinct values is calculated.
  • The success condition is met if the distinct value count satisfies the configured operator and value.
  • For example, if the operator is Greater than and the value is 3, then the column must have more than 3 distinct values to be within the threshold.

Configuration fields

  • Operator options

    Greater than

    Less than

    Equal to

    Between (requires specifying a start and end range)

  • Operator defines the comparison operation. You can use Greater Than, Less Than, Equal To, or Between.

  • Value is the threshold value used for success criteria. It is required for Greater than, Less than, and Equal to operators.

  • Value range is required only when the Between operator is selected. You need to specify the start and end range.

  • Threshold type indicates whether the Value or Value Range should be considered as a percentage or an absolute count.

  • Allow null values determines if null values are permitted.

Sample Input

IDNameAge
1Alice25
2Bob30
3Alice25
4Charlie40
5AliceNULL

Sample rule configuration

  • Operator Greater than
  • Value 3
  • Threshold type Absolute Count
  • Allow null values False

alt text

Sample Output

Column NameRule NameSuccess CountWithin ThresholdNull Count
NameName Duplicate Check3No0
AgeAge Duplicate Check3No1