Duplicate Record
The Duplicate Record Check ensures that a specific column does not contains duplicate values. This rule verifies whether the values in a given column meet the configured success criteria based on the uniqueness of the values.
Configuration fields
-
Operator options
Greater than
Less than
Equal to
Between
(requires specifying a start and end range) -
Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).
-
Value The threshold value used for success criteria. Required for
Greater than
,Less than
, andEqual to
operators. -
Value range Required only when the
Between
operator is selected, specifying thestart
andend
range. -
Threshold type Indicates whether the
Value
orValue Range
to be considered as percentage or an absolute count. -
Allow null values Determines if null values are permitted.
Success criteria
The success criteria are evaluated based on the number of distinct values in the column.
- If the column has N rows, the number of distinct values is calculated.
- The success condition is met if the distinct value count satisfies the given
operator
andvalue
. - For example, if
operator
isGreater than
andvalue
is3
, then the column must have more than 3 distinct values to be within the threshold.
Sample Input
ID | Name | Age |
---|---|---|
1 | Alice | 25 |
2 | Bob | 30 |
3 | Alice | 25 |
4 | Charlie | 40 |
5 | Alice | NULL |
Sample configuration
- Operator Greater than
- Value 3
- Threshold type Absolute Count
- Allow null values False
Sample Output
Column Name | Rule Name | Success Count | Within Threshold | Null Count |
---|---|---|---|---|
Name | Name Duplicate Check | 3 | No | 0 |
Age | Age Duplicate Check | 3 | No | 1 |