Distinct check rule
Description
The Distinct Check rule ensures that a specific column contains distinct (unique) values. This rule verifies whether the values in a given column meet the configured success criteria based on the uniqueness of the values.
Configuration Fields
-
Operator Options
Greater than
Less than
Equal to
Between
(requires specifying a start and end range)
-
Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).
-
Value The threshold value used for success criteria. Required for
Greater than
,Less than
, andEqual to
operators. -
Value Range Required only when the
Between
operator is selected, specifying thestart
andend
range. -
Threshold Type Indicates whether the
Value
orValue Range
to be considered as percentage or an absolute count. -
Allow Null Values Determines if null values are permitted.
Success Criteria
The success criteria are evaluated based on the number of distinct values in the column.
- If the column has N rows, the number of distinct values is calculated.
- The success condition is met if the distinct value count satisfies the given
operator
andvalue
. - For example, if
operator
isGreater than
andvalue
is3
, then the column must have more than 3 distinct values to be within the threshold.
Sample Input
ID | Name | Age |
---|---|---|
1 | Alice | 25 |
2 | Bob | 30 |
3 | Alice | 25 |
4 | Charlie | 40 |
5 | Alice | NULL |
Sample Configuration
- Operator Greater than
- Value 3
- Threshold Type Absolute Count
- Allow Null Values False
Sample Output
Column Name | Rule Name | Success Count | Within Threshold | Null Count |
---|---|---|---|---|
Name | Name Distinct Check | 3 | No | 0 |
Age | Age Distinct Check | 3 | No | 1 |