Column Boundary Validation
The Column boundary validation is the process of ensuring that the data within each column of a dataset meets predefined criteria, such as correct data types, value ranges, and format constraints. It ensures data consistency and integrity by verifying that values in each column fall within acceptable boundaries or limits. This helps prevent errors and maintain reliable data for analysis and processing.
Rule configurations
Rule configuration in column boundary validation refers to setting specific criteria or conditions that data in each column must meet, such as data type, range, or format. It defines the boundaries or rules that ensure the integrity and accuracy of the data within the column.
Mode
The mode for column boundary validation,Involves setting limits on data within a specific range for validation purposes.
Minimum
Maximum
Range Range refers to the valid interval of values a column can accept, defined by a minimum and maximum value. Data must fall within this range to be considered valid.
Success criteria
The success criteria for a monotonic sequence is met when the sequence is consistently either non-increasing or non-decreasing (in the case of non-strict), or strictly increasing or decreasing (in the case of strict), without any reversal in direction.
- The success condition depends on how the
Mode
andRange
is configured. - For example In column boundary validation, the “Age” column may have a rule where the value must be between 18 and 60, ensuring no one below 18 or above 60 is entered.
Configuration fields
-
Operator options
Greater than
Less than
Equal to
Between
(requires specifying a start and end range) -
Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).
-
Value The threshold value used for success criteria. Required for
Greater than
,Less than
, andEqual to
operators. -
Value range Required only when the
Between
operator is selected, specifying thestart
andend
range. -
Threshold type Indicates whether the
Value
orValue Range
to be considered as percentage or an absolute count.
Sample Input
ID | Age | Profits |
---|---|---|
1 | 25 | 459 |
2 | 78 | 6495 |
3 | 58 | 12345 |
4 | 62 | 1576 |
5 | 49 | 3500 |
Sample rule configuration
- Mode
Age = Minimum Profits = Maximum - Range
Age =1-60 Profits = 5000 - 15000
Sample success criteria configuration
- Operator Greater than
- Value 50%
- Threshold type Absolute Count
Sample Output
Column Name | Rule Name | Success Count | Failure Count | Within Threshold | Null Count |
---|---|---|---|---|---|
Customer | Column Boundary Validation | 3 | 2 | Yes | 0 |
Country | Column Boundary Validation | 2 | 3 | No | 0 |