Custom SQL
A Custom SQL check rule in data quality refers to a user-defined query or SQL script that you use to validate or verify specific data quality rules or conditions in a dataset. You write SQL code to assess various data quality aspects such as completeness, consistency, accuracy, uniqueness, and integrity.
Rule configuration
The rule configuration for a Custom SQL check involves writing specific SQL queries to validate or check data quality. These queries are designed to assess data based on custom criteria. You ensure accuracy and consistency by applying tailored validation logic directly through structured database queries.
Query The query refers to the specific SQL (Structured Query Language) code that you write to check or validate certain aspects of data quality.
Success criteria
The success criteria for Custom SQL in data quality ensure that the data meets specific conditions, such as non-null values exceeding a threshold (for example, 80 percent). You define whether the data is valid by excluding null or empty values and checking if it meets quality standards based on success and failure counts.
- The success condition depends on how you configure the
Query
. - The success condition is met if the count satisfies the given
operator
andvalue
.
Configuration fields
-
Operator options
Greater than
Less than
Equal to
Between
(requires specifying a start and end range) -
Operator defines the comparison operation. You can select Greater Than, Less Than, Equal To, or Between.
-
Value is the threshold value used for the success criteria. This is required for the
Greater than
,Less than
, andEqual to
operators. -
Value range is required only when you select the
Between
operator. You must specify thestart
andend
range. -
Threshold type indicates whether the
Value
orValue Range
is considered as a percentage or an absolute count.
Sample Input
Table name product_sales_data
ID | Product Name | Category |
---|---|---|
1 | Refrigerator | Appliances |
2 | Chair | Furniture |
3 | Smartwatch | Appliances |
4 | Laptop | Electronics |
5 | Smartphone | Electronics |
Sample rule configuration
Query
SELECT CASE WHEN T1.Category IN (‘Appliances’, ‘Electronics’) THEN 1 ELSE 0 END AS is_valid FROM product_sales_data AS T1
The query checks each product’s category in the product_sales_data
table. If the category is either “Appliances” or “Electronics,” it returns 1
for is_valid
column; otherwise, it returns 0
. This helps identify valid product categories based on predefined criteria.
Sample success criteria configuration
- Operator Between
- Value range
Start
= 3,End
= 5 - Threshold type Absolute Count
- Allow null values Not Applicable
Sample Output
Column Name | Rule Name | Success Count | Failure Count | Within Threshold | Null Count |
---|---|---|---|---|---|
Data | Custom Sql Check | 4 | 1 | Yes | 0 |