Skip to content

Custom Sql Check Rule

Description

A Custom SQL check rule in data quality refers to a user-defined query or SQL script that is used to validate or verify specific data quality rules or conditions in a dataset. It involves writing SQL code to assess various data quality aspects such as completeness, consistency, accuracy, uniqueness, and integrity

Rule Configuration

The Rule configuration for Custom SQL check involves writing specific SQL queries to validate or check data quality. These queries are designed to assess data based on custom criteria, ensuring accuracy and consistency by applying tailored validation logic directly through structured database queries.

  • Query Query refers to the specific SQL (Structured Query Language) code that is written to check or validate certain aspects of data quality.

Success Criteria

The success criteria for custom SQL in data quality ensure that data meets specific conditions, such as non-null values exceeding a threshold (e.g., 80%). It defines whether the data is valid, excluding null or empty values, and checks if it meets quality standards based on success and failure counts.

  • The success condition depends on how the Query is configured.

  • The success condition is met if the count satisfies the given operator and value.

    Configuration Fields

    • Operator Options

      • Greater than

      • Less than

      • Equal to

      • Between (requires specifying a start and end range)

      • Operator Defines the comparison operation (Greater Than, Less Than, Equal To, or Between).

      • Value The threshold value used for success criteria. Required for Greater than, Less than, and Equal to operators.

      • Value Range Required only when the Between operator is selected, specifying the start and end range.

      • Threshold Type Indicates whether the Value or Value Range to be considered as percentage or an absolute count.

Sample Input

Table Name product_sales_data

IDProduct NameCategory
1RefrigeratorAppliances
2ChairFurniture
3SmartwatchAppliances
4LaptopElectronics
5SmartphoneElectronics

Sample Rule Configuration

Query

SELECT CASE WHEN T1.Category IN (‘Appliances’, ‘Electronics’) THEN 1 ELSE 0 END AS is_valid FROM product_sales_data AS T1

The query checks each product’s category in the product_sales_data table. If the category is either “Appliances” or “Electronics,” it returns 1 for is_valid column; otherwise, it returns 0. This helps identify valid product categories based on predefined criteria.

Sample Success Criteria Configuration

  • Operator Between
  • Value Range Start = 3, End = 5
  • Threshold Type Absolute Count
  • Allow Null Values Not Applicable

alt text

Sample Output

Column NameRule NameSuccess CountFailure CountWithin ThresholdNull Count
DataCustom Sql Check41Yes0