Skip to content

Consistent Casing

In data quality, Consistent Casing refers to ensuring that text data is standardized in terms of letter casing (uppercase versus lowercase) across a dataset. This consistency is important for maintaining data integrity, as variations in casing (for example, “John Doe” versus “john doe” or “USA” versus “usa”) can lead to errors during data processing, matching, and analysis.

Rule configuration

A value is marked as a success when it matches the selected case type. If the value is unique and fits within the defined set, the rule is considered passed.

Case type In data quality, the case type refers to the specific formatting of text where letter casing is applied in distinct patterns. These patterns dictate how characters are capitalized or formatted in a string. They help ensure consistency and improve readability across datasets.

Upper Case means all letters are capitalized.

Lower Case means all letters are in lowercase.

Title Case means the first letter of each word is capitalized.

Sentance Case means only the first letter of the first word is capitalized.

camel Case means the first word is lowercase, and each subsequent word starts with an uppercase letter without spaces.

Pascal Case is similar to camel case, but the first letter of the first word is also capitalized.

Kabab Case means words are in lowercase and separated by hyphens.

Snake Case means words are in lowercase and separated by underscores.

Success criteria

The success condition depends on how the Case Type is configured.
For example, when Case Type is set to Pascal Case, only inputs where each word starts with an uppercase letter are valid.
For example, “DropDown” is valid, but “dropDown” is not.

Configuration fields

  • Operator options

    Greater than

    Less than

    Equal to

    Between (requires specifying a start and end range)

  • Operator defines the comparison operation. It can be Greater Than, Less Than, Equal To, or Between.

  • Value is the threshold used for success criteria. It is required for the Greater than, Less than, and Equal to operators.

  • Value range is required only when the Between operator is selected. You must specify a start and end range.

  • Threshold type indicates whether the Value or Value Range is considered as a percentage or an absolute count.

  • Allow null values determines whether null values are permitted.

  • Check for match verifies if data values align with predefined standards, formats, or reference values. This helps ensure accuracy, consistency, and integrity.

Sample Input

IDCustomerCountry
1FallongreatBritain
2FranklynFryerFrance
3KathleenunitedStates
4JudieGreen
5JohnDoeFrance

Sample rule configuration

Case type Pascal Case

Sample success criteria configuration

  • Operator Greater than
  • Value 75%
  • Threshold type Absolute Count
  • Allow null values False
  • Check for match True

alt text

Sample output

Column NameRule NameSuccess CountFailure CountWithin ThresholdNull Count
CustomerConsistent Casing check50Yes0
CountryConsistent Casing check23No1