Skip to content

Custom File Parsing

Description

The custom file parsing is designed to process files according to a user-defined JSON template that specifies how data should be identified, extracted, and transformed. The JSON template acts as a parsing blueprint, outlining field names, positions, lengths, delimiters, data types, and any transformation rules. This approach allows for highly flexible parsing that can adapt to a wide range of file formats, whether structured, semi-structured, or irregular in layout.

By separating parsing logic from the code and placing it in a JSON configuration, the parser becomes easily maintainable and reusable across different datasets without requiring code changes. This method is particularly useful for organizations dealing with varied data sources, evolving file structures, or client-specific formats. It supports precise data mapping, validation, and conversion into structured outputs such as JSON, CSV, or database records, enabling seamless integration into ETL workflows, automation processes, and reporting systems.

This activity is especially useful in cases where:

  • Multi-format adaptability – Parses varied file types by simply updating the JSON template without changing code.
  • Client-specific data mapping – Handles unique field layouts for different clients or data providers.
  • Rapid onboarding of new formats – Quickly integrates new file structures into workflows by creating corresponding JSON templates.

By reading structured or semi-structured data from files based on a user-defined JSON template, this activity ensures accurate extraction, flexible field mapping, and seamless integration into modern data workflows.

Tip: Ensure the JSON template’s field definitions, positions, and rules align precisely with the file’s structure to prevent parsing errors or missing data.


Input

FieldDescription
FileUnstructured text file from the previous activity.

Output

FieldDescription
DataStructured data extracted using JSON-defined rules, formatted into a table.

Configuration Fields

FieldDescription
Add FilesDefines custom file parsing rule(s) using a JSON file template to identify and extract data from files.
File EncodingSpecifies the character set used to read the input file correctly.
Line SeparatorDefines the character(s) that mark the end of each record in the file.
Max row lookup countLimits the number of rows processed during data preview or validation, defaulting to 10000.

Sample Input

Not Applicable


Sample Configuration

FieldValue
Add FilesFile Selector for JSON template.
File EncodingWindows 1252
Line Separator\r\n
Max row lookup count10000

Sample Output

CustomerIDNamePurchaseDateAmountStatus
CUST001Alice Johnson2025-07-151250.50Paid
CUST002Robert Smith2025-07-18980.00Pending
CUST003Maria Lopez2025-07-204500.00Paid
CUST004David Williams2025-07-22320.75Cancelled