Custom File Parsing
Description
The custom file parsing is designed to process files according to a user-defined JSON template that specifies how data should be identified, extracted, and transformed. The JSON template acts as a parsing blueprint, outlining field names, positions, lengths, delimiters, data types, and any transformation rules. This approach allows for highly flexible parsing that can adapt to a wide range of file formats, whether structured, semi-structured, or irregular in layout.
By separating parsing logic from the code and placing it in a JSON configuration, the parser becomes easily maintainable and reusable across different datasets without requiring code changes. This method is particularly useful for organizations dealing with varied data sources, evolving file structures, or client-specific formats. It supports precise data mapping, validation, and conversion into structured outputs such as JSON, CSV, or database records, enabling seamless integration into ETL workflows, automation processes, and reporting systems.
This activity is especially useful in cases where:
- Multi-format adaptability – Parses varied file types by simply updating the JSON template without changing code.
- Client-specific data mapping – Handles unique field layouts for different clients or data providers.
- Rapid onboarding of new formats – Quickly integrates new file structures into workflows by creating corresponding JSON templates.
By reading structured or semi-structured data from files based on a user-defined JSON template, this activity ensures accurate extraction, flexible field mapping, and seamless integration into modern data workflows.
Tip: Ensure the JSON template’s field definitions, positions, and rules align precisely with the file’s structure to prevent parsing errors or missing data.
Input
Field | Description |
---|---|
File | Unstructured text file from the previous activity. |
Output
Field | Description |
---|---|
Data | Structured data extracted using JSON-defined rules, formatted into a table. |
Configuration Fields
Field | Description |
---|---|
Add Files | Defines custom file parsing rule(s) using a JSON file template to identify and extract data from files. |
File Encoding | Specifies the character set used to read the input file correctly. |
Line Separator | Defines the character(s) that mark the end of each record in the file. |
Max row lookup count | Limits the number of rows processed during data preview or validation, defaulting to 10000. |
Sample Input
Not Applicable
Sample Configuration
Field | Value |
---|---|
Add Files | File Selector for JSON template. |
File Encoding | Windows 1252 |
Line Separator | \r\n |
Max row lookup count | 10000 |
Sample Output
CustomerID | Name | PurchaseDate | Amount | Status |
---|---|---|---|---|
CUST001 | Alice Johnson | 2025-07-15 | 1250.50 | Paid |
CUST002 | Robert Smith | 2025-07-18 | 980.00 | Pending |
CUST003 | Maria Lopez | 2025-07-20 | 4500.00 | Paid |
CUST004 | David Williams | 2025-07-22 | 320.75 | Cancelled |