Split http query
Description
The Split HTTP Query activity is used to extract and convert HTTP query strings embedded within URLs into structured, column-based data. This is particularly useful in cases where user input, system requests, or web-tracking data are passed as URL query parameters and stored in a single column.
This activity scans each value in the selected column to detect standard query string structures—those typically following a ?
in a URL, such as: https://domain.com/resource?key1=value1&key2=value2
Once detected, it parses each parameter (key-value pair) and creates new columns in the dataset, assigning values from the query string accordingly. This enables users to access and manipulate data like id
, name
, department
, etc., as individual fields rather than dealing with them inside an encoded string.
Example Use Case:
You may have a dataset of employee links where each URL includes parameters like?id=E003&name=Carlos+Gomez&department=Engineering
. This activity will extractE003
,Carlos+Gomez
, andEngineering
into separate columns namedKey_id
,Key_name
, andKey_department
, making the data easier to read, filter, or aggregate.
Input
Type | Description |
---|---|
Data | A dataset containing a column with HTTP URLs or raw query strings. The column must follow the typical format with key-value pairs separated by & . |
Output
Type | Description |
---|---|
Transformed Data | A dataset where the selected column’s query parameters are expanded into individual columns. |
Configuration Fields
Field Name | Description |
---|---|
Column Name | The column to be parsed. It should contain complete URLs or raw query strings containing key-value pairs separated by & . Only a single column can be selected. |
Prefix | A custom prefix for the newly created columns. Each extracted key will be appended to this prefix. For example, a key id with prefix Key_ becomes Key_id . This is useful to prevent naming collisions or to group similar fields. |
Include Original | Determines whether the original input row is retained. - Enabled: Keeps all the original columns, including the raw HTTP query. - Disabled: Only shows the parsed query parameter columns. |
Sample Input
employee_id | name | http_query |
---|---|---|
E001 | John Doe | https://company.com/employee?id=E001&name=John+Doe&department=Sales |
E002 | Marie Dupont | https://company.com/employee?id=E002&name=Marie+Dupont&department=Marketing |
E003 | Carlos Gómez | https://company.com/employee?id=E003&name=Carlos+Gómez&department=Engineering |
Sample Configuration
Field | Value |
---|---|
Column Name | http_query |
Prefix | Key_ |
Include Original | Enabled |
Sample Output
employee_id | name | http_query | Key_id | Key_name | Key_department |
---|---|---|---|---|---|
E001 | John Doe | https://company.com/employee?id=E001&name=John+Doe&department=Sales | E001 | John+Doe | Sales |
E002 | Marie Dupont | https://company.com/employee?id=E002&name=Marie+Dupont&department=Marketing | E002 | Marie+Dupont | Marketing |
E003 | Carlos Gómez | https://company.com/employee?id=E003&name=Carlos+Gómez&department=Engineering | E003 | Carlos+Gómez | Engineering |