Description
The Split URL activity is designed to parse and decompose full URL strings into their constituent parts such as protocol, host, port, path, query string, and fragment. URLs often store critical information like identifiers, categories, or navigation paths, and this activity enables structured access to that data for analysis, filtering, or transformation.
By specifying the source column containing URLs and configuring the output column names for different components, users can efficiently extract web-related metadata for use in dashboards, reports, or downstream processing.
Example Use Case:
A dataset contains URLs accessed by users:
https://company.com:8080/employee?id=E001&name=John+Doe#profile
This activity will split it into:
- Protocol:
https
- Host:
company.com
- Port:
8080
- Path:
/employee
- Query:
id=E001&name=John+Doe
- Fragment:
profile
Type | Description |
---|
Data Only | A column in the dataset must contain full URLs (including schemes like http, https, ftp, etc.) |
Output
Type | Description |
---|
Transformed Data | Adds new columns for the extracted components of the URL (e.g., protocol, host, path, query, etc.) |
Configuration Fields
Field Name | Description |
---|
URL Column | Specifies the column containing the full URLs to be parsed. Only one column can be selected. |
Protocol Column Name | Column name to store the extracted URL protocol (e.g., http , https , ftp ). |
Host Column Name | Column name to store the domain or IP address of the URL (e.g., example.com , 192.168.1.1 ). |
Port Column Name | Column name to store the port number if present (e.g., 8080 , 443 ). |
Path Column Name | Column name to store the path portion of the URL (e.g., /employee , /dashboard ). |
Query Column Name | Column name to store the query string (portion after ? ) without parsing key-value pairs. |
Fragment Column Name | Column name to store the fragment identifier (portion after # , if any). |
Note: This activity does not split the query string into key-value pairs — use the Split HTTP Query activity if you need that level of detail.
Sample Configuration
Field | Value |
---|
URL Column | url |
Protocol Column Name | Protocol |
Host Column Name | Host |
Port Column Name | Port |
Path Column Name | Path |
Query Column Name | Query |
Fragment Column Name | Fragment |
Sample Output