Skip to content

Split url

Description

The Split URL activity is designed to parse and decompose full URL strings into their constituent parts such as protocol, host, port, path, query string, and fragment. URLs often store critical information like identifiers, categories, or navigation paths, and this activity enables structured access to that data for analysis, filtering, or transformation.

By specifying the source column containing URLs and configuring the output column names for different components, users can efficiently extract web-related metadata for use in dashboards, reports, or downstream processing.

Example Use Case:
A dataset contains URLs accessed by users:
https://company.com:8080/employee?id=E001&name=John+Doe#profile
This activity will split it into:

  • Protocol: https
  • Host: company.com
  • Port: 8080
  • Path: /employee
  • Query: id=E001&name=John+Doe
  • Fragment: profile

Input

TypeDescription
Data OnlyA column in the dataset must contain full URLs (including schemes like http, https, ftp, etc.)

Output

TypeDescription
Transformed DataAdds new columns for the extracted components of the URL (e.g., protocol, host, path, query, etc.)

Configuration Fields

Field NameDescription
URL ColumnSpecifies the column containing the full URLs to be parsed. Only one column can be selected.
Protocol Column NameColumn name to store the extracted URL protocol (e.g., http, https, ftp).
Host Column NameColumn name to store the domain or IP address of the URL (e.g., example.com, 192.168.1.1).
Port Column NameColumn name to store the port number if present (e.g., 8080, 443).
Path Column NameColumn name to store the path portion of the URL (e.g., /employee, /dashboard).
Query Column NameColumn name to store the query string (portion after ?) without parsing key-value pairs.
Fragment Column NameColumn name to store the fragment identifier (portion after #, if any).

Note: This activity does not split the query string into key-value pairs — use the Split HTTP Query activity if you need that level of detail.


Sample Input

employee_idnameurl
E001John Doehttps://company.com:8080/employee?id=E001&name=John+Doe&department=Sales#profile
E002Marie Duponthttp://marketing.com/employee?id=E002&name=Marie+Dupont&department=Marketing
E003Carlos Gómezftp://fileserver.com:21/download?id=E003&name=Carlos+Gómez&department=Engineering

Sample Configuration

FieldValue
URL Columnurl
Protocol Column NameProtocol
Host Column NameHost
Port Column NamePort
Path Column NamePath
Query Column NameQuery
Fragment Column NameFragment

Sample Output

employee_idnameurlProtocolHostPortPathQueryFragment
E001John Doehttps://company.com:8080/employee?id=E001&name=John+Doe&department=Sales#profilehttpscompany.com8080/employeeid=E001&name=John+Doe&department=Salesprofile
E002Marie Duponthttp://marketing.com/employee?id=E002&name=Marie+Dupont&department=Marketinghttpmarketing.com/employeeid=E002&name=Marie+Dupont&department=Marketing
E003Carlos Gómezftp://fileserver.com:21/download?id=E003&name=Carlos+Gómez&department=Engineeringftpfileserver.com21/downloadid=E003&name=Carlos+Gómez&department=Engineering