Download from S3
Description
The Download From S3 activity enables automated downloading of files from a specified directory in an Amazon S3 bucket.
This is useful when working with datasets or documents stored in cloud storage that need to be processed, archived, or analyzed in workflows.
It supports flexible filtering using regex and date ranges, optional encryption settings, and post-download cleanup actions like file deletion or moving files to another S3 directory.
Use case: Automatically retrieve
.csv
transaction logs from an S3 path/logs/transactions/
, process them, and archive the originals in/logs/processed/
.
Input
Input Type | Description |
---|---|
Files | No upstream input required. Files are fetched directly from S3. |
Output
Output Type | Format | Description |
---|---|---|
Files | Binary | Files retrieved from S3 |
Metadata | JSON | Includes file name, size, and source location info |
Configuration Fields
Field Name | Required | Description |
---|---|---|
Connection | Yes | S3 connection with credentials and region for accessing the bucket. |
Working Directory | Yes | The S3 path (bucket/folder) where the files are located. |
Regex | No | A pattern to filter specific files for download (e.g., .*\.csv$ ). |
Files For | Yes | Specifies which files to fetch based on their timestamps. Options: All, Today, Yesterday, Today and Yesterday, Date Range. |
Start Date | Optional | Start of the custom range (only shown if “Date Range” is selected). |
End Date | Optional. | End of the custom range (only shown if “Date Range” is selected). |
Delete After Download | No | If enabled, files will be removed from S3 after download. |
Move After Download | No | If enabled, moves downloaded files to another folder instead of deleting. |
Move Directory | Optional. | Destination path in S3 for moved files (shown only if Move is enabled). |
Use Encryption | No | Enables encryption handling while downloading files. |
Encryption Key Name | Optional. | The name of the encryption key to use (shown only if encryption is enabled). |
Sample Input
Not Applicable
Sample Configuration
Field | Value |
---|---|
Connection | S3 - MyDataLake |
Working Directory | /daily-dumps/ |
Regex | .*\.csv$ |
Files For | Date Range |
Start Date | 2024-07-01 |
End Date | 2024-07-03 |
Delete After Download | false |
Move After Download | true |
Move Directory | /archive/2024-07/ |
Use Encryption | true |
Encryption Key Name | finance-data-key |
Sample Output
The following files are fetched from the configured S3 directory:
Remote File Name | File Name | Size | Download |
---|---|---|---|
file1.txt | file1.txt | 2MB | [Download] |
file2.csv | file2.csv | 3MB | [Download] |