Skip to content

Download from S3

Description

The Download From S3 activity enables automated downloading of files from a specified directory in an Amazon S3 bucket.
This is useful when working with datasets or documents stored in cloud storage that need to be processed, archived, or analyzed in workflows.

It supports flexible filtering using regex and date ranges, optional encryption settings, and post-download cleanup actions like file deletion or moving files to another S3 directory.

Use case: Automatically retrieve .csv transaction logs from an S3 path /logs/transactions/, process them, and archive the originals in /logs/processed/.


Input

Input TypeDescription
FilesNo upstream input required. Files are fetched directly from S3.

Output

Output TypeFormatDescription
FilesBinaryFiles retrieved from S3
MetadataJSONIncludes file name, size, and source location info

Configuration Fields

Field NameRequiredDescription
ConnectionYesS3 connection with credentials and region for accessing the bucket.
Working DirectoryYesThe S3 path (bucket/folder) where the files are located.
RegexNoA pattern to filter specific files for download (e.g., .*\.csv$).
Files ForYesSpecifies which files to fetch based on their timestamps. Options: All, Today, Yesterday, Today and Yesterday, Date Range.
Start DateOptionalStart of the custom range (only shown if “Date Range” is selected).
End DateOptional.End of the custom range (only shown if “Date Range” is selected).
Delete After DownloadNoIf enabled, files will be removed from S3 after download.
Move After DownloadNoIf enabled, moves downloaded files to another folder instead of deleting.
Move DirectoryOptional.Destination path in S3 for moved files (shown only if Move is enabled).
Use EncryptionNoEnables encryption handling while downloading files.
Encryption Key NameOptional.The name of the encryption key to use (shown only if encryption is enabled).

Sample Input

Not Applicable


Sample Configuration

FieldValue
ConnectionS3 - MyDataLake
Working Directory/daily-dumps/
Regex.*\.csv$
Files ForDate Range
Start Date2024-07-01
End Date2024-07-03
Delete After Downloadfalse
Move After Downloadtrue
Move Directory/archive/2024-07/
Use Encryptiontrue
Encryption Key Namefinance-data-key

Sample Output

The following files are fetched from the configured S3 directory:

Remote File NameFile NameSizeDownload
file1.txtfile1.txt2MB[Download]
file2.csvfile2.csv3MB[Download]