Amazon S3 bucket Sources do not support the double asterisk (**) within the file expression like other Source types.
My_S3_Bucket_Name/Cloudtrail/**/*.gz
Amazon S3 interprets wildcards in the file path differently than other Sources. It does not consider forward slashes the same way a traditional filesystem does; Amazon S3 considers them simply part of a path string. Using a single asterisk in S3 Source paths accomplishes the same goal as using two asterisks in other Sources.
For example, CloudTrail logging generates a new folder every day that looks like this:
My_S3_Bucket_Name/CloudTrail/2014/12/05/20141205.json.gz
To gather all logs under a directory structure that is constantly changing, use the file path above when creating your S3 Source:
For example, an S3 source file path set to My_S3_Bucket_Name/Cloudtrail/* will collect everything under CloudTrail.
My_S3_Bucket_Name/CloudTrail/2014/12/05/20141205.json.gz
My_S3_Bucket_Name/CloudTrail/2013/11/04/20131104.json.gz
My_S3_Bucket_Name/CloudTrail/2012/10/03/20121003.json.gz
More information can be found at Amazon Path Expression
Note: When you configure the Source's Collect From Time parameter, if you select All Time, this will result in collecting every single log, which in some cases can be several years of logs! Specify a specific date instead in order to avoid this issue.
Comments
1 comment
I am trying to configure sumologic to look at my CloudTrail logs. It is asking for the S3 Bucket Name and a Path Expression. in your example above, My_S3_Bucket_Name/Cloudtrail/*, what would I enter in the S3 Bucket Field, and what would I enter in the Path Expression field?
Please sign in to leave a comment.