Problem:
I have created an S3 Source in Sumo Logic and trying to collect logs from an S3 bucket into Sumo Logic but I am not seeing any data show up in my Sumo Logic search.
Resolution:
Step:1
Check the Collector Status page (Sumo Logic-> Manage Data -> Collection -> Status), and verify you are not seeing any data displayed within the histogram chart for your Collector. If you see data in this chart then go to Step 2.
Step: 2
Run a search of your Collector using the "Use Receipt Time" option and verify if you are able to see any log messages. If you are seeing log messages the issue may be a time parsing problem or Sumo Logic may be attempting to catch up with the objects within your bucket.
If you are not seeing data after completion of the above 2 steps then there are several possibilities that may be causing this issue.
1.) AWS S3 JSON access policy:
Please check that you have assigned a valid S3 policy to your IAM user. We recommend users to assign below JSON policy to grant access to S3 bucket content:
Make sure to replace the your_bucketname
placeholders in the Resource
section of the JSON policy with your actual S3 bucket name.
{
"Version":"2012-10-17",
"Statement":[
{
"Action":[
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucketVersions",
"s3:ListBucket"
],
"Effect":"Allow",
"Resource":[
"arn:aws:s3:::your_bucketname/*",
"arn:aws:s3:::your_bucketname"
]
}
]
}
Note:
- All of the Action parameters shown above are required. Make sure to include both of the Amazon Resource Name (ARN) statements in the "Resource" section of the policy. Both statements are required to allow full access to the bucket contents and the bucket itself.
2.) Wrong Path Expression:
If your AWS S3 JSON access policy looks correct then please check the path expression supplied within your source to ensure it matches the folder structure of your file objects. Some common errors with the path expression are as follows:
- The S3 bucket name is included as part of the path. The bucket name should not be included as part of the "path expression" unless it exists as a folder and is part of the folder structure.
- The path expressions do include a leading forward slash. The path expression supplied should not include a leading forward slash character and
- The path expression does not match the path for the file objects to collect. To collect all logs at a hierarchical level, you can use a portion of the source path along with a single asterisk as a wildcard to denote the rest of the path and file object names. Alternately, to match all file objects in a bucket, you can supply a "*" as the path expression
- The path expression has multiple wild card asterisk "*". You can use only one wild card in the path expression.
If you want to collect logs from a specific folder then define that folder name and then a * to denote the file objects within that path. For example, if your file objects are contained within a folder named test you can define the path expression as test/* to get all the objects within the test folder. Note this will also include any objects located in subfolders of this directory.
3.) There are no objects within your bucket with a last modified date earlier than the collection begin time setting.
If your AWS S3 JSON policy and path expressions are correct next check the configured data collection beginning time in Sumo Logic for that specific S3 source and compare this with the last modified dates of the objects you expect to be collected from the S3 bucket.
Where you can find this: In Sumo Logic Collection page -> Select your S3 source -> Edit -> Collection Should begin on the configuration page, to get more idea please see below video
If the last modified date of your S3 bucket objects is older than the collection beginning time in Sumo Logic then you have to recreate your S3 Source into Sumo Logic and change your data collection beginning time to the time which includes the last modified time of the objects within your S3 bucket.
In case if you want to ingest data of all time from your S3 bucket to Sumo Logic then you can select All Time as collection beginning time from the drop-down option in Sumo Logic.
4.) If the SNS notification is not configured.
If the SNS notification is not configured and if the S3 buckets contain millions of objects in an S3 bucket then it is possible that the scan can take a long time, possibly hours, or spin forever as it attempts to return the list of objects in the bucket.
If SNS notification is configured for the source then the objects in the S3 bucket will be collected by Sumo Logic quickly. SNS notification will provide faster collection relative to the bucket scan method especially for S3 buckets with millions of objects.
To configure SNS notification on the source, please follow the steps in the documentation link.
If you are still facing any issues please reach out to Sumo Logic Technical Support.
Comments
0 comments
Please sign in to leave a comment.