Strategy for collectors / sources?

Comments

2 comments

  • Avatar
    Michael Floyd
    Hi Dan, others may have stronger opinions on sources but due to the amount of data you'll want to consider managing your ingest to prevent throttling or hitting your account cap. Here's an article on the considerations. http://help.sumologic.com/Manage/Ingestion_and_Volume/Manage_Ingestion Here's some best practices for local vs centralized collection: https://help.sumologic.com/Start_Here/Getting_Started/Best_Practices%3A_Local_and_Centralized_Data_Collection Also, depending on your production environment there are some benefits of installed collectors over hosted. This article explains. http://help.sumologic.com/Send_Data/Hosted_Collectors/Compare_Installed_and_Hosted_Collectors.
  • Avatar
    Brian Goleno
    Dan, With the Cloud APIs for Data Collection, such as the HTTP API for Logs that you are asking about, the "Collector" level container is not near as relevant as it is for Installed Collectors. At this time, it simply serves as an organizing container if you have multiple Sources defined, and for metadata inheritance. The metadata and configuration settings at the Collector level are inherited as overridable defaults by each of the Sources created beneath it. Another reason to organize multiple HTTP API Sources under a "Collector" is for permissions management. We are actively developing an improved roles based permissions management capability for Collectors. You may find it easier to group similar HTTP API sources under 1 Collector for easier permissions management. Re: "Should I create separate sources at all, or just dump all of it into a single source?" This is a matter of organizational preference on your side. Sorry, I hate to give a weak answer, let me expand a bit. There are a variety of reasons to create multiple HTTP API sources under a Collector. * Wanting to have different Source Name or Source Category metadata auto applied to different data streams. Conversely, you can add SourceName and SourceCategory as HTTP Headers as well. * Needing to override the timezone for different data streams * Creating multiple sources with Identical metadata configurations, but different Tokens to use as part of a security token rotation model. Some users will create a replica of a Source that has a different URL token, migrate their logging clients to the new URL, then delete the old Source, thereby rotating the tokens. Or, they create a wide range of tokens in order to minimize the exposure of the token. re: "If I should create multiple sources, should they be by host, by log type (app log / system log" This is sort of related to the previous answer. If I were to divide them up, I would do them by log type since that would be more closely related to the metadata used. But, that can be overridden with HTTP Header values. A roadmap note - by end of year, we should have much broader support for user defined metadata, and that should be settable in HTTP Headers as well, and you can always inject metadata into the message itself, and set Field Extraction Rules to cast them to specific fields. Regards, Brian Goleno Data Collection and Ingest Product Manager

Please sign in to leave a comment.