Elasticsearch is gaining momentum as the ultimate destination for log messages. There are two major reasons for this:
- You can store arbitrary name-value pairs coming from structured logging or message parsing.
- You can use Kibana as a search and visualization interface.
This is the second blog post in a six-part series on storing logs in Elasticsearch using syslog-ng. You’ll find a link to the next and previous parts in the series at the end of this post. You can also read the whole Elasticsearch series in a single white paper.
Logging to Elasticsearch the traditional way
Originally, you could only send logs to Elasticsearch via Logstash. But the problem with Logstash is that it is quite heavy-weight, as it requires Java to run, and most of it was written in Ruby. While the use of Ruby makes it easy to extend Logstash with new features, it uses too much resource to be used universally. It is not something to be installed on thousands of servers, virtual machines or containers.
The workaround for this problem is to use the different Beats data shippers, which are friendlier with resources. If you also need reliability and scalability, you also need buffering. For this purpose, you need an intermediate database or message broker: Beats and Logstash support Redis and Apache Kafka.
If you look at the above architecture, you’ll see that you need to learn many different software to build an efficient, reliable and scalable logging system around Elasticsearch. All of these software have a different purpose, different requirements and different configuration.
Logging to Elasticsearch made simple
The good news is that syslog-ng can fulfill all of these roles. Most of syslog-ng is written in efficient C code, so it can be installed even in containers without extra resource overhead. It uses PatternDB for message parsing, which uses an efficient Radix-tree based algorithm instead of resource-hungry regular expressions. Of course regexp and a number of other parsers are also available, implemented in efficient C or Rust code. The only part of the pipeline where Java is needed is when the central syslog-ng server sends the log messages to the Elasticsearch server. In other words, only the Elasticsearch destination driver of syslog-ng uses Java, and it uses the official JAR client libraries from Elasticsearch for maximum compatibility.
As syslog-ng has disk-based buffering, you do not need external buffering solutions to enhance scalability and reliability, making your logging infrastructure easier to create and maintain. Disk-based buffering has been available in syslog-ng Premium Edition (the commercial version of syslog-ng) for a long time, and recently also became part of syslog-ng Open Source Edition (OSE) 3.8.1.
How to get started with syslog-ng and Elasticsearch
The syslog-ng application comes with detailed documentation to get you started and help you fine tune your installation.
If you want to get started with parsing messages – replacing grok – see the following links:
In my next Elasticsearch blog, I talk about how to parse data with syslog-ng, store in Elasticsearch, and analyze with Kibana.
In my previous blog in the Elasticsearch series, I covered basic information about using Elasticsearch with syslog-ng and how syslog-ng can simplify your logging architecture.
Are you stuck?
If you have any questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by e-mail or even in real time via chat. For a long list of possibilities, check our contact page at https://www.syslog-ng.com/contact-sales/.