Using syslog-ng with the Elastic stack

6 Aug 2019

One of the most popular destinations of syslog-ng is Elasticsearch. Any time a new language binding was introduced to syslog-ng, someone implemented an Elasticsearch destination for it. For many years, the official Elasticsearch destination for syslog-ng was implemented in Java. With the recent enhancements to the http() destination of syslog-ng, a new, native C-based implementation called the elasticsearch-http() destination is available.

Why do so many people want to send their logs to Elasticsearch? There are many reasons:

it is an easy-to-scale and easy-to-search data store
it is NoSQL: any number of name-value pairs can be stored (Hello, message parsing!)
Kibana: an easy-to-use data explorer and visualization solution for Elasticsearch

And why use syslog-ng on the sending side? There are very good reasons for that, too:

A single, high-performance and reliable log collector for all your logs, no matter if they come from network devices, local system or applications. Therefore, it can greatly simplify your logging architecture.
High speed data processor that parses both structured (JSON, CSV, XML) and unstructured (PatternDB) log messages. It can also anonymize log messages if required by policies or regulations, and reformat them to be easily digested by analyzers.
Complex filtering, to make sure that only important messages get through and that they reach the right destination.

This blog post is based on the Elasticsearch-specific parts of the syslog-ng workshop I gave recently at the Pass the SALT conference in Lille, France.

Before you begin

The elasticsearch-http() destination was introduced in syslog-ng version 3.21. To be able to use it, you need HTTP and JSON support enabled in syslog-ng. If you installed syslog-ng in a package, these features might be in separate sub-packages so you can avoid installing any extra dependencies. Depending on your distribution, the necessary package might be called syslog-ng-http (Fedora/RHEL/CentOS), syslog-ng-curl (openSUSE/SLES) or syslog-ng-mod-http (Debian, Ubuntu). Recent versions of FreeBSD ports enable these features by default.

If it is not available as part of your Linux distribution, check our 3^rd party binaries page for downloads, or build it yourself from source.

Obviously, you also need Elasticsearch to be installed. The example configuration is tested to work with Elasticsearch 7.X. The minimal differences between 7.X and earlier versions from the syslog-ng configuration point of view will be noted.

Learn how to install syslog-ng and Elasticsearch 7 on RHEL/CentOS, our most popular platforms: https://www.syslog-ng.com/community/b/blog/posts/syslog-ng-and-elasticsearch-7-getting-started-on-rhel-centos

Learning syslog-ng

If you are new to syslog-ng, you can learn about the basics, its major building blocks and configuration from my blog at https://www.syslog-ng.com/community/b/blog/posts/building-blocks-of-syslog-ng. It is the generic part of the syslog-ng workshop I gave at the Pass the SALT conference in Lille.

Once you are confident with the basic concepts, it will be easier to read the documentation. It is massive (well over 300 pages, detailing all the smallest details of syslog-ng), and available at https://www.syslog-ng.com/technical-documents/list/syslog-ng-open-source-edition/

Elasticsearch

Originally, the official syslog-ng Elasticsearch driver was written in Java. It is still available, but most likely will be phased out once the new elasticsearch-http() is fine-tuned. It has several disadvantages (namely, it cannot be included in Linux distributions, and requires a lot more resources), though. The new elasticsearch-http() destination is a wrapper around the http() destination of syslog-ng,written as native C code. As it does not have any “esoteric” dependencies, it can be part of any Linux distributions. Except for extreme load, it is a lot less resource-hungry than the Java-based destination. It only uses more resources than the Java-based Elasticsearch destination in some extreme cases.

Below you can see a very basic configuration for syslog-ng, which saves logs locally and sends the same logs to Elasticsearch as well. This way you can easily check if your logs arrive to Elasticsearch.

@version:3.21
@include "scl.conf"
source s_sys { system(); internal();};
destination d_mesg { file("/var/log/messages"); };
log { source(s_sys); destination(d_mesg); };
 
destination d_elasticsearch_http {
    elasticsearch-http(
        index("syslog-ng")
        type("")
        url("http://localhost:9200/_bulk")
        template("$(format-json --scope rfc5424 --scope dot-nv-pairs
        --rekey .* --shift 1 --scope nv-pairs
        --exclude DATE --key ISODATE @timestamp=${ISODATE})")
    );
};


log {
    source(s_sys);
    destination(d_elasticsearch_http);
    flags(flow-control);
};

The configuration above sends log messages to Elasticsearch using the new elasticsearch-http() destination. You need to set an index name and a URL. The type() option is also mandatory, but for Elasticsearch 7.X you should leave it empty. You can see that the Elasticsearch destination uses a complex template (namely, it uses JSON formatting and sends not only syslog data, but name-value pairs parsed from messages as well).

Name-value pairs created by out-of-box parsers of syslog-ng start with a dot. When formatted into JSON, these initial dots are turned into underscores, which is problematic with Elasticsearch. In the template above, initial dots are simply removed. While it is OK in most cases, in your specific environment it might cause problems (namely, overwriting existing name-value pairs), so double check the name of your name-value pairs before using this template.

Elasticsearch prefers the ISODATE date format over the traditional syslog date format, which is why timestamp is replaced on the last line of the template.

You can learn a lot more about configuring syslog-ng for Elasticsearch from the syslog-ng documentation. Here I would like to highlight two differences from Beats/Logstash:

If you want to feed a cluster of Elasticsearch nodes using syslog-ng, you have to list the nodes in the url() parameter. There is no automatic node discovery.
By default, syslog-ng sends all data to Elasticsearch as string, which limits how data can be used. You can use mapping on the Elasticsearch side, and you can also configure data type on the syslog-ng side: https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/9#TOPIC-956418

Testing

The configuration above sends all system logs to the Elasticsearch destination as well, so you will most likely have some sample logs in Elasticsearch very soon. If your test machine does not produce any logs within a reasonable time frame, you can use the logger utility to send a few test messages:

logger this is a test massage

Even without extra configuration, you can see results from message parsing in Elasticsearch. Recent (3.13+) versions of syslog-ng parse sudo log messages automatically. If you run any commands through sudo, you should see name-value pairs parsed from sudo messages.

What is next?

Learn what is new with Elasticsearch 7 and syslog-ng, and how to send geographical data from syslog-ng to Elasticsearch along the way: https://www.syslog-ng.com/community/b/blog/posts/syslog-ng-with-elastic-stack-7

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or you can even chat with us. For a long list of possibilities, check our contact page at https://syslog-ng.org/contact-us/. On Twitter I am available as @Pczanik.

Parents

patsm00re18 over 4 years ago

I will positively fly back again in the blink of an eye. Link to Google
- Cancel
- Up 0 Down
- More
- Cancel

Comment

patsm00re18 over 4 years ago

I will positively fly back again in the blink of an eye. Link to Google
- Cancel
- Up 0 Down
- More
- Cancel

Children

No Data