Sending logs to Humio using the elasticsearch-http() destination of syslog-ng

One of the most popular syslog-ng destinations is Elasticsearch. Humio, a log management provider, supports a broad range of ingest options and interfaces, including an Elasticsearch-compatible API. Last week, Humio announced Humio Community Edition, which provides the full Humio experience for free, with some limitations on daily ingestion and retention time. I tested the Community Edition, and it works perfectly well with syslog-ng.

If you come from the Humio side, you might wonder what syslog-ng is. It is an application for high performance central log collection. Traditionally, syslog messages were collected centrally and saved to text files. Nowadays, syslog-ng acts more like a log management layer: collects log messages from hosts, saves them for long term storage, but also forwards them to multiple destinations, like SIEMs and other log analysis solutions. This way, it is enough to collect log messages only once, and syslog-ng delivers the right log messages to the right destinations in the right format, after some initial processing.

Humio is available as a cloud service or self-hosted, where you can send all your logs for storage and analysis. It has an easy-to-use interface to query log messages which can be extended with further analytics possibilities from the Humio marketplace.

From this blog, you can learn how to get started with Humio and syslog-ng. While Humio provides many other APIs for log ingestion, I focus on the elasticsearch-http() destination of syslog-ng, demonstrating that there is no vendor lock-in: the same driver works equally well for Elastic’s Elasticsearch, AWS’s OpenSearch and for Humio.

Before you begin

First of all, you need a Humio account. If you do not yet have one, go and register on the Humio website: https://www.humio.com/getting-started/community-edition/. My tests were made on the Community Edition, but everything should work on the full version just by modifying the URL of your Humio service in the syslog-ng configuration.

On the syslog-ng side, you need at least syslog-ng version 3.23, but as usual, I recommend using the latest version available. You can find up-to-date 3rd-party syslog-ng packages for many operating systems, through https://www.syslog-ng.com/products/open-source-log-management/3rd-party-binaries.aspx.

Configuring syslog-ng

To configure syslog-ng, you need two pieces of data from the Humio web interface. One is the URL and the other one is the ingest token. The documentation only mentions that the host part of the URL depends on the location: cloud.humio.com for European customers and cloud.us.humio.com for US customers. I learned the hard way, that there is also a third variant for the community edition: cloud.community.humio.com. So, in my case, the complete URL is: https://cloud.community.humio.com/api/v1/ingest/elastic-bulk

The other data is the ingest token. You can find it under Settings. There is a default token, but you are advised to create new tokens per log source so you can easily revoke access if a host or application is compromised.

You also need to check what the name of the log source is for local log messages in your syslog-ng configuration. If you use the default syslog-ng configuration on Ubuntu, it is called “s_src”.

If your Linux distribution of choice supports it, create a new configuration with a .conf extension under the /etc/syslog-ng/conf.d/ directory. Otherwise append the configuration to the end of syslog-ng.conf:

destination d_elastic_humio {
    elasticsearch-http(
        type("humio") # not used by humio, but required by plugin
        index("syslog-humio") # not used by humio, but required by plugin
        url("https://cloud.community.humio.com/api/v1/ingest/elastic-bulk")
        workers(2)
        batch-lines(200)
        user("syslog-ng") # not used by humio, can be whatever you want
        password("a555ef1c-XXX-YYY-ZZZ-63e6914e22e1")
        template("$(format-json --scope rfc5424 --scope dot-nv-pairs
        --exclude .journald.*
        --rekey .* --shift 1 --scope nv-pairs
        --exclude DATE  @timestamp=${ISODATE})")
    );
};

log {
    source(s_src);
    destination(d_elastic_humio);
    flags(flow-control);
};

As you can see, we use the elasticsearch-http() destination of syslog-ng. Not all the parameters required by this driver are actually used by Humio. Most likely, it would be possible to create a tailored destination driver for Humio, which does not ask for the parameters not used by Humio. However, my goal is to demonstrate that the same driver works unmodified everywhere.

There are three parameters that are not used by Humio, but required on the syslog-ng side: type(), index() and user(). Copy and paste them from here or make up some values yourself.

The url() is a required parameter and should be set as discussed previously.

The password() parameter of the elasticsearch-http() driver is required for Humio to work, this is where we put the ingest token.

By default, syslog-ng sends log messages one by one. It is not a problem when you have only a few log messages a second. But it is not efficient when you want to forward them with a high message rate. This is where performance-related settings come handy: workers() allows you to configure multiple syslog-ng threads to send logs in parallel, while batch-lines() lets you send multiple log messages in a single batch. When your central syslog-ng server forwards thousands of messages a second to Humio, you need to experiment for optimal settings.

The default template() sends only the fields of a traditional syslog message. However, syslog-ng often receives log messages in a structured format or parses log message to create name-value pairs from important fields in log messages. Name-value pairs parsed from log messages are often used to decide which message goes where, for example to do real-time alerting to various messaging services. By using your own template, you can forward these name-value pairs to Humio as well.

Explaining this template in depth is outside of the scope of this blog. However, let me quickly describe it. It creates a JSON formatted message from name-value pairs. It includes all parts of a syslog message and all name-value pairs parsed from a message. Some of them are excluded (name-value pairs of the systemd Journal) and instead of the traditional DATE macro, we use the name and format expected by Elasticsearch.

Testing

Save your configuration using the right URL, token and source and reload syslog-ng. Now you are ready for some testing. Even without doing anything more, you should already see a few messages in the Humio search interface.

You can also test the sending of the extra name-value pairs. Recent versions of syslog-ng parse sudo log messages automatically. Run a command through sudo. When you check the log message in the Humio query interface, you should see some new name-value pairs starting with “sudo.”.

-

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @PCzanik.

Related Content