Kafka destination improved with template support in syslog-ng

The C implementation of the Kafka destination in syslog-ng has been improved in version 3.30. Support for templates in topic names was added as a result of a Google Summer of Code (GSoC) project. The advantage of the new template support feature is that you no longer have to use a static topic name. For example, you can include the name of your host or the application sending the log in the topic name.

From this blog you can learn about a minimal Kafka setup, configuring syslog-ng and testing syslog-ng with Kafka.

Before you begin

Template support in the C implementation of the Kafka destination first appeared in syslog-ng version 3.30. This version is not yet available in Linux most distributions, and even where it is available, Kafka support is not enabled. You can either compile syslog-ng 3.30 or later yourself, or use 3rd party package repositories to install it. You can learn more about them at https://www.syslog-ng.com/3rd-party-binaries In most cases, Kafka support is not part of the base syslog-ng package, but available as a sub package. For example, in (open)SUSE and Fedora/RHEL packages it is available in the syslog-ng-kafka package.

Kafka might be available for your Linux distribution of choice, but for simplicity’s sake, I use the binary distribution from the Kafka website. At the time of writing, the latest available version is kafka_2.13-2.6.0.tgz and it should work equally well on any Linux host with a recent enough (that is, 1.8+) Java. If you use a local Kafka installation, you might need to modify some of the example command lines.

Downloading and starting Kafka

A proper Kafka installation is outside of the scope of my blog. Here we follow relevant parts of the Kafka Quickstart documentation. We download the archive containing Kafka, extract it, and start its components. You will need network access and four terminal windows.

First, download the latest Kafka release and extract it. The exact version might be already different:

wget https://downloads.apache.org/kafka/2.6.0/kafka_2.13-2.6.0.tgz
tar xvf kafka_2.13-2.6.0.tgz

At the end, you will see a new directory: kafka_2.13-2.6.0

From now on, you will need the 3 extra terminal windows, as first we start two separate daemons in the foreground to see their messages, and two more windows are required to send messages to Kafka and to receive them.

First, start zookeeper in one of the terminal windows. Change to the freshly created directory and start the application:

cd kafka_2.13-2.6.0/
bin/zookeeper-server-start.sh config/zookeeper.properties

Now you can start the Kafka server in a different terminal window:

cd kafka_2.13-2.6.0/
bin/kafka-server-start.sh config/server.properties

Both applications print lots of data on screen. Normally, the flood of debug information stops after a few seconds and the applications are ready to be used. If there is a problem, you will get back the command line. In this case, you have to browse through the debug messages and resolve the problem.

Now you can do some minimal functional testing, without syslog-ng involved yet. This way you can make sure that access to Kafka is not blocked by a firewall or other software.

Open yet another terminal window, change to the Kafka directory and start a script to collect messages from a Kafka topic. You can safely ignore the warning message, as it appears because the topic does not exist yet.

cd kafka_2.13-2.6.0/
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic mytest
[2020-12-15 14:41:09,172] WARN [Consumer clientId=consumer-console-consumer-31493-1, groupId=console-consumer-31493] Error while fetching metadata with correlation id 2 : {mytest=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)

Now you can start a fourth terminal window to send some test messages. Just enter something after the “>” sign and hit Enter. Moments later, you should see what you just entered in the third terminal window:

bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic mytest
>bla
>blabla
>blablabla
>

You can exit with ^D.

Configuring syslog-ng

Now that we checked that we can send messages to Kafka and pull those messages with another application, it is time to configure syslog-ng. As a first step, we just check sending logs to Kafka and check the results.

If syslog-ng on your system is configured to include .conf files from the /etc/syslog-ng/conf.d/ directory, create a new configuration file there. Otherwise, append the configuration below to syslog-ng.conf.

destination d_kafka {
  kafka-c(config(metadata.broker.list("localhost:9092")
                   queue.buffering.max.ms("1000"))
        topic("mytest")
        message("$(format-json --scope rfc5424 --scope nv-pairs)"));
};

log {
  source(src);
  destination(d_kafka);
};

Note that the source name for local logs might be different in your syslog-ng.conf, so check before reloading the configuration. The name “src” is used on openSUSE/SLES. As a safety measure, check your configuration with:

syslog-ng -s

While it cannot check if you spelled the source name correctly, a quick syntax check will ensure that all necessary syslog-ng modules are installed. If you see a message about JSON or Kafka, install the missing modules.

Once you reloaded syslog-ng, you should see logs arriving on your third terminal window in JSON format, similar to these:

blablabla
{"SOURCE":"src","PROGRAM":"syslog-ng","PRIORITY":"notice","PID":"5841","MESSAGE":"syslog-ng starting up; version='3.30.1.25.g793d7e4'","HOST_FROM":"leap152","HOST":"leap152","FACILITY":"syslog","DATE":"Dec 15 15:04:58"}
{"SOURCE":"src","PROGRAM":"systemd","PRIORITY":"info","PID":"1","MESSAGE":"Stopping System Logging Service...","HOST_FROM":"leap152","HOST":"leap152","FACILITY":"daemon","DATE":"Dec 15 15:04:57"}

Using a template in the topic name

To use a template in the topic name, the syslog-ng configuration needs two modifications. First of all, you need to modify the topic(). But you also need to provide an additional parameter: fallback-topic(). Note that topic names can only contain numbers and letters from the English alphabet. Special characters or letters with accent marks are rejected. This is why you need a fallback-topic: if a topic name cannot be used, the related message is saved to the topic named in the fallback-topic(). You can find the modified configuration below:

destination d_kafka {
  kafka-c(config(metadata.broker.list("localhost:9092")
                   queue.buffering.max.ms("1000"))
        topic("mytest_$PROGRAM")
        fallback-topic("mytest")
        message("$(format-json --scope rfc5424 --scope nv-pairs)"));
};

log {
  source(src);
  destination(d_kafka);
};

Using this configuration, the name of the application sending the log is also included in the topic name. Once you reload syslog-ng, you will receive a lot less logs on the “mytest” topic. But, for example, postfix logs will still arrive there, as they include a slash in the application name. Alternatively, you can send a log with accent marks yourself. Being Hungarian is an advantage here, but German also has its own share of accent marked characters. For example, you can use “logger” to send logs and the “-t” option can be used to set the application name:

logger -t öt_szép_szűzleány_őrült_írót_nyúz bla

You will see the related message in the “mytest” topic:

{"SOURCE":"src","PROGRAM":"öt_szép_szűzleány_őrült_írót_nyúz","PRIORITY":"notice","PID":"6177","MESSAGE":"bla","HOST_FROM":"leap152","HOST":"leap152","FACILITY":"user","DATE":"Dec 15 16:21:01"}

By now, you should have logs from a couple of applications. Stop the application pulling logs from Kafka on the third terminal window, and list the available topics. You should see something similar:

bin/kafka-topics.sh --bootstrap-server localhost:9092 --list
__consumer_offsets
mytest
mytest_syslog-ng
mytest_systemd

When you start the collector script again with mytest_systemd as a topic, you will most likely not see any input for several minutes. The reason is that by default, the script is only collecting any new messages. Check the built-in help how you can check earlier messages.

What is next?

This blog is enough to get you started and learn the basic concepts of Kafka and the syslog-ng Kafka destination. On the other hand, it is far from anything production-ready. For that, you need a proper Kafka installation and most likely the syslog-ng configuration also needs additional settings.

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @Pczanik.

Related Content