Version 4 of syslog-ng works perfectly well in version 3 compatibility mode. However, if you want to use the syslog-ng 4 features, you need to be aware of some significant changes. If you have a simple configuration, like those in Linux distributions, then simply rewriting the version string is most likely enough. However, if you use PatternDB or JSON parsing, any Python code, or an Elasticsearch, or MongoDB destination, you have to be aware of the changes.

From this blog you can learn about type support, how this can affect you, changes in Python support, and some tips how to prepare for the upgrade.

Type support

One of the major changes in syslog-ng 4 was the introduction of type support. In version 3, all data parsed from log messages was stored and handled as text (string). Even if you parsed an integer from a log message using PatternDB or the JSON parser, the type information was lost, and all data was handled as text. If you wanted to forward data with proper type information, you had to use type hinting in the syslog-ng destination driver.

Version 4 makes syslog-ng type-aware. Data is stored as text (it is still a syslog implementation, built to work with text data), but syslog-ng also stores type information alongside name-value pairs. It has two major advantages:

  • Comparisons actually work, 90 is not larger than 100, as it was before, (based on alphabetical order instead of numerical values)

  • Some destination drivers can use the type information to send name-value pairs in the correct format to the destination. For example, in case of the Elasticsearch destination, it means that searching/reporting/dashboards work properly without hand creating mapping for data types.

What does type support mean in practice? When syslog-ng is parsing a log message using PatternDB or the JSON parser, it can detect the type of data and store type info. You can also set the type using rewrite rules for name-value pairs coming from other parsers.

Today we use sudo logs as an example. Recent versions of sudo can log in JSON format, and syslog-ng can parse these logs automatically into name-value pairs. In the following examples we parse these logs using syslog-ng 3.37 and syslog-ng 4, and then reconstruct the log message in JSON format. While the differences are barely visible for the human eye, software working with the generated log messages will definitely notice the difference.

The first log path in the following configuration collects local log messages and stores them without any further processing or filtering into /var/log/messages. There is also a second log path which selects sudo logs, parses them using the JSON parser, and then creates a JSON formatted message from the result.

@version:3.37
@include "scl.conf"
source s_sys { system(); internal();};
destination d_mesg { file("/var/log/messages"); };
log { source(s_sys); destination(d_mesg); };
filter f_sudo {program(sudo)};
destination d_test {
  file("/var/log/sudo.json"
    template("$(format-json --scope nv_pairs --scope dot_nv_pairs --rekey .* --shift 1 --exclude *journal* --exclude MESSAGE --scope rfc5424)\n\n"));
};
log {
  source(s_sys);
  filter(f_sudo);
  destination(d_test);
};

The JSON template function has a few exclusions to make the log messages shorter, so the MESSAGE macro, which contains the original JSON payload, and any journal macros are discarded. Two new lines at the end make the files more human readable.

Using this configuration if you run sudo, you will find the unparsed JSON formatted log message from sudo in /var/log/messages:

Mar 17 15:24:07 localhost sudo[35095]: @cee:{"sudo":{"accept":{"uuid":"9045212067-d7cb-4a8f-ab40-32dc5b7c6f","server_time":{"seconds":1679063047,"nanoseconds":415866617,"iso8601":"20230317142407Z","localtime":"Mar 17 15:24:07"},"submit_time":{"seconds":1679063047,"nanoseconds":399116920,"iso8601":"20230317142407Z","localtime":"Mar 17 15:24:07"},"submituser":"czanik","command":"/usr/bin/ls","runuser":"root","runcwd":"/home/czanik","ttyname":"/dev/pts/0","submithost":"localhost.localdomain","submitcwd":"/home/czanik","runuid":0,"columns":118,"lines":60,"runargv":["ls","/root/"],"runenv":["LANG=en_US.UTF-8","HOSTNAME=localhost.localdomain","TERM=xterm-256color","PATH=/home/czanik/.local/bin:/home/czanik/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin","MAIL=/var/mail/root","LOGNAME=root","USER=root","HOME=/root","SHELL=/bin/bash","SUDO_COMMAND=/usr/bin/ls /root/","SUDO_USER=czanik","SUDO_UID=1000","SUDO_GID=1000"]}}}

Now take a look at what syslog-ng can re-create from the original JSON log message using the version 3 configuration. JSON sudo logs are automatically parsed by syslog-ng: this is why you do not see a parser declaration in the above configuration. The JSON template function then tries to re-create the log message from the name-value pairs. For the untrained eye, the only difference is that syslog macros, like date or facility, are now also part of the JSON:

{"cee":{"sudo":{"accept":{"uuid":"9045212067-d7cb-4a8f-ab40-32dc5b7c6f","ttyname":"/dev/pts/0","submituser":"czanik","submithost":"localhost.localdomain","submitcwd":"/home/czanik","submit_time":{"seconds":"1679063047","nanoseconds":"399116920","localtime":"Mar 17 15:24:07","iso8601":"20230317142407Z"},"server_time":{"seconds":"1679063047","nanoseconds":"415866617","localtime":"Mar 17 15:24:07","iso8601":"20230317142407Z"},"runuser":"root","runuid":"0","runenv":"LANG=en_US.UTF-8,HOSTNAME=localhost.localdomain,TERM=xterm-256color,PATH=/home/czanik/.local/bin:/home/czanik/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin,MAIL=/var/mail/root,LOGNAME=root,USER=root,HOME=/root,SHELL=/bin/bash,\"SUDO_COMMAND=/usr/bin/ls /root/\",SUDO_USER=czanik,SUDO_UID=1000,SUDO_GID=1000","runcwd":"/home/czanik","runargv":"ls,/root/","lines":"60","command":"/usr/bin/ls","columns":"118"}}},"app":{"name":"cee"},"SOURCE":"s_sys","PROGRAM":"sudo","PRIORITY":"notice","PID":"35095","HOST_FROM":"localhost","HOST":"localhost","FACILITY":"authpriv","DATE":"Mar 17 15:24:07"}

However, if you take a more careful look, or if you use an app like jq to display the JSON message, you will catch a few more differences. The list looks slightly different, and the numbers are enclosed in quotes, meaning that they are forwarded as strings instead of numbers.

        "runuser": "root",
        "runuid": "0",
        "runenv": "LANG=en_US.UTF-8,HOSTNAME=localhost.localdomain,TERM=xterm-256color,PATH=/home/czanik/.local/bin:/home/czanik/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin,MAIL=/var/mail/root,LOGNAME=root,USER=root,HOME=/root,SHELL=/bin/bash,\"SUDO_COMMAND=/usr/bin/ls /root/\",SUDO_USER=czanik,SUDO_UID=1000,SUDO_GID=1000",
        "runcwd": "/home/czanik",
        "runargv": "ls,/root/",
        "lines": "60",
        "command": "/usr/bin/ls",
        "columns": "118"

Now update syslog-ng and your configuration to version 4. At first glance, you might not notice, but the JSON generated by syslog-ng 4 is very different from the JSON generated by syslog-ng 3:

{"cee":{"sudo":{"accept":{"uuid":"3d3e1f8c9b-d62e-4081-8fff-bc71e80e29","ttyname":"/dev/pts/0","submituser":"czanik","submithost":"localhost.localdomain","submitcwd":"/home/czanik","submit_time":{"seconds":1679063411,"nanoseconds":336097049,"localtime":"Mar 17 15:30:11","iso8601":"20230317143011Z"},"server_time":{"seconds":1679063413,"nanoseconds":696181185,"localtime":"Mar 17 15:30:13","iso8601":"20230317143013Z"},"runuser":"root","runuid":0,"runenv":["LANG=en_US.UTF-8","HOSTNAME=localhost.localdomain","TERM=xterm-256color","PATH=/home/czanik/.local/bin:/home/czanik/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin","MAIL=/var/mail/root","LOGNAME=root","USER=root","HOME=/root","SHELL=/bin/bash","SUDO_COMMAND=/usr/bin/ls /root/","SUDO_USER=czanik","SUDO_UID=1000","SUDO_GID=1000"],"runcwd":"/home/czanik","runargv":["ls","/root/"],"lines":60,"command":"/usr/bin/ls","columns":118}}},"app":{"name":"cee"},"SOURCE":"s_sys","PROGRAM":"sudo","PRIORITY":"notice","PID":"35167","HOST_FROM":"localhost","HOST":"localhost","FACILITY":"authpriv","DATE":"Mar 17 15:30:13"}

It is easier to spot the differences with some formatting:

        "runuser": "root",
        "runuid": 0,
        "runenv": [
          "LANG=en_US.UTF-8",
          "HOSTNAME=localhost.localdomain",
          "TERM=xterm-256color",
          "PATH=/home/czanik/.local/bin:/home/czanik/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin",
          "MAIL=/var/mail/root",
          "LOGNAME=root",
          "USER=root",
          "HOME=/root",
          "SHELL=/bin/bash",
          "SUDO_COMMAND=/usr/bin/ls /root/",
          "SUDO_USER=czanik",
          "SUDO_UID=1000",
          "SUDO_GID=1000"
        ],
        "runcwd": "/home/czanik",
        "runargv": [
          "ls",
          "/root/"
        ],
        "lines": 60,
        "command": "/usr/bin/ls",
        "columns": 118

Both lists and numbers are handled properly.

How type support affects you?

To learn more about comparisons, check this blog from Balázs Scheidler, original author of the syslog-ng project: https://syslog-ng-future.blog/syslog-ng-4-progress-3-38-1-release/ You can find many in-depth examples, how syslog-ng 4 makes comparisons a lot more useful.

When sending parsed log data from the PatternDB or the JSON parser to some of the destinations, you might run into problems. With earlier Elasticsearch versions, I ran into problems that if the type of a given name-value pair changed, then Elasticsearch started to reject those messages. In my recent tests I did not have such problems. Elasticsearch accepted integer values where earlier I stored string data. However, if you had any dashboards or analytics in Kibana, those might suddenly give unexpected results.

Some destinations work perfectly well in 3.X compatibility mode, but might fail when switched to version 4 mode. A fix for the MongoDB destination is under way, and hopefully will be part of the upcoming 4.2 release.

As you could see from the sudo example, you have JSON parsing enabled even if you did not explicitly turn it on in your configuration. Not all JSON is automatically parsed, only messages starting with the “@cee” mark, like sudo JSON logs.

You can check the 4.0 release notes about type support and if items in your configuration are affected: https://github.com/syslog-ng/syslog-ng/releases/tag/syslog-ng-4.0.1

Python changes

Python support in syslog-ng received considerable changes in version 4. Luckily, as long as your version string is set to a 3.X value, your existing Python scripts work without any modifications. However, if you want to use the syslog-ng 4 features, you also need to rewrite your Python code.

In one of my previous blogs I showed a Python code snippet resolving IP addresses to host names:

python {

"""
very simple syslog-ng Python parser example
resolves IP to hostname
value pair names are hard-coded
"""

import socket

class SngResolver(object):
    def parse(self, log_message):
        """
        Resolves IP to hostname
        """

        ipaddr_b = log_message['apache.clientip']
        ipaddr = ipaddr_b.decode('utf-8')

        # try to resolve the IP address
        try:
            resolved = socket.gethostbyaddr(ipaddr)
            hostname = resolved[0]
            log_message['hostname.client'] = hostname
        except:
            pass

        # return True, other way message is dropped
        return True

};

parser p_resolver {
    python(
        class("SngResolver")
    );
};

The change required was minimal, just an additional import, and rewriting the class header:

[…]
import socket
from syslogng import LogParser

class SngResolver(LogParser):
    def parse(self, log_message):
[…]

The above example is for parsers, these changes are slightly different for other syslog-ng blocks. For more details, including other new possibilities, check the Python module documentation at: https://github.com/syslog-ng/syslog-ng/tree/master/modules/python-modules

What is next?

Version 4 of syslog-ng works well in compatibility mode, however, to enjoy the new features you have to switch the configuration to version 4 mode. If you do not parse your logs with the JSON or PatternDB parsers, and save only to local files or over the syslog protocol without JSON formatting, then you are most likely safe to change to 4 mode without much further testing. Of course, this also means that you do not take any advantage of the new features.

If you use Python, or if you suspect that type support might affect your configuration, then I recommend using a test environment to test the syslog-ng configuration upgrade from version 3.X to 4. It might give you some extra work, but it also means that there is a much shorter downtime in your production environment, and a lot smaller chance for message loss or other surprises.

-

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @PCzanik, on Mastodon as @Pczanik@fosstodon.org.

Related Content