New performance tuning possibilities in syslog-ng

On April’s fool’s day, I shared that syslog-ng can reach 7 million EPS. This test lab result was in part possible thanks to a few performance enhancements coming to syslog-ng version 4.12.

How 7 million EPS is possible? Before diving deeper, let me repeat it: 7 million EPS is just a lab testing result, not (yet) possible in the real world. However, the technologies enabling this are already available on the development branch of syslog-ng, or have been available for ages, just not tested or promoted enough.

Before you begin

In the lab test where I reached 7 million EPS, I experimented with a few different parameters:

  • jemalloc

  • parallelize()

  • receive buffer size

You can test jemalloc with any syslog-ng version, just as in the case of receive buffer size. However, you need the latest git snapshot (or syslog-ng version 4.12) to test the new parallelize() parameter: batch_size().

The syslog-ng nightly packages now include the new, optimized parallelize(), just as my weekly git snapshot RPM packages.

Jemalloc

So, what is jemalloc? According to their website:

“jemalloc is a general purpose malloc(3) implementation that emphasizes fragmentation avoidance and scalable concurrency support.”

I did not test it in depth, but in my measurements, syslog-ng performance increased visibly under heavy load with many connections. There was no visible benefit with just one or few TCP connections.

With the latest git version, you can compile in jemalloc support into syslog-ng. However, you can also use the LD_PRELOAD environment variable to load it, and use it with earlier syslog-ng versions:

LD_PRELOAD=/usr/lib64/libjemalloc.so.2 syslog-ng -Fvde

Of course, the above command might be different, depending on your shell and OS, and it is good enough for testing only. You might need to tune your init scripts or service files to set them up for a production environment.

If you build syslog-ng yourself, you need to add jemalloc to your build environment and use the --enable-jemalloc option with configure.

The new batch_size() option of parellelize()

The parallelize() option of syslog-ng enables syslog-ng to process incoming logs in parallel, utilizing multiple CPU cores even if there is only a single incoming TCP connection. It can increase performance significantly, especially if there are only a few incoming connections and you have complex configuration. However, a lot larger increase is possible if you enable batch processing.

Log {
  source(src);
  source(chroots);
  parallelize(batch_size(1000) workers(28));
  filter(f_messages);
  destination(messages);
};

As usual, the exact number of workers and batch size depend on your configuration, number of incoming connections, and hardware, like the number of CPU cores. My test box had 32 CPU cores but reached peak performance when the number of worker threads was slightly smaller.

Socket receive buffer

I must admit that I was not aware that this option is also available for TCP and used it only for UDP previously. However, the so-rcvbuf() option was also useful for TCP and had a fantastic effect on syslog-ng. In my measurements it made a larger difference than anything else combined. I configured a 256MB buffer on the command line:

sysctl -w net.core.rmem_max=268435456

And configured the tcp() source accordingly:

tcp(ip("0.0.0.0") port(514) max_connections(128) so-rcvbuf(268435456));

As usual, a different setting might perform a lot better in your specific environment. My test machine has 128 GB of RAM, however raising the buffer size to 1 GB did not have a noticeable improvement in my environment.

Testing

I used sngbench for my tests. That is a synthetic benchmark. While it is good enough to compare configurations and hardware, it runs locally and shows results that are unreachable in production. As much as your environment allows, you should test performance with production logs and configurations, or something very similar to those.

What is next?

What are your experiences? Share with us how jemalloc and parallelize() changed syslog-ng performance in your environment!

-

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @PCzanik, on Mastodon as @Pczanik@fosstodon.org.

Related Content