What hardware to use for a syslog-ng server? It is a frequent question with no definite answer. It depends on many factors: the number and type of sources, the number of logs, the way logs are processed, and so on. My experience is that for the majority users even a Raspberry Pi would be enough. But of course, not for everyone.
Note: to avoid confusion and frustration, I omitted my syslog-ng benchmark results. Message rate from synthetic performance tests with optimal configuration and message load is unachievable in the real world.
Before you begin
Do you really have to think about the hardware used under your syslog-ng server? Talking to thousands of syslog-ng users over the years, my assumption is that most of you are good to go with any hardware available to you. I mean, most people I talked to reported that their syslog-ng server receives 10 to 50 messages a second. If you check my Raspberry Pi blog, you will see that even a many years old Raspberry Pi 2 can handle a lot more messages with encrypted connections and message processing.
As mentioned earlier, most users could use the Raspberry Pi as a syslog-ng server without running into hardware limitations. Of course, using SD cards as storage is strongly discouraged due to reliability problems, but an USB HDD can work around this limitation.
In practice, most users run syslog-ng on x86 servers. An entry level Xeon with a Gigabit Ethernet connection and with some spinning rust (traditional HDD) can easily handle tens or even hundreds of thousands of messages, depending on sources, destinations, and processing in between. This is more than what most syslog-ng users will ever need.
As much as I love high performance servers, only a very few syslog-ng users can push high end servers to their limits. Depending on sources, processing, destinations, you might not be able to utilize a good part of a manycore server. Talking about CPU: a few larger cores work better than many smaller cores. In my experiences turning off SMT on POWER 9 CPUs can increase peak performance considerably. Disabling HT on Intel boxes also improves performance, even if not so drastically.
It seems to me that cache size has a considerable effect on syslog-ng performance. For many years, POWER 9 CPUs by IBM were the fastest to run syslog-ng among the machines I had access to. Right now, my AMD Ryzen desktop CPU with 36MB of cache provides me with the highest events per second processing power in my synthetic benchmarks. POWER 10 and the latest AMD EPYC CPUs now even have much larger cache sizes.
When you have some really high message rates, you can run into limitations where not even the strongest hardware can help you. Luckily, with a bit of extra work on the configuration side, some of these performance problems can be worked around.
On the source side TCP scales better than UDP. Until recently, even if you had a 32 core monster CPU, only one of those cores was utilized while receiving log messages over UDP, as there was a single listener thread collecting messages. The so-reuseport() option improves this situation as it allows multiple listeners on the same port. However, it only helps if you have many sources with a relatively even distribution of messages, as messages are distributed by source IP address and port. TCP connections can utilize the CPU better, but with the same limitation: if there are many connections with similar message rates. A lot more messages can be handled by syslog-ng if you open multiple ports. It is true both for UDP and TCP, but unfortunately, it also means more planning and more time spent on configuring syslog clients.
Any kind of parsing can slow down syslog-ng. When you have just a few thousand messages a second, it is not a problem, even a recent Raspberry Pi can handle that. However, with a higher message rate you should consider what you parse. You should make sure that you only use PatternDB, regexp-parser() or the Python parser on logs where it is really necessary, and only the parser which is necessary. You can filter logs on application name or use a separate source port, whatever is more convenient.
The file() and http()-based (like Splunk, Elasticsearch) destinations are fast. However, even if your hardware can process hundreds of thousands of messages, the SQL destination can only deal with a few thousand messages a second. Make sure that you use a destination which is both fast and well-supported by syslog-ng.
Right now, there is a hard coded thread limit in syslog-ng: 64. This also means, that high core count servers cannot utilize all the cores. Future versions of syslog-ng will have this limit removed. ARM pioneered the high core count server CPUs, but IBM, AMD and Intel now all have processors with many cores. There are other performance improvements on the horizon as well, which can make sure that messages are processed in parallel and thus better utilize the underlying hardware.
What is next?
You can learn more about syslog-ng performance tuning in various chapters of the syslog-ng documentation: https://www.syslog-ng.com/technical-documents/list/syslog-ng-open-source-edition/ The syslog-ng Premium Edition (the commercial variant of syslog-ng) documentation also has a performance guide line. The current version is available at https://support.oneidentity.com/technical-documents/syslog-ng-premium-edition/7.0.29/performance-guideline-for-syslog-ng-premium-edition/ and parts of it are also valid about the Open Source Edition (OSE).
If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @PCzanik.