The MongoDB destination of syslog-ng will receive another performance update. Starting with the upcoming version 4.3, it will support bulk operations. Depending on the configuration settings, this may result in a more than 300% performance increase.

Before you begin

At the time of writing this blog, bulk support for the MongoDB destination is not yet available. It arrived soon after the 4.2 release and will be officially released as part of the 4.3 release. However, you can already try it using my RPM git snapshot packages, my FreeBSD ports repo, or the nightly Linux container images.

You also need a MongoDB server. Syslog-ng uses the mongo-c-driver to access MongoDB. Older driver versions do not support the latest server versions, and the latest drivers dropped support for older server versions. The syslog-ng packages use the mongo-c-driver bundled with the OS, which might limit your choice of MongoDB server version. Of course, this limitation does not apply if you compile the mongo-c-driver and syslog-ng yourself.

A bit of history

The first document store supported by syslog-ng was MongoDB. It was also the first to introduce a lot more flexible way of working with name-value pairs. Still, the performance of the driver was not so great, as it was single-threaded and did not support bulk operations.

Version 3.32 introduced support for multiple worker threads, and it also added support for templates in collection names. Depending on the hardware and the amount of log messages, this already resulted in a major jump in performance and flexibility.

Bulk support was added right after the 4.2 release. This means that instead of sending messages one by one, bulk support groups messages together and sends them in batches, considerably speeding up storing messages to MongoDB.

Configuring syslog-ng

The following configuration is taken from the git commit message. You can use this as a starting point to test the new parameters:

@version: 4.2
@include "scl.conf"


# For feeding this source you can use e.g.
#
#		loggen -T -r 100000 -S 127.0.0.1 514 --reconnect
#
# 3 instances well enough to provide around 120000 msgs/sec avg. output)
#
source s_remote0 {
	default-network-drivers(
		log-iw-size(3000000)
	);
};


# After installing mongodb
#
# - Create a database called syslog by entering the command:
#
#    mongosh syslog
#
# - Set up the database like so:
#
#    db.createCollection( "messages", { capped: true, size: 100000000 } )
#
# - Once the messages have been transmitted, to view use:
#
#    db.messages.find().pretty()
#
# - Drop messages
#	
#    db.messages.drop()
#
destination d_mongodb {
	mongodb(
		uri("mongodb://127.0.0.1:27017/syslog")
		#collection("messages_${R_YEAR}${R_MONTH}${R_DAY}")
		collection("messages")
		value-pairs(
			scope("selected-macros" "nv-pairs" "sdata")
		)
		#bulk(no)
		#bulk_unordered(yes)
		#write_concern(unacked)
	
		workers(3) # 3-4 workers can produce the best result in this test
	  
		batch-lines(10000)
		batch-timeout(1000)
	);		
};


log {
	source(s_remote0);

	destination(d_mongodb);

	flags(flow-control);
};

You can use collections together with bulk operations. However, using bulk operations only speeds up your MongoDB connection if the template stays the same for a longer period of time. It works fine with the included example, and including the hour should still work fine. On the other hand, including the host name in the template could mean that enabling bulk operations would not have any noticeable performance impact.

Various extra parameters can influence how bulk operations work. In our measurements, enabling or disabling unordered message sending does not have a noticeable performance impact. Bypassing validation also has minimal impacts only. Besides the number of worker threads and enabling bulk operations, write concern seems to have the most impact on performance. The safest and default write concern is “acked”, but it is also the slowest. Using “unacked” might boost performance considerably with the risk of message loss.

What is next?

Bulk support in the syslog-ng MongoDB destination driver is not yet available in an official release, but is considered to be ready. If you use the MongoDB destination driver, I strongly recommend testing it. You should experiment with the number of worker threads and batch sizes. Any feedback is very welcome. Obviously, we would be very happy to hear about your success stories, but we also want to know if you had any problems, or if bulk support works at all for you.

-

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/syslog-ng/syslog-ng. On Twitter, I am available as @PCzanik, on Mastodon as @Pczanik@fosstodon.org.

Related Content