Creating an HTTP source for syslog-ng in Python

HTTP is quickly becoming the universal transport protocol of the Internet. Nowadays even DNS over HTTPS implementations are available. There is no HTTP source implemented in C for syslog-ng, but starting with syslog-ng version 3.18, you can write new source drivers for syslog-ng in Python. While performance is not as good as it would be using C, you gain flexibility and ease of implementation by using Python.

From this blog post you can learn how to create a basic HTTP source for syslog-ng in Python. The example code (written by syslog-ng developer László Várady) is using the Twisted framework from Twisted Matrix Labs.

Before you begin

The Python source was released with version 3.18.1 of syslog-ng. There is a good chance that your Linux distribution of choice does not have this version (or any Python support enabled at all) yet. Check https://www.syslog-ng.com/products/open-source-log-management/3rd-party-binaries.aspx for a list of third-party package sources for up-to-date packages. Python support is usually not included in the core package, but available as a sub-package instead.It is usually called syslog-ng-python, syslog-ng-mod-python or something similar, depending on the distribution you use.

There are two ways to store the code of your Python source. For shorter codes you can store the Python code inline in your syslog-ng configuration. While for shorter codes including them inline in the syslog-ng configuration is much easier, , there are advantages to storing even these shorter Python codes (not only longer or more complex ones)separately. For example, you can easily validate code with pylint, have syntax highlighting, or use a dedicated Python IDE, like IDLE or Spyder3. In this blog post I am using this method. You can learn how to do that from my Python destination blog at https://www.syslog-ng.com/community/b/blog/posts/python-destination-getting-into-details.

You also need the Twisted framework installed. Luckily, many Linux distributions have it available in ready-to-use packages. For example, on Fedora Linux you can install your framework by using:

dnf install python3-twisted

My example code has only been tested with Python 3, but most likely also works with Python 2 with small modifications.

Syslog-ng configuration

source s_http {
    python(
      class("httpsource_v2.HTTPSource")
      options("port", "8081")
    );
};

destination d_http {
    file("/var/log/http");
};

log {
  source(s_http);
  destination(d_http);
};

As with any other Python bindings in syslog-ng, you have to refer to the Python code from the syslog-ng configuration. In this case, you have a source called s_http. The only mandatory option of the Python source is class(), where you name the Python class of your source driver. Here we store the Python code in an external file, so first we have to enter the file name without the extension (in my case it is “httpsource_v2”), followed by a dot and the name of the class.

You can also pass options to the Python code from the syslog-ng configuration. This part is optional, but helps to make your Python code more generic and reusable. Here we pass the port number on which the HTTP source listens.

In order to see the collected log messages from the Python source, you also need a destination and a log statement connecting the Python source with that destination. In the configuration above, the destination is a file called http under the /var/log/ directory.

Python source code

from twisted.web import server, resource
from twisted.internet import reactor, endpoints

from syslogng import LogSource
from syslogng import LogMessage

# Resource objects represent a single URL segment.
class HTTPSource(LogSource, resource.Resource):
    # URL segment is a leaf
    isLeaf = True

    def init(self, options):
        # if the port option is not set, use the default 8080
        if "port" in options:
            self.listen_port = "tcp:%s" % options["port"]
        else:
            self.listen_port = "tcp:%s" % "8080" # "tcp:8080"

        # Site Objects are responsible for creating HTTPChannel instances
        # to parse the HTTP request, and begin the object lookup process.
        self.site = server.Site(self)
        self.server = endpoints.serverFromString(reactor, self.listen_port)
        return True

    def run(self):
        self.server.listen(self.site)

        # The reactor is the core of the event loop within Twisted
        # (https://en.wikipedia.org/wiki/Reactor_pattern)
        # signal only works in main thread
        reactor.run(installSignalHandlers=False)

    def request_exit(self):
        reactor.callFromThread(reactor.stop)

    def render_GET(self, request):
        request.setResponseCode(501)
        return 'Not implemented'.encode('ascii')

    def render_POST(self, request):
        request_body = request.content.read()
        msg = LogMessage.parse(request_body, self.parse_options)
        self.post_message(msg)

        return 'Message received'.encode('ascii')

You can store the Python source code either inline in the syslog-ng configuration or in external files. In this case we store it in an external file.

The Python code starts with a number of imports:

  • The first two imports are for the Twisted framework.

  • The source driver has to be inherited from the syslogng.LogSource class.

  • The message you pass to syslog-ng from your Python code has to use the LogMessage API for log messages.

The HTTPSource() class is inherited from both a syslog-ng and a Twisted class. The first one is there, because the source driver has to be inherited from the syslogng.LogSource class. The second one is there to handle incoming requests. In this HTTP collector only the “/” page exists, so the isLeaf variable is set to True.

The init() method is optional, but very useful. It initializes your source driver using the options() defined in the syslog-ng configuration. In this case, it checks if “port” is set in the configuration and sets the default port value if it is not passed as an option. Some objects for the HTTP server are also created in this method.

If the init() method returns with False, syslog-ng terminates. This way you can check if options passed to init() are correct and prevent syslog-ng from starting if any problem occurs.

The run() method is mandatory. This is where your event loop runs. In the code above, it starts to listen on the assigned port and starts the “reactor” (the event loop of Twisted). The reactor event loop starts with the “installSignalHandlers=False” parameter, as signal handling only works in the main thread.

Finally, you also have to implement the mandatory request_exit method. This is called when syslog-ng is reloaded or stopped. It issed to stop the run() method. In the sample code it calls the stop method of the reactor.

The last two methods come from Twisted and are responsible for resource handling. The render_GET() method only answers an error code, as we expect the log messages through POST requests.

The render_POST() method first reads the body of an HTTP request, then it creates a very simple log message using the LogMessage API. You can learn more about the LogMessage API at http://support.oneidentity.com/technical-documents/syslog-ng-open-source-edition/3.18/administration-guide/source-read-receive-and-collect-log-messages/python-writing-server-style-python-sources/python-logmessage-api

Testing

As soon as you save your configuration and Python code, you are ready for testing. For more information about how to start syslog-ng when Python code is saved outside of the syslog-ng configuration,refer to the Python destination blog mentioned earlier in the Before you begin section. Once you start syslog-ng, you should be able to send log messages using curl:

curl -d "<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - 'su root' failed for lonvick on /dev/pts/8" -X POST http://localhost:8081/

In /var/log/http you should see the following line as many times as you posted it:

Oct 11 22:14:15 mymachine.example.com su: 'su root' failed for lonvick on /dev/pts/8

If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or you can even chat with us. For a list of possibilities, check our GitHub page under the “Community” section at https://github.com/balabit/syslog-ng. On Twitter, I am available as @PCzanik.

Related Content