Endpoint visibility and monitoring using osquery and syslog-ng

Using osquery you can ask questions about your machine using an SQL-like language. For example, you can query running processes, logged in users, installed packages and syslog messages as well. You can make queries on demand, and also schedule them to run regularly. The results of the scheduled queries are logged to a file.

From this post, you will learn, how to

  • send log messages to osquery,
  • read osquery logs using syslog-ng
  • parse the JSON-based log messages of osquery, so selected fields can be forwarded to Elasticsearch or other destinations expecting name-value-pairs.


You can easily perform all of these tasks using the syslog-ng log management solution.


Send log messages to osquery

To be able to query log messages, osquery needs to store them in its own database. It can collect syslog messages through a pipe, but only accepts messages in a specific format.
To configure osquery to accept syslog messages, you can either add parameters to osqueryd add them to a file. This file is usually under /etc/osquery/osquery.flags and expects each parameter on a separate line.

Set the following parameters, then restart the osqueryd service:

--disable_events=false  --enable_syslog=true

Once you restarted the osqueryd service, it’s time to configure syslog-ng (if you don’t have syslog-ng installed yet, install it from the repository of your distribution, or find a package on the syslog-ng website. If you are not familiar with syslog-ng, check out the quickstart section in the documentation).

Add the following snippets to the syslog-ng.conf file:

# Reformat log messages in a format that osquery accepts  rewrite r_csv_message {    set("$MESSAGE", value("CSVMESSAGE") );    subst("\"","\"\"", value("CSVMESSAGE"), flags(global) );  };    template t_csv {  template("\"${ISODATE}\",\"${HOST}\",\"${LEVEL_NUM}\",\"${FACILITY}\",\"${PROGRAM}\",\"${CSVMESSAGE}\"\n");   template_escape(no);  };    # Sends messages to osquery  destination d_osquery {    pipe("/var/osquery/syslog_pipe" template(t_csv));  };    # Stores messages sent to osquery in a log file, useful for troubleshooting   destination d_osquery_copy {    file("/var/log/csv_osquery" template(t_csv));  };    # Log path to send incoming messages to osquery   log {   source(s_sys);   rewrite(r_csv_message);   destination(d_osquery);   # destination(d_osquery_copy);  };

The rewrite is needed to make sure that quotation marks are escaped. The template re-formats the messages as expected by osquery. Binaries provided by the osquery project expect syslog messages in this pipe: you might need to change the location if you compiled osquery yourself. If you want to see what messages are sent to osquery you can uncomment the “d_osquery_copy” destination in the log path. The “s_sys” source refers to your local log messages and might be different on your system (this example is from CentOS).
Note that you should not forward log messages from a central syslog-ng server to osquery, as it was designed with single machines in mind, both for performance and sizing. By default, osquery preserves the last hundred thousand messages. If you have a larger network, hundred thousand messages can arrive on a central syslog server in a matter of seconds.


Collect and parse osquery logs

By default, osquery stores all of its log messages under the /var/log/osquery directory. While configuring and debugging osquery, you can use the different .INFO and .WARNING files in this directory to figure out what went wrong. If you configured osqueryd to do periodical queries about your system, the results will go to a file called osqueryd.results.log in the same directory. The format of this file is JSON, which means that in addition to forwarding its content, syslog-ng can also parse the messages. This has many advantages:

  • you can create filters based on individual fields from the messages
  • you can limit which fields to store, or can create additional fields
  • if you want to store the messages in Elasticsearch, you can add the date in the required format, and send the messages to Elasticsearch directly from syslog-ng.


The following configuration reads the osquery log file, parses it, filters for a given event (not really useful here, just as an example) and stores the results in Elasticsearch. For easier understanding I broke the configuration into smaller pieces, you can find the full configuration at the end of my post. You can append it to your syslog-ng.conf or place it in a separate file under the etc/syslog-ng/conf.d directory in many Linux distributions and on FreeBSD.

First we need to read the osquery log file, and make sure that syslog-ng does not try to parse it as a syslog message:

source s_osquery {   file("/var/log/osquery/osqueryd.results.log"   program-override("osquery")   flags(no-parse));  };

Next we need to parse the log messages with the JSON parser, so we have access to the name-value pairs of the JSON-based log messages. The prefix option makes sure that names parsed from JSON do not collide with names used by syslog-ng internally.

parser p_json {   json-parser (prefix("osquery."));  };

Then we define a filter, in this case searching for messages related to loading Linux kernel modules. This is just an example, you can easily filter for any fields, combine it with the inlist() filter to filter for a list of values, and so on.

filter f_modules {   "${osquery.name}" eq "pack_incident-response_kernel_modules"  };  

We need to store the logs somewhere, so here we define an Elasticsearch destination. (You can read more about logging to Elasticsearch with syslog-ng here.)

destination d_elastic {   elasticsearch2 (    cluster("syslog-ng")    client-mode("http")    index("syslog-ng")    type("test")    flush-limit("1")    template("$(format-json --scope rfc5424 --scope nv-pairs --exclude MESSAGE --exclude DATE --key ISODATE)")   )  };

As a last step, we connect all of these building blocks together using a log statement. If you do not want to filter the forwarded logs, comment out the filter line:

log {   source(s_osquery);   parser(p_json);   filter(f_modules);   destination(d_elastic);  };  

Here is the complete configuration for a better copy & paste experience:

source s_osquery {   file("/var/log/osquery/osqueryd.results.log"   program-override("osquery")   flags(no-parse));  };   parser p_json {    json-parser (prefix("osquery."));  };   filter f_modules {    "${osquery.name}" eq "pack_incident-response_kernel_modules"  };  destination d_elastic {   elasticsearch2 (    cluster("syslog-ng")    client-mode("http")    index("syslog-ng")    type("test")    flush-limit("1")    template("$(format-json --scope rfc5424 --scope nv-pairs --exclude MESSAGE --exclude DATE --key ISODATE)")   )  };  log {   source(s_osquery);   parser(p_json);   filter(f_modules);   destination(d_elastic);  };  


This was just a quick introduction to osquery and syslog-ng. These examples are good to whet your appetite, but you should read the official documentation if you plan to use it in production, because it requires a more in-depth knowledge of syslog-ng and osquery to produce useful results. Save the following references:

Related Content