Redes Sociales

martes, 4 de diciembre de 2018

Trying to get a Bro Cluster to work

After developing an analyzer as a plugin, I had to develop a number of bro scripts to make my tool work. Finally they were 4 bro scripts. We prepared a Cluster infrastructure with two worker nodes, each with two network interfaces, one of them listening in two different VLANs and the other in the monitoring network. And a manager&proxy node running in the monitoring network only.

The installation was quite smooth. Used a Debian Jessy 64bits and followed instructions found in the Bro website. After installation, both workers and the manager seemed to work fine. Not exactly.

Briefly, in Manager node, bro scripts should be copied in

[1] /usr/local/bro/share/bro/site

...broctl should be configured modifying values of:

[2] /usr/local/bro/etc/broctl.cfg

...logs would be copied to:

[3] /mnt/logs/current

...given the different and deep-named directories I decided to create some symbolic links to them in my /home/bro directory:

root@Manager:/home/bro# ls -l
total 8
drwxr-xr-x 6 bro  bro  4096 Jun  7 13:38 aux_files
drwxr-xr-x 3 bro  bro  4096 May 31 13:30 bro
lrwxrwxrwx 1 root root   29 Jun  8 13:51 broctl.cfg -> /usr/local/bro/etc/broctl.cfg
lrwxrwxrwx 1 root root   17 Jun  8 13:50 logs -> /mnt/logs/current
lrwxrwxrwx 1 root root   29 Jun  8 13:50 site -> /usr/local/bro/share/bro/site

These are the steps to follow when any change have to be introduced in the configuration or bro scripts:

  1. Modify bro scripts located in [1]
  2. Modify (if necessary) broctl options in [2]
  3. launch Bro Control (broctl)
  4. check
  5. deploy

All broctl commands, which can be looked up using the 'help' command, are actually python scripts located in:

[4] /usr/local/bro/lib/broctl/Brocontrol

It's not difficult to understand what they are doing if you know a little bit of Python. For instance, deploy and install commands are run with the control.py python script.

Basically, when a 'deploy' is sent in broctl, the scripts in [1] are copied to:

[5] /home/bro/bro/spoolinstalled-scripts-do-not-touch/site

...both in the manager and all worker nodes.

Why these directories?. Well, this is configured in the broctl.cfg [2] file. In this file there are a number of variables that will allow us to configure directories and bro-scripts like:

  1. PolicyDirSiteInstall: Directory where all bro scripts are copied both in workers and manager nodes. By default is ${SpoolDir}/installed-scripts-do-not-touch/site. In my case is the directory mentioned in [5].
  2. SitePolicyPath: Colon-separated list of Directories to search for site-specific policy (bro) files. For each such directory, all files and subdirectories are copied to PolicyDirSiteInstall during broctl ‘install’ or ‘deploy’. By default is /usr/local/bro/share/bro/site.
  3. SitePolicyWorker: Space-separated list of bro scripts that should be run in EVERY worker node of the cluster. By default its value is local-worker.bro, which is an empty bro script. They will be copied to ALL nodes of the cluster when a 'deploy' or 'install' commands are launched. If we want to run different scripts in both nodes, the process consist in installing the script for one worker (in all nodes) and then, modify nodes.cfg file, modify this variable and send an 'update' command in broctl.
  4. sitepolicyworker=my-prot-worker-main.bro my-prot-aux-functions.bro
    
  5. sitepolicymanager: Space-separated list of bro scripts that should be run in the manager node. In my case this script manages notices generated by workers.
  6. sitepolicymanager=manager_notice_policy.bro
    
  7. LogDir: Location of the log directory where log files will be archived each rotation interval.
    LogDir = /mnt/logs
    

Of course, there are many other interesting options that can be modified for BroControl. Have a llo to the BroControl documentation website.

Using a bro analyzer as a plugin in a Bro Cluster

One of the problem I faced was that all bro scripts running in the worker nodes, monitored a proprietary protocol for which I had developed an analyzer as a plugin. If run standalone both in the manager and worker nodes, the script and the proprietary analyzer worked fine. But if run as a cluster, the scripts seemed not to understand proprietary network traffic. After a couple of checks I found that the plugin was not being 'loaded' by bro in the workers. Then I realized that the problem should be the Plugin directory where Bro was looking for the plugin. Soon I found that there are two different options in broctl.cfg file that could help me:

  • Env_Vars:A comma-separated list of environment variables (e.g. env_vars=VAR1=123, VAR2=456) to set on all nodes immediately before starting Bro. Node-specific values (specified in the node configuration file) override these global values.
  • Env_Vars=BRO_PLUGIN_PATH=/home/bro/aux_files/plugin
    
  • SitePluginPath:Directories to search for custom plugins (i.e., plugins that are not included with broctl), separated by colons.
  • SitePluginPath=/home/bro/aux_files/plugin
    

...but somehow, after deploying, it didn't work.

A little bit of research led to the /usr/local/bro/share/broctl/scripts/run-bro script, where I found that apparently only three environment variables were set, regardless of what I wrote in the broctl.cfg file:

...
echo "PATH=${PATH}" >.env_vars
echo "BROPATH=${BROPATH}" >>.env_vars
echo "CLUSTER_NODE=${CLUSTER_NODE}" >>.env_vars
echo "BRO_PLUGIN_PATH=${BRO_PLUGIN_PATH}" >>.env_vars
...

I just added the last line, et voila!, my analyzer was working fine!... well, not so fine. Some thing didn't work. Somehow some events were not noticed to the manager node, some log file were not created in the manager LogDir, and a config file was not read.



The fact of cluster logging

My first move was to read the stderr.log and stdout.log files stored in the manager node BUT, somehow, even if I filled my scripts with 'print' instructions, the stdout.log was empty!...

...until I found the /usr/local/bro/share/bro/base/frameworks/cluster/nodes directory. Here you can find some redef options for the worker, manager and proxy nodes. And one of these options was:

## Don't do any local logging.
redef Log::enable_local_logging = F;

...setting this option to True and deploying again, made the workers to start logging locally in the following directory of the worker nodes:

[7] /home/bro/bro/spool/worker-n/

There I found the reporter.log, that showed me that there was a mistake in one of my bro scripts. Curiously, the reporter.log file in the manager was empty, which means that logs are NOT COPIED from the workers to the manager node. That's the reason why two of my log files did not appear in the manager node but they were in the workers' log [7] directory.

The fact is, log files correspond to the node where you find them. stdout.log of the worker nodes will show the standard output of the bro scripts running in the workers, while stdout.log in the manager correspond to what the manager policy script is sending to the standard output.

Regarding the protocol log files, we find some network traffic messages in the Manager but these are not all the network traffic messages received in the worker, only about 1 out of 10 messages received in the worker nodes are actually logged in the manager, and this is something we still have to understand.



Opening a config file in the workers

My third problem was related with a config file that should be opened both by the workers and the manager bro scripts. I was trying to open it with something like:

Input::add_event([$source="config.cfg", $reader=Input::READER_RAW, $name="conf", $fields=Val, $ev=config_entry]);

The solution came when I realized that I had to use the full file path, which should be the same both in the worker and the manager nodes:

Input::add_event([$source="/home/bro/bro/spool/installed-scripts-do-not-touch/site/config.cfg", $reader=Input::READER_RAW, $name="conf", $fields=Val, $ev=config_entry]);

This made it work.



The cluster bro processes structure

Now I had to be able to send my local worker log files to the manager. First of all I tried to understand the process structure.

In every worker node there are three bro processes:

root@Mario:/usr/local# ps -ef | grep bro
root      25139      1  0 Jun08 ?        00:00:00 /bin/bash /usr/local/bro/share/broctl/scripts/run-bro -1 -i eth0 -U .status -p broctl -p broctl-live -p local -p worker-1 local.bro broctl base/frameworks/cluster my-prot.bro broctl/auto
root      25145  25139 20 Jun08 ?        03:38:50 /usr/local/bro/bin/bro -i eth0 -U .status -p broctl -p broctl-live -p local -p worker-1 local.bro broctl base/frameworks/cluster my-prot.bro broctl/auto
root      25148  25145  0 Jun08 ?        00:00:09 /usr/local/bro/bin/bro -i eth0 -U .status -p broctl -p broctl-live -p local -p worker-1 local.bro broctl base/frameworks/cluster my-prot.bro broctl/auto

So the first was the parent of one of the bro processes which forked to the other one. Parameters where passed to the run-bro script mainly on run bro as nohup with all the same parameters:

...
mybro=${bro}
...
nohup "$mybro" "$@" &
...

So, the question was, why is the working bro process forking?. lsof command was very useful for that:

root@worker-1:/usr/local# lsof | grep 25145 | egrep -v "\.bro|\.so|pipe" | grep 25186
bro:      25145 25186        root  cwd       DIR                8,1      4096     656821 /home/bro/bro/spool/worker-1
bro:      25145 25186        root  rtd       DIR                8,1      4096          2 /
bro:      25145 25186        root  txt       REG                8,1 134900600    1315052 /usr/local/bro/bin/bro
bro:      25145 25186        root    1w      REG                8,1       188     655382 /home/bro/bro/spool/worker-1/stdout.log
bro:      25145 25186        root    2w      REG                8,1        97     655383 /home/bro/bro/spool/worker-1/stderr.log
bro:      25145 25186        root    4u     IPv4             152650       0t0        UDP 192.168.1.10:41394->192.168.1.5:domain 
bro:      25145 25186        root  202u     pack             152653       0t0        ALL type=SOCK_RAW
bro:      25145 25186        root  260r      REG                8,1       739     657284 /home/bro/bro/spool/installed-scripts-do-not-touch/site/config.cfg
bro:      25145 25186        root  272u     unix 0xffff88007a6abb80       0t0     152656 socket
bro:      25145 25186        root  273w      REG                8,1 283722769     656833 /home/bro/bro/spool/worker-1/my-prot.log
bro:      25145 25186        root  279w      REG                8,1    301214     656838 /home/bro/bro/spool/worker-1/communication.log
bro:      25145 25186        root  280w      REG                8,1     42471     656841 /home/bro/bro/spool/worker-1/weird.log
bro:      25145 25186        root  281w      REG                8,1       447     656844 /home/bro/bro/spool/worker-1/reporter.log



root@worker-1:/usr/local# lsof | grep 25148 | egrep -v "\.bro|\.so|pipe" 
bro       25148              root  cwd       DIR                8,1      4096     656821 /home/bro/bro/spool/worker-1
bro       25148              root  rtd       DIR                8,1      4096          2 /
bro       25148              root  txt       REG                8,1 134900600    1315052 /usr/local/bro/bin/bro
bro       25148              root    0u     IPv4             151842       0t0        TCP 192.168.1.10:35948->192.168.1.20:47762 (ESTABLISHED)
bro       25148              root    4u     IPv4             152650       0t0        UDP 192.168.1.10:41394->192.168.1.5:domain 
bro       25148              root  202u     pack             152653       0t0        ALL type=SOCK_RAW
bro       25148              root  260r      REG                8,1       739     657284 /home/bro/bro/spool/installed-scripts-do-not-touch/site/config.cfg
bro       25148              root  273u     unix 0xffff88007a6ab040       0t0     152657 socket
bro       25148              root  279u     IPv4             151845       0t0        TCP 192.168.1.10:57286->192.168.1.20:47761 (ESTABLISHED)
bro       25148              root  284u     IPv4             151850       0t0        TCP *:47763 (LISTEN)
bro       25148              root  285u     IPv6             151851       0t0        TCP *:47763 (LISTEN)

So, the first bro process (parent), is responsible for listening on the monitoring interface and writing local log files, while the second (child) is responsible for communicating with the manager (two different connections) and listening from the proxy.

In the manager node, there are two processes for the proxy and another two for the manager:

root@Manager:/home/bro/site# ps -ef | grep bro
root      38193      1  0 Jun08 ?        00:00:00 /bin/bash /usr/local/bro/share/broctl/scripts/run-bro -1 -U .status -p broctl -p broctl-live -p local -p manager local.bro broctl base/frameworks/cluster notice_policy.bro broctl/auto
root      38199  38193  3 Jun08 ?        00:37:58 /usr/local/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p manager local.bro broctl base/frameworks/cluster notice_policy.bro broctl/auto
root      38201  38199  0 Jun08 ?        00:00:21 /usr/local/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p manager local.bro broctl base/frameworks/cluster notice_policy.bro broctl/auto


root      38236      1  0 Jun08 ?        00:00:00 /bin/bash /usr/local/bro/share/broctl/scripts/run-bro -1 -U .status -p broctl -p broctl-live -p local -p proxy-1 local.bro broctl base/frameworks/cluster local-proxy broctl/auto
root      38242  38236  3 Jun08 ?        00:40:23 /usr/local/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p proxy-1 local.bro broctl base/frameworks/cluster local-proxy broctl/auto
root      38243  38242  0 Jun08 ?        00:00:03 /usr/local/bro/bin/bro -U .status -p broctl -p broctl-live -p local -p proxy-1 local.bro broctl base/frameworks/cluster local-proxy broctl/auto

Netstat showed the open connections among them and the workers.

root@Manager:/home/bro/site# netstat -tupan | grep bro
tcp        0      0 0.0.0.0:47761           0.0.0.0:*               LISTEN      38201/bro       
tcp        0      0 0.0.0.0:47762           0.0.0.0:*               LISTEN      38243/bro       
tcp        0      0 192.168.1.20:47762         192.168.1.10:35948         ESTABLISHED 38243/bro       
tcp        0      0 192.168.1.20:47761         192.168.1.20:38523         ESTABLISHED 38201/bro       
tcp        0      0 192.168.1.20:38523         192.168.1.20:47761         ESTABLISHED 38243/bro       
tcp        0      0 192.168.1.20:47761         192.168.1.11:57110         ESTABLISHED 38201/bro       
tcp        0      0 192.168.1.20:47762         192.168.1.11:45263         ESTABLISHED 38243/bro       
tcp        0      0 192.168.1.20:47761         192.168.1.10:57286         ESTABLISHED 38201/bro       
tcp6       0      0 :::47761                :::*                    LISTEN      38201/bro       
tcp6       0      0 :::47762                :::*                    LISTEN      38243/bro       
udp        0      0 192.168.1.20:44445         192.168.1.5:53             ESTABLISHED 38242/bro       
udp        0      0 192.168.1.20:35156         192.168.1.5:53             ESTABLISHED 38199/bro  

So, to summarize:

  • The proxy (child process) has one connection to every worker node.
  • The manager (child process) has one connection to every worker node.
  • The proxy and the manager (boy child processes) are connected through a socket (ports 47761 to 38523, in this case)
  • Both proxy and manager parents are more interested in logging processes (they are resposible for writing logs to disk) and DNS resolution

Fine, but we still don't know where are those proprietary logs lost. Installed tcpdump and it quickly showed me that the proprietary protocol logs were being sent from the worker to the manager node and received by the manager child process.

So it seemed the problem should be in my manager policy script. It just implemented the hook of Notice::policy...

hook Notice::policy(not:Notice::Info)

No hay comentarios:

Publicar un comentario