# DCAP Central4.1

## SonarGateway Systems Guide

SonarGateway is a set of packages and systems that allow you to insert data into DCAP Central. SonarGateway ingests data using two basic mechanisms: (1) rsyslog using SonarGateway, (2) handling log streams that can be directed to syslog and by tailing and syncing complete log files to DCAP Central.

SonarGateway comprises the following modules:

1. SonarGateway - Responsible for receiving remote messages, normalizing them, transforming them, and inserting them into appropriate collections in DCAP Central.

2. SonarLogstash - A Logstash process that can write directly into SonarW, DCAP Central's data repository.

3. SonarOracleGateway - A process that runs on Oracle hosts and syncs the xml log file to DCAP Central.

4. SonarMaprGateway - A package that installs, verifies, and configures rsyslog on a MapR system.

5. SonarMaprAgent - A package that configures periodic sync of MapR log files from the MapR file system to a local file system, where rsyslog monitors it.

### SonarGateway

SonarGateway is a user-configurable rule-based input adapter for DCAP Central. It processes input in three general steps (which may collectively be referred to as "normalization"):

1. Normalizing a message into Key/Value pairs using a specified Event Format. (The Event Format defines basic text processing parameters such as delimiters and rules for valid Key names, etc.)

Step 1 specifies the syntax of the raw input data.

2. Transforming the Key/Value pairs into typed and renamed BSON Field/Value pairs. (This step determines "what" the data should look like.)

3. Dynamic, content-aware, routing of the resulting BSONs to specific collections in a database. (This step determines "where" the data should reside.)

Steps 2 and 3 collectively specify the desired semantics of the data.

SonarGateway normalization flexibly adapts unstructured text input into meaningful collections of documents that can be further queried, analyzed, and visualized.

#### SonarGateway Installation

SonarGateway is installed as part of the DCAP Central product suite. Once DCAP Central is installed, the SonarGateway service will be running and ready to accept incoming syslog data.

Separate network ports are used for the normalization of different Event Formats.

The normalization syntax for the Event Formats is fixed in source code. User-editable configuration files control the transformation and dynamic routing of content to specific collections and databases; in other words, by editing the configuration files the user can change the semantics of the data to match their requirements.

### Important

For Red Hat systems you will need to set selinux to Permissive to allow SonarGateway to access the Event Format ports:

sudo bash
setenforce 0
getenforce
Permissive


#### SonarGateway rsyslog Configuration

For each Event Format there are three configuration files that control how SonarGateway transforms and routes incoming content:

/etc/sonar/gateway/{mssql,oracle,qradar, ... etc ...}.json -


Event normalization and transformation are configured by one of /etc/sonar/gateway/*.json. This file can be configured by the user. It controls the semantics mentioned above.

Network ports, format-specific parameters, and binary command line options are configured by files in the /etc/rsyslog.d/sonar/gateway/ directory. For most applications, this file does not need to be changed.

### Important

Each configuration change requires a restart of rsyslog before the changes will take effect:

sudo systemctl restart rsyslog


#### Configuration Validation

SonarGateway requires successful validation of the provided configuration file. If the configuration file does not validate, SonarGateway will not run and no events will be processed.

As an example, a basic Hello SonarGateway configuration will be created to validate correctly. The Hello SonarGateway configuration file will be built upon to demonstrate more SonarGateway functionality.

First, jump forward to the SonarGateway logfile section and increase the verbosity of the logging by enabling INFO level logs.

The JSON configuration is automatically validated every time the SonarGateway service is restarted. The validation checks for syntax errors in the JSON, that all required fields are present, and that any specified Sonar URIs can be connected to. As configurations of increasing complexity are created, it will be helpful to run this validation before starting the service and processing syslog events. To run this validation:

/usr/lib/sonar/gateway/sonargateway --validate --config /etc/sonar/gateway/helloworld.json --delay_depot_dir /var/lib/sonar/gateway

The following error is likely to appear:

2017-07-03 15:50:54,922 ERROR Connecting using default collection: instance
2017-07-03 15:50:54,922 ERROR Client error message: No suitable servers found (serverSelectionTryOnce set): [connection refused calling ismaster on '127.0.0.1:27119']
2017-07-03 15:50:54,922 ERROR SonarGateway failed to validate config file at: /etc/sonar/gateway/helloworld.json


SonarGateway can't connect to the running SonarW instance. This needs to be fixed, assuming that SonarW is running at port 27117 on the localhost. Edit the configuration file /etc/sonar/gateway/helloworld.json; note that the port number is 27119 in the sonar_URI JSON field, a database connection string where you can specify user, password, etc. as per the SonarW documentation:

[
{
"output_connection": {
"event_format": {
"standard": "CEF"
},
"target_db": "sonargateway",
"default_collection": "instance",
"unique_label": "instance_cef",
"group_label": "instance_cef",
"collection_selectors": {
"select_by_key_value": {
"cs2": "MSSQL"
}
}
}
}
]


Change the sonar_URI to the correct URI for the running SonarW instance, save the file, and rerun the validation. It is likely that changing port 27119 to 27117 is the only change required before running the validation:

sudo /usr/lib/sonar/gateway/sonargateway --validate --config /etc/sonar/gateway/helloworld.json --delay_depot_dir /var/lib/sonar/gateway

If the configuration is successfully validated, the following message should appear:

2017-07-03 15:50:26,603 INFO No sonar_DR_URI specified in output_connection with unique label: instance. You may specify one in instance or in global_settings.sonar_DR_URI
2017-07-03 15:50:26,603 INFO sonargateway successfully validated config file at: /etc/sonar/gateway/helloworld.json      

After successful validation, ensure that you can authenticate and connect with the given sonar_URI string. In the example below, the simpler URI string doesn't return success because it is missing user name, password, and authentication database. When these are added to the URI, it is possible to connect and authenticate:

##      vvvv sonar_URI string vvv
mongo mongodb://127.0.0.1:27117 --eval "db.hostInfo()" | grep -F "\"ok\" : 1"
## didn't return "ok" : 1 -- we connected but did NOT authenticate

##      vvvvvvv correct sonar_URI string vvvvvvvvv
mongo mongodb://[user]:'[pass]'@[ip]:27117/[authdb] --eval "db.hostInfo()"|grep ok
"ok" : 1
## returned "ok" : 1 -- we connected and authenticated


### Important

Continue to edit the file, run the validation, and check the URI authentication from the command line, until the file validates and the db.hostInfo() check returns ok. SonarGateway will not run until all of these steps are successful.

Tail the logfile in a separate terminal window to monitor the status and progress of SonarGateway:

tail -f /var/log/sonar/gateway/sonargateway.log

#### Event Normalization

The /etc/sonar/gateway/*.json files are arrays of JSON objects. The most important objects are output_connections. These objects configure the normalization and final destination for JSON documents generated from the incoming syslog events.

As mentioned in the SonarGateway introduction, basic normalization is performed according to the event_format specified in the SonarGateway Configuration. Those formats are mostly fixed industry standards along with some customer-specific event formats.

SonarGateway can be configured to change the "what" and "where" semantics of the data via Field Translations and Collection Selectors. Again, we use "normalization" as convenient short-hand to describe the sequence of all of the steps.

In the next sections, complex normalization tasks will be demonstrated, starting with a basic Hello SonarGateway example.

#### Hello SonarGateway

A minimal output_connection object is shown below. This output will attempt to normalize using the Common Event Format; if it is successful, it will insert the resulting JSON document to the collection instance in database sonargateway:

"output_connection": {
"event_format": {
"standard": "CEF"
},
"target_db": "sonargateway",
"default_collection": "instance",
"unique_label": "instance",
"group_label": "instance",
"collection_selectors": {
"select_by_key_value": {
"cs2": "MSSQL",
}
}
}


Notes:

• event_format is one of the supported event formats.

• sonar_URI is the database connection string. Specify user, password, etc. as per the SonarW documentation.

• target_db – All collections created by the output_connection will be contained in this database.

• unique_label is a string to uniquely identify this output_connections object.

• group_label – Event processing stops for all subsequent output_connections with matching group_label once one output_connection has successfully processed an event in that group. If you want all output_connections to attempt to process events, use the unique_label as the group_label

• collection_selectors contains selection rules used to determine if the incoming event should be processed and what collection it should be inserted into.

Use ncat (yum install nmap-ncat on RedHat systems) to echo a syslog string in CEF format to port 10512 on the localhost. (To perform this task, sudo access is required.)

sudo bash
vi /etc/rsyslog.d/sonar/gateway/sonargateway.conf


Comment in the helloworld.conf line:

...
#$IncludeConfig /etc/rsyslog.d/sonar/gateway/rulesets/eventhub.conf$IncludeConfig /etc/rsyslog.d/sonar/gateway/rulesets/helloworld.conf
#$IncludeConfig /etc/rsyslog.d/sonar/gateway/rulesets/horton.conf ... vi /etc/sonar/gateway/helloworld.json ## change sonar_URI if necessary systemctl restart rsyslog  Next, run the following command: echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Note that port 10512 is used for these examples. You can freely edit the helloworld.conf file without interfering with any other configurations. If you run the command sudo tail -f /var/log/sonar/syslog/sonargateway.log, the following message should appear: 2017-06-30 02:09:38,338 INFO Successfully connected to default collection: instance 2017-06-30 02:09:38,339 WARNING SonarGateway Release v1.2.1-238-g919e1bc: Logging started/reconfigured, pid 10889 2017-06-30 02:09:38,339 INFO Looking for delayed messages in /var/lib/sonar/gateway/instance_cef_4716186055225792270.journal ... 2017-06-30 02:09:38,339 INFO No delayed messages in /var/lib/sonar/gateway/instance_cef_4716186055225792270.journal.recovering 2017-06-30 02:09:38,339 INFO Creating a collection buffer in namespace sonargateway.instance 2017-06-30 02:09:53,322 INFO Insert 1 bson to sonargateway.instance  In the mongo shell, the following message should appear: > use sonargateway switched to db sonargateway > show collections instance system.ingest > db.runCommand({flush: "instance"}) { "ok" : 1 } > db.instance.findOne() { "_id" : ObjectId("59e7ff0896cb25067166bff5"), "CEF Version" : 0, "Device Product" : "Guardium", "Device Vendor" : "IBM", "Device Version" : 10, "Name" : "Log all SQLs - Full SQL Template", "Severity" : "5", "Signature ID" : NumberLong(20011), "deviceAction" : "SQL_LANG", "applicationProtocol" : "MYSQL", "Server Type" : "MYSQL", "DB Protocol Version" : "10.0.0", "destinationPort" : NumberLong(47741), "destinationAddress" : "192.168.0.22", "destinationUserName" : "ROOT", "externalId" : NumberLong(-1), "message" : "select @@version_comment limit 1", "deviceReceiptTime" : ISODate("2017-06-02T16:32:02.951Z"), "sourceProcessName" : "MYSQL CLIENT", "sourcePort" : NumberLong(12160), "sourceAddress" : "192.168.0.22", "startTime" : ISODate("2017-06-02T16:32:02.951Z") }  Note that the CEF standard has been applied while normalizing and, for example, the input syslog string rt=1496421122951 has been translated to an ISODate "deviceReceiptTime" : ISODate("2017-06-02T16:32:02.951Z") in the final JSON document. In the next section, the Hello SonarGateway JSON configurations will be expanded to add Field Translations. These translations rename and typecast matching syslog key/value pairs before adding them to the final JSON document. #### SonarGateway Log File If you run tail -f /var/log/sonar/gateway/sonargateway.log and ncat a syslog event to SonarGateway, the following message should appear (without the line number indicators, e.g. Line 1 ---->): Line 1 ----> 2017-06-30 02:09:38,338 INFO Successfully connected to default collection: instance Line 2 ----> 2017-06-30 02:09:38,339 WARNING SonarGateway Release v1.2.1-238-g919e1bc: Logging started/reconfigured, pid 10889 Line 3 ----> 2017-06-30 02:09:38,339 INFO Looking for delayed messages in /var/lib/sonar/gateway/instance_cef_4716186055225792270.journal ... Line 4 ----> 2017-06-30 02:09:38,339 INFO No delayed messages in /var/lib/sonar/gateway/instance_cef_4716186055225792270.journal.recovering Line 5 ----> 2017-06-30 02:09:38,339 INFO Creating a collection buffer in namespace sonargateway.instance Line 6 ----> 2017-06-30 02:09:53,322 INFO Insert 1 bson to sonargateway.instance  Note: SonarGateway will occasionally write other messages to this log file. Line 1 is from the Configuration Validation. Line 2 is from the running SonarGateway process and shows the build revision and process ID (pid), in this case 10889. If there are many Logging started/reconfigured, pid ... lines with different process IDs, this usually indicates a configuration error causing the SonarGateway to exit and restart. The pid in Line 2 should live until you the service is manually restarted. Lines 3 and 4 are journal files used to store syslog events in the event that SonarW is temporarily unavailable. Note that the location of these journal files can be changed via the --delay-depot-dir argument in /usr/lib/sonarw/run_sonargateway.sh. Line 5 shows the database/collection namespace where the document will be inserted. Line 6 shows that the normalized syslog BSON has been inserted to SonarW. To view these lines the verbosity of the logging my be increased by editing /etc/sonar/gateway/logging.conf: * GLOBAL: FORMAT = "%datetime %level %msg" FILENAME = "/var/log/sonar/gateway/sonargateway.log" ENABLED = true TO_FILE = true TO_STANDARD_OUTPUT = true PERFORMANCE_TRACKING = false MAX_LOG_FILE_SIZE = 10485760 ## 10MB LOG_FLUSH_THRESHOLD = 1 * TRACE: ENABLED = true ## do not disable * DEBUG: ENABLED = false * FATAL: ENABLED = true * ERROR: ENABLED = true * WARNING: ENABLED = true * INFO: ENABLED = false  To increase the verbosity of the logfile, enable INFO as follows: * INFO: ENABLED = true  Note: Other parameters can be changed, e.g. the location of the logfile. Next, save /etc/sonar/gateway/logging.conf and restart rsyslog: systemctl restart rsyslog  After this change, return to tail -f /var/log/sonar/gateway/sonargateway.log and ncat a syslog event to SonarGateway; only the TRACE output will be visible: 2017-06-30 02:11:23,912 WARNING SonarGateway Release v1.2.1-238-g919e1bc: Logging started/reconfigured, pid 25103  #### Field Translations SonarGateway uses Field Translations to perform tasks such as the following: • rename input Keys to desired JSON field names • type cast input Value to desired BSON types There are other Translations available, described in the configuration reference below. As described in the SonarGateway introduction, Field Translations determine "what" the final data looks like, Collection Selectors determine "where" the data reside, and the combination of "what" and "where" define the semantics of the data. In the previous example, the following field/values appeared in the final JSON: "Severity" : "5" "message" : "select @@version_comment limit 1"  The following Translations will be performed next: • type cast Severity to an Integer • rename message to "Login Message". Open /etc/sonar/gateway/helloworld.json with a text editor to see the JSON from the Hello SonarGateway. Edit the file, adding the three Field Translations JSON objects side by side with the output_connection: [ { "output_connection": { "event_format": { "standard": "CEF" }, "sonar_URI": "mongodb://CN=admin@localhost:27117/admin? ... ", "target_db": "sonargateway", "default_collection": "instance", "unique_label": "instance_cef", "group_label": "instance_cef", "redact_unmatched_fields": true, "collection_selectors": { "select_by_key_value": { "cs2": "MYSQL" } } }, "Severity": { "type": 18 }, "msg": { "rename": "Login Message" } } ]  After editing this file, open a terminal and enter the following commands: sudo bash systemctl restart rsyslog ## for the configuration settings to take effect echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Next, perform the following actions in the mongo shell: > use sonargateway switched to db sonargateway > show collections instance system.ingest > db.runCommand({flush: "instance"}) { "ok" : 1 } > db.instance.find().sort({_id:-1}).pretty() { "_id" : ObjectId("595a62f8769f2b4b3a6c9cc2"), "Severity" : NumberLong(5), "Login Message" : "select @@version_comment limit 1" } { "_id" : ObjectId("59e7ff0896cb25067166bff5"), "CEF Version" : 0, "Device Product" : "Guardium", "Device Vendor" : "IBM", "Device Version" : 10, "Name" : "Log all SQLs - Full SQL Template", "Severity" : "5", "Signature ID" : NumberLong(20011), "deviceAction" : "SQL_LANG", "applicationProtocol" : "MYSQL", "Server Type" : "MYSQL", "DB Protocol Version" : "10.0.0", "destinationPort" : NumberLong(47741), "destinationAddress" : "192.168.0.22", "destinationUserName" : "ROOT", "externalId" : NumberLong(-1), "message" : "select @@version_comment limit 1", "deviceReceiptTime" : ISODate("2017-06-02T16:32:02.951Z"), "sourceProcessName" : "MYSQL CLIENT", "sourcePort" : NumberLong(12160), "sourceAddress" : "192.168.0.22", "startTime" : ISODate("2017-06-02T16:32:02.951Z") }  The fields are translated as shown below: "Severity" : NumberLong(5), "Login Message" : "select @@version_comment limit 1"  The JSON for each of these is described below: For each Field Translation the name of the object is the key name in the original syslog input. The type field specifies the final BSON type for the value. The rename value specifies the final field name in the output JSON. #### Redacting and Ingesting Two important details from the Field Translations section need further explanation. First, the following line in the output_connection configuration: "redact_unmatched_fields": true  The redact_unmatched_fields is not required and is false by default. However, it's a very useful configuration setting because when true only fields that have matching Field Translations will be added to the final BSON. This allows for very precise output document structure and will save disk space by redacting unnecessary fields. The second important detail is the need to flush the collection in the mongo shell: > use sonargateway switched to db sonargateway > show collections instance system.ingest > db.runCommand({flush: "instance"})  By default SonarGateway creates ingested collections. In the above example, even though SonarGateway has flushed the BSON, db.instance.count() will return 0 until the ingest queue is flushed using db.runCommand({flush: "instance"}) or until SonarW automatically flushes it. See the SonarW documentation for more details. Collections are ingested by default, allowing a higher data input rate into SonarW to be sustained. Ingest can be turned off by collection name using global_settings. Edit /etc/sonar/gateway/helloworld.json. Copy/paste the output_connection to create a second one and change the default_collection, unique_label, group_label, and turn off ingestion for this collection. This is performed via the do_not_ingest field which takes an array of collection names: [ { "global_settings": { "do_not_ingest": [ "session" ], "sonar_URI": "mongodb://CN=admin@localhost:27117/admin? ... ", "target_db": "sonargateway" } }, { "output_connection": { "event_format": { "standard": "CEF" }, "default_collection": "session", "unique_label": "session_cef", "group_label": "session_cef", "redact_unmatched_fields": true, "collection_selectors": { "select_by_key_value": { "cs2": "MYSQL" } } }, "Severity": { "type": 18 }, "msg": { "rename": "Login Message" } }, { "output_connection": { "event_format": { "standard": "CEF" }, "sonar_URI": "mongodb://CN=admin@localhost:27117/admin? ... ", "target_db": "sonargateway", "default_collection": "instance", "unique_label": "instance_cef", "group_label": "instance_cef", "redact_unmatched_fields": true, "collection_selectors": { "select_by_key_value": { "cs2": "MYSQL" } } }, "Severity": { "type": 18 }, "msg": { "rename": "Login Message" } } ]  In the mongo shell, drop the example database via the following: > use sonargateway switched to db sonargateway > db.dropDatabase() { "dropped" : "sonargateway", "ok" : 1 }  Open a terminal and run the following commands: sudo bash systemctl restart rsyslog ## for the configuration settings to take effect echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Running tail -f /var/log/sonar/gateway/sonargateway.log, will return the following message: [date] INFO parsing global settings [date] TRACE Global target DB name is sonargateway [date] TRACE Global DB URI is set. [date] INFO Successfully connected to default collection: session [date] INFO Successfully connected to default collection: instance [date] WARNING SonarGateway Release [version]: Logging started/reconfigured, [pid] [date] INFO Creating a collection buffer in namespace sonargateway.session [date] INFO Creating a collection buffer in namespace sonargateway.instance [date] INFO Insert 1 bson to sonargateway.session [date] INFO Insert 1 bson to sonargateway.instance  Note: INFO parsing global settings is going to mark session as do_not_ingest. Also note that the Collection Selectors in both output_connections have matched, which creates two collection buffers (last two lines of the log file example shown). In the mongo shell, run the commands shown below. Note that the count() for instance is 0 because it is being ingested. The BSON item is still in the queue, as displayed in the ingeststats. However, db.session.findOne() finds a document immediately since session is not marked as ingest: > show collections instance session system.ingest > db.instance.count() 0 > db.session.count() 1 > db.system.ingest.find() {"_id" : "instance", "allow_duplicate_ids":true, "buffer_size":NumberLong(1073741824)} > db.runCommand({ingeststats: "instance"}) { "details" : [ { "bytes_in_queue" : NumberLong(92), "items_in_queue" : NumberLong(1), "bytes_ingested" : NumberLong(0), "ingesting" : false, "tid" : NumberLong(0), "deleted" : NumberLong(0), "mapped_size" : NumberLong(1048576), "mapped_size_string" : "1024.000000 KB", "last_flush_time" : ISODate("2017-10-19T20:27:43Z") }, { "bytes_in_queue" : NumberLong(0), ..... etc ..... "last_flush_time" : ISODate("2017-10-19T20:27:43Z") } ], "ok" : 1 } > db.session.findOne() { "_id" : ObjectId("59e90ac496cb257314181648"), "Severity" : NumberLong(5), "Login Message" : "select @@version_comment limit 1" }  Note that the document fields and types for session are the same as for instance because the same Field Translations and "redact_unmatched_fields": true were used for both output_connections. #### Collection Selectors The collection_selectors section of an output_connection contains selection rules used to determine if the incoming event should be inserted into the default_collection. Sometimes the selector will dynamically determine a target collection instead of using the default_collection. In all of the Hello SonarGateway examples, select_by_key_value was used to route the BSON to a collection: "default_collection": "instance", "collection_selectors": { "select_by_key_value": { "cs2": "MSSQL", } }  With this selector specified, if the raw syslog text has the Key cs2 and Value MSSQL, it will be inserted the BSON into the instance collection. Open /etc/rsyslog.d/sonar/gateway/rulesets/helloworld.conf and change the CONFIG_FILE to the following: CONFIG_FILE=/etc/sonar/gateway/collection_selectors.json  Open /etc/sonar/gateway/collection_selectors.json and change all the sonar_URI fields to match the one contained in /etc/sonar/gateway/helloworld.json. Next, check your configuration: sudo bash /usr/lib/sonar/gateway/sonargateway --validate --config /etc/sonar/gateway/collection_selectors.json --delay_depot_dir /var/lib/sonar/gateway  The following message should appear after the sonar_URI fields have been fixed: [date] INFO parsing global settings [date] INFO Successfully connected to default collection: all_sessions [date] INFO Successfully connected to default collection: mssql_sessions [date] INFO Successfully connected to default collection: oracle_sessions [date] INFO Successfully connected to default collection: sessions_by_key [date] INFO Output connections at array indexes 4, 5 are grouped by group_label: session_by_key_value [date] INFO SonarGateway successfully validated config file at: /etc/sonar/gateway/collection_selectors.json  With the sonargateway_collection_selectors.json file successfully validated, note that the first two objects are global for all output_connection objects: [ { "global_settings": { "sonar_URI": "mongodb://CN=admin@localhost:27117/admin? ... ", "target_db": "sonargateway", "do_not_ingest": [ "all_sessions", "oracle_sessions", "mssql_sessions", "sessions_by_key" ] } }, { "global_field_translations": { "msg": { "rename": "Greeting" }, "app": { "rename": "Product" }, "cs2": { "rename": "Database" } } }, .....  global_settings was included in the Redacting and Ingesting section. The global_field_translations are new. These are the same as the ones that were previously seen, but they will be used by all output_connections. The rest of the file consists of JSON configuration already seen, except for the introduction of a one new Collection Selector, select_by_key, which takes an array of Key strings. If the raw syslog text has a Key that matches any in the array, the BSON will be inserted the BSON default_collection. Below are excerpted parts of the configuration. The all_sessions output_connection will match any CEF formatted syslog, since they are always specified to have a Device Vendor field.: { "output_connection": { "default_collection": "all_sessions", "unique_label": "all_sessions", "group_label": "all_sessions", "collection_selectors": { "select_by_key": [ "Device Vendor" ] } } }  The session_by_key_value_1 output_connection will match if the raw syslog text has the Key cs2 and Value MSSQL. It will be inserted the BSON into the mssql_sessions collection: { "output_connection": { "default_collection": "mssql_sessions", "unique_label": "session_by_key_value_1", "group_label": "session_by_key_value", "collection_selectors": { "select_by_key_value": { "cs2": "MSSQL" } } } }  The session_by_key_value_1 output_connection will match if the raw syslog text has the Key cs2 and Value ORACLE. It will insert into the oracle_sessions collection: { "output_connection": { "default_collection": "oracle_sessions", "unique_label": "session_by_key_value_2", "group_label": "session_by_key_value", "collection_selectors": { "select_by_key_value": { "cs2": "ORACLE" } } } }  The session_by_key output_connection will match if the raw syslog text has the Key cs2 or msg. It will insert into the sessions_by_key collection: { "output_connection": { "default_collection": "sessions_by_key", "unique_label": "session_by_key", "group_label": "session_by_key", "collection_selectors": { "select_by_key": [ "cs2", "msg" ] } } }  There are other collection selectors, as you can see in the reference section below. In the mongo shell, drop the example database via the following: > use sonargateway switched to db sonargateway > db.dropDatabase() { "dropped" : "sonargateway", "ok" : 1 }  Open a terminal and run the following commands: sudo bash systemctl restart rsyslog ## for the configuration settings to take effect echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Running tail -f /var/log/sonar/gateway/sonargateway.log will reveal that three of the four output_connections were triggered because their collection_selectors matched: [date] INFO Successfully connected to default collection: all_sessions [date] INFO Successfully connected to default collection: mssql_sessions [date] INFO Successfully connected to default collection: oracle_sessions [date] INFO Successfully connected to default collection: sessions_by_key [date] INFO Output connections at array indexes 4, 5 are grouped by group_label: session_by_key_value [date] WARNING SonarGateway Release v1.2.1-238-g919e1bc: Logging started/reconfigured, pid 27701 [date] INFO Creating a collection buffer in namespace sonargateway.all_sessions [date] INFO Creating a collection buffer in namespace sonargateway.mssql_sessions [date] INFO Creating a collection buffer in namespace sonargateway.sessions_by_key [date] INFO Insert 1 bson to sonargateway.all_sessions [date] INFO Insert 1 bson to sonargateway.mssql_sessions [date] INFO Insert 1 bson to sonargateway.sessions_by_key Three collections were created: 1. all_sessions because the Device Vendor Key is present 2. mssql_sessions because the cs2 Key has Value MSSQL 3. sessions_by_key because the either the cs2 or the msg Key is present In the mongo shell, check that everything is proceeding as expected: > use sonargateway switched to db sonargateway > show collections all_sessions sessions_by_key mssql_sessions > db.all_sessions.findOne() { "_id" : ObjectId("595c36a2769f2b7705301756"), "Product" : "Guardium", "Database" : "MSSQL", "Greeting" : "select @@version_comment limit 1" }  ##### Using the group_label Note that the session_by_key Collection Selector matched the syslog in the previous example, because it does not share a group_label with any of the previous output_connections. SonarGateway stops processing collection_selectors in the same group_label once the first output_connection has matched. This can help prevent redundant data from being stored in SonarW. For this example, give the session_by_key the same group_label as the all_sessions output_connection. Next, restart the SonarGateway service, rerun the syslog input, and observe the results. Open /etc/sonar/gateway/collection_selectors.json and change "group_label": "session_by_key" to "group_label": "all_sessions" and save the file. In the mongo shell, drop the example database via the following: > use sonargateway switched to db sonargateway > db.dropDatabase() { "dropped" : "sonargateway", "ok" : 1 }  Open a terminal and run the following commands: sudo bash systemctl restart rsyslog ## for the configuration settings to take effect echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Now, running tail -f /var/log/sonar/gateway/sonargateway.log reveals that only two of the four output_connections were triggered: [date] INFO parsing global settings [date] INFO Successfully connected to default collection: all_sessions [date] INFO Successfully connected to default collection: mssql_sessions [date] INFO Successfully connected to default collection: oracle_sessions [date] INFO Successfully connected to default collection: sessions_by_key [date] INFO Output connections at array indexes 3, 6 are grouped by group_label: all_sessions [date] INFO Output connections at array indexes 4, 5 are grouped by group_label: session_by_key_value [date] WARNING SonarGateway Release v1.2.1-238-g919e1bc: Logging started/reconfigured, pid 1356 [date] INFO Creating a collection buffer in namespace sonargateway.all_sessions [date] INFO Creating a collection buffer in namespace sonargateway.mssql_sessions [date] INFO Insert 1 bson to sonargateway.all_sessions [date] INFO Insert 1 bson to sonargateway.mssql_sessions Two collections were created: 1. all_sessions because the Device Vendor Key is present 2. mssql_sessions because the cs2 Key has Value MSSQL The sessions_by_key was not processed because all_sessions successfully matched and has the same group_label. In the mongo shell, run the following: > show collections all_sessions mssql_sessions > db.all_sessions.findOne() { "_id" : ObjectId("59e923d296cb25054c07dd0c"), "Product" : "Guardium", "Database" : "MYSQL", "Greeting" : "select @@version_comment limit 1" } > db.mssql_sessions.findOne() { "_id" : ObjectId("59e923d296cb25054c07dd0d"), "Product" : "Guardium", "Database" : "MYSQL", "Greeting" : "select @@version_comment limit 1" }  The documents in the two collections have the same content because global_field_translations was used. The content for each collection can be customized by moving those global_field_translations back into the individual output_connections. Before proceeding to a more complex example that demonstrates such customization, run two more inputs into the current configuration: sudo bash echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512  Running tail -f /var/log/sonar/gateway/sonargateway.log reveals one new collection created, and two flushed BSONs: 2017-10-19 22:20:25,392 INFO Creating a collection buffer in namespace sonargateway.oracle_sessions 2017-10-19 22:20:30,416 INFO Insert 1 bson to sonargateway.all_sessions 2017-10-19 22:20:30,449 INFO Insert 1 bson to sonargateway.oracle_sessions  The oracle_sessions collection is created via the select_by_key_value collection_selector in the output_connection with "unique_label": "session_by_key_value_2". Note that this shares a group_label with another output_connection, but because the previous one didn't match, it is normalized and the new collection is created. In the mongo shell, run the following: > show collections all_sessions mssql_sessions oracle_sessions > db.all_sessions.count() 2 > db.mssql_sessions.count() 1 > db.oracle_sessions.count() 1  #### Automatic Type Detection Field Translations showed a type Translation for Severity, which converted the string "5" to NumberLong(5). There were, however, other fields in the example that were converted automatically. Note that Signature ID, destinationPort, deviceReceiptTime, startTime and others were converted to sensible types: { "_id" : ObjectId("59e7ff0896cb25067166bff5"), "CEF Version" : 0, "Device Product" : "Guardium", "Device Vendor" : "IBM", "Device Version" : 10, "Name" : "Log all SQLs - Full SQL Template", "Severity" : "5", **"Signature ID" : NumberLong(20011),** "deviceAction" : "SQL_LANG", "applicationProtocol" : "MYSQL", "Server Type" : "MYSQL", "DB Protocol Version" : "10.0.0", **"destinationPort" : NumberLong(47741),** "destinationAddress" : "192.168.0.22", "destinationUserName" : "ROOT", "externalId" : NumberLong(-1), "message" : "select @@version_comment limit 1", **"deviceReceiptTime" : ISODate("2017-06-02T16:32:02.951Z"),** "sourceProcessName" : "MYSQL CLIENT", "sourcePort" : NumberLong(12160), "sourceAddress" : "192.168.0.22", **"startTime" : ISODate("2017-06-02T16:32:02.951Z")** }  Best efforts are made to infer and cache the type conversion for all input fields. The automatic type detection can be overridden by explicitly specifying the type using a Field Translation. This is sometimes necessary for integers being interpreted as epoch ISODates instead of plain integers. The conversion of DateTime strings into ISODates is possibly the most important reason for Automatic Type Detection, as many queries and analyses require correct ISODates. In the mongo shell, drop the example database via the following: > use sonargateway switched to db sonargateway > db.dropDatabase() { "dropped" : "sonargateway", "ok" : 1 }  Open a terminal and run the following commands: sudo bash systemctl restart rsyslog ## for the configuration settings to take effect ## going to break the “start” formatting echo “Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0|IBM|Guardium|10.0|20011|Log all SQLs - Full SQL Template|5|rt=1496421122951 start=2017-06-02 cs1=INFO cs1Label=Severity cs2=MYSQL cs2Label=Server Type app=MYSQL cs4=10.0.0 cs4Label=DB Protocol Version sproc=MYSQL CLIENT act=SQL_LANG externalId=-1 duser=ROOT dst=192.168.0.22 dpt=47741 src=192.168.0.22 spt=12160 msg=select @@version_comment limit 1” | ncat localhost 10512 Note that because the format of the start field is broken, the input is no longer automatically converted to an ISODate and instead is interpreted as a string: > db.all_sessions.findOne().startTime 2017-06-02  Next, edit the following file: vi /etc/sonar/gateway/dt_fmt_strings  Add %Y-%m-%d to the end of the file so it appears as shown below: ..... ..... %Y-%m-%d %T %Y-%m-%dT%H:%M:%S.%fZ %Y-%m-%dT%T %Y-%m-%d %H:%M:%S %Y-%m-%d  In the mongo shell, drop the example database via the following: > use sonargateway switched to db sonargateway > db.dropDatabase() { "dropped" : "sonargateway", "ok" : 1 }  Open a terminal and run the following commands: systemctl restart rsyslog ## for the configuration settings to take effect ## now have a format string to match echo "Jun 2 12:32:03 test_machine sqlguard[19870]: CEF:0Guardium200115 ncat localhost 10512 Note the ISODate: > db.all_sessions.findOne().startTime ISODate("2017-06-02T00:00:00Z")  Any amount of date time format strings to /etc/sonar/gateway/dt_fmt_strings can be added, but the rsyslog service must be restarted for those new format strings to be read by SonarGateway. ### Sending logs from MS SQL to SonarGateway #### Configuring the DCAP Central host 1. Add the MS SQL data source to SonarGateway: sudo vi /etc/rsyslog.d/sonar/gateway/sonargateway.conf  Uncomment this line: #$IncludeConfig /etc/rsyslog.d/sonar/gateway/rulesets/mssql.conf

2. Save and exit.

3. Run the following command:

sudo systemctl restart rsyslog

4. Perform verification:

netstat -tpln | grep rsyslog


And look for port 10536.

#### Configuring the MS SQL host

1. Create MS SQL Audit to Application, as per MS SQL documentation.

2. Check the source-name being used for the MS SQL logs; see Windows Log >> application, typically stored in Windows Event Viewer. Record this source-name.

3. Ensure a syslog forwarder application is installed on the system, e.g. NXLog.

### Notice

4. Modify the syslog forwarder application's configuration file (e.g. nxlog.conf) using the syntax specified in the application's documentation. The example show below refers to nxlog.conf.

##  nxlog.conf
## This is a sample configuration file. See the nxlog reference manual about the
## configuration options. It should be installed locally and is also available
## online at http://nxlog.org/docs/

## Please set the ROOT to the folder your nxlog was installed into,
## otherwise it will not start.

#define ROOT C:\Program Files\nxlog
define ROOT <replace with NXlog root>

Moduledir %ROOT%\modules
CacheDir %ROOT%\data
Pidfile %ROOT%\data\nxlog.pid
SpoolDir %ROOT%\data
LogFile %ROOT%\data\nxlog.log

<Extension _syslog>
Module      xm_syslog
</Extension>

<Extension _json>
Module    xm_json
</Extension>

<Input in>
Module      im_msvistalog
## add here your sourcename from event viewer. Use double quotation around the name.
Exec if $SourceName != "<use source-name from step #2 above>" drop(); Exec$Hostname=hostname_fqdn(); $hostname_ip=host_ip(); to_json(); Exec to_json(); </Input> #Connect to syslog listener on sonarg host <Output out> Module om_tcp ## add your ip and port to send ms-sql logs to Host <sonarg hostname or IP> Port 10536 Exec to_json();$Message = $raw_event; to_syslog_bsd(); # Exec$raw_event = replace($raw_event, '{', ':{', 1); </Output> <Route 1> Path in => out </Route>  5. Restart the syslog forwarder via services.msc. ### Sending logs from Oracle to SonarGateway #### Configuration on the DCAP Central host 1. Add the Oracle data source to SonarGateway: sudo vi /etc/rsyslog.d/sonar/gateway/sonargateway.conf  Uncomment the following line: #$IncludeConfig /etc/rsyslog.d/sonar/gateway/rulesets/oracle.conf

2. Save and exit.

3. Run the following command:

sudo systemctl restart rsyslog

4. Perform a verification:

netstat -tpln | grep rsyslog


And look for port 10518.

#### Configuration on the Oracle host

1. Connect to the Oracle server (using a sql client), and set audit_trail to "os"

2. Set the auditing level according to your requirements. Refer to the relevant Oracle documentation.

3. Configure rsyslog to send logs from a specific audit level to the DCAP Central machine.

4. Restart rsyslog.

5. Apply the updates to Oracle.

The below example applies specifically to Oracle 12.1.0:

SQL> startup;
SQL> show parameter audit;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
audit_sys_operations                 boolean     TRUE
audit_syslog_level                   string
audit_trail                          string      DB
unified_audit_sga_queue_size         integer     1048576


In this example, audit_syslog_level is not set, and audit_trail is set to DB. To pass audit to syslog, set audit_trail to OS; define the change in a config file (spfile).

SQL> alter system set audit_trail=OS scope = SPFILE;
SQL> create pfile from spfile;


To get audit logs that integrate well with DCAP Central system, audit by access. Example:

SQL> audit all by access;
SQL> audit select any table by access;
SQL> audit update any table by access;
SQL> audit insert any table by access;
SQL> audit alter any table by access;
SQL> audit delete any table by access;


Restart the Oracle server for the changes to take effect. The example below is for Oracle-xe:

sudo systemctl restart oracle-xe

2. Set the audit level, e.g. info:

sudo echo "*.audit_syslog_level=local0.info" >> /oracle/product/12.1.0/dbhome_1/dbs/initorcl.ora
3. Configure rsyslog to send local0.info logs to a DCAP Central machine on IP 1.2.3.4 (replace with your DCAP Central host IP):

sudo echo "local0.info  @@1.2.3.4:10518" >> /etc/rsyslog.conf

4. Restart rsyslog:

sudo systemctl restart rsyslog


SQL> shutdown immediate;
SQL> create pfile from spfile;
SQL> startup;


### SonarOracleGateway

This package is meant to be installed on a machine running Oracle. It will copy Oracle XML log files to DCAP Central for further processing.

#### Setup

Prerequisite: Oracle package installed.

• Install the SonarOracleGateway package

After installing, you will need to configure the DCAP Central remote destination. There are several steps to carry out on the oracle machine, and on the DCAP Central machine.

#### Oracle Machine Configuration

1. Edit the file /etc/sonar-oracle-gateway.conf. The commented values are the defaults - uncomment and modify as needed to match your installation.

2. Set up ssh connectivity for rsync.

3. Create a file (or append to) ~oracle/.ssh/config, with the following contents:

host sonarg
hostname <sonarg hostname>
user sonargd


Note: If the DEST parameter in /etc/sonar-oracle-gateway.conf has been modified, ensure sure it matches ~oracle/.ssh/config as well. The template above is based on the default values.

4. Copy (e.g. to the clipboard) the contents of the file ~oracle/.ssh/id_rsa.pub

#### DCAP Central Machine Configuration

1. Paste the contents of the file you copied in the previous step into the following:

/var/lib/sonargd/.ssh/authorized_keys. If the file already exists, append to it; otherwise, create the file. Ensure the file is owned by sonargd.sonar and that its permissions are 640.; the containing .ssh directory should also be owned by sonargd.sonar, with permissions 700.

2. Change the shell for sonargd user to be bash:

sudo usermod -s /bin/bash sonargd


This is required to allow connecting to the DCAP Central machine using rsync from the Oracle machine.

3. Configure sonargd to process the xml files. This will depend on the format of the XML filenames, and on the collection where they will be routed.

4. Edit the sonargd.conf configuration file, typically found in the following path:

sudo vi /etc/sonar/sonargd.conf


There are two sections to edit: plugins instructs sonargd how to process the information in the files, and misc-files routes the processed information to the collection.

(The example below is for Oracle 12c and routing the input to a collection named oracle_logs.) Insert / merge the following lines into sonargd.conf:

plugins:

default:

sonargdm

oracle_logs:

oracle

misc-files:

match: orcl_ora_[a-zA-z0-9] collection: oracle_logs
5. Restart sonargd service to apply the change in configuration:

sudo systemctl restart sonargd


### Using SonarW to Send Syslog Messages

The underlying datastore, SonarW, has pipeline operators that allow you to send data directly over syslog as well as projection operators that allow you to format JSON documents as LEEF.

As an example, if a collection has documents of the form:

> db.webservice.findOne({number: 'INC0000002'})
{
"_id" : ObjectId("59cf96748a7597550015e40f"),
"number" : "INC0000002",
"SonarG Source" : "SonarDispatcher-Web-Service",
"__status" : "success",
"active" : true,
"activity_due" : "2017-08-10 14:51:11",
"approval" : "null",
"assigned_to" : "Howard Johnson",
"assignment_group" : "Network",
"caller_id" : "Fred Luddy",
"category" : "Network",
"close_code" : "null",
"cmdb_ci" : "FileServerFloor2",
"contact_type" : "null",
"description" : "User can't get to any of his files on the file server.",
"escalation" : "Overdue",
"hold_reason" : "Awaiting Vendor",
"impact" : "1 - High",
"incident_state" : "On Hold",
"knowledge" : false,
"location" : "1050 Sunnyview Road Northeast, Salem,OR",
"notify" : "Do Not Notify",
"opened_at" : ISODate("2017-05-04T16:07:12Z"),
"opened_by" : "Joe Employee",
"priority" : "1 - Critical",
"problem_id" : "PRB0000007",
"reassignment_count" : NumberLong(1),
"section" : "web_service",
"severity" : "1 - High",
"short_description" : "Network file shares access issue",
"sla_due" : "UNKNOWN"
}


and you want to send the contents over syslog in a LEEF format, you can run:

db.webservice.aggregate(
{$project:{'*':1}}, {$out:{
format:'leef',
vendor:'default_vendor',
product:'$SonarG Source', product_version:'$number',
ignore_fields:["_id","SonarG Source","number"],
value_replace:[{from:'\n',to:'\\n'}],
fstype:'syslog',
host:'localhost',
loglevel:'notice',
facility: 'user',
protocol:'udp',
port:10516
}})


The steps performed above will produce the following:

Oct 13 13:27:39 localhost sonarw: LEEF:2.0|default_vendor|SonarDispatcher-Web-Service|INC0000002|59cf96748a7597550015e40f|IP Address=localhost#011__status=success#011active=true#011activity_due=2017-08-10 14:51:11#011approval=null#011assigned_to=Howard Johnson#011assignment_group=Network#011caller_id=Fred Luddy#011category=Network#011close_code=null#011cmdb_ci=FileServerFloor2#011comments=2017-08-03 16:13:23 - System Administrator (Additional comments)nAdded an attachmentnn#011comments_and_work_notes=2017-08-03 16:13:23 - System Administrator (Additional comments)nAdded an attachmentnn#011contact_type=null#011description=User can’t get to any of his files on the file server.#011escalation=Overdue#011hold_reason=Awaiting Vendor#011impact=1 - High#011incident_state=On Hold#011knowledge=false#011location=1050 Sunnyview Road Northeast, Salem,OR#011made_sla=false#011notify=Do Not Notify#011opened_at=2017-05-04T16:07:12#011opened_by=Joe Employee#011priority=1 - Critical#011problem_id=PRB0000007#011reassignment_count=1#011section=web_service#011severity=1 - High#011short_description=Network file shares access issue#011sla_due=UNKNOWN

Tab is the default delimiter. To specify a delimiter, e.g. & delimiter, add delimiter: '&'.

### SonarMaprGateway

The integration of MapR audit logs to SonarGateway involves the following steps described below.

### Note

Some of these steps are manual, as per your specific needs; others are automatically carried out by the sonar packages.

2. Set up the relevant audit in the MapR cluster.

3. Mount the HDFS on one of the MapR nodes, or on a machine external to the cluster.

4. Run expandaudit periodically on one or more of the MapR nodes.

5. Periodically run rsync to copy the logs from the HDFS mount to a local filesystem. This is required for rsyslog monitoring.

6. Run rsyslog to monitor the local file system for new log files.

7. Add MapR data source to DCAP Central.

sudo cat <<'EOF' | sudo tee /etc/yum.repos.d/rsyslog.repo
[rsyslog_v8]
name=Adiscon CentOS-$releasever - local packages for$basearch
baseurl=http://rpms.adiscon.com/v8-stable/epel-7/$basearch enabled=1 gpgcheck=0 gpgkey=http://rpms.adiscon.com/RPM-GPG-KEY-Adiscon protect=1 EOF  Upgrade rsyslog: sudo yum update rsyslog  #### Setting up audit on MapR For up-to-date instructions, please refer to MapR documentation. In general, getting audit logs from a MapR cluster requires the setup of auditing on three levels: cluster, volume and object. An object can be a table or a folder to be audited. Enable audit from the DCAP Central Application, orvia a CLI. The following examples will use a MapR CLI: • Cluster Level To enable logging for cluster management operations: maprcli audit cluster -enabled True  To enable filesystem and table operations: maprcli audit data -enabled True  This only allows data operations logging in volumes. • Volume Level To enable actual logging per volume: maprcli volume audit -name <volume> -enabled true <options>  To get a list of volumes: maprcli volume list -columns volumename  To enable full auditing for all volumes: for volume in$(maprcli volume list \
-columns volumename | tail -n +2); do
maprcli volume audit -name $volume -enabled true -coalesce 1 -dataauditops +all done  Get info on the volume (verify that it is being audited): maprcli volume info -name <name> -json  Alternatively, use =hadoop mfs ls= command: hadoop mfs -ls /  The output includes three capital letters (immediately after the permissions) that correspond to Compression, Encryption and Auditing for each entry. A means enabled, U means disabled. If the third letter is A, then auditing is enabled. Example: hadoop mfs -ls /|grep /tables vrwxr-xr-x Z U A 1 root root 1 2017-07-24 15:53 268435456 /tables  In this above example, auditing is enabled. • Object Level To enable actual logging per directory/file/table: hadoop mfs -setaudit on <directory|file|table>  To view all hadoop directories: hadoop mfs -ls /  Note: Setting audit on a hadoop directory can be non-recursive; audit needs to be set for for each directory to be audited. To set up auditing for all objects, run the following: hadoop mfs -lsr / |awk '/rw/ {print$12}' | xargs -n 1 hadoop mfs -setaudit on


After this command is run, the auditing flag will be A for each directory/file/table in the system.

#### Mounting HDFS

If SonarMaprGateway in to be installed and run on one of the MapR nodes, ensure that this node has 'mapr-nfs' service running and that the MapR file system is mounted on '/mapr'. If you intend to run SonarMaprGateway on a machine external to the cluster, mount the MapR file system to '/mapr' folder by using 'mapr-loopbacknfs'. For more information, see http://maprdocs.mapr.com/home/AdministratorGuide/c_POSIX_loopbacknfs_client.html.

#### SonarMaprAgent

The basic MapR logs contain only references to users' names, folder paths, etc., and not the actual names and paths. To get the necessary details in the logs, expand them using the MapR expandaudit utility. Install the sonar-mapr-agent package on one or more MapR nodes; this package will setup a periodic cronjob on the node that will expand the logs.

#### sonar-mapr-gateway Package

Install the sonar-mapr-gateway package on a machine with the MapR file system mounted (see above). Update the target DCAP Central machine IP address and restart rsyslog for the change to take effect.

Install package:

sudo yum install sonar-mapr-gateway


Update the target DCAP Central machine IP:

sudo vi /etc/rsyslog.d/sonar-mapr-gateway.conf


Replace "SonarG IP" with the IP address of your DCAP Central machine.

Restart rsyslog:

sudo systemctl restart rsyslog


Performing the steps described above will achieve the following:

1. Add periodic rsyncs of the audit files from MapR file system to local file system.

2. Setup rsyslog service to monitor the local file system folder and to send the data to SonarGateway on the DCAP Central machine.

Note: Ensure that you have an open connection from the host where sonar-mapr-gateway is installed to the DCAP Central host, configure your firewall etc. accordingly.

#### Add MapR data source on the DCAP Central Machine

1. Add the MapR data source to SonarGateway:

sudo vi /etc/rsyslog.d/sonar/gateway/sonargateway.conf


Uncomment this line:

$for x in {1..100}; do for y in {1..10}; do logger -t RandomStuff -p user.warn shuf -n250 /usr/share/dict/words | paste -d, -s; done; sleep 1; done In the first example, random characters are sent. A compression ratio of ~1.25X can be seen for the random case. In the second example, random words from a dictionary are sent. More redundancy is expected here, as can be seen by the compression ratio of ~1.6X. #### Validating the Connection On sonargateway: systemctl restart rsyslog yum install -y lsof # should see something like the following on sonargateway lsof -i :6514 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME rsyslogd 9688 root 6u IPv4 36936 0t0 TCP *:syslog-tls (LISTEN) rsyslogd 9688 root 7u IPv6 36937 0t0 TCP *:syslog-tls (LISTEN) touch /tmp/6514_tls_test.txt tail -f /tmp/6514_tls_test.txt  On the agent, check the connection by writing to port 514 locally: yum install -y nmap-ncat echo "<83>Oct 21 15:57:38 agent agetty[79106]: Hello from Agent" | nc localhost 514 ## if you installed more than one agent, run this on the other agent echo "<83>Oct 21 15:57:38 agent agetty[79106]: Hello from Agent 2" | nc localhost 514  The following should appear on the sonargateway machine via tail -f /tmp/6514_tls_test.txt:  { "Source Machine":"agent","Timestamp":"1508601458","Message":" Hello from Agent","Facility":"authpriv","Severity":"err","Program Name":"agetty" } { "Source Machine":"agent","Timestamp":"1508601458","Message":" Hello from Agent 2","Facility":"authpriv","Severity":"err","Program Name":"agetty" }  If multiple agents have connections, each connection can be seen on the the sonargateway machine by using lsof: lsof -i :6514 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME rsyslogd 9688 root 6u IPv4 36936 0t0 TCP *:syslog-tls (LISTEN) rsyslogd 9688 root 7u IPv6 36937 0t0 TCP *:syslog-tls (LISTEN) rsyslogd 9688 root 10u IPv4 37005 0t0 TCP [ip]:syslog-tls->[dns]:42664 rsyslogd 9688 root 12u IPv4 37011 0t0 TCP [ip]:syslog-tls->[dns]:58630  Assume that rsyslog is installed on the master machine that created the certificates, and that this machine can access two agent machines at 54.153.107.32 and 52.53.165.0. Append the following lines to the end of /etc/rsyslog.conf (and remove them afterwards): module(load="imuxsock") # provides support for local system logging module(load="imklog") # provides kernel logging support (previously done by rklogd)$WorkDirectory /var/lib/rsyslog # where to place spool files
$ActionQueueFileName fwdRule1 # unique name prefix for spool files$ActionQueueMaxDiskSpace 1g   # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown$ActionQueueType LinkedList   # run asynchronously
$ActionResumeRetryCount -1 # infinite retries if host is down *.* @@54.153.107.32:514 *.* @@52.53.165.0:514  Run the following commands on master: sudo systemctl restart rsyslog sudo bash logger -t SomeProgram -p user.notice "Called without required number of parameters" logger -t UnAuthProgram -p auth.warning "Tried to access some restricted resource"  Return to the sonargateway machine and the tail -f /tmp/6514_tls_test.txt. Four lines should appear one for each of the commands performed above, with each duplicated because rsyslog is forwarding to two agents: From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imtcp: module loaded, but no listeners defined - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imfile: no files configured to be monitored - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” [origin software=”rsyslogd” swVersion=”8.30.0” x-pid=”9529” x-info=”http://www.rsyslog.com”] start”,”Facility”:”syslog”,”Severity”:”info”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imudp: module loaded, but no listeners defined - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imtcp: module loaded, but no listeners defined - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imudp: module loaded, but no listeners defined - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” imfile: no files configured to be monitored - no input will be gathered [v8.30.0 try http://www.rsyslog.com/e/2212 ]”,”Facility”:”syslog”,”Severity”:”err”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952269”,”Message”:” [origin software=”rsyslogd” swVersion=”8.30.0” x-pid=”9529” x-info=”http://www.rsyslog.com”] start”,”Facility”:”syslog”,”Severity”:”info”,”Program Name”:”rsyslogd” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952270”,”Message”:” SELinux is preventing /usr/sbin/rsyslogd from name_bind access on the tcp_socket port 10558. For complete SELinux messages. run sealert -l 78d7d7ec-2c88-4947-acdf-c6fbe1beed08”,”Facility”:”user”,”Severity”:”err”,”Program Name”:”setroubleshoot” } From Command 1. –> { “Source Machine”:”derek”,”Timestamp”:”1508952270”,”Message”:” SELinux is preventing /usr/sbin/rsyslogd from name_bind access on the tcp_socket port 10558. For complete SELinux messages. run sealert -l 78d7d7ec-2c88-4947-acdf-c6fbe1beed08”,”Facility”:”user”,”Severity”:”err”,”Program Name”:”setroubleshoot” } From Command 2. –> { “Source Machine”:”derek”,”Timestamp”:”1508952313”,”Message”:” derek : TTY=pts/34 ; PWD=/home/derek/src/ca ; USER=root ; ENV=LD_LIBRARY_PATH=/opt/rh/devtoolset-6/root/usr/lib64:/opt/rh/devtoolset-6/root/usr/lib::/usr/local/lib:/usr/local/lib PATH=/usr/lib/ccache:/opt/rh/devtoolset-6/root/usr/bin:/usr/lib/ccache:/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/derek/.local/bin:/home/derek/bin:/usr/local/go/bin:/usr/local/go/bin ; COMMAND=/bin/scl enable devtoolset-6 ‘bash’”,”Facility”:”authpriv”,”Severity”:”notice”,”Program Name”:”sudo” } From Command 2. –> { “Source Machine”:”derek”,”Timestamp”:”1508952313”,”Message”:” derek : TTY=pts/34 ; PWD=/home/derek/src/ca ; USER=root ; ENV=LD_LIBRARY_PATH=/opt/rh/devtoolset-6/root/usr/lib64:/opt/rh/devtoolset-6/root/usr/lib::/usr/local/lib:/usr/local/lib PATH=/usr/lib/ccache:/opt/rh/devtoolset-6/root/usr/bin:/usr/lib/ccache:/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/derek/.local/bin:/home/derek/bin:/usr/local/go/bin:/usr/local/go/bin ; COMMAND=/bin/scl enable devtoolset-6 ‘bash’”,”Facility”:”authpriv”,”Severity”:”notice”,”Program Name”:”sudo” } From Command 3. –> { “Source Machine”:”derek”,”Timestamp”:”1508952361”,”Message”:” Called without required number of parameters”,”Facility”:”user”,”Severity”:”notice”,”Program Name”:”SomeProgram” } From Command 3. –> { “Source Machine”:”derek”,”Timestamp”:”1508952361”,”Message”:” Called without required number of parameters”,”Facility”:”user”,”Severity”:”notice”,”Program Name”:”SomeProgram” } From Command 4. –> { “Source Machine”:”derek”,”Timestamp”:”1508952408”,”Message”:” Tried to access some restricted resource”,”Facility”:”auth”,”Severity”:”warning”,”Program Name”:”UnAuthProgram” } From Command 4. –> { “Source Machine”:”derek”,”Timestamp”:”1508952408”,”Message”:” Tried to access some restricted resource”,”Facility”:”auth”,”Severity”:”warning”,”Program Name”:”UnAuthProgram” } #### Event Format Parameters Some event formats have parameters. The JSON event format can be configured to not be strict about using the backslash character as an escape, allowing malformed JSON strings such as "newhostuser" to be used. The default of the "escaping" parameter is true, i.e. strict escaping. By changing this parameter to false, you can relax the requirements from input strings: "event_format": { "standard": "JSON", "escaping" : false }  ### SonarLogstash As a rule of thumb, SonarGateway is approximately one order of magnitude faster and more scalable than Logstash; for new DCAP Central implementations, it is recommended to use SonarGateway and rsyslog rather than Logstash. However, if you require Logstash infrastructure, this section provides instructions for adding DCAP Central as a target source to receive logs through Logstash. You can integrate any of your Logstashes directly with DCAP Central or forward from your Logstash agents to a centralized Logstash process living on the DCAP Central machine or another machine. Follow the instructions below for any Logstash that will be writing directly to SonarW. First, place the sonarw.gem module in your logstash modules directory. On a Red Hat 7 system, with Ruby version 1.9, the path is /share/logstash/vendor/bundle/jruby/1.9. The logstash config file will contain lines similar to these, where all log lines go to elasticsearch server on port 9200: output { elasticsearch { hosts => ["localhost:9200"] } }  Occasionally, different log lines go to different output modules. In the example below, Apache logs are directed to Elasticsearch server on port 9200, while other logs are directed to Nagios: output { if [type] != "apache" { nagios { } } else { elasticsearch { hosts => ["localhost:9200"] } } } }  In order to redirect log messages to DCAP Central instead of (or in addition to ) Elasticsearch, replace (or add) the lines defining the Elasticsearch output with lines defining DCAP Central output. For example, the first definition above should read: output { sonarw { collection => "apache_logs" database => "syslogs" uri => "mongodb://user:pass@localhost:27117/?authenticationSource=admin" }  And the second example should be: output { if [type] != "apache" { nagios { } } else { sonarw { collection => "apache_logs" database => "syslogs" uri => "mongodb://user:pass@localhost:27117/?authenticationSource=Admin" } } } }  Note: You need to allow the SonarW component within DCAP Central to receive connections from your Logstash sources; consult jSonar support for more information. ### Configuration file Reference Any event can be sent (traditionally, using syslog protocol) to a DCAP Central system to be stored and queried. However, events need to be transformed and directed to the desired database collection. You can configure SonarGateway to both direct incoming events into the DCAP Central storage system and reshape them in terms of data types and filtering. You can filter whole messages or redact fields in the message. You can also calculate alternative fields values or add new fields, saving time when you query for the message. Events may come in many formats. SonarGateway always convert events to a list of fields and their values, regardless of the format (JSON, LEEF, CSV, XML and others). You can configure SonarGateway to examine specific fields and their values to decide how to store the data. Field values can by themselves be a list of field/value pairs or arrays, allowing a faithful representation of structured formats such as JSON and XML. The DCAP Central system always runs the rsyslog daemon. The configuration files in /etc/rsyslog.d/sonargateway instruct it to run a SonarGateway service instance once events come in a given TCP connection, and feed any incoming data to that instance. The instance runs as long as the rsyslogd service runs. The SonarGateway instance, configured by JSON-formatted files in /etc/sonar/sonargateway, parses the events coming to it from rsyslogd, filters them, reshapes them, and send them to SonarW databases and collections. In the text below, "Message" denotes an rsyslogd-sourced line of text, "Event" denotes a logical Event from an application, an OS or a network appliance, and "Field" denotes the components of the Event. Note that one event, and even one field, can be large enough to span multiple rsyslogd messages. The configuration files specify how to handle these "multiline" events scenario. #### Command Line Parameters sonargateway is normally invoked directly by rsyslogd that runs on DCAP Central system. It is not run by a user. It includes the following command line parameters: ---config arg – Configuration file (required). This is the only required parameter. Other optional parameters are: --help – Produce help message --version – Display version number and exit --verbose – Display verbose debug information when running. --validate – Validate the configuration file, and exit. --flush_interval arg – How often to flush data to SonarW. --stats_interval arg – How many seconds between sending operating statistics to SonarW to database sonar_log , collection sonargateway. --delay_depot_dir arg – Directory to store delayed messages --input_delimiter arg – A string marking the end of rsyslogd-provided message #### Configuration File A SonarGateway instance is configured with a configuration file in /etc/sonar/sonargateway. Rsyslogd runs the instances and provides the configuration file upon the arrival of the first event over the network. Under heavy load, rsyslog may start multiple instances to handle the same event source, and the same configuration file will be used. #### General Structure The configuration file is a JSON Objects Array. The following type of array members are present: Global Setting – Parameters regarding all of the events handled by the instance. Global Fields – Instruction how to transform fields that are always generated in the database object, regardless of what the event was. Common Fields – For convenience, if a field is present at many, but not all of the events, the field transformation definition can be specified in this section and used later when handling a specific event type. Rule Set – This section can appear repeatedly. It describes how to handle an event - what fields to expect, how to transform them and which Database and collection they should go into. #### Field Translation If a field appears in a message and you want it to be present as a string in the stored message in the database, you don’t need to declare any translation. By default, all fields of an event are written to the database. However, you may want to remove, rename, redact, concatenate, check, and do other transformation on the field. You may also want to generate a new field that does not exist in the event. The field transformation event allow you to do that. The general format of a transformation is: "Field-Name" : { "Attribute": "value", "Attribute": "value", ...}  For example: "ETime": { "type": 9, "Rename": "Event Time" }  This means that the the “ETime” field in the event should be stored in the Database as a date and named “Event Time” and not “Start Time”. Here is a list of all possible field transformation attributes: 1. type Try to convert the field to the specified numeric BSON types. The list of supported BSON types: • Double Precision - 1 • String - 2 • Boolean - 8 • Date - 9 • Null - 10 • 32-bit integer-16 • 64-bit integer-18 1. date_format Specify the date format to be used to convert the field test into a date. Using this attributes automatically implies specifying “type: 9”. The format string is according to POSIX.1-2001 strptime() definition. For example: "global_timestamp": { "date_format": "%Y-%m-%d %T" }  1. Rename Save the field in the DB under a different name. For example: “ETime”: { “type”: 9, “Rename”: “Event Time” }  Means that although the event contains a field called “ETime”, in the SonarW DB the Event field will be called “Event Time”. 1. generate Sometimes you want to generate a new field in the Event, if it doesn’t appear. For example: “Sender”: { “generate”: true, “default” : “FireWall Monitor” }  This means that if the field “Sender” does not appear in the event, a field will be generated with the value given. Note the value type is boolean, so you should not use quotes around the true or false value. Specifying false does not have any effect on the final stored event. 1. null Set the string value that represent a null. For example, if you want the value of the field “Sender” to be stored as null if the events contains “N/A” , use: “Sender”: { “null”:”N/A” }  If an incoming event has “N/A” as an empty string, the stored event will contain the null DB type. 1. default Set the default value if the event contains an empty value. This will not have an effect if the field is completely missing. If you want the field to appear in all cases, add the “generate” attribute as shown above. For example: “Sender”: { “generate”: true, “default” : “FireWall Monitor” }  If an incoming event has “Sender” as an empty string, the stored event will contain “FireWall Monitor”. 1. redact Redact the field in the DB Message. For example: “ETime”: { “redact”: true, }  Means that although the event contains a field called “ETime”, it will not be stored in the DB. 1. extract Sometime a field can contain further fields. For example, in events generated by MS SQL server, the field “Message” contains further event fields. For example: "Message" : { "redact": true , "extract": "MSSQL" },  If the Event contains a field called “Message”, the field itself will not be stored in the SonarW database (because of the redact attribute) but it will be parsed according the MS SQL format, and the contained fields will be added to the message. In addition, some fields can contain a text that can imply a set of one or more key-value pairs, and you want that key to be a field in the message with the associated value. Specify a regular expression with an even number of capture groups to grab the field names and their values from the message. For example: "Message" : { "redact": true , "extract": ".*\$(CLIENT): (.*)\$" },  If the Event contains a field called “Message”, with the value "Login failed for user 'WIN-K\Joe'. Connection made using authentication. [CLIENT: 73.5.4.12]” A new field called “CLIENT” will be generated in the database with the value “73.5.4.12”. Note: Be careful specifying the field name capture group; a mistake can lead to a very large number of field names instead of the one you want. The best practice is to use a capture group containing the field name as in this example. Note: Fields generated by this attributes are normal fields for all purposes and you can define a transformation for them. 1. update Sometimes you want to update a document in SonarW instead of creating a new one. To update a document, a key field must be specified by using the "update" attribute. For example: "SessionID" : { "update": true }  The SessionID field of the event is used as a key to select which document in the SonarW collection to update. The values of all the other fields in the Events will be used to update the respective values in the SonarW collection. If a an event with this ID does not exist, a new record will be created. Note the collection that is the target of the Event must not be ingested one. See the "do_not_ingest" global parameter. 1. set_on_insert Continuing the case of "update" above, sometimes you would like a field value to remain constant even if the SonarW records gets updated. Use the set_on_insert to mark an event field to be stored once and never change. For example: "Original Arrival Time" : { "set_on_insert": true, "eval" : "$eventTime"
}

1. eval

Sometimes it is necessary to calculate new field values, or completely new fields. There are several functions you can call and their return values will be the value for the field. The list of evaluators and examples for them are defined in the next section. However here are some basic principles.

One way to use an evaluator is to generate a new field based on existing ones:

“Full Name” : {
“generate”: true,
“eval” : [“concat”,”$First Name”, “ ”,“$Last Name”]
}


The field “Full Name” will be in the stored event, and it will be a combination of the values of “First Name” and “Last Name”.

Another way is to modify a field value, if already exist; and if not, the field will not be in the message:

“Client Address” : {
“eval” : [“IP_address”,”$Client Address”] }  This will try to convert the value of the field Client Address into an IP address, if not already one. The eval attribute value is either a string, which is the name of the function to call (e.g. “eval”: “now” ) or an array of strings if the function has parameters ( e.g. “eval” : [“hostname”, “$Client IP”] ).

#### Evaluators

1. Copy

Copy the value of a field into another field:

“Client IP” : {
“generate” : true,
“eval” :“$Source IP” }  The field “Client IP” will be in generated, and it will be have the same value and type of the “Source IP” field. 1. rand Generate a 64 bit random number. This is useful when a key in a collection needs to be unique. You can use the type attribute to force double precision or 32-bit number instead of 64 bit number. For example: "Session ID": { "generate": true, "eval": “rand” ] }  1. "now" Returns the current time. For example: "Event Arrival Time": { "generate": true, "eval": "now" ] }  1. IP_address Return an IP address as a string, given a list of event fields. The value of the first field that is either an IP address or a resolvable host name is returned. For example: "Client IP Address" : { "generate": true, "eval": [ "$IP_address" , "$ClientHost" , "127.0.0.1" ] }  Suppose IP_address field is suppose to contain an IP address. If this is the case, the "Client IP Address" field in the SonarW document will contain that address. If the field is missing or empty, sonargateway will try to resolve the value in ClientHost into an IP address. If that fails, the value used will be "127.0.0.1". 1. concat Concatenate several fields values and constant values into one. For example: "Full Name" : { "generate": true, "eval": [ "$First Name" , " " , "$Last Name" ] }  The value of the field "Full Name" in the SonarW document will be the value of the event field "First Name", a space, and the value of "Last Name". 1. ifeq Choose a value to be stored based on comparing two values, constants or event field values (or a mix). For example: "Operation Permitted": { "generate": true, "eval": [ "ifeq", "allowed", "true", "1", "0" ], "type": 18 }  The value of the field "Operation Permitted" in the SonarW document will be 1 , as a 64 bit number, if the value of the message field "allowed" is "true" , or 0 otherwise. 1. ifempty Choose a value to be stored based on checking if a field is empty or not. For example: "Succeeded": { "generate": true, "eval": [ "ifempty", "sql-error", "1", "0" ] }  The value of the field "Succeeded" in the SonarW document will be 1 , as a 64 bit number, if the field "sql-error" is empty or there was no such field, or 0 otherwise. 1. hostname This is the reverse of the IP_address evaluator. Return an FQDN, given a list of event fields. The value of the first field that is a resolvable FQDN or an IP address that is resolvable to an FQDN is returned. For example: "Client Host Name" : { "generate": true, "eval": [ "$ClientHost",  "$IP_address" , "localhost.localdomain" ] }  1. errno Converts a numeric UNIX error code into a string with the error message. For example: "Exception Description": { "generate": true, "eval": [ "errno", "$status"]
]
}


The field “Exception Description” will be added to the event, and will contain the error message for the error code in the “status” field of the message.

1. oracle_error

Converts an Oracle error code into a string with the error message. For example:

"Database Error Text": {
"generate": true,
"eval": [ "oracle_error", "RETURNCODE" ]
},

1. uniqueId

This feature is used to enable several events coming in over time to have a unique numerical ID common to them which depends on a set of fields that exists in all of them. This helps with issuing SonarW queries and report that use this ID as a foreign key. The evaluator can work in one of three modes: $set, $get and $erase. Suppose the events coming in are TCP packet metadata, and there is in SonarW a "connection" table for connections and "packet" table to track packets. Make a ruleset for the connection table , using a selector to select new connection only, and use: "_id": { "generate": true, "eval": [ "uniqueId", "$set", "source-ip", "source-port",  "dest-ip", "dest-port" ],
},


And _id will be generated for the combination of "source-ip", "source-port", "dest-ip", "dest-port" . All further calls to uniqueId in the same sonargateway instance will yield the same ID, until $erase or another $set with the same field value will be called.

Make another ruleset for the connection table , using a selector to select connection closing only, and use:

"_id": {
"generate": true,
"eval": [ "uniqueId", "$erase", "source-ip", "source-port", "dest-ip", "dest-port" ], }  $erase will still return the same ID, but further calls to $set or $get with the same field value combination will yield a different ID. This ensures that if you miss a connect or disconnect events (or both), the IDs will be unique per TCP session.

Finally, make a ruleset for the packets collection, and use:

"_id": {
"generate": true,
"eval": [ "uniqueId", "$get", "source-ip", "source-port", "dest-ip", "dest-port" ], }  1. regex Extract a string from a given field value using a capture group in a regular expression. For example: "Database": { "generate": true, "eval": [ "regex", "schema", "(.*)/.*" ] }  If the field "schema" contains a value of the form "<db-name>.<table name>", the DB name will be returned. 1. toString Convert an array of values (including objects) into a string. SonarGateway can handle structured data; sometimes you need this information in a string (for example, to perform textual searches). Pass the name of the field to convert, the delimited to put between array members and the delimiter to put between object. You can also name the object field member you would like to see in the string. For example: "Position": { "generate":true, "eval": ["toString", ";", " ", "country","province"] }  If the field "Position" was originally an array: [ {"city":"Vancouver", "province":"BC","country":"Canada"}, {"city":"Toronto", "province":"ON","country":"Canada"}]  Its content will be replaced with the string Canada BC;Canada ON #### Global Setting The global setting object always looks like this: { "global_settings": { <global_attribute>: <value>, <global_attribute>: <value>, ... } }  The sections below explain the various global settings possible. #### Multi line support settings Sometime an event is split between messages. There are two ways to generate the entire event. One is to merge few messages together, examining each at a time. Another is to concatenate all the messages together and then splitting them. This processing happens before SonarGateway tries to parse the event format , find the fields and their values, etc. 1. Using the merge-messages method Set the value of "merge_lines" to true, and specify how SonarGateway recognizes how to merge messages together. You have to specify at least one of the following: • start_event_regex - a regular expression, that if matches, marks that the current message is the first one in an event. All previous messages , if not processed yet, are merged and processed. • end_event_regex - a regular expression that, if matches, marks that the current message is the last one in the event. All messages not processed yet , including the current one, are merged and processed. And you can also specify, if neccessary: • do_not_end_event - If a message matches the given regular expression, it will not be the event's last message, regardless of anything else. • event_end_preventor - If a message matches this regular expression, the event will not end until a messages matching end_event_regex is received, event if a message matching end_event_regex is received. • max_lines_per_event - Maximum number of messages per event. After this many unprocessed messages, they are merged and processed. • max_message_size - Maximum number of bytes that can compose one event. If a message comes in that brings the total bytes in all unprocessed event over this number, the messages will be merged and processed. For example: { "global_settings": { ...., "multiline_settings" : { "merge_lines": true, "start_event_regex": "{\"SonarG Source\"", "end_event_regex": "\"End Column\": 1 }", "max_lines_per_event" : 20, "max_message_size" : 1000, } } }  Messages will be collected, starting events matching the given start regex, until a message matching the end_event_regex arrives. If a message matching start_event_regex arrives before end_event_regex , a new event will start. If end_event_regex arrives, even without a message matching start_event_regex, a new event will be generated and processed. 1. Using the concatanate-and-split method Set the value of "merge_lines" to false, and specify a splitting regular expression. Wherever the expression matches, the accumulated message data will be split into a new event. Capture groups will be dropped as they indicate a delimiting string and not a part of any event. For example, this setup will merge all incoming messages into one long stream of data and split it into distinct events whenever " X " appears: { "global_settings": { ...., "multiline_settings" : { "merge_lines": false, "split_at" : "(\\sX\\s)" } } },  #### Events Ignore and Pre-processing setting You can drop a whole message, or replace part of a message, using the /preprocessor/. The preprocessor array in the global_settings object allows you to do the following: • Drop a whole message. Create a document with the field "drop" and a regular expression value. Every message matching the regular expression will be completely ignored. • Replace a part of a message with an alternative text. Create a document with the fields "replace" and "with". For "replace", specify a regular expression with capture groups capturing the values to replace. For "with", specify an array of strings with the replacement values. The length of list of replacements should be equal to the number of capture groups. • Drop a part of the message text. Create a document with the field "replace", with a value of a regular expression with a capture groups capturing the values to remove. For example: { "global_settings": { ..., "preprocessor" : [ { "drop" : "DB.*Name" }, { "replace" : "(oldField)=(\d*)", "with" : ["newField","000"] }, { "replace" : "(rep.*Field)Name" } ] } }  This will drop all messages matching the regex "DB.Name", will remove from all message any text maching the regular expression "(rep.Field)Name", and will replace in all messages of the form "oldfield=<some number>" with "newField=000". #### Message Header Cleansing Before rsyslogd messages are collected into an Event, you can scrub the message header to remove any transport-level artifacts. In general, the rsyslogd on DCAP Central system is configured to convert those artifacts into real event fields and remove them from the message text. However, sometime it is not possible for rsyslogd to recognize that the start of the message is a transport-related text to be removed. Therefore, you can configure a regular expression with one capture group as the message header cleanser. Everything captured by the group will be removed. For example.: { "global_settings": { "multiline_settings": { "syslog_header_cleansers" : "regex:(PRI\\s\\d+)\\s+" } } }  Will remove the header from message:  "PRI 2 LEEF:2.0jsonar-leef-generator23||dstPackets= src=fw.company.com"  And leave this well-defined LEEF to be further processed:  LEEF:2.0jsonar-leef-generator23||dstPackets= src=fw.company.com"  #### Sonar DB Settings sonar_URI - The MongoDB-style URI of the SonarW DB server to connect to. target_db - Unless specified by a rule, the name of the SonarW database to write events to. do_not_ingest - An array of strings specifying collections that will be updated, and therefore batch-style inserts cannot be used. For example, the following defines the SonarW DB to use, define a collection to be non-batched, and declares that the default DB to write to is “SIEM_DB”: { "global_settings": { … "do_not_ingest": ["last_activity"], "sonar_URI": "mongodb://user:pass@127.0.0.1:27117/admin", "target_db": "SIEM_DB", … } }  #### Global Fields As explained above, the global_field_translations object specifies how to deal with field that are present in all events, regardless of which rule contains it. For example: "global_field_translations": { "msgtxt": { "rename": "Message Text" } }  In all Events, in regardless which ruleset handles it, the field "msgtxt", if present, will be renamed to "Message Text". #### Common Fields The common_field_translations object specifies how to translate fields, but does not cause them to be translated unless another rule asks for that field. This is a useful shortcut if you have a field that appear in many event types, but not all, and has complex translation. Use the name of the common translation to apply the same to a field in a specific case. For example: [ "common_field_translations": { "msgsrc": { "set_on_insert": true, "host_name" : ["hostname", "source host", "source ip", "local host"], type: 2, } }, { "output_connection": { "default_collection": "instance", "collection_selectors": { "select_by_key_value": { "cs2": "MYSQL" } }, } "Severity": { "type": 18 }, "Source": "msgsrc" }, { "output_connection": { "default_collection": "session", "collection_selectors": { "select_by_key_value": { "cs2": "ODBC" } }, } "Severity": { "type": 18 }, "Source": "msgsrc" } ]  Without the use of common_field_translations, instead of the value "msgsrc", you would need to again enter the long definition given in the common section. For example, the last entry would have to be the longer: { "output_connection": { "default_collection": "session", "collection_selectors": { "select_by_key_value": { "cs2": "ODBC" } } , } "Severity": { "type": 18 }, "Source": { "set_on_insert": true, "host_name" : ["hostname", "source host", "source ip", "local host"], type: 2, } }  ### Rule Sets #### Group and Unique label As explained above, an event will be used by one rule of each group. For example, if you have a lot of database sessions events coming in, and you have rules to sort them out to their respective database type, you can put all the rules in one group, and provide a catch-all rule with no selector to store any event not previously handled, like this: { “output_connection”: { …, “select_by_key_value”: { “Type”: “MSSQL” } “unique_label”: “mssql_session”, “group_label”: “sessions”, …, }, “output_connection”: { …, “select_by_key_value”: { “Type”: “MariaDB” }, “unique_label”: “mdb_session”, “group_label”: “sessions”, …, }, “output_connection”: { // This will catch all other events …, “unique_label”: “mssql_session”, “group_label”: “sessions”, …, } } Note that each rule has to have a unique label, regardless of the group. #### Selection Using the selection part of the rule, you decide if the rule applies to the currently handled event. Only if the rule applies, the event will be written to SonarW. The general format of the selection is: { "output_connection": { ... , "collection_selectors": { <selector_type>: { <selector_parameter> : <selector_value> , ... } } }, ... , }  If you specify more than one selector, once one matches the rule will be selected. 1. select_by_key_value Select the rule depending on specific field values.$or and $and operators can be used. The value should match the string exactly, or if the string is in the form "/REGEX/", it should match the regular expression specified by REGEX. For example: "collection_selectors": { "select_by_key_value": { "$or": [
{
},
{
"source": "company.com"
}
]
}
}


The rule will be selected if the field "prop.code" matches the regular expression, or if the field "fieldA" has the value "a" exactly. For example, the JSON event:

{
"code" : 201,
"message" : "HTTP Code 201 - Object Created"
},
"source: "supply.com"
}


will match, because the first condition is met.

Instead of "$or", you can use "$and" , and then both fields should match. So the example JSON above would not match.

1. select_by_key_not_value

This selector behaves exactly like select_by_key_value , but the final result is reversed.

1. select_by_key

The output collection will be selected if the any of the strings in the array exists in the event. For example, if the event deals with SQL activity, the collection will be selected:

"select_by_key": [ "SQL Statement", "SQL Result" ]

1. select_by_regex

Select the rule if the regular expression matches anywhere in the event. For example:

"collection_selectors": {
"select_by_regex": ".*Data.*Requests.*"
}


If any field in the event matches the regular expression, the rule will be selected.

#### Event Formats

There are several event formats sonargateway can parse. You have to specify one event format, and its parameters, for each rule, like this:

{
"output_connection": {
... ,
"event_format": {
"standard": <event_format>
<event_format_parameter> : <event_format_value> ,
},
... ,
}


For example:

{
"output_connection": {
... ,
"event_format": {
"standard": "JSON",
"unwind" : "events"
},
... ,
}

• CEF

Treat incoming messages as CEF events:

"event_format": {
"standard": "CEF",
},


Note that custom fields named (e.g. cs1) are saved in SonarW using their provided label (e.g. cs1Label). Field types are automatically set for pre-defined fields.

• LEEF

Treat incoming messages as LEEF version 1 or 2 events:

"event_format": {
"standard": "LEEF",
}


Field types are automatically set for pre-defined fields.

• CSV

Treat incoming messages as CSV file lines. Each field must be surrounded by double-quotes and fields must be separated by a comma. Newlines are allowed in fields. The parameter specifies which fields are present and their names. For example:

"event_format": {
"standard": "CSV",
"Timestamp",
"Service",
"Action Type",
"DB User",
"Collection Name",
"Database Name",
"Client IP",
"Event",
"Error"
]
}


Incoming messages will be treated as CSV lines with 9 fields, whose names are listed.

• JSON

Treat incoming messages as JSON. The data must strictly adhere to RFC 7159, except for treating of escaping characters which can be relaxed. There are two parameters, "unwind" string parameter to extract the actual events from an array with the given name within the JSON, and "escaping" boolean parameter that, when set to false, relaxes the RFC 7159 requirements about escaping characters so that strings like "ADMINSERVER" can appear in the input. For example:

"event_format": {
"standard": "JSON",
"unwind": "events"
}


Will create many JSON events out of an arriving message. If the message contains the field "events", as an object array, each of those objects will become one event.

• ORACLE_SYSLOG

Treat incoming messages as Oracle's audit message as they appear in syslog:

"event_format": {
"standard": "ORACLE_SYSLOG"
}


Note that Oracle audit to XML is of a different format. Use the SonarGateway for Oracle if you configure XML based auditing.

• POSITIONAL

Use field position to endow event fields with name. You can optionally define the delimiters that separate words from one another. By default, the separators are spaces and tabs. For example:

{

“event_format”: “positional” ,

“delimiter”: ” t,” ,

“fields”: { “Type” : 1, “Timestamp” : [2,3], “Thread Name” :4, Message:[5], Sender: [-2,-1] }

}

The events will contain five fields: The first word will be the event type, the second and third word together constitute a time stamp, the fourth word will have the value for the "Thread Name" field, and all words from the fifth and forward will be saved together in the "Message" field. In addition, the last two words will be saved in the field "Sender".