Using SonarK

SonarK is available under the “Discover” icon in the DCAP Central home page.

When you first load SonarK you need to define which collection you want to use. If this is the first time you use SonarK click Management on the left and then click Index Patterns. To create a new index pattern click the + button and enter the Index name. Enter the database name, a dash, and the collection name - e.g. sonargd-instance. The metadata will be read and you can select the time field name which serves for time queries. Then click Create. You can also set this to be the default index.

Click on Discover to the left. The default index will be selected or you can use the drop down to select any collection to explore.

Select the timeframe on the top right - either a time shortcut or using a from-to range. The system creates a histogram of how many records are in the collection for the timeframe and also displays a cursor of the matched documents. The default is 500 documents but you can adjust this in Management -> Advanced Settings. You can double click on any bar in the histogram to drill down to a specific timeframe.

On the left side you will have the list of fields of this collection along with their type. Click on a field to see the five top distinct values of the records shown on screen. You can also click on Analyze entire selected data to get the top ten distinct values; note that this operation can take a longer time since the entire selected data set is analyzed, not just the cursor records currently showing on the screen.

The normal display in the main pane is for each documents in a concise form. Click on the triangle to open the document and see the fields. You can also select only certain fields that you want shown as a table. Click the Add button next to the field on the left or on the table icon in the middle section when the document is expanded. When showing a tabular view you can click the x to remove the column from the display and when you remove all columns the display reverts back to the concise view.

You can add filters to look at specific data. If you want all matches for a specific field value you can add it by clicking on the “plus” magnifying glass next to the value on the left or the main area. Click on the “minus” magnifying glass to select all documents that have a value different than the selected value. You can also type in serach criteria in the top search bar and the syntax conforms for the most part to the Kibana/Lucene syntax (minus some elements such as boosting). You can use both Lucene syntax and Kibana DSL syntax.

Use boolean operators to combine criteria, e.g:

Objects_and_Verbs: drop OR Objects_and_Verbs: create

Use ranges to select numeric ranges, eg. (note - TO is case sensitive):

Failed_Sqls: [1 TO 100]

or for an open-ended range:

Failed_Sqls: >=1

Since field names in DCAP Central can have spaces, replace a space with an underscore when making the search. For example, instance has a field name called “Objects and Verbs” and you can enter:

Objects_and_Verbs: drop

to get all documents that match /drop/ as a regular expression. Note that for string field all searches are regular expression searches.

Note that when you do not specify a field name the search will be a brute-force search on all fields of the documents and will take longer than if you specify it for a particular field.

To get all documents with an empty or null Session End do either:

NOT Session_End:*
or
NOT _exists_:"Session End"

For more syntax examples consult Kibana documentation at https://www.elastic.co/guide/en/kibana/current/discover.html.

SonarK supports a customizable and enriched syntax. Enhancements include:

  1. Support for field names that include spaces: Many fields include spaces in them - for example - “Objects and Verbs”. Kibana does not support spaces but in Sonark you can type in Objects_and_Verbs: drop and it will run the correct query.
  2. SonarK supports a rich and extensible syntax that allows you to gain from the analytic operator richness of DCAP Central. For example, if you want to see all events that have a DDL command type in LIKEGR$Objects_and_Verbs: “DDL Commands” - LIKEGR is a syntax extension (LIKE GROUP) that will pull out all elements from the DDL Commands group which mihgt have % and perform a LIKE operation with all these values on the Objects and Verbs field.

Querying Documents

There are two possible ways to query your data through SonarK:

1. Lucene Query Syntax SonarK’s search implements a subset of the Lucene Query Syntax for quering documents. The lucene query syntax is not implemented in it’s entirety due to underlying architectural differences between Elasticsearch and SonarW which will be discussed in the limitations section below. In general, with an identical dataset loaded into Elasticsearch, SonarK will return the same result set with some exceptions. These exceptions are detailed below.

2. Elasticsearch DSL Query SonarK currently supports all autogenerated dsl queries produced when applying filters in the Discover section of Kibana. This includes:

  • BOOL
  • MATCH
  • MATCH_ALL
  • MATCH_PHRASE
  • RANGE
  • EXISTS
  • QUERY_STRING
  • SIMPLE_QUERY_STRING

For further information on forming valid DSL queries please reference, Elasticsearch Query DSL

Exceptions to Lucene Syntax support:

  • Wildcard Queries: When performing a wildcard query, Elasticsearch has the ability to search each term within a multi term string individually, matching any term within the string that matches the wildcard search. This functionality is not present within SonarW. In order to maintain an intuitive API, SonarK checks if the query string is present within the entire field. For example:

    Elasticsearch :
    \*mber will return any document with a string field ending with mber.
    
    SonarK :
    \*mber query will return any document with a string field containing, mber.
    
    The same behaviour is true of a query like mber\*
    
  • Term Boosting: This feature of Elasticsearch is not implemented in this version of SonarK

  • SonarK IDs: SonarK has internal IDs for each document detonated by _id, these are different from SonarW’s ids. SonarW’s IDs will be displayed in a field called lmrm__id. SonarW’s IDs can be queried using both the _id and lmrm__id. SonarK’s internal _ids cannot be queried, querying _id field will always query sonarW’s IDs.

  • Exists queries: If you want to query for a field that is null or does not exist use one of these two options:

    NOT field:*
    NOT _exists_:"field"
    
    for example to search for sessions that do not have a Session End use:
    
    NOT Session_End:*
    or
    NOT _exists_:"Session End"
    

Full Equality / Exact match support:

SonarK has the ability to query using a full equality. The syntax for finding an exact match is ‘=’.

Examples:

DB_User_Name: =QA
NOT DB_User_Name: "=USER A"

The first example: DB_User_Name: =QA will query The field “DB User Name” for any results that match QA exactly. A user called “QA A” will not match.

The second example: NOT DB_User_Name: "=USER A" will query The field “DB User Name” for any results that do not match “USER A” exactly. A user called “USER ADAM” will match.

Please make sure to use the ‘=’ symbol conjoint to the value you wish to query. Failing to do this may lead to unexpected results:

Wrong usage examples:

DB_User_Name: = QA
NOT DB_User_Name: "= USER A"

The first example: DB_User_Name: = QA will query The field “DB User Name” for any results that match an empty string exactly. In addition it will query the whole collection for QA. This is equivalent to DB_User_Name:"=" OR QA.

The second example: NOT DB_User_Name: "= USER A" will query The field “DB User Name” for any results that do not match ” USER A” exactly, notice the leading space. A user called “USER A” will match.

Out of the Box Syntax Extensions:

  • PCTOVR$field1$field2: 50 - Returns documents where 100*field1/(field1+field2) > 50
  • INGR$field1: group_description - Returns true when field1’s value is in the group with the description.
  • INGR$f1$f2$f3$f4$f5: group_description - Similarly but for a tuple group.
  • LIKEGR$field1: group_description - Returns true when field1’s value matches any group member with the description (i.e. the group member can gave % wildcards) or can be a partial match.
  • LIKEGR$f1$f2$f3$f4$f5: group_description - Similarly but for a tuple group.
  • OV: group description - Returns true when the Objects and Verbs field has one of the group members in the group with the group description. Applies both to a group of objects and a group of verbs. The word(s) in the group members need to be contained in the string but there may be other words.
  • CGR$field1: group_description - Returns true when field1’s value fully contains a member of the group with the description. The word(s) in the group members need to be contained in the string but there may be other words.
  • ReduceNoise: For sessions, exceptions and instance, performs noise reduction aggregating multiple occurrences of the same event on the same day into one.

Adding your own Syntax Extensions:

You can extend the syntax yourself by adding syntax expression documents to the lmrm__sonark_syntax collection in the lmrm__sonarg database. You specify the name, number of parameters and the expression that needs to be performed. For example:

use lmrm__sonarg

db.lmrm__sonark_syntax.insert({"name": "FOO", "#params": 2, "query": "{\"$expr\":
{\"$gt\": [{\"$divide\": [\"$$$1\",{\"$add\": [\"$$$1\",\"$$$2\"]}]}, {\"$divide\":
[$$value,100.0]} ]}}"})

Note that you can have a syntax FOO with different numbers of parameters (i.e. you can have a FOO with 2 parameters and a FOO with 3 parameters at the same time) but you cannot have multiple entries with the same name and number of parameters.

Once you add a new syntax extension to the collection you should wait a minute for the syntax to be added to SonarK

Fields with multiple types in their values

The fields in SonarK are typed, SonarK’s types are of the following: number, boolean, string and date.

In order to present the fields in the best way possible, all multitype fields are presented with their raw values as strings.

When you query a multi-type field, all the types are queried. The query is based on their types, so you cannot use string operators (like wildcards) and get the number results, only the string results.

In order to be able to use aggregation features that are based on type we created a type specific field per type in a multi-type field.

Every typed field is named using the following template: <field name>__lmrm_<type>

For example a field names port with strings and numbers will have 2 single type fields: 1. port__lmrm_number - this field can be used for queries and aggregations for numbers 2. port__lmrm_string - this field can be used for queries and aggregations for string

If you want to query only a specific type within a field you can use these special fields to run those queries.

Note

Any field that has a Long type in sonarW is treated like a multi-type field. This is in order to display long numbers accurately and not cause rounding issues.

Example:

In our example you have a field called “port” which can contain the port as a string: “27117” or as a number: 27117.

When you query the field port: port: 27117 it will find both the string “27117” and the number 27117.

When you query the field port using a subset of the string: port: 271* it will find only the string “27117”. This is because numbers cannot be queried using wildcards in kibana.

The following two fields get created alongside the port field: port__lmrm_number and port__lmrm_string.

If you wish to query only the numbers you can use the following query:

port__lmrm_number:27117

The results of this query will only be numbers and not strings.

Exceptions to Visualization support:

SonarK Supports most functionality in the visualization tab.

The following metrics are supported by SonarK:

  • Average
  • Count
  • Max
  • Median
  • Min
  • Percentile Ranks
  • Percentiles
  • Standard deviation
  • Sum
  • Unique count
  • Cumulative Sum
  • Derivative
  • Moving Avg
  • Serial Diff

The following bucket aggregations are supported by SonarK:

  • Date Histogram
  • Date range
  • Filters
  • Histogram
  • IPv4 Range
  • Range
  • Terms

SonarK differs from Kibana in the following bucket aggregations:

  • Date range, IPv4 Range and Range do not support overlapping ranges. Ranges 0-5 with 5-10 will work. Ranges 0-5 with 0-10 will not work and will display an error on the screen.
  • Terms can only be used as the first or first and second buckets when visualizing using multiple buckets.
  • Terms can only be used up to twice when visualizing using multiple buckets.

Note

Filters will run a query per filter and can be slower than the rest of the aggregations. This gets worse when compounded with another Filters bucket aggregation. Only use Filters when other buckets cannot fulfil your needs, or you are not worried about performance.

SonarK URLs as Menu Items

To add a SonarK visualization, search or dashboard as a custom menu item copy the URL when working in SonarK and add it to the menu list using the report builder’s Add to Menu button (Add Report -> Add to Menu).

Known Issues

  1. SonarK cannot be used on a secondary - you must use SonarK on the node that is currently the primary.
  2. The admin user’s setting for “SonarK Max Query Time” controls the max query time for all SonarK users.