When a node scan is scheduled, the files that are requested to be scanned are derived from both the file scan options of all node groups an node is a member of as well as any files specified in attached policies in file checks. This article describes how a final set of file scan options are derived and sent to the agent or connection manager to perform a scan.

Overview

File scan options are sent to the connection manager upon every scan request and they instruct the connection manager which files and directories to traverse and collect. In simple cases, a node may only inherit basic scan options of common system paths from those UpGuard prescribes, in which case the resulting files that appear in a node scan should be recognizable. In more complex cases, it may be difficult to determine why a particular file was collected during a node scan, especially if multiple sources are linked to the node group.

Scan options are collected from a number of sources:

  • A node inherits all file scan options from all node groups it is a member of;
  • Sometimes scan options can cancel each other out if paths from one group use an exclusion rule on top of file scan options from another group;
  • For conflicting rules of inclusion and exclusion, a user specified order or precedence may be applied;
  • Policies that are assigned to node groups that contain file-based checks also have these file paths inherited.

This article provides a number of worked examples to demonstrate how a final file scan instruction set is derived and sent to a connection manager. This page can be particularly useful if you are designing your scan options and either cannot explain why certain files are being scanned, or if you are hitting a file scan limit.

Example 1: Scan options from single node group membership

In this example, we will use a single Linux node. All Linux-based nodes are automatically added to the Linux node group, which contains a number of default file scan options. The following is an adapted subset of the official file scan option defaults, which we will use for this example:

  • R1. /home/user/app/app.json
  • R2. /etc/*.conf
  • R3. /etc/**/*.conf

Here, our example node will inherit these paths from the Linux node group and scan:

  • the specific file at /home/user/app/app.json (rule 1)
  • any file ending in .conf in the /etc directory (rule 2)
  • any file ending in .conf in subdirectories of /etc (rule 3)

For more information about these scan options types, please see our guides on Absolute Syntax, Wildcard Syntax and Greedy-star Syntax, respectively.

Example 2: Scan options from multiple node groups

More often than not, a node will belong to two or more node groups. For example, an app server node could belong to:

  • the Linux node group as it’s base operating system is Linux
  • an App Server node group as it may have reporting, change detection or policies associated with it as part of a larger app server group of nodes
  • an Production node group as it may have additional checks that are run on production nodes only.

For this example, assume the following file scan options are defined for each of these node groups.

Linux node group

  • R1. /etc/*.conf
  • R2. /etc/**/*.conf

App Server node group

  • R3. /home/user/app/app.json
  • R4. /home/user/app/db.yml

Production node group

  • R5. /home/user/shared/server.key
  • R6. /home/user/shared/server.crt
  • R7. /home/user/app/app.json

Here, the following file scan options will be derived for this node’s node scan:

  • any file ending in .conf in the /etc directory (rule 1)
  • any file ending in .conf in subdirectories of /etc (rule 2)
  • the specific file /home/user/app/app.json (rule 3 and rule 7)
  • the specific file /home/user/app/db.yml (rule 4)
  • the specific file /home/user/shared/server.key (rule 5)
  • the specific file /home/user/shared/server.crt (rule 6)

Here you can see that a node inherits the union of all file scan options from all node groups that it belongs to. You will also notice the the specific file at /home/user/app/app.json is specified twice (once for the App Server group and once for the Production group). This entry gets deduplicated during the pre-scan derivation of file scan options.

Example 3: Scan options from node groups with exclusion rules and order of precedence

In this example, we will use two nodes called AppDev and AppProd. Both nodes are members of the Windows node group as they are both Windows-based nodes. Both nodes are in an additional node group called App Servers as they are both configured to have similar functions. The AppProd node is also in a Production node group, whereas the AppDEv node is not.

The file scan options for the Node Groups are defined as follows: Windows node group

  • R1. C:\Windows\System32\**\*.ini
  • R2. C:\Windows\System32\*.ini

App Servers node group

  • R3. C:\App\pages\**\*.html
  • R4. C:\App\config\*
  • R5. C:\App\pages\src\.subversion\build-hash

Production node group

  • R6. !C:\Windows\System32\secret\keys.ini
  • R7. !C:\App\config\*.pem
  • R8. !C:\App\pages\**\.subversion

Here, the AppDev node will simply inherit the union of the file scan options from the Windows and App Servers node groups, as per the same logic in Example 1.

As for AppProd the following file scan options will be derived:

  • Any file in C:\Windows\System32 and it’s subdirectories that ends in .ini, except for the specific file C:\Windows\System32\secret\keys.ini (rule 1, rule 2 and rule 6)
  • Any file in subdirectories of C:\App\pages that ends with .html and is not in any directory or subdirectory that has a directory called .subversion in its path (rule 3 and rule 8)
  • The specific file at C:\App\pages\src\.subversion\build-hash (rule 5)
  • Any file in the directory C:\App\config that does not end in .pem (rule 4 and rule 7)

Each example above shows how a wider greedy-star or wildcard pattern can be trumped by a more specific inclusion or exclusion rule. Of particular interest is how the second and third derived scan options are constructed from rules 3, 5 and 8. The specific file at C:\App\pages\src\.subversion\build-hash is included, even with the exclusion rule (rule 8) because absolute paths have precedence over wildcard and greedy-star rules. In this example if you want the file at C:\App\pages\src\.subversion\build-hash to be excluded given the current rule set, you can define a custom precedence value on rule 8 so that it takes artificial priority over rule 5.

For a more thorough description of the rules, their order or precedence and how to override the default order of operations with a custom priority order, please visit our guide on Order of Precedence of Conflicting Rules. For more information on adding exclusion rules, please visit our guide on File Exclusion via Path Negation.

Example 4: Scan options from node group membership and associated policy

File scan options can also be included from file-based checks defined in policies associated with node groups that a node is a member of. As policy file checks usually define a absolute path to a file, they will generally be included, given the possible existence of other exclusion rules, because they are absolute.

In this example we will define a single node that belongs to a Webserver node group and this node group has a policy associated with it. The node group has the following file scan options defined:

  • R1. /etc/**/*.conf
  • R2. /etc/*.conf

The associated policy has the following checks:

  • C1. The file at /home/app/shared/server.crt should exist and have the checksum 3ec061bf2752cf35f358b689212dceeb
  • C2. The file at /home/app/shared/server.key should exist and have the checksum c0bd1423a970449eb51d18a2c0b22ab7

The following file scan options will be derived for this node:

  • Any file in any subdirectories of /etc that ends in .conf (rule 1)
  • Any file in the directory /etc that ends on .conf (rule 2)
  • The specific file /home/app/shared/server.crt (rule 3)
  • The specific file /home/app/shared/server.key (rule 4)

Here, the derived file scan options are simply a union of any scan options inherited from node group membership combined with any specific files listed in associated policies in file-based checks.

What Next?

For more information on file scan options as well as other types of scan options, please visit our guide on Scan Options.

Tags: