Enterprise Recon 2.10.0
Distributed Scan
This section covers the following topics:
- How a Distributed Scan Works
- Distributed Scan Requirements
- Start a Distributed Scan
- Monitor a Distributed Scan Schedule
You can use ER2 to perform a distributed scan on a Target or Target location using a group of Proxy Agents. Distributed scans allow you to:
- Improve scanning time by having multiple scanning processes executed in parallel.
- Optimize resources by distributing the scanning load across multiple Proxy Agent hosts which might otherwise have been unutilized.
Distributed scans are particularly useful for scanning Targets that have a vast number of locations, for example:
- An Exchange Server with thousands of mailboxes.
- A Microsoft SQL Server with hundreds of databases, with thousands of tables per database.
For more information, see Distributed Scan Requirements below.
How a Distributed Scan Works
When a distributed scan starts, the Master Server begins by collecting information about the Target(s) and the Proxy Agents in the Agent Group assigned to the scan. The Master Server uses this information to break down the Target(s) into smaller components or sub-scans, then proceeds to distribute the scan workload among the Proxy Agents that are online and available.
Each Proxy Agent then starts to execute the assigned sub-scans on the Target(s). Results for the Target(s) are progressively processed and displayed in the Web Console as each sub-scan completes. While the distributed scan is in progress, if any Proxy Agent becomes idle (after completing all assigned tasks) or is newly connected, outstanding tasks from other Proxy Agents will be dynamically reallocated to these available Agents to further improve the overall scan time.
A distributed scan schedule is marked as "Complete" only when all sub-scans distributed among all Proxy Agents have been completed.
Distributed Scan Requirements
Proxy Agent Requirements
To perform a distributed scan on a Target or group of Targets, you need to Create an Agent Group to be assigned to the Target or Target location. Ensure that all Proxy Agents in the Agent Group:
- Have been upgraded to version 2.1 and above.
-
Support scanning of the Target platform.
If any Proxy Agent within the Agent Group does not support scanning of the Target, all sub-scans assigned to the Proxy Agent will not be executed, subsequently causing the scan schedule to fail. To check which Agents are supported for a Target, see the respective pages under Target Type.To run a distributed scan on a MySQL database, ensure that the Agent Group assigned to the scan only contains Windows Proxy Agents or Linux Proxy Agents.
If the Agent Group assigned to scan the MySQL database includes a Solaris Proxy Agent, the scan schedule will be marked as "Failed" due to incomplete sub-scans.
Supported Targets
You can run a distributed scan on the following supported Target types:
Target Type | Description |
---|---|
Windows Share | Scans are distributed across the folders and files under the Path of the network storage location as specified in the scan schedule.
If the network storage Path in the scan schedule is specified as MyFolder, the scan will be distributed across all files and folders within the MyFolder directory.
If the number of files under the Path exceeds a certain limit,
|
Remote Access via SSH | Scans are distributed across the folders and files under the Path of the network storage location as specified in the scan schedule.
If the network storage Path in the scan schedule is specified as MyFolder, the scan will be distributed across all files and folders within the MyFolder directory.
If the number of files under the Path exceeds a certain limit,
|
IBM DB2 | Scans are distributed across the tables in the database. |
InterSystems Caché | Scans are distributed across the tables in the database. |
MongoDB | Scans are distributed across the collections in the MongoDB Server. |
MariaDB | Scans are distributed across the tables in the database. |
Microsoft SQL Server | Scans are distributed across the tables in the database. |
MySQL | Scans are distributed across the tables in the database. |
Oracle Database | Scans are distributed across the tables in the database. |
PostgreSQL | Scans are distributed across the tables in the database. |
SAP HANA | Scans are distributed across the tables in the database. |
Sybase / SAP ASE | Scans are distributed across the tables in the database. |
SharePoint Server | Scans are distributed across the sites in the SharePoint Server. |
Confluence On-Premises |
Scans are distributed across the spaces, blog post folder, and/or top-level pages that are one-level below the selected location(s). Example 1When the entire Confluence domain is selected, the scans will be distributed across each space (e.g. Space Engineering and Space Product) in the domain. Confluence [host name: my-confluence-server]
Confluence on target MY-CONFLUENCE-SERVER
Space Engineering
Blog Post Folder
Blog Post January
Space Product
Page Feature
Page Feature A
Page Feature B
Example 2The scans for Space Engineering will be distributed across the blog post folder (Blog Post Folder) and top-level page (Page Development). Confluence [host name: my-confluence-server]
Confluence on target MY-CONFLUENCE-SERVER
Space Engineering
Blog Post Folder
Blog Post January
Blog Post February
Page Development
Page Bug Fixes
Page Enhancements
Space Product
Page Feature
Page Feature A
Page Feature B
Page Release
Page Release Q1
Page Release Q2
|
Amazon S3 Buckets | Scans are distributed across the Amazon S3 Buckets in the Amazon account. |
Azure Storage | Scans are distributed across the Blobs, Tables or Queues in the Azure Storage account. |
Box Inc | Scans are distributed across the locations in the Box Inc domain that
are selected for the scan schedule.
For example, in the scenario below, the scans will be distributed across
four locations.
Box [domain: example.app.box.com]
Group Administration
Group Engineering
User user1@example.com
User user2@example.com
Group Finance
User user3@example.com
User user4@example.com
User user5@example.com
Group Human Resource
Group Sales
|
Exchange Domain | Scans are distributed across the mailboxes in the Exchange domain. |
Exchange Online | Scans are distributed across the mailboxes in the Microsoft 365 domain. |
Google Workspace | Scans are distributed across the users in the Google Workspace domain. |
Google Cloud Storage | Scans are distributed across the buckets in the Google Cloud Storage project. |
Microsoft OneNote | Scans are distributed across the user or group name notebooks in the Microsoft 365 domain. |
Microsoft Teams | Scans are distributed across the (i) channels in a team, or (ii) users in a group within the Microsoft 365 domain. |
Rackspace Cloud | Scans are distributed across the cloud server regions in the Rackspace account. |
SharePoint Online | Scans are distributed across the sites in the SharePoint Online domain. |
Start a Distributed Scan
Running a distributed scan is the same as starting any other scan.
- Log in to the ER2 Web Console.
- Navigate to the Select Locations page by clicking on:
- Scans > New Scan, or
- the New Scan button in the Dashboard, Targets, or Scans > Schedule Manager page.
- On the Select Locations page, click + Add Unlisted Target. Follow the on-screen instructions to add a new Target.
- When prompted to select an Agent to act as proxy host, click on the Select proxy agent menu and select a suitable Agent Group.
If any Proxy Agent within the Agent Group does not support scanning of the Target, all sub-scans assigned to the Proxy Agent will not be executed, subsequently causing the scan schedule to fail. To check which Agents are supported for a Target, see the respective pages under Target Type.
- Click Test, and then Commit.
- On the Select Data Types page, select the Data Type Profiles to be included in your scan and click Next. See Data Type Profiles.
- Set a scan schedule in the Set Schedule section. Click Next.
- Review your scan configuration. Once done, click Start Scan.
Monitor a Distributed Scan Schedule
Distributed scans show up in the Targets page and Scans > Schedule Manager page in the Web Console just like any other scan. See View and Manage Scans for more information.