Project

General

Profile

EGI security dashboard

Context

The EGI CSIRT operates several monitoring services that collect various information from the sites and provide an overview of the infrastructure in terms of operational security. Currently two services are in production: the EGI security Nagios box launching security-related probes and system Pakiti evaluating the patching status of the compute resources. Other services may appear in the future.

These monitoring services provide their own interfaces to access results, which is obviously not suitable for routine operations. The goal of the EGI security dashboard will be to aggregate data produced by the EGI security monitoring and provide ways of manipulating with the data. The dashboard will be linked to the EGI information services (namely GOC DB) and other operation tools (ticketing system) so that the EGI security people could have a single interface to view the data and handle them.

Using the security Dashboard extenstions the operations people could handle security issues as part of the standard procedures (paying attention to the sensitivity of information communicated, of course).

See https://operations-portal.egi.eu/.

Use-cases

Incident resolution – when an incident appears, the EGI CSIRT tries to collect as much information as possible about the sites concerned by the incident. The data involves status of security patches (current state and history), results of the common security probes as well as other incidents that appeared on the site in the past. The security dashboard should provide this information at a single place, together with an easy way of filing tickets against the sites when necessary.

Vulnerability chases – whenever the EGI CSIRT comes accross a serious vulnerability, which may seriously impact the infrastructure, it sends alerts to the sites about the issue. For extremely serious vulnerabilities, the EGI CSIRT asks the sites for actions within a short time-frame (usualy a week). The subsequent process of following up with sites and monitoring their actual state requires a lot of manual work and is time-consuming for the operators. The security dashboard should assist in obtaining information needed (e.g., based on a template), contacting the sites (directly using the GOC DB contacts and/or via the ticketing system) and in easier handling the tickets raised against the sites.

Report compiling – since the first “Pakiti challenge” the EGI CSIRT has seen a continuous trend of improving the responsivness of the sites and their quality. Such achievements as well as cases where communications failed are very important to follow and presents to both the sites and management/NGIs. The security dashboard should collect enough information to be able to produce such reports and long-term statistics and follow the trends.

Basic functions of the dashboard

Itemized areas mentioned above, (to be extended):

  • Collecting results from Pakiti and Nagios
  • the basic “unit” of the report will be a site, the Dashboard will be able to “construct” an NGI view out of the relevant sites.
  • The results must be secured during trasmission and prevented from leakage.
  • Combining the results – see the CVE-2010-3847 scenario, when we wanted to identify vulnerable sites, i.e. the ones that have neither applied the patch (Pakiti) nor blocked the vulnerable module (Nagios). An alternative would be to make this decision on Nagios.
  • Providing a view of a site/NGI/EGI
  • proper access control must be applied
  • filtering/sorting of the results based on defined criteria, etc.
  • Dispatching alerts when needed
  • ideally based on configurable rules
  • Operations functions – links to GOC DB, EGI RT
  • e.g., templates for filing RT tickets, bulk manipulations with them, etc.
  • Reporting functions – generation on demand and automatically on regular basis
  • compute security metrics based on the numbers gathered

Version 1

The primary use-case to address is the vulnerability chases, based on both vulnerable packages being present on the nodes and/or security probes failing for the sites. The Dashboard should be able to easily display the failing sites and allow the CSIRT people and/or operations staff to act accordingly (at least providing a list of proper contacts from the GOC DB).

We will start with a simple integraton of the results of Pakiti and Nagios to the existing dashboard. Both these services will provide an XML-based reports that will be retrieved by the dashboard on regular basis. The results will contain the site name (as per GOC DB) and information gathered to the site.

The Dashboard will make this information available in the “site view”, subjected to proper authorization. Information about a site will only be available to the site administrators and security contacts, based on the relevant record in the GOC DB. In addition, the NGI security people and management will also be able to view results of the sites in the NGI, and EGI CSIRT members and operations staff will be allowed to view results of all sites.

The Dashboard will provide a presentation layer, it will not be used to manipulate back with the “primary” services, e.g. to configure them or change the ACLs, etc.

Steps needed:

- Define/adapt/implement/ the XML (CSV,...) format of the reports for Nagios and Pakiti and make them available for Dashboard.
- Define and implement the mechanism of passing this information to the Dashboard.
- Extend the dashboard with the capability of displaying the information in the site view
- Make sure proper authorization is applied (based on GOC DB and EGI SSO); make sure that EGI CSIRT/operations people can access all the data collected.