A Security Practices Evaluation Framework

SP-EF Measurement Guidebook

Abstract

Software development teams are increasingly faced with security concerns regarding the software they develop. While many software development security practices have been proposed, published empirical evidence for their suitability and effectiveness is limited. The goal of this research is to support theory construction through empirical evidence collection for security practice use in software development by building a measurement framework for software development security practices and their correlations with security-related outcomes.

To orient the framework, we set two practice-oriented sub-goals:

Identify security practices most likely to reduce post-release vulnerabilities from the pool of practices not currently applied on a project.
Reduce post-release vulnerabilities for a project through security practice selection and use.

To meet our goals, we define and evaluate the “Security Practices Evaluation Framework” (SP-EF).

This document describes how to collect the data required for SP-EF. Examining patterns in the aggregated data supports making security-focused improvements to the development process.

Introduction

Vulnerability prevention and removal are becoming increasingly important in software development. A number of security practices have been developed, advocated and implemented, but there is not yet empirical evidence for which practices work best in a given development environment.

In this document, we describe the data elements of the Security Practices Evaluation Framework (SP-EF), and provide guidance on how to collect the data elements. SP-EF contains three categories of data elements; practice adherence metrics, outcome measures, and context factors. The practice adherence metrics are a set of attributes and values that are used to describe security practices in use on a project, and the degree to which each practice is adhered to by the project team. Outcome measures are a set of attributes and values that are used to describe the security-related outcomes of the project. Context factors are a set of attributes and values that are used to provide a basis of comparison between projects measured using SP-EF.

The goal of the SP-EF is to provide a repeatable set of measures and measurement instructions structuring case studies of security practice use so that the case studies can be combined, compared, and analyzed to form a family of evidence on security practice use.[^1]

We have adopted the design of the Extreme Programming Evaluation Framework (XP-EF), where possible. Goals for the framework include that the metrics be:

Simple enough for a small team to measure without a metrics specialist and with minimal burden
Concrete and unambiguous
Comprehensive and complete enough to cover vital factors

The framework is designed for use throughout development, as well as for annotation of projects that have been completed.

Data Sources

The primary data sources required are the project’s documentation, particularly all process-related documentation, the version control system, and the bug tracker.

Project Measurement Demographics

To identify the collected data, record the following items:

Organization name - Security practices, personnel policies, media and public attention, and many other factors will vary from organization to organization. We record the organization name to permit controlling for the organization.
Project name - Development platforms, schedules, staffing Security practices, personnel policies, and many other factors will vary from project to project. We record the project name to permit controlling for the project.
Date(s) measurements were taken
Start date of measurement period
End date of measurement period
Links or notes on project documentation
Version control system
Bug tracking system

Domain

Description

Different risks are associated with different software domains. Web applications may be concerned with sustaining thousands, or possibly millions, of concurrent users supporting a variety of different languages. Whereas the primary concerns of a database project may be scalability and response time. The medical domain has unique security and/or privacy concerns.

Data Collection

Text description based on discussion with project staff, or read from project artifacts.

Context Factors

Drawing general conclusions from empirical studies in software engineering is difficult because the results of any process largely depend upon the specifics of the study and relevant context factors. We cannot assume a priori that a study’s results generalize beyond the specific environment in which it was conducted [3]. Therefore, recording an experiment’s context factors is essential for comparison purposes and for fully understanding the similarities and differences between the case study and one’s own environment.

Language

Description

Language in which the software is written

Data Collection

Text description from project staff, researcher observation, or inferred from project artifacts.

Confidentiality, Integrity, and Availability Requirements

Description

These values are taken directly from CVSS, and this section paraphrases the description in the CVSS Guide. These metrics measure the security requirements of the software under development. Each security requirement has three possible values: Low, Medium, High, and Not Defined.

Data Collection

To choose a value for each context factor, consider the most sensitive data that passes through, or is kept by, the software being evaluated. For example, a web browser may access highly confidential personal information such as bank account or medical record data, to which a High Confidentiality Requirement would apply.

Metric Value Description

Low (L) Loss of [confidentiality | integrity | availability] is likely to have only a limited adverse effect on the organization or individuals associated with the organization (e.g., employees, customers).
Medium (M) Loss of [confidentiality | integrity | availability] is likely to have a serious adverse effect on the organization or individuals associated with the organization (e.g., employees, customers).
High (H) Loss of [confidentiality | integrity | availability] is likely to have a catastrophic adverse effect on the organization or individuals associated with the organization (e.g., employees, customers).
Not Defined (ND) Assigning this value to the metric will not influence the score. It is a signal to the equation to skip this metric.

Methodology

Description

Project approach to the software development lifecycle.

Data Collection

Text description from project staff or researcher observation (e.g., XP, Scrum, Waterfall, Spiral).

Source Code Availability

Description

Ross Anderson has claimed that, for sufficiently large software systems, source code availability aids attackers and defenders equally, but the balance shifts based on a variety of project-specific factors. We track the source code availability for each project measured.

Data Collection

Discuss source code availability with the project staff, or infer from the existence of a public repository or other, legal, public distribution of the source code.

Values: Open Source, Closed Source

Team Size

Description

The complexity of team management grows as team size increases. Communication between team members and the integration of concurrently developed software becomes more difficult for large teams, as described by Brooks. Small teams, relative to the size of the project, may be resource-constrained. Therefore, we track the number of people engaged in software development for the project, categorized by project role. To enable normalizing effort and calculation of productivity, we record average hours per week for each person in their primary role.

The four roles defined for SP-EF are:

Manager (e.gr. Project Management, Requirements Engineer, Documentation, Build Administrator, Security),
Developer (Designer, Developer),
Tester (Quality Assurance, Penetration Tester, External Penetration Tester),
Operator (User, Systems Administrator, Database Administrator),

Data Collection

Count managers, developers, and testers dedicated to the project under study.
Survey project team to establish each member’s time commitment to the project.

Count When working with a project in progress, count people currently engaged on the project, noting roles and whether they are full-time or part-time on the project. When working with historical project data, sort participants by their number of commits (or bug reports) and count participants contributing the first 80% of commits (bug reports) to estimate development team size and testing team size.

Per team member data:

Project Role Values: Manager, Developer, Tester, Other.
Average Hours Per Week: 0.0 - 99.9

Team size: Summary by Project Role, Count, Average Hours Per Week

Team Location

Description

Distributed teams that communicate via the Internet are becoming more commonplace, and it is possible that team location and accessibility may influence a project. A distributed team faces more challenges than a co-located team during development. Communication and feedback times are typically increased when the team is distributed over many sites.

Data Collection

Record whether the team is collocated or distributed. A collocated team is found in the same building and area, such that personal interaction is easily facilitated. If the team is distributed, record whether the distribution is across several buildings, cities, countries, or time zones.

Operating System

Description

Operating system/runtime environment and version.

Data Collection

Text description from project staff or researcher observation (e.g. Linux, Windows, Android, iOS).

Product Age

Description

Product age relates to both the availability of product knowledge as well as product refinement. An older product might be considered more stable with fewer defects, but there may be a lack of personnel or technology to support the system. Furthermore, making significant changes to a legacy system may be an extremely complex and laborious task. Working with a newer product may involve instituting complex elements of architectural design that may influence subsequent development, and may be prone to more defects since the product has not received extensive field use.

Data Collection

Determine the date of the first commit/first lines of code written. Record the number of months elapsed since that date. Record the age of the product in months.

Number of Identities

Description

Number of personal identities the software stores or transmits.

A black market for personal identities, names, addresses, credit card numbers, bank account numbers, has developed. In 2011, a personal identity could be bought (in groups of 1000) for 16 US cents[^3]. One component of software security risk is the presence and use of personal information, represented by the number of identities accessible to the software.

Data Collection

Work with the team to count or estimate the number of personal identities managed by the software. A browser might manage one or two identities, while a database system might manage millions.

Number of Machines

Description

Number of machines on which the software runs.

The rise of botnets, networks of computers that can be centrally directed, has created a black market for their services. In 2013, an hour of machine time on a botnet ranged from 2.5 to 12 US centsref. Infesting machines with malware enabling central control creates Botnets, and so the number of machines a piece of software runs on is a risk factor.

Data Collection

Work with team to count or estimate the machines (physical or virtual) on which the software runs.

Source Lines of Code (SLOC)

“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” - Bill Gates

Description

Lines of Code is one of the oldest, and most controversial, software metrics. We use it as a means for assessing software size, and as a proxy for more detailed measures such as complexity. Broadly speaking, larger code size may indicate the potential for software defects, including vulnerabilities.

Definition

Number of non-blank, non-comment lines present in the release of the software being working on during the current project.

Data Collection

Count total number of non-blank, non-comment lines present in the release of the software being working on during the current project.

Use cloc or SLOCCount where possible.

Churn

Description

Development teams change software to add features, to correct defects, and to refine software performance according to variety of non-functional requirements. Changes to software can introduce defects, and so measuring change is important for assessing software quality. We measure Churn, the number of non-blank, non-comment lines changed, added, or deleted in the the software being working on, over a time period. Churn is composed of three measurements; Start Date, End Date, and the total changed, added, deleted SLOC between the Start Date and the End Date.

Data Collection

Select the Start Date and End Date to be measured. In our initial studies, we define Project Month, and compute Churn for each month since the first available month of data for the project, using the first and last days of each month as our Start and End Dates. In other studies, Start and End Dates may be for a single release, or series of releases.

Following the data collection procedures for SLOC, measure SLOC for the Start Date. Measure changed, added, deleted SLOC for the End Date, relative to the Start Date.

SP-EF’s practice adherence metrics are designed to measure how closely a project adheres to a set of security practices. Each project is likely to use its own set of security practices and to use a given security practice in its own way. Project teams may add and drop practices to suit the requirements of their customers, their business and operational environments, and their awareness of trends in software development. Adherence metrics are a means of characterizing the degree to which a practice is used on a project. We have included objective and subjective metrics for measuring practice adherence. Objective metrics are drawn from evaluation of the project data, given our expectation that the security practices of a team will be reflected in the documentation the team creates, and the logs of activity the team generates.

Subjective metrics are drawn from interviews with, or surveys of, the project team members. People are the driving force behind process and practices, and their views should be considered, while weighing the bias introduced by self-reporting.

For each security practice adherence event, we recorded the following data elements: • Event Date – Date on which document was created. * Frequency: Frequency with which the practice is performed. Values: Not Used, Daily, Weekly, Monthly, Quarterly, Annually, Less than Annually • Practice – Name of security practice associated with document. • Source – Data source for document. Possible Values: Version Control, Defect Tracker, Email. • Document Id – Id of document in its source, e.g. commit hash, bug tracker id, email id. • Creator – Role of the author of the source document. Role: Manager, Developer, Tester, User, Attacker. • Assignee – For defect report documents, the person assigned the defect, where applicable * Phase - Project phase during which the practice is performed. Values: Initiation, Requirements, Design, Implementation, Unit Testing, Testing, Release, Operations * Evidence source: link to, or description of, source of data reported.

While the practice adherence metrics are, conceivably, useful for any set of practices, we have defined a set of software development security practices, synthesized from our evaluation of the BSIMM, SDL, SAFECode, and OWASP practice lists.

Perform Security Training

Ensure project staff are trained in security concepts, and in role-specific security techniques.

Description

Security training raises staff awareness of potential security risks and approaches for mitigating those risks. While some security concepts, e.g. Confidentiality, Availability, and Integrity, apply in general, role-specific training, e.g. coding techniques, database management, design concerns, is beneficial.

Practice Implementation Questions

Is security treated as part of the on boarding process?
Are project staff trained in the general security principles?
Are project staff trained in the security techniques they are expected to apply?
Is refresher training scheduled periodically?
Is further training scheduled periodically?
Are security mentors available to the project?

Keywords

awareness program, class, conference, course, curriculum, education, hiring, refresher, mentor, new developer, new hire, on boarding, teacher, training

Apply Data Classification Scheme

Maintain and apply a Data Classification Scheme. Identify and document security-sensitive data, personal information, financial information, system credentials.

Description

A Data Classification Scheme (DCS) specifies the characteristics of security-sensitive data, for example, personal information, financial information, and/or system credentials. The DCS should be developed by considering the security implications of all data used by the software. The DCS should be considered by project personnel when writing, testing, and documenting the project’s software.

Practice Implementation Questions

Does the software under development reference, store, or transmit any of the following data:
- personally-identifiable information (PII)
- financial information
- credit card information
- system credentials (e.g. passwords, ssh keys)
Are rules for recognizing all of the data types used in question 1 documented?
Are rules for handling all of the data types used in question 1 documented?
Is the DCS revised periodically?
Are all personnel trained in the use of the DCS?
Are personnel periodically re-trained in the use of the DCS?

Keywords

(street) address, credit card number, data classification, data inventory, Personally Identifiable Information (PII), user data, privacy.

Links

BSIMM Activity AM1.2

Apply Security Requirements

Consider and document security concerns prior to implementation of software features.

Description

Security requirements are documented statements about what the software should allow and ban with respect to security concerns, including confidentiality, integrity, and availability. When a developer (tester) works on a piece of code (test), they should be able to reference the security requirements of that code (test).

Practice Implementation Questions

Are there organizational and/or project standards for documenting security requirements?
Is a plan for how security will be addressed during development created before development begins?
Does the software development team know whether compliance (regulatory, and organizational standards) requirements apply to its software?
Are compliance requirements translated into the work items/user stories/functional specs the developers use to guide their day to day progress?
Are user roles, behavior, and permissions specified before coding?
Are the environments and corresponding trust boundaries under which the software will run considered during design/before coding?
Are authentication and authorization implemented for the services and data the software provides?

Keywords

authentication, authorization, requirement, use case, scenario, specification, confidentiality, availability, integrity, non-repudiation, user role, regulations, contractual agreements, obligations, risk assessment, FFIEC, GLBA, OCC, PCI DSS, SOX, HIPAA.

Links:

OWASP Document security-relevant requirements

Apply Threat Modeling

Anticipate, analyze, and document how and why attackers may attempt to misuse the software.

Description

Threat modeling is the process of analyzing how and why attackers might subvert security mechanisms to gain access to the data and other assets accessible through the project’s software.

Practice Implementation Questions

Does the project have a standard for threat modeling?
Does the project have a list of expected attackers?
Does the project have a list of expected attacks?
Does the project budget for time to analyze its expected attackers and attacks, identify vulnerabilities, and plan for their resolution?
Does the project budget time for keeping up to date on new attackers and attacks?
- for the project software
- for the project technologies
- for the environment in which the project operates?
Does the project develop ‘abuse cases’ or ‘misuse cases’ based on its expected attackers?
Are defect records created to track resolution of each vulnerability discovered during threat modeling?
Are results from vulnerability tracking fed into the threat modeling process?

Keywords

threats, attackers, attacks, attack pattern, attack surface, vulnerability, exploit, misuse case, abuse case.

Links

OWASP Threat Modeling

Document Technical Stack

Document the components used to build, test, deploy, and operate the software. Keep components up to date on security patches.

Description

The technical stack consists of all software components required to operate the project’s software in production, as well as the software components used to build, test, and deploy the software. Documentation of the technical stack is necessary for threat modeling, for defining a repeatable development process, and for maintenance of the software’s environment when components receive security patches.

Practice Implementation Questions

Does the project maintain a list of the technologies it uses?
Are all languages, libraries, tools, and infrastructure components used during development, testing, and production on the list?
Are security features developed by the project/organization included on the list?
Is there a security vetting process required before a component is added to the list?
Is there a security vetting process required before a component is used by the project?
Does the list enumerate banned components?
Does the project review the list, and vulnerabilities of components on the list? On a schedule?

Keywords

stack, operating system, database, application server, runtime environment, language, library, component, patch, framework, sandbox, environment, network, tool, compiler, service, version

Links

BSIMM R2.3 Create standards for technology stacks.

Apply Secure Coding Standards

Apply (and define, if necessary) security-focused coding standards for each language and component used in building the software.

Description

A secure coding standard consists of security-specific usage rules for the language(s) used to develop the project’s software.

Practice Implementation Questions

Is there a coding standard used by the project?
Are security-specific rules included in the project’s coding standard?
Is logging required by the coding standard?
Are rules for cryptography (encryption and decryption) specified in the coding standard?
Are technology-specific security rules included in the project’s coding standard?
Are good and bad examples of security coding given in the standard?
Are checks of the project coding standards automated?
Are project coding standards enforced?
Are project coding standards revised as needed? On a schedule?

Keywords

avoid, banned, buffer overflow, checklist, code, code review, code review checklist, coding technique, commit checklist, dependency, design pattern, do not use, enforce function, firewall, grant, input validation, integer overflow, logging, memory allocation, methodology, policy, port, security features, security principle, session, software quality, source code, standard, string concatenation, string handling function, SQL Injection, unsafe functions, validate, XML parser

Links

BSIMM SR 1.4 Use secure coding standards

Apply Security Tooling

Use security-focused verification tool support (e.g. static analysis, dynamic analysis, coverage analysis) during development and testing.

Description

Use security-focused verification tool support (e.g. static analysis, dynamic analysis, coverage analysis) during development and testing. Static analysis tools apply verification rules to program source code. Dynamic analysis tools apply verification rules to running programs. Fuzz testing is a security-specific form of dynamic analysis, focused on generating progam inputs that can cause program crashes. Coverage analyzers report on how much code is ‘covered’ by the execution of a set of tests. Combinations of static, dynamic, and coverage analysis tools support verification of software.

Practice Implementation Questions

Are security tools used by the project?
Are coverage analyzers used?
Are static analysis tools used?
Are dynamic analysis tools used?
Are fuzzers used on components that accept data from untrusted sources (e.g. users, networks)?
Are defects created for (true positive) warnings issued by the security tools?
Are security tools incorporated into the release build process?
Are security tools incorporated into the developer build process?

Keywords

automate, automated, automating, code analysis, coverage analysis, dynamic analysis, false positive, fuzz test, fuzzer, fuzzing, malicious code detection, scanner, static analysis, tool

Links

Use automated tools along with manual review.

Perform Security Review

Perform security-focused review of all deliverables, including, for example, design, source code, software release, and documentation. Include reviewers who did not produce the deliverable being reviewed.

Description

Manual review of software development deliverables augments software testing and tool verification. During review, the team applies its domain knowledge, expertise, and creativity explicitly to verification rather than implementation. Non-author reviewers, e.g. teammates, reviewers from outside the team, or security experts, may catch otherwise overlooked security issues.

Practice Implementation Questions

Each of the following questions applies to the decision to:

change code, configuration, or documentation
include a (revised) component the project
release the (revised) software built by the project

Does the project use a scheme for identifying and ranking security-critical components?
Is the scheme used to prioritize review of components?
Are the project’s standards documents considered when making the decision?
Are the project’s technical stack requirements considered when making the decision?
Are the project’s security requirements considered when making the decision?
Are the project’s threat models considered when making the decision?
Are the project’s security test results considered when making the decision?
Are the project’s security tool outputs considered when making the decision?
Are changes to the project’s documentation considered when making the decision?

Keywords

architecture analysis, attack surface, bug bar, code review, denial of service, design review, elevation of privilege, information disclosure, quality gate, release gate, repudiation, review, security design review, security risk assessment, spoofing, tampering, STRIDE

Links

OWASP Perform source-level security review BSIMM CR1.5 Make code review mandatory for all projects.

Perform Security Testing

Consider security requirements, threat models, and all other available security-related information and tooling when designing and executing the software’s test plan.

Description

Testing includes using the system from an attacker’s point of view. Consider security requirements, threat model(s), and all other available security-related information and tooling when developing tests. Where possible, automate test suites, and include security-focused tools.

Practice Implementation Questions

Is the project threat model used when creating the test plan?
Are the project’s security requirements used when creating the test plan?
Are features of the technical stack used by the software considered when creating the test plan?
Are appropriate fuzzing tools applied to components accepting untrusted data as part of the test plan?
Are tests created for vulnerabilities identified in the software?
Are the project’s technical stack rules checked by the test plan?
Is the test plan automated where possible?
Are the project’s technical stack rules enforced during testing?

Keywords

boundary value, boundary condition, edge case, entry point, input validation, interface, output validation, replay testing, security tests, test, tests, test plan, test suite, validate input, validation testing, regression test

Links

OWASP Identify, implement, and perform security tests, BSIMM ST3.2 Perform fuzz testing customized to application APIs

Publish Operations Guide

Document security concerns applicable to administrators and users, supporting how they configure and operate the software.

Description

The software’s users and administrators need to understand the security risks of the software and how those risks change depending on how the software is configured. Document security concerns applicable to users and administrators, supporting how they operate and configure the software. The software’s security requirements and threat model are expressed in the vocabulary of the user (and administrator).

Practice Implementation Questions

Are security-related aspects of installing and configuring the software documented where users can access them?
Are security-related aspects of operating the software documented where users can access them?
Are abuse cases and misuse cases used to support user documentation?
Are expected security-related alerts, warnings and error messages documented for the user?

Keywords

administrator, alert, configuration, deployment, error message, guidance, installation guide, misuse case, operational security guide, operator, security documentation, user, warning

Links

OWASP Build operational security guide

Perform Penetration Testing

Arrange for security-focused stress testing of the project’s software in its production environment. Engage testers from outside the software’s project team.

Description

Testing typically is focused on software before it is released. Penetration testing focuses on testing software in its production environment. Arrange for security-focused stress testing of the project’s software in its production environment. To the degree possible, engage testers from outside the software’s project team, and from outside the software project’s organization.

Practice Implementation Questions

Does the project do its own penetration testing, using the tools used by penetration testers and attackers?
Does the project work with penetration testers external to the project?
Does the project provide all project data to the external penetration testers?
Is penetration testing performed before releases of the software?
Are vulnerabilities found during penetration test logged as defects?

Keywords

penetration

Links

OWASP Web Application Penetration Testing

Track Vulnerabilities

Track software vulnerabilities detected in the software, and prioritize their resolution.

Description

Vulnerabilities, whether they are found in development, testing, or production, are identified in a way that allows the project team to understand, resolve, and test quickly and efficiently. Track software vulnerabilities detected in the software, and prioritize their resolution.

Practice Implementation Questions

Does the project have a plan for responding to security issues (vulnerabilities)?
Does the project have an identified contact for handling vulnerability reports?
Does the project have a defect tracking system?
Are vulnerabilities flagged as such in the project’s defect tracking system?
Are vulnerabilities assigned a severity/priority?
Are vulnerabilities found during operations recorded in the defect tracking system?
Are vulnerabilities tracked through their repair and the re-release of the affected software?
Does the project have a list of the vulnerabilities most likely to occur, based on its security requirements, threat modeling, technical stack, and defect tracking history?

Keywords

bug, bug bounty, bug database, bug tracker, defect, defect tracking, incident, incident response, severity, top bug list, vulnerability, vulnerability tracking

Links

BSIMM CMVM 2.2 Track software bugs found during ops through the fix process.

Improve Development Process

Incorporate ``lessons learned’’ from security vulnerabilities and their resolutions into the project’s software development process.

Description

Experience with identifying and resolving vulnerabilities, and testing their fixes, can be fed back into the development process to avoid similar issues in the future. Incorporate ``lessons learned’’ from security vulnerabilities and their resolutions into the project’s software development process.

Practice Implementation Questions

Does the project have a documented standard for its development process?
When vulnerabilities occur, is considering changes to the development process part of the vulnerability resolution?
Are guidelines for implementing the other SPEF practices part of the documented development process?
Is the process reviewed for opportunities to automate or streamline tasks?
Is the documented development process enforced?

Keywords

architecture analysis, code review, design review, development phase,gate, root cause analysis, software development lifecycle, software process

Links

BSIMM CP 3.3 Drive feedback from SSDL data back to policy.

Subjective Practice Adherence Measurement

Text-based practice adherence data collection.

Description

SP-EF includes five subjective adherence measures that can be used in surveys and interviews:

Usage - How often is this practice applied?
- Values: not used, daily, weekly, monthly, quarterly, annually, less than annually.
Ease Of Use - How easy is this practice to use?
- Values: Very Low, Low, Nominal, High, Very High.
Utility - How much does this practice assist in providing security in the software under development?
- Values: Very Low, Low, Nominal, High, Very High.
Training - How well trained is the project staff in the practices being used?
- Values: Very Low, Low, Nominal, High, Very High.
Effort - How much time, on average, does applying this practice take each time you apply it?
- Ordinal values: 5 minutes or less, 5-15 minutes, 15-30 minutes, 30-minutes-1 hour, 1-4 hours, 4-8 hours, 1-2 days, 3-5 days, over 5 days
- Ratio values: hours (fractional allowed)

Objective Practice Adherence Measurement

Practice adherence data based on concrete project data.

Description

Objective metrics are drawn from evaluation of the project data, given our expectation that the security practices of a team will be reflected in the documentation the team creates, and the logs of activity the team generates.

We collect the following objective practice adherence metrics for each practice:

Presence: whether we can find evidence of the practice.
- Values: True, False.
Prevalence: Proportion of the team applying the practice, the ratio of all practice users to all team members.
- Values: 0 - 1.00.
- Alternate Values: Low, Medium, High.

When recording practice adherence manually, it is sufficient to record the following data elements:

Practice - Name of security practice associated with document.
Practice date: Date for which evidence of practice use is claimed by the researcher.
Presence - as described above
Prevalance - as described above

When recording practice adherence events automatically from emails, issues, commits, we recorded the following data elements:

Practice - Name of security practice associated with document.
Event Date - Date on which document was created.
Source - Data source for document. Possible Values: Version Control, Defect Tracker, Email.
Document Id - Id of document in its source, e.g. commit hvi ash, bug tracker id, email id.
Creator - Role of the author of the source document.
Assignee - For defect report documents, the person assigned the defect, where applicable.

Per Vulnerability Attributes

Description

While hundreds of security metrics have been proposed, tracking a relatively small set of attributes for each vulnerability detected in the software is sufficient to replicate many of them.

Definition

Data Collection

In addition to data kept for defects (e.g. those attributes listed by Lamkanfi [32]), we collect:

Source – The name of the bug tracker or bug-tracking database where the vulnerability is recorded.
Identifier – The unique identifier of the vulnerability in its source database.
Description – Text description of the vulnerability.
Discovery Date – Date the vulnerability was discovered.
Creation Date – Date the tracking record was created.
Patch Date – The date the change resolving the vulnerability was made.
Release Date – The date the software containing the vulnerability was released.
Severity – The criticality of the vulnerability. Scale: Low, Medium, High.
Phase – Indication of when during the development lifecycle the vulnerability was discovered
Reporter – Indication of who found the vulnerability
Role
(Optional) Identifier (name, email)

Pre-Release Defects

Description

Defects discovered during the development process should be credited to the team and its development practices.

Definition

Defects found in new and changed code before software is released.

Data Collection

When a defect is found in new or changed code before the software is released, collect the Per-Defect attributes and mark the development phase where the software was found; Requirements, Design, Development, Testing. Count total number of Defects found in new and changed code before the software is released.

Post-Release Defects

Description

Defects discovered after the software is released should be studied for how they could be identified and resolved sooner.

Definition

Defects found in released software.

Data Collection

When a vulnerability is found in released software, record its per-vulnerabilty attributes and mark the Phase as ‘Post-Release’. Count total number of Defects found in released software.

Vulnerability Density

Description

Vulnerability Density (Vdensity) is the cumulative vulnerability count per unit size of code. We adopt a size unit of thousand source lines of code (KSLOC).

Definition

Total Vulnerabilities divided by number of KSLOC in the software, at a point in time.

Data Collection

Derived from Pre- and Post-Release Vulnerabilities and SLOC metrics.

Pre-Release Vulnerabilities

Description

Vulnerabilities discovered during the development process should be credited to the team and its development practices.

Definition

Vulnerabilities found in new and changed code before software is released.

Data Collection

When a vulnerability is found in new or changed code before the software is released, Collect the Per-Vulnerability attributes and mark the development phase where the software was found; Requirements, Design, Development, Testing. Count total number of vulnerabilities found in new and changed code before the software is released.

Post-Release Vulnerabilities

Description

Vulnerabilities discovered after the software is released should be studied for how they could be identified and resolved sooner.

Definition

Vulnerabilities found in released software.

Data Collection

When a vulnerability is found in released software, record its per-vulnerabilty attributes and mark the Phase as ‘Post-Release’. Count total number of vulnerabilities found in released software.

Vulnerability Removal Effectiveness

Description

Vulnerability Removal Effectiveness (VRE) is the ratio of pre-release vulnerabilities to total vulnerabilities found, pre- and post-release, analogous to defect removal effectiveness. Ideally, a development team will find all vulnerabilities before the software is shipped. VRE is a measure for how effective the team’s security practices are at finding vulnerabilities before release.

Definition

Pre-Release Vulnerabilities divided by total number of Pre- and Post-Release Vulnerabilities in the software, at a point in time.

Data Collection

Derived from Pre- and Post-Release Vulnerabilities metrics.