A Security Practices Evaluation Framework

Developing a Security Practices Evaluation Framework

In this section, we briefly describe the development process behind the SP-EF security practices, and present the SP-EF data elements. SP-EF consists of context factors, practice adherence measures, and outcome measures. Context factors are software engineering measurements used to characterize a project for comparison with other projects, e.g. (Programming) Language and number of Source Lines of Code (SLOC). Practice adherence measures are used to describe security practice use by the project team. Outcome measures are metrics used to assess quality from the security perspective. A description of how to collect the data elements is available in the SP-EF measurement guidebook[^1].

We have adopted the design of XP-EF [1] , where possible. Goals for the framework include that the metrics be:

Simple enough for a small team to measure without a metrics specialist and with minimal burden
Concrete and unambiguous
Comprehensive and complete enough to cover vital factors

Security Practices

A software development security practice is an action a software development team takes to assure that the team’s security goals are met for the software they are building. The four lists of software development security practices we identified in the introduction (BSIMM, SAFECode, SDL, and SSAP), overlap in some ways, differ in other ways, and make assumptions about the contexts in which they are applied. Choosing a single list leaves out practices unique to the other lists. Combining the lists is also non-trivial. In some cases, the practices match, in some cases they share elements, in some cases practices are identified in their descriptions as optional. While the SP-EF practice adherence metrics can be applied to any of the practices in the four lists, comparing similar practices across lists requires identifying the similarities between practices. For these reasons, we utilized a systematic process to define a set of security practices to capture the central elements of software development security practice use in a broad range of situations.

In keeping with our goal of supporting software development teams, we set the following inclusion/exclusion criteria for software development security practices:

Include practices employed by software developers and testers.
Include practices producing inputs for practices used by software developers and testers.
Exclude language, environment, and network specifics.
To capture security practice use in a consistent way at the level of > actions needed within roles on a development team, we developed a > template to translate the practices identified in our sources into > a consistent format. The security practice template is as > follows: <Artifact(s) Affected> guided by:

The element indicates the role played by the person deciding upon, applying, auditing and/or enforcing the practice.
<Verb> describes the action taken by its user., drawn from the following list: Apply, Document, Improve, Perform, Provide, Publish, Track
Artifact is used as a generic term for the objects and information sources used or created during software development.
- “Artifact(s) Affected” is a list of artifacts that are created, or changed by the practice.
- “Artifact(s) Referenced” is a list of artifacts that are used to inform decisions by the user as they apply the practice.

For each of the source practices, we aligned its user, verb, and artifacts with the template. As an example, BSIMM’s Create Data Classification Scheme calls for the software development organization to identify and classify the data it manages according to potential security risks.

Filling the template out for BSIMM’s “Create a data classification scheme and inventory” yields:

Role: Management, Verb: Create, Artifacts Affected: Data Classification Scheme, Artifacts Referenced: Assets, Threats, Requirements, Bug Reports
Role: Development, Verb: Apply, Artifacts Affected: Source Code, Unit Tests Artifacts Referenced: Data Classification Scheme
Role: Testing, Verb: Apply, Artifacts Affected: Test Plan, Test Suite, Artifacts Referenced: Data Classification Scheme

We grouped our practices according to the security-related artifacts created or referenced. For example, we treat the three bullet points in the data classification scheme example above (as well as other data classification practices in the source lists) as a single practice “Apply Data Classification Scheme‚”. The complete set of source statements and their classifications is available online[^2].

Through our classification, we identified 16 core practices that, in combination with roles, verbs, phases, and artifacts, were sufficient to classify the source practices we identified. Two of the practices, “Apply Security Principles‚”, and “Publish Disclosure Policy‚” were identifed infrequently (less than 10 times across the sources), and they were excluded from the framework, yielding 14 practices.

Once we classified each source practice as one of our core security practices, we extracted keywords that were unique to each set of source practices to use in identifying the presence of our core practices in the project data sources.

Table 1 lists the roles, artifacts, and keywords for each practice.

Practice Adherence Metrics

SP-EF’s practice adherence metrics are designed to measure how closely a project adheres to a set of security practices from objective and subjective perspectives.

For each practice, a set of three objective measures of practice adoption and use, presence, frequency, and prevalence,

For each practice, a set of four subjective metrics, Usage frequency, Ease of use, Assistance, and Training, measure the practice’s usage.

Project teams may adapt their methodology through selecting (and dropping) practices to suit the requirements of their customers, their business and operational environments, and their awareness of trends in software development. Identifying when individual security practices are first applied - adoption -, and how they are applied over time - use - , is necessary to our investigation.

We take three measures of practice adoption and use, derived from the collection of security practice events:

Presence - Is there evidence of the practice in the project data during a project month? Values: Counts of events classified with the practice‚”s keywords.
Frequency - How often is this practice applied? Values: Not Used, Daily, Weekly, Monthly, Quarterly, Annually, Less than Annually. Values are derived from the ratio of practice counts to the number of months from the start to the end of the project period examined.
*Prevalence - “What proportion of the team uses the practice? Value ranges from 0 to 1, computed as the ratio of team members (Creator, Assignee) using the practice at least once during a Project Month to total unique team members (Creator, Assignee) active during the month.
1. Outcome Measures

Outcome measures are a set of attributes and values that are used to describe the security-related outcomes of the project.

Pre-release Vulnerabilities - Vulnerabilities found in new and changed code before software is released
Post-release Vulnerabilities - Vulnerabilities found in new and changed code after software is released
Vulnerability Density - Vulnerability Density (Vdensity) is the cumulative vulnerability count per unit size of code [19]. We adopt a size unit of thousand source lines of code (KSLOC). Lower values for Vdensity indicate higher levels of security.
Vulnerability Removal Effectiveness - Vulnerability Removal Effectiveness (VRE) is the ratio of pre-release vulnerabilities to total vulnerabilities found, defined in parallel to defect removal effectiveness [20].

These measures can be used to monitor the set of security practices in place on a software development project, and whether outcomes change as a result of introducing or removing practices.

Context Factors

Recording a project‚”s context factors is essential for comparison purposes and for understanding the similarities and differences between the project and one‚”s own environment. potential risk involved for the software under development.

Source Lines of Code (SLOC) - Total number of source lines of code (no comments, no blanks), as counted by the cloc utility[^3] for the measurement time period.
Churn “ Total number of SLOC added, deleted, or changed during the measurement time period.
Developers “ Total number of unique developers working on the project during the measurement time period.
Number of Machines - Number of machines is a count of how many machines (physical or virtual) on which the software will ultimately be installed.
Number of Identities Managed - Number of Identities Managed is the maximum count of how many personal identities are managed by the software being developed. A browser might be responsible for managing one or two identities, while a database system might manage millions.
Confidentiality, Integrity, and Availability Requirement (from CVSS) - Subjective assessment of the most sensitive data that passes through, or is kept by, the software under development, for its confidentiality (CR), integrity (IR), and availablilty (AR) requirements. Values: Low, Medium, High, Not Defined. s