Software Dependencies - EEL6883 - Software Engr II

Presented By : Abirami Poonkundran
This paper is a case study on the impact of
◦ Syntactic Dependencies,
◦ Logical Dependencies and
◦ Work Dependencies
on a software development project, and
identifies which dependencies have the
higher impact on fault proneness
Software Dependencies
◦ Syntactic Dependencies
◦ Logical Dependencies
◦ Work Dependencies
Data Collection
Measuring Failure
Pro’s and Con’s
Research has shown that software faults are
caused by violation of dependencies
Dependencies could be:
◦ Software Dependencies
 Technical
 Caused by developers
◦ Work Dependencies
 Organizational
 Caused by how work is organized
This paper examines the relative impact that
each of these dependencies have on the fault
proneness of the software system
Software Dependencies could be:
◦ Syntactic
◦ Logical
Focuses on Control and Dataflow
Dependencies are discovered by analysis of
source code or from an intermediate
representation like byte code or syntax trees
These dependencies could be:
◦ Data Related Dependency - e.g., a particular data
structure modified by a function and used in
another function
◦ Functional Dependency – e.g., Method A calls
Method B
Dependencies between the source code files
of a system that are changed together as part
of software development
Often Logical Dependencies provide more
valuable information than Syntactic
Dependencies (eg., in Remote Procedure
They can identify important dependencies
that are not visible in Syntactic Code analysis
Only recent research have started shedding light
on the impact of human and organizational
factors on the failure proneness of software
Caused because of lack of proper
communication and coordination between
Research have shown that identification and
management of work dependencies is a major
Examined two large software development
◦ Project A
 Complex distributed system
 Data are covered for 3 years of development activity
 The company had 114 developers grouped into 8
development team and has 3 development locations
 ≃ 5 million lines of code distributed in 7,737 source
code files in C language
◦ Project B:
 Embedded software system
 40 developers in the project over a period of 5 years
 1.2 million lines of code were used in both C and C++
In both projects, every change to source code
was controlled by Modification Requests (MR)
Every change made to Source code has to be
committed to Version Control System
Information Used for this Analysis:
◦ Collected a total of 8,257 and 3,372 MRs for Project
A and Project B
◦ Version control system from both projects
◦ The source code itself from both projects
Goal is to investigate failure proneness at the
file level
File Buggyness – indicates whether a file has
been modified in the course of resolving a
Used C-REX tool to identify programming
language tokens and references in each entity
of each source-code file
Source code snap shot was taken every
Syntactic dependency analysis was done for
each source code snapshot
Syntactic dependencies between source code
file was identified by data, function and
method references
Relate source-code files that are modified
together as part of an MR
If only one file was changed for an MR, then
there is no dependencies
Using the Commit information from the
Version control system, a logical dependency
matrix (LDM) was created
LDM is a symmetric matrix of source-code
files where Cij represents the sum, across all
releases, of the number of times files i and j
were changed together as part of an MR
Used two measures:
◦ Workflow Dependencies
 Captures the temporal aspects of the development
 Two developers i and j are said to be interdependent if
the MR was transferred from one developer i to
developer j some point during that MR
◦ Coordination Requirements
 Captures the intradeveloper coordination requirements
 Uses two matrix:
 Task Assignment Matrix – Developer to file matrix
 Task Dependency Matrix – File to file matrix
Analysis consists of two stages:
◦ First Stage: Focus on examining the relative impact
of each dependency type on failure proneness of
source-code files
◦ Second Stage: Verified the consistency of the initial
results by conduction a number of confirmatory
Constructed several logistic regression
If Odds Ratio is larger than 1, then positive
relationship between the independent and
dependent variables
If Odds ratio less than 1, then negative
Model 1:
◦ Based on LOC and Average Lines Changed
◦ LOC is positively associated with failure proneness
◦ Average lines changed is also positively associated
with defects
Model II:
◦ Introduces Syntactic Dependency measures by:
 Inflow Data
 Has significant impact on error proneness
 Inflow Functional
 This type of syntactic dependency has less impact on
failure pronenesss
Model III:
◦ Higher number of logical dependencies related to
an increase in the likelihood of failure
Model IV:
◦ Workflow dependencies do increase the likelihood
of defects
Model V:
◦ Coordination requirement has an higher impact in
Project A and lesser impact in Project B
All dependencies increases fault proneness
Logical Dependencies has the highest impact,
followed by Workflow dependencies and then
Syntactic Dependencies
Analysis is based on data collection from 2
Logical Dependencies has the highest impact
when compared to other 2 dependencies
Data collection from only 2 projects
They have not mentioned about other
dependencies except software and work
Not provided a method to solve the errors for the
Need to provided a method to solve the
errors for the dependencies
Discussion about other dependencies
General concepts should be introduced

similar documents