T4: Data Science Approaches in Software Engineering


Date
July 22 (Monday)
Period
1:30 - 4:30 pm (Half-day)

Tutorial Outline


Abstract


There is an abundance of metrics in the field of software reliability engineering. However, it is challenging to incorporate the right metrics in predictive models that are both mathematically sound and also reflect the behavior of software in a large, productional environment. And most importantly, the metrics and models need to help the engineering teams improve their practices and processes, thereby enabling them to produce quality products.

High performance models are needed to enable software practitioners to identify deficient (and superior) development and test practices. Even using standard practice metrics, and models derived from these metrics, software development teams can, and do, vary substantially in practice adoption and effectiveness. One challenge for researchers and analysts in these organizations is to develop and implement mathematical models that adequately characterize the health of individual practices (such as code review, unit test, static analysis, regression testing, etc.). These models can enable process and quality assurance groups to assist engineering teams in surgically repairing broken practices or replacing them with more effective and efficient ones.

In this tutorial, we will describe our experience with various types of in-process and downstream metrics. We then describe how such metrics are used in model building and implementation, and describe the boundaries within which certain types of models perform well. We will also address how to balance model generalizability and specificity in order to integrate the correct computational strategies and methods into everyday engineering workflows.

An important part of the the analysis and modeling effort needs to be the correlative ‘linking’ of the development and test metric values to customer experience outcomes, and then 'linking' outcomes to customer sentiment (i.e., satisfaction). These 'linkages' are essential in not only convincing engineering leadership to use certain computational tools in practice, but also are needed to enable investigators, at an early stage, to design experiments and pilots to test model applicability for future products and releases. After convincing experiments and pilots have been demonstrated, much work remains: Choosing a useful, and manageable, set of metrics; establishing goals and tracking/reporting mechanisms; and planning and implementing the tooling, training, and rollout steps. These practical considerations invariably put a strain on the models, therefore the models and ancillary analyses must be resilient, ‘industrial strength.’

Understanding a model’s practical limitations and strengths are important aspects of its use – just as the mathematical and statistical limitations and strengths underscore a model’s scientific validity. Both factors, mathematical and practical, need to mesh properly in a workable way in a data-driven engineering environment. The tutorial addresses the integration of these factors.

We will describe our experience in building and implementing models that are used by engineering teams that employ diverse development approaches, including waterfall, hybrid and agile development. We will show how we link in-process measures/metrics (from development and test) to customer experience, and then to customer satisfaction, which in turn correlates strongly with company revenue and market share. We will discuss the steps involved in choosing the most valuable metrics, setting goals for these metrics, and using them to help in improving development and test practices and processes.

Speaker


Pete Rotella avatar
Pete Rotella USA

Cisco Systems, Inc, USA


Pete Rotella has over 30 years’ experience in the software industry, as a leader of large-scale development projects and senior software engineering researcher. He has led major system development projects at IBM Corporation, U.S. Environmental Protection Agency, U.S. National Institutes of Health, GlaxoSmithKline plc, Unisys Corporation, and several statistical systems startups. For the past 16 years, he has been focusing on improving software reliability at Cisco Systems, Inc.


Sunita Chulani avatar
Sunita Chulani USA

Cisco Systems, Inc, USA


Sunita Chulani is an Advisory Engineer/Senior Manager of Analytical Models and Insights at Cisco Systems. Sunita has deep subject matter expertise in the area of software metrics, measurement and modeling and is responsible for developing insights derived from descriptive and prescriptive quality data analytics. Sunita has a good understanding of the mix between engineering and management with good analytical, communication and leadership skills. Her team’s charter focuses on Analytic Models and Customer/Product Insights. Sunita is a go to expert with a 9-year tenure at Cisco. She has several patents and co-authored a book, several book chapters, encyclopedia articles and more than five dozen papers/presentations at prestigious conferences. She is also very active in IEEE with a strong influence in the field of software reliability and has taught graduate level courses at Carnegie Mellon University. Prior to Cisco, Sunita was as a Research Staff Member at IBM Research in the Center for Software Engineering. She received her Ph.D. and Masters in Computer Science (with an emphasis on Statistics/Data Analysis and Software Economics) from the University of Southern California.