What Data Scientists (care to) Recall

No Thumbnail Available

Date

2022-07-06

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Program comprehension is a crucial activity for software developers, just as it is for data scientists. It is an activity that involves gaining new knowledge and recovering lost knowledge, and the process could be a factor that affects various aspects of software projects. Because of this, there is a good amount of research on developers’ information needs and program comprehension support tools and techniques. “What Developers (Care to) Recall” [1] especially investigates the link between what software developers think is important to remember, their information needs and their memory. Kr¨uger et al. studied the importance of knowledge, memory correctness, and self-assessment by interviewing 17 developers of small systems. However, we could not find similar studies that particularly focus on data scientists and their human factors. Data scientists deal with different concepts in their daily tasks, which means that their information needs may be different from software developers’. To fill this gap, we replicated [1] and conducted the same interview-survey with some adjustments to the questions fit in the data science context. We interviewed 12 data scientists and investigated the knowledge they consider to be important to remember, whether they can remember parts of their systems correctly, the relation between their actual knowledge and their self-assessment, and finally how different/similar the results are to the replicated paper’s. Our results suggest that similar to software developers, data scientists consider architectural knowledge to be the most important to remember, they perform best in what they considered to be the most important type of knowledge, and on the contrary to software developers, their self-assessment increases when reflecting on their systems. In this paper, we discuss these findings, as well as the validity of these results and what kind of research directions may need to be considered in the future to better grasp the kind of comprehension support that data scientists need.

Description

Keywords

data scientists, human factors, rogram comprehension, knowledge importance, memory

Citation