Transparency in research... and football
Why software management plans could help us understand the world (including football) better.
Update June 2024: Since this blog was published in 2021, a working group with members from NWO, the Netherlands eScience Center, RadboudUMC, TU Delft, and Vrije Universiteit Amsterdam published the Practical Guide to Software Management Plans in 2022.
About two weeks ago, the Netherlands’ football team was eliminated from the 2020 European Championship after losing to Czechia. A major contribution to the Dutch defeat was a red card received by defender Matthijs de Ligt. De Ligt was sent off the field for using his hand to wipe the ball away from Czech striker Patrik Schick’s feet, while Schick had a free run on the goal. The head referee used the Video Assisted Referee (VAR) to amend his initial judgement — giving De Ligt yellow — to giving him a red card¹.
That same week the Netherlands eScience Center and NWO organised a workshop about Software Management Plans (SMPs). Software has become an essential part of research, but there are no clear guidelines that instruct researchers how to plan for that software to be developed, maintained, and reused. The SMP workshop, attended by more than 40 participants from 28 different Dutch research organisations, explored the need for a national SMP template for research.
There were parallels between these seemingly disparate events: Speaker Antica Culina, a researcher at the Netherlands Institute of Ecology (NIOO), spoke about reproducibility in research. She demonstrated the need for transparency in research software by referring to a study about red cards in football. The study investigated a simple sounding question: are football referees more likely to give a red card to dark-skin-toned, as opposed to light-skin-toned players? Unlike a ‘regular’ research article, where one team of researchers presents a single result from a single analytic approach, the authors sent the same dataset containing football card and skin tone data to 29 different groups of researchers. These groups were free to use any analysis they deemed appropriate for answering the research question. They did their analyses, which were peer reviewed, and sent their results back to the authors.
The collection of results was astounding. Each group used a unique analytic approach. Twenty of the research groups concluded that dark-skin-toned players were more likely to receive red cards, whereas nine concluded that no effect of skin tone could be observed in the same data. The researchers’ expectations about what the result would be did not predict the conclusion they arrived at. No singular conclusion to this ‘simple’ question emerged. While this lack of a single result may seem worrying for anyone invested in the scientific endeavour, the authors write: “This does not mean that analysing data and drawing research conclusions is a subjective enterprise with no connection to reality. It does mean that many subjective decisions are part of the research process and can affect the outcomes.”
Subjective is also a head referee’s decision to award red cards to football players. While there are guidelines, and even clear rules (such as “you will receive a red card if you spit on another player”), ultimately, the decision to award a red card is at the discretion of the head referee. Similarly, there are countless statistics textbooks, blogs, and tutorials written about the choice of analytic approach for any statistical problem. But in the end, there is often more than one correct way, and the choice of analytic approach (such as statistical method, covariate selection and model selection) is at the discretion of the researcher.
A strategy for dealing with subjectivity in research, and in football, is to expose it. The VAR system in football allows the referee to scrutinise video footage of players, decide how grave a foul was, whether a ball crossed a line, or whether a situation should be deemed offside or not. It is a system designed to minimise human errors that could have a big influence on match results. It does not remove subjectivity, but it makes the ref’s decision-making process more transparent.
Openly sharing research software serves a similar goal.² Or, as Silberzahn et al. argue: “Transparency in data, methods, and process gives the rest of the community opportunity to see the decisions, question them, offer alternatives, and test these alternatives in further research.” However, as Antica Culina pointed out during the workshop, sharing code is not yet a standard practice for all researchers who write it.
Without a guide or a plan, it may be unfair to expect every researcher to know how to do this. A standardised SMP template could help. As Steve Crouch from the Software Sustainability Institute (SSI) explained, SMPs require researchers to make explicit what their software does, who it is for, what the outputs are, who is responsible for the release and to ensure that the software stays available to the community.
The workshop participants provided valuable input as to what an SMP template should contain. Currently, researchers typically specify their plans for managing software in data management plans. During the SMP workshop, it was apparent that there is a need for clear guidelines specifically for software management, in addition to data management, among researchers. Participants also made clear that because researchers are usually not software engineers, an SMP template would need to use plain language, accessible to those who may be familiar enough with coding to write a few analysis scripts for research, but are not familiar with professional software development.
Participants also said that if writing plans for software management is to become standard practice in the future, the benefits of doing so should be emphasised to those who write code for research. Many in academia already feel overwhelmed with the amount of admin their work requires, and SMPs could be in danger of becoming ‘another form’. But beyond the benefit to the community, there can be a lot of value in writing an SMP to individuals. If software is an important part of the research output, an SMP helps researchers, engineers and research support staff to think about how that software will be structured, shared, and maintained. This can save time and effort later, when reproducing the results, or when using that software in a different project. Another benefit for the researcher lies in the increasing importance of software for research. An SMP can help researchers to write better quality software, which will likely be an important part of a researcher’s portfolio in the future.
Still an open question is whether an SMP should be a part of every research project, and whether they should be integrated with data management plans. In addition, it is unclear if a universal SMP template is desirable, since different academic disciplines may have different software management needs. The eScience Center and NWO are setting up a working group to explore these questions and the possibility of creating a national template for Software Management Plans for Dutch research organisations.
We would like to thank Maria Cruz (NWO), Steve Crouch (SSI) and Antica Culina (NIOO) for their insightful talks about Data Management Plans and Software Management Plans. Thanks also to all participants for their valuable input.
[1] For those not familiar with the rules of football, a red card is the heaviest correctional tool in football (soccer). It means that the player must leave the field immediately without being replaced, leaving their team in clear disadvantage. In the case of the European Championships, the player who receives a red card also misses their team’s next match, if there is one.
[2] Arguably, as my colleague Pablo Rodríguez-Sánchez pointed out to me, the whole history of science and epistemology is about minimising human errors. Open Scientific Software doesn’t root in the development of Linux, but in Galileo and Newton.