NanoCommons Demonstration Case - Electronic Laboratory Notebooks for data collection and annotation
This case demonstrates how electronic lab notebooks (ELNs) can be used to manage complex studies and collaborative research. It will also show current challenges especially with respect to the initial high effort to set up the customised workflows, integration with other data management tools and additional requirements when using in regulatory setting. This case has been finalised.
Table of contents
Background and aims
The rapid technological development and increasing experimental complexity of nanosafety studies have increased the requirements regarding the proper quality management and control of the experimental workflows and the produced results. At the same time, the 2019 EU Open Data Directive, which extended the Open Research Data policy, introduced during the Horizon 2020 (H2020) framework, requires the ability fojr non-sensitive data accessibility, interoperability and reuse as per the FAIR (Findable, Accessible, Interoperable, Reusable) data principles (Open access, 2021)(Ammar et al., 2020).
Based on these requirements, the presence of a detailed data management plan (DMP) is a must, describing how the respective data will be captured, described, handled, stored and shared. The DMPs need to have provisions for the required metadata, so that the data they complement are sufficiently described, understandable and reusable (Papadiamantis et al., 2020).
Traditional experimental practices are not sufficient to fulfil these requirements, especially when the study design requires that data is generated by multiple laboratories based, in many cases, in different countries. To address this modern practices regarding the data and metadata capturing need to be implemented to ensure that the data management process starts at the project onset, and is not left until towards the end of the project when a lot of valuable information will have been lost (for example due to personnel movement, harddrive failures etc.). Hence, the data and metadata capturing process needs to be modernised, digitised and, as much as possible, automated. This first demonstration case was designed to evaluate how Electronic Laboratory Notebooks (ELNs) can be employed to achieve this. These are software dedicated to data and metadata capturing in a structured way, but offering more flexibility than the industrially or regulatory-orientated Laboratory Information Management Systems (LIMS). While LIMS need to be managed centrally and are very rigid with respect to any required changes (due to compliance with standards such as ISO 17025, Good Manufacturing Practice etc.), ELNs offer experimentalists the opportunity to build their own workflows, note any experimental deviations and capture their data using a structured digital environment, without the need for paper laboratory notebooks that can be misplaced or destroyed.
The aims of the demonstration case were:
- To develop structured experimental workflows using ELNs (in our case the chosen ELN was SciNote) to capture all necessary data and metadata in an experimental setting.
- To apply these workflows using a real-life example from an ongoing EU-funded project, involving multiple partners based in different countries.
- To establish a list of strengths and weaknesses regarding the implementation of such practices in the field of nanosafety.
Key questions of the demonstration case:
- Can ELNs be successfully applied in nanosafety research?
- Is it easy for researchers to accept ELNs for their experimental practice?
- Can ELNs be established in everyday data management practices from individual laboratories to large projects?
- What are the current bottlenecks in ELN implementation?
ACEnano round robin
The ELN demonstration case was performed in collaboration with the H2020 ACEnano project that performed a series of interlaboratory comparisons (round-robins) for the standardisation of physicochemical characterisation techniques for nanomaterials. The ACEnano round-robins included laboratories not only from the EU, but from Korea as well (Figure 1). This meant that a system needed to be implemented to facilitate the data capturing and sharing that would be independent of the time difference, that users would be able to access at their own time and that would allow the round-robins leader to holistically be able to monitor the experiments progress and results. The experiment that was used to test the ELN implementation was the Ultraviolet-Visible Spectroscopy (UV-ViS) that was led by the University of Birmingham (UoB). Besides UoB, the ACEnano partners that participated were the UK Centre for Ecology and Hydrology (UK-CEH), the University of Oxford (UoOxf), the German Federal Institute for Risk Assessment (BfR), the Helmholtz Centre for Environmental Research (UFZ) and the Korea Research Institute of Standards and Science (KRISS).
The ELN that was used to implement the experimental workflow was the cloud version of SciNote. Within SciNote, a specific project was created that corresponded to the ACEnano round-robins experiments and sub-spaces were designated to each partner. All partners were then invited to create their free accounts and PIs were asked to invite their participating data generators to join the project (Figure 2). Tassos Papadiamantis was designated as the data shepherd for the case study (for a detailed description of the data shepherd role see Papadiamantis et al. (Papadiamantis et al., 2020), helping the UoB ACEnano team, led by Emily Guggenheim, to work with the SciNote environment, prepare, implement and explain to the rest of the partners the functionalities and actions needed.
Within the SciNote protocols inventory the analytical protocol was imported (Figure 3), which was enriched with all necessary metadata to make it as safeproof as possible. This included information from the nanomaterials used as received by the commercial provider, a copy of the materials safety data sheet (MSDS) and highlighted information on potential hazards and risks. Where possible, each experimental step was complemented with images to ensure that partners could easily understand and follow the experimental process. A separate copy of the protocol was then loaded for each partner. In this way, each partner was able to make changes and add notes regarding potential deviations, without modifying the original protocol that was used as the round-robins baseline and ensured metadata capturing completeness.
In the results section (Figure 4), a harmonised environment was imported where partners were able, in their own time, to enter the acquired data along with any required metadata. The results tables were built so that any initial statistics, i.e., standard deviations, were automatically calculated. Furthermore, a file was implemented where the partners import their data, a grading is automatically generated and the size of an unknown nanomaterial is calculated (Figure 5). In this way, the round-robins partners and lead were able to test their results, evaluate the experimental procedure and reach the necessary conclusions. Based on this process, all experimental steps were successfully completed in a timely manner by all partners, while the UoB team was able to monitor the tasks’ progress and resolve any issues when needed. At the same time, this helped the KRISS partners to complete and submit their data and communicate with the rest of the group without the need for extended working hours due to the time difference (GMT + 9 hours).
Extension to complex workflows
Encouraged by the positive experience in the ACEnano setting, ELN implementation for more complex studies was attempted during the TA with the Laboratório Nacional de Nanotecnologia (LNNano) regarding ELN implementation, data management and annotation in experiments regarding the toxicity of different forms of graphene in Daphnia magna. During the implementation, the UoB team worked with the group of Diego Martinez, providing lectures and dedicated hackathons on the use of SciNote, data management, curation and QC and semantic annotation using established ontologies.
This work resulted in the successful implementation of SciNote and a resulting publication by Martinez et al. (Martinez et al., 2020). Figure 6 presents schematically the experimental workflow implemented in SciNote and performed in LNNano using the instance map from the NIKC templates (see also Demonstration Case Data management concept NanoFASE). Each experimental instance (denoted in blue in Figure 6) was imported in SciNote as a separate experimental step (Figure 7). Within each instance, the dedicated experimental workflows were defined, protocols imported and results sections designed (instance 5 is shown in Figure 8 as an example).
Discussion - strengths & weaknesses identified
The results demonstrated that ELNs have the potential to play an important role in nanosafety research, automating the data- and metadata-capturing process and moving it to the point of data production. This was taking place from the project onset, optimising the metadata capturing to maximise the value of the data produced in terms of interoperability and reusability. The original application in the round robins and the follow up TA both demonstrated the usefulness and applicability of ELNs at different project levels and experimental complexity.
In general, ELNs offer the ability to capture very complex experimental workflows. These, combined with the adoption of the NIKC templates instance maps, can be broken down into distinct dedicated experimental steps. These maps can be used as visual guides on the experimental process, help identify any experimental gaps and subsequently be included as graphical abstracts describing the experimental process in full detail. They can also help prioritise the experimental steps so as to guide the data producers towards optimal use of experimental time and facilitate complete data and metadata capturing. At a data producer level, ELNs offer experimentalists the opportunity to have a protools library available to them from the onset of the experimental process. They are able to implement these in their workflows, note any deviations (having the backup of the original protocol) and receive comments and advice regarding their work through the integrated comments sections at all levels. Furthermore, ELNs offer experimentalists the ability to have pre-designed and pre-annotated data templates available to them, which can integrate aspects of data sharing and FAIRness already into these first steps of the data management life cycle.
Some ELNs like SciNote used in this demonstration case offer functionalities to share studies with a team and, in this way, provide the ability, at a single group level, to allow the respective PI to monitor the progress of different projects and students, set deadlines and make sure they are met. PIs have the ability, using the team created within the ELN, to test experimental implementation and progress at their own time, without the need for dedicated meetings and long presentations from the group. In this way, valuable working time is saved and communication at the different data life cycle steps facilitated.
At a project level, ELNs with team functionality offer the opportunity to monitor the project’s progress and task completion from different partners, without having to wait for dedicated meetings that can be delayed or postponed for various reasons. ELNs can facilitate the experimental communication between partners based at different time zones, reducing the working hours and burden. This is especially significant in modern research, in which large projects usually include partners from different continents (e.g., Europe, Asia, the Americas). ELN users have the ability to prepare automated reports that can act as the basis of project deliverables, theses, articles and more. This is very helpful, as it provides a complete overview of the experimental process, while increasing the security of the data produced, as users no longer need to keep paper-based laboratory notebooks or transfer data around using removable storage devices.
Besides the strengths of ELNs several weaknesses/bottlenecks have been identified as well, some of them general and some specific to the chosen tools. Firstly, a strong mental change is needed within groups and consortia on accepting new more modern technological solutions with respect to data capturing and sharing. This, in many cases, may be a significant bottleneck due to resistance regarding change at different levels and especially from more senior researchers. This can be based on scientific reasons that partners might still prefer to publish data first before it is shared even inside the project or ethical reasons with respect to personal data protection (see the next chapter for concerns about being able to monitor working times of employees).
Furthemore, the usefulness and benefits of ELNs become apparent following substantial implementation into the everyday experimental setting. This may pose a risk in ELNs’ acceptance, since their implementation requires a lot of preparatory work, expert guidance and training (part of what we have termed “data shepherding”). These preparatory steps include importing a protocols library, customised data templates, annotation, etc. that require strong commitment, time dedication and acceptance at the higher levels. In the ACEnano round robins, the setup was performed by the data shepherd guaranteeing harmonisation of the reported data and metadata as well as of the analysis. Based on his previous experience regarding data quality and completeness as well as the FAIR guiding principles, he was able to create the workflow according to the highest data management and reporting standards. In a similar way, the graphene study was supported in the TA and adoption of the concepts of the NIKC instance maps provided the structure for the implementation in the ELN. However, for groups starting with ELNs and without the support of a data shepherd, the ELN itself does not offer any guidance or quality control on the implemented workflows. These users have to define everything from the beginning, relying on personal experience, however, in many cases researchers are still lacking the experience of what is required to maximise data FAIRness. This can result in data without any harmonisation and interoperability comparable to the situation of data files before standardised templates like NIKC or NANoREG were established. This is not a new issue introduced by ELNs but one of the most time consuming aspects of data management of large projects in more or less every past and ongoing project. Nevertheless, ELNs have the potential to change this situation when templates are shared within the community. Therefore, ELNs, with all their advantages described above, would need to become a central part of data management to achieve the goals described above (see the background section for improved quality management and control of the experimental workflows and the produced results). However, they have to be combined with functionality for data input validation with respect to quality, completeness and fitness for specific re-use purposes, according to community-agreed criteria and (meta)data standards, as well as semantic annotation for improved machine readability and data interoperability currently developed as stand-alone tools or as part of data input functionality of data warehouses or LIMS.
Many ELNs, including SciNote, are developed as commercial products. In the case of SciNote, there exists, besides different paid plans, a free subscription plan with limited functionality and no support and even an open-source version of the code. The free plan was used in this demonstration case since it offered, at that point of time, all features required for achieving the planned tasks. Still, being dependent on a third-party product and especially non-supported, the free version imposes many limitations on the possibilities NanoCommons can offer, and as such NanoCommons has to customise the service to its users’ needs (similar experiences were already reported with the GUIDEnano tool). First, the ambiguous goals described in the previous paragraph with their large influence on the core structure of the program are only achievable with a reasonable amount of work effort with the strong support of the software creator and, therefore, needs to fit into their business model. Long-term solution must therefore be to attach an ELN developing group directly to NanoCommons as a service provider. Second, changes in the business philosophy of the software company can have strong consequences on the usefulness of the tool as part of the infrastructure as we experienced direction after the end of the demonstration case. During our evaluation of the suitability of different ELN, the SciNote platform stood out for its multi-member and multi-partner team-research capabilities important for the round robin presented above. However, after the two studies (round robin and TA) were transferred into training material and shortly before the hackathon “Online Electronic Lab Notebook basics” (ELN hackathon, 2020) was held, the provider revoked this important feature. For the team of the case study, which were also the trainers in the hackathon, this came at the most unfortunate times since the carefully prepared workflows in SciNote meant to be showcased in the hackathon could not be executed anymore. Some of the changes became only obvious during the hackathon resulting in very negative experiences by the participants, as well as a significant blow in trust that a good solution for integration into NanoCommons has been found in the form of SciNote and a consequent decrease in the commitment of the NanoCommons team to promote the tool. However, due to the large amount of time already investigated in conceptualizing the use of ELN in nanosafety research within SciNote, work-around solutions for allowing multi-user functionality at least for testing and training are currently being investigated.
Challenges and additional considerations for ELN implementation in a regulatory testing context
When designing the demonstration case, the approaches developed and implemented in the ACEnano round robins were planned to also be transferred to, and evaluated by, other NanoCommons partners. The starting point for this was the scoping phase for ELN implementation by partner BfR, which undertakes both research and regulatory testing, and has an existing Laboratory Information and Management System (LIMS) in place to fulfill regulatory requirements. The exploration of how to implement ELNs to complement the existing LIMS identified a number of additional requirements and challenges to the usage of ELNs in regulatory settings. Some of the challenges identified were quite unexpected, as described below, relating to other areas of data management and personal data protection.
Several activities were initiated internally at BfR with the aim to launch a pilot phase for ELN. Most of these activities were organised in the context of “BfR2025”, which was launched in February 2020 with the overarching aim to render BfR “Fit for Future”. BfR2025 is organised in different working groups, one of which, entitled “Efficiency” being led by Andrea Haase addresses “digitalisation”, including “digitalisation in research”. Several BfR-internal meetings took place in 2020. One major activity was a virtual BfR-internal workshop on digitalisation, which attracted more than 140 participants. One of the break-out sessions of this workshop was dedicated to “digitalisation in research” and different elements such as research data management and ELNs were discussed. BfR, as a governmental agency, has to consider many different aspects before being able to launch a pilot phase for ELNs. For instance, all the laboratories at BfR are fully accredited (according to ISO17025) and have a fully implemented LIMS. Thus, it was necessary to consider how ELNs can be tested within the established quality management system and how the interface to the established LIMS system will be addressed. Moreover, we also discussed with the BfR staff council to ensure that ELN will not be used to monitor employee performance. In addition, IT security needs to be addressed properly as many ELNs operate with cloud-based data storage, which in general is considered a critical issue for potentially sensitive data. Most of the open issues could be properly addressed during 2020. Therefore, we expect a decision from BfR top level management with respect to launching an ELN pilot phase at BfR in 2021. Within this pilot phase, BfR would like to test different ELNs and compare their performance and user-friendliness, which will further extend this demonstration case. A checklist of the additional considerations for non-academic users interested in implementing ELNs is being finalised to accompany the overall guidance on ELN selection and implementation.
Conclusions on the overall achievement
An article demonstrating the use of the instance maps for complex NMs studies has been published Martinez et al., 2020), and the instance map concept has been found by researchers to be extremely useful for visualising their experiments, and the associated data and metadata: Therefore, developments to extend the concept to study design and towards user-friendly tools to create such maps will be extended further in the new Demonstration Case Study Design as a means to support evaluation of study data and metadata completeness relative to best practice. The advantages of ELNs were also discussed in the first metadata paper (Papadiamantis et al., 2020), and further publications are under development building on the demonstration case, including a workflow for complete integration of Round Robin (RR) or InterLaboratory Comparison (ILC) data into ELNs and on utilisation of ELNs for complex mesocosm datasets.
A guidance and checklist of considerations for implementation of ELNs in academic/non-academic nanosafety research settings, including a suite of protocols, workflows and ontology-annotated data capture templates is being implemented, including the existing experimental ELN workflows developed for NMs characterization, daphnia interactions with NMs, and NM interactions with proteins (protein corona studies). Addressing the challenges noted above in terms of the requirement for team-work within the ELN, NanoCommons has agreed with SciNote to deploy a NanoCommons specific instance based on the open-source version of SciNote offering all features of the paid version. This will from now on be used in TA projects and training events. A key aspect of this will be the requirement to clearly indicate to users what is available in the free version versus what are paid-for features such that they understand what they can implement locally under the different modalities.
Generalizing from the experiences in this demonstration case, we learned that the task of training users in implementation of their experiments into the ELN and the development of the support data capture techniques requires dedicated effort from both sides - the NanoCommons data shepherd must be extremely hands-on with the data generators probing all aspects of the study design and data to be captured, while the data generators need to be open to thinking about their experiments in a slightly different way and to investing time into the establishment of their workflows. While the initial effort is quite large, the payoff is that once the workflows are established, and the metadata related to the instruments, media, organisms, etc., is collected in a single location, the subsequent experiments and the writing-up of experiments, data sharing / data analysis and other steps along the data life cycle are all made significantly easier. As the EC and other funders move towards an increased focus on FAIR and Open data, and journals are increasingly requesting data availability statements, it is clear that processes to streamline this and integrate it directly into the data generation process are both essential and inevitable. NanoCommons is pleased to be leading the charge here for the nanosafety community.
- Ammar et al., 2020: Ammar, A.; Bonaretti, S.; Winckers, L.; Quik, J.; Bakker, M.; Maier, D.; Lynch, I.; van Rijn, J.; Willighagen, E. A Semi-Automated Workflow for FAIR Maturity Indicators in the Life Sciences. Nanomaterials 2020, 10 (10), 2068. https://doi.org/10.3390/nano10102068.
- ELN hackathon, 2020: Papadiamantis, A.; Himly, M. Online Electronic Lab Notebooks-Basics Hackathon. 2021. https://doi.org/10.5281/ZENODO.4518805. https://zenodo.org/record/4518805.
- Martinez et al., 2020: Martinez, D. S. T.; Da Silva, G. H.; de Medeiros, A. M. Z.; Khan, L. U.; Papadiamantis, A. G.; Lynch, I. Effect of the Albumin Corona on the Toxicity of Combined Graphene Oxide and Cadmium to Daphnia Magna and Integration of the Datasets into the NanoCommons Knowledge Base. Nanomaterials 2020, 10 (10), 1936. https://doi.org/10.3390/nano10101936.
- Open access, 2021: European Commission. Open Access & Data Management - H2020 Online Manual. 2021. https://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-dissemination_en.htm.
- Papadiamantis et al., 2020: Papadiamantis, A. G.; Klaessig, F. C.; Exner, T. E.; Hofer, S.; Hofstaetter, N.; Himly, M.; Williams, M. A.; Doganis, P.; Hoover, M. D.; Afantitis, A.; Melagraki, G.; Nolan, T. S.; Rumble, J.; Maier, D.; Lynch, I. Metadata Stewardship in Nanosafety Research: Community-Driven Organisation of Metadata Schemas to Support FAIR Nanoscience Data. Nanomaterials 2020, 10 (10), 2033. https://doi.org/10.3390/nano10102033.