NanoPharos provides curated data for modellers.
For the development of nanoinformatics models, data needs are often more focussed on finding as many materials as possible for which the specific endpoint is known to allow building of sufficient sized training and test sets. Additionally, the data needs to be organised in a very defined tabular format (one NM and treatment condition (time, concentration etc.) per row) with all end-points available organised as consecutive columns. To support the nanoinformatics and modelling activities, the NanoPharos the computational database was co-developed by NanoCommons and NanoSolveIT projects specifically to provide curated data that is ready-to-use for modelling purposes and, in this way, is complementary to, and integrated into, the NanoCommons KB user interface.
More precisely, the NanoPharos database enables the development of nanoQSAR, read-across and other types of predictive models by giving access to NM properties, their biointeractions and adverse effects and, thus, will greatly contribute to the in silico assessment of NM toxicity and evaluation of NM functionalities. Another aspect of the NanoPharos database is the support of the already developed models within the NanoCommons and NanoSolveIT projects (see below). These models are supported with the necessary documentation regarding the methods that were applied to develop them, to define their domain of applicability and validate their performance. In this framework, the data used to build the models are hosted in the NanoPharos database to ensure their long-term accessibility and to further increase reproducibility as well as stakeholders’ confidence when using the models. The database is thus a means to achieve FAIR analysis of NM data with transparent and open approaches through all modelling aspects. The NanoPharos database can be extended as new data and models are generated and incorporated into the sustained NanoCommons KB.
Users just have to select the dataset of interest and click on the “Download dataset” button. The dataset will then be downloaded in XLSX format. The datasets included in NanoPharos thus far comprise of metal and metal oxide NMs, fullerenes, MWCNTs, carbon black, silica NMs and graphite nanofibers and include descriptors such as physicochemical properties, transcriptomics data, computational, periodic and image properties. Toxicological data are also included that can be used as the target-endpoint for the development of predictive models that can contribute to the ongoing work in the nanosafety community on NM safe by design (SbD), risk assessment and risk governance.
Screenshot of the dataset 7 in the NanoPharos database of ready for modelling datasets - Computational data for the toxicity and bioactivity assessment of 83 functionalized MWCNTs. The predictive model is available via NanoCommons Enalos Cloud instance