This repository contains all the files used for the thesis titled "Development of Predictive Models for Nanomaterial Biodistribution Using Machine Learning" of Alexandros Angelis. The thesis focuses on developing models to predict the biodistribution of nanomaterials using machine learning techniques.
This folder contains the data files used in the thesis:
Detailed information on the experimental conditions used to generate the biodistribution data.
A comprehensive database of nanomaterials and their associated properties, used as input data for modeling.
Data on the organ-to-body weight ratios for different tissues, used to analyze the biodistribution of nanomaterials.
The code folder in this repository contains all Jupyter notebooks used in the thesis. Below is a brief overview of each file:
Calculates the Area Under the Curve (AUC) and Cmax values for the biodistribution data using a double exponential decay model with a constant factor.
This notebook contains a web crawler that searches for biodistribution data specific to nanomaterials. It automates the collection of relevant data to build and update the database.
Processes the raw data obtained from the database, including data cleaning, normalization, and transformation steps, preparing it for further analysis and modeling.
Optimizes the parameters of the biodistribution fit curve. The fitting is performed using a double exponential decay with a constant factor, leveraging the NLopt optimizer for precise parameter estimation.
Builds machine learning models using Random Forest regression to predict AUC and Cmax. This file also evaluates model performance, identifies feature importance, and determines the applicability domain.
To use the code files, you need to have Jupyter Notebook installed.