Dissertation: Applying Machine Learning Methods for Modeling Cell Behavior During Regeneration (Variant 2)

Final Model

MedicalNet
MedicalNet is a model developed by Tencent’s research team to advance 3D medical image analysis using deep learning and transfer learning techniques which is based on MED3D: TRANSFER LEARNING FOR 3D MEDICAL IMAGE ANALYSIS.
- It provides a family of 3D ResNet-based encoder networks that are pre-trained on a massive, diverse aggregated medical datasets called 3DSeg-8.
- These pre trained 3D ResNet based encoders can be used for downstream tasks such as classification and segmentation:

I used the 3D-ResNet10 variant from MedicalNet’s encoder family for my Binary Classification between WT (Wild Type) vs MUT (Mutant), where I added a fully connected layer followed by a sigmoid activation function to the pre-trained encoder : [Drew this figure using Draw.io and Overleaf]

Approach 1 : Train the network with the encoder weights frozen

Implemented 5 Fold Cross Validation.
Ran the network for 30 epochs per fold.
Total training time : 23 hours, 10 minutes, and 25 seconds.
Loss function : Binary Cross Entropy

K = 5 Fold Cross-Validation Summary (Training Phase):
Evaluation Metrics on Unseen Validation Data from 5-Fold Cross-Validation:

Trained Model (Fold)	Accuracy	Precision	Recall	F1 Score	ROC AUC
Fold 1	0.7600	0.8889	0.6154	0.7273	0.9295
Fold 2	0.8000	1.0000	0.6154	0.7619	0.9744
Fold 3	0.7600	0.8889	0.6154	0.7273	0.9551
Fold 4	0.8000	1.0000	0.6154	0.7619	0.9551
Fold 5	0.8000	1.0000	0.6154	0.7619	0.9551
Mean ± Std	0.7840 ± 0.0196	0.9556 ± 0.0544	0.6154 ± 0.0000	0.7481 ± 0.0170	0.9538 ± 0.0143

Model shows high precision but relatively low recall, suggesting it misses many actual “Mutant” cases (i.e., high false negatives). This may be due to the use of frozen encoder weights.
Accuracy remains consistent across folds, with stable performance on unseen data.
ROC AUC is consistently high (~0.95 ± 0.01), indicating strong ability to distinguish between WT and MUT.
Low standard deviations across metrics imply reliable generalization across all folds.

Plots:

Receiver Operating Characteristic (ROC) Curves Across 5 Folds with Mean AUC:

Training and Validation loss curves across all 5 folds during cross-validation:

The model does not appear to be overfitting:
- Validation loss is consistently below training loss, which may suggest good generalization.
- This pattern may result from the use of transfer learning with frozen encoder weights, along with regularization techniques (such as dropout or data augmentation) applied during training but not during validation — leading to artificially higher training loss despite good generalization.

Note: During 5-fold cross-validation, Fold 3 stopped training at epoch 26 due to early stopping. All other folds continued training up to epoch 30. To ensure consistency in visualizing and averaging training and validation losses across folds, all loss curves were truncated at epoch 26 (the final epoch completed by every fold).

Approach 2 : Train the entire network without freezing encoder weights.

Implemented 5 Fold Cross Validation.
Ran the network for 30 epochs per fold.
Total training time : 23 hours, 10 minutes, and 25 seconds.
Loss function : Binary Cross Entropy

K = 5 Fold Cross-Validation Summary (Training Phase):

Evaluation Metrics on Unseen Validation Data from 5-Fold Cross-Validation:

Trained Model (Fold)	Accuracy	Precision	Recall	F1 Score	ROC AUC
Fold 1	1.0000	1.0000	1.0000	1.0000	1.0000
Fold 2	1.0000	1.0000	1.0000	1.0000	1.0000
Fold 3	1.0000	1.0000	1.0000	1.0000	1.0000
Fold 4	1.0000	1.0000	1.0000	1.0000	1.0000
Fold 5	1.0000	1.0000	1.0000	1.0000	1.0000
Mean ± Std	1.0000 ± 0.0000	1.0000 ± 0.0000	1.0000 ± 0.0000	1.0000 ± 0.0000	1.0000 ± 0.0000

The model achieved perfect classification metrics across all folds, indicating highly consistent performance.
Recall has increased from 0.6154 to 1.000, no false negatives.
The zero standard deviation across folds shows the model is not only accurate but also consistent.

Note: The results appeared too good to be true at first glance. To ensure reliability, I thoroughly checked for class imbalance and data leakage. Both checks confirmed no issues, validating the integrity of the evaluation pipeline:

Plots:

Receiver Operating Characteristic (ROC) Curves Across 5 Folds with Mean AUC:

Training and Validation loss curves across all 5 folds during cross-validation:

The trends of the loss curves are similar to Approach 1, but both training and validation losses have reduced significantly, indicating improved generalization.

To-Do List

Finalise the Network for training my 3D medical dataset.
Approach 1 - recall low is a problem here.
Approach 2
Implement additional evaluation methods to validate results

Feature Space Animation with MUT/WT and Arrows

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
models		models
README.md		README.md
compare_approaches.md		compare_approaches.md
fine_tune_classifier_Approach1.ipynb		fine_tune_classifier_Approach1.ipynb
finetune_classifier-approach2.ipynb		finetune_classifier-approach2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dissertation: Applying Machine Learning Methods for Modeling Cell Behavior During Regeneration (Variant 2)

Final Model

Approach 1 : Train the network with the encoder weights frozen

Approach 2 : Train the entire network without freezing encoder weights.

To-Do List

About

Uh oh!

Releases

Packages

Languages

harshdeepkalita/Dissertation_Cell_Regeneration

Folders and files

Latest commit

History

Repository files navigation

Dissertation: Applying Machine Learning Methods for Modeling Cell Behavior During Regeneration (Variant 2)

Final Model

Approach 1 : Train the network with the encoder weights frozen

Approach 2 : Train the entire network without freezing encoder weights.

To-Do List

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages