Drug-Food Interaction

This project extracts and semantically models drug-food interactions using data sourced from DrugBank. It processes natural language interaction descriptions, links terms to biomedical ontologies via BioFalcon, and generates a structured RDF-based Knowledge Graph of interactions, drugs, foods, effects, impacts, and recommendations.

Project Workflow

Data Extraction
- The CSV file drugBank_drug_food_interactions.csv contains raw interaction descriptions from DrugBank.
- main.py processes the CSV file and extracts relevant terms (drugs, foods, effects, impacts, interactions).
- extracting the Inter has more than one DFI.py handles cases where multiple DFIs are embedded in a single entry.
Term Normalization
- dictionary.py is used to normalize extracted terms (e.g., converting "increased", "increasing" → "increase").
Entity Linking to UMLS
- BioFalcon linking.py uses BioFalcon to link each term to its UMLS Concept Unique Identifier (CUI).
- compare similarity.py applies fuzzy matching (fuzzywuzzy) to improve label alignment with UMLS terms.
Recommendation Extraction
- recommendations.py filters out and extracts only the interaction texts that are explicit recommendations.
Semantic Mapping to RDF
- RDF/Turtle mapping files in the Mapping/ directory define rules to convert processed CSV files into RDF triples (.nt format).
- Output .nt files represent the semantic Knowledge Graph, suitable for querying and reasoning.

Repository Structure

Drug-Food-Interaction-main/
│
├── main.py                             # Extracts data from DrugBank CSV
├── extracting the Inter has more than one DFI.py  # Handles multiple DFIs in one entry
├── dictionary.py                       # Normalizes terms to avoid duplicates
├── BioFalcon linking.py               # Links terms to UMLS using BioFalcon
├── compare similarity.py              # Matches terms using fuzzy similarity
├── recommendations.py                 # Extracts recommendation-based interactions
│
├── drugBank_drug_food_interactions.csv  # Raw interaction data from DrugBank (downloaded on Feb 28, 2024)
│
├── Mapping/                            # RDF mapping files and outputs
│   ├── *.ttl                           # Mapping templates (e.g., DrugMapping.ttl)
│   ├── *.nt                            # RDF output files
│   └── config.txt                      # Mapping configuration
│
├── error.log                           # Processing error logs
└── .idea/                              # PyCharm IDE metadata (can be ignored)

Requirements

Python 3.7+
fuzzywuzzy
pandas
BioFalcon API Access
(Make sure to include .env or credentials if required for BioFalcon access.)

Install required packages:

pip install -r requirements.txt

If requirements.txt is missing, install manually:

pip install pandas fuzzywuzzy python-Levenshtein

Usage

Start by extracting interactions

python main.py

Process multiple-interaction entries

python "extracting the Inter has more than one DFI.py"

Normalize and prepare terms

python dictionary.py

Link terms with UMLS using BioFalcon

python "BioFalcon linking.py"

Refine matches using fuzzy similarity

python "compare similarity.py"

Extract only recommendation-based interactions

python recommendations.py

Generate RDF triples with mappings

Use SDM-RDFizer or similar tools to apply .ttl mapping files and produce .nt RDF outputs.

Output

After processing, RDF triples representing drugs, foods, effects, impacts, and their interactions will be available in .nt format under the Mapping/ folder. These triples can be used for semantic reasoning, knowledge graph exploration, or querying with SPARQL.

References

DrugBank: https://go.drugbank.com/
BioFalcon: https://labs.tib.eu/sdm/biofalcon
UMLS Metathesaurus: https://www.nlm.nih.gov/research/umls/index.html
SDM-RDFizer: https://github.com/SDM-TIB/SDM-RDFizer

Acknowledgements

This work was developed as part of the P4-LUCAT project, within a research workflow for semantic enrichment of biomedical data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Drug-Food Interaction

Project Workflow

Repository Structure

Requirements

Usage

Output

References

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.idea		.idea
Final edition		Final edition
Mapping		Mapping
.gitattributes		.gitattributes
BioFalcon linking.py		BioFalcon linking.py
README.md		README.md
compare similarity.py		compare similarity.py
dictionary.py		dictionary.py
drugBank_drug_food_interactions.csv		drugBank_drug_food_interactions.csv
error.log		error.log
extracting the Inter has more than one DFI.py		extracting the Inter has more than one DFI.py
main.py		main.py
recommendations.py		recommendations.py

SDM-TIB/Drug-Food-Interaction

Folders and files

Latest commit

History

Repository files navigation

Drug-Food Interaction

Project Workflow

Repository Structure

Requirements

Usage

Output

References

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages