-
Notifications
You must be signed in to change notification settings - Fork 5
Integrate volesti sampling routines into cobra MATLAB toolbox
Systems Biology is a fundamental field and paradigm that introduces a new era in Biology. The crux of its functionality and usefulness relies on metabolic networks that model the reactions occurring inside an organism and provide the means to understand the underlying mechanisms that govern biological systems. Even more, metabolic networks have a broader impact that ranges from resolution of ecosystems to personalized medicine.The analysis of metabolic networks is a computational geometry oriented field as one of the main operations they depend on is sampling uniformly points from polytopes; the latter provides a representation of the steady states of the metabolic networks. In particular, sampling from the feasible set of steady states is an important task as it avoids both the point estimation of fluxes (as in Flux Balance Analysis (FBA) and in Flux Variance Analysis (FVA)) and the introduction of an objective function; a considerably tough task.
The COnstraints Based Reconstruction and Analysis (COBRA) project is the most important package for the analysis of metabolic networks. Especially the MATLAB toolbox provides a comprehensive collection of basic and advanced modelling methods. However, considering uniform sampling, the native MATLAB code can make the run-time extremely bigger, especially as we move towards more complete and thus complex networks; i.e., moving from lower dimension networks to higher dimension ones. The C++ package volesti under GeomScale organization provides the fastest implementations for uniform sampling from a convex polytope. Moreover, the implementation of Multiphase Phase Monte Carlo Sampling (MMCS) algorithm in [1] (in a development branch of volesti) is the first implementation that scales up to 5000 dimensions in a day.
The aim of the project is to integrate the C++ implementation of MMCS algorithm of volesti into cobra MATLAB toolbox. The student:
- has to read and follow the instructions of how to contribute C++ code in cobra toolbox.
- will prepare and submit a contribution.
This project will offer new possibilities to the community of computational biology. More specifically, it will allow the complete analysis of networks by extending cobra with new more efficient methods and implementations. This is of high value considering the high increase of genome-scale metabolic reconstructions that are now available.
[1] Apostolos Chalkis, Vissarion Fisikopoulos, Elias Tsigaridas, Haris Zafeiropoulos, Geometric algorithms for sampling the flux space of metabolic networks, SoCG 2021.
Students, please contact both mentors below after completing at least one of the tests below.
-
Elias Tsigaridas <elias.tsigaridas at inria.fr> is an expert in computational nonlinear algebra and geometry with experience in mathematical software. He has contributed to the implementation, in C and C++, of several solving algorithms for various open source computer algebra libraries and has previous GSOC mentoring experience with the R-project (2019).
-
Vissarion Fisikopoulos <vissarion.fisikopoulos at gmail.com> is an expert in mathematical software, computational geometry and optimization, and has previous GSOC mentoring experience with Boost C++ libraries (2016-2019) and the R-project (2017-2019).
-
Apostolos Chalkis <tolis.chal at gmail.com> is a PhD student in Computer Science. His research focuses on mathematical computing, optimization and computational finance. He has previous experience in GSoC 2018 and 2019 as a student under Org.
R-project
, implementing state-of-the-art algorithms for sampling from high dimensional multivariate distributions. He is one of the authors ofvolesti
.
Students, please do one or more of the following tests before contacting the mentors above.
Easy: Compile and run the C++ interface of volesti
.
Medium: Use the cobra MATLAB toolbox to sample from the flux space of the metabolic network of E.coli with CHRR algorithm of cobra.
Hard: Implement a simple MATLAB wrapper for the C++ implementation of the Coordinate Hit and Run random walk (for uniform sampling) of volesti. Compare its efficiency with CHRR in cobra MATLAB toolbox.
💯
IMPORTANT: For tips ask the mentors!
:100:
Students, please post a link to your test results here.