Models of the human metabolic network: aiming to reconcile metabolomics and genomics
Functional understanding of signaling pathways requires detailed information about the constituent molecules and their interactions. Simulations of signaling pathways therefore build upon a great deal of data from various sources. We first survey electronic data resources for cell signaling modeling and then based on the type of data representation the data sources are broadly classified into five groups. None of the data sources surveyed provide all required data in a ready-to-be-modeled fashion. We then put forward a "wish list" for the desired attributes for an ideal modeling centric database.
Finally, we close with perspectives on how electronic data sources for cell signaling modeling have developed. We suggest that future directions in such data sources are largely model-driven and are hinged on interoperability of data sources. Search form Search this site. Home Research Faculty Prof. Upinder S. Bhalla - Electronic data sources. Bhalla - Electronic data sources Supplementary classification of databases for: Electronic data sources for kinetic models of cell signaling J Biochem Tokyo. Authors : HarshaRani, G.
Diagram Resources Model Depositories Specialized Databanks Searchable Model Depositories Online Modeling Databases Abstract Functional understanding of signaling pathways requires detailed information about the constituent molecules and their interactions. The database has graphic depictions of molecular or cell-to-cell interactions. A number of curated pathways created from the interaction data are available as images. It contains pathway diagrams of metabolic pathways, cell-cell interactions, invasion of the erythrocyte by the parasite and transport functions. They have pathway diagrams among other information.
Net Biomodels. Net project is an international effort to define common standards for model curation and annotation, coupled with a free, centralized, publicly-accessible database of annotated models. This resource is currently under development. Their web site has a repository of samples model as BioNetGen input files. The models have been developed by hand, not converted automatically from other sources, and most were validated by running simulations on them using the parameter and boundary values specified in the model. BIND has three classifications for molecular associations: molecules that associate with each other to form interactions, molecular complexes that are formed from one or more interaction s and pathways that are defined by a specific sequence of two or more interactions.
It is maintained and developed at the Institute of Biochemistry at the University of Cologne. Data on enzyme function are extracted directly from the primary literature. Signaling pathways are compiled as binary relationships of biomolecules and represented by graphs drawn automatically. KDBI contains information about binding or reaction event, participating molecules, binding or reaction equation, kinetic data and related references.
For example, Albert et al. The steady-state behavior of this model was in excellent agreement with experimentally observed expression patterns under wild type and several gene mutation conditions. This study highlighted the importance of the network topology in determining biologically correct asymptotic states of the system.
Emeritus Professor Philip Kuchel
Indeed, when the segment polarity gene control network was modeled with more detailed kinetic models, such as systems of nonlinear differential equations, exceptional robustness to changes in the kinetic parameters was observed [ 32 ]. Boolean networks have also been used to model the yeast and mammalian cell cycle [ 33 , 34 ]. Li et al. The Boolean network formalism was also recently used to model systems-level regulation of the host immune response, which resulted in experimentally validated predictions regarding cytokine regulation and the effects of perturbations [ 35 ].
Boolean rules can be learned from gene expression data using methods from computational learning theory [ 36 ] and statistical signal processing [ 37 ].
IN ADDITION TO READING ONLINE, THIS TITLE IS AVAILABLE IN THESE FORMATS:
A limitation of the Boolean network approach is its inherent determinism. Because of the inherent stochasticity of gene expression and the uncertainty associated with the measurement process due to experimental noise and possible interacting latent variables e. The contribution of each function is proportional to its determinative potential as captured by statistical measures such as the coefficient of determination, which are estimated from the data [ 37 ]. The dynamical behavior of PBNs can be studied using the theory of Markov chains, which allows the determination of steady-state behavior as well as systematic intervention and control strategies designed to alter system behavior in a specified manner [ 39 — 41 ].
The PBN formalism has been used to construct networks in the context of several cancer studies, including glioma [ 42 ], melanoma [ 41 ], and leukemia [ 40 ]. PBNs, which are stochastic rule-based models, bear a close relationship to dynamic Bayesian networks [ 43 ] — a popular model class for representing the dynamics of gene expression.
Bayesian networks are graphical models that have been used to represent conditional dependencies and independencies among the variables corresponding to gene expression measurements [ 44 ]. One limitation of Bayesian networks for modeling genetic networks is that these models must be in the form of directed acyclic graphs and, as such, are not able to represent feedback control mechanisms.
Dynamic Bayesian networks, on the other hand, are Bayesian networks that are capable of representing temporal processes [ 45 , 46 ] that may include such feedback loops. Since not all causal relationships can be inferred from correlation data, meaning that there can be different directed graphs that explain the data equally well, intervention experiments where genes are manipulated by overexpression or deletion have been proposed to learn networks [ 47 ].
The Bayesian network formalism has also been used to infer signaling networks from multicolor flow cytometry data [ 48 ]. There exist a number of other approaches for inferring large-scale molecular regulatory networks from high-throughput data sets. One example is a method, called the Inferelator, that selects the most likely regulators of a given gene using a nonlinear model that can incorporate combinatorial nonlinear influences of a regulator on target gene expression, coupled with a sparse regression approach to avoid overfitting [ 49 ].
In order to constrain the network inference, the Inferelator performs a preprocessing step of biclustering using the cMonkey algorithm [ 50 ], which results in a reduction of dimensionality and places the inferred interactions into experiment-specific contexts.
- Prof. Upinder S. Bhalla - Electronic data sources | NCBS.
- VIAF ID: 22172343 (Personal)?
- Virtual Components Design and Reuse.
- Models of the human metabolic network: aiming to reconcile metabolomics and genomics.
- Visual Media Coding and Transmission.
- Burning Paradise.
- Virtual International Authority File.
The authors used this approach to construct a model of transcriptional regulation in Halobacterium that relates 80 transcription factors to predicted gene targets. Another method that predicts functional associations among genes by extracting statistical dependencies between gene expression measurements is the ARACNe algorithm [ 51 ]. This information-theoretic method uses a pairwise mutual information criterion across gene expression profiles to determine significant interactions. A key step in the method is the use of the so-called data processing inequality, which is intended to eliminate indirect relationships in which two genes are co-regulated through one or more intermediaries.
Thus, the relationships in the final reconstructed network are more likely to represent the direct regulatory interactions.
Models of the human metabolic network: aiming to reconcile metabolomics and genomics
A method related to the ARACNe algorithm, called the context likelihood of relatedness CLR , also uses the mutual information measure but applies an adaptive background correction step to eliminate false correlations and indirect influences [ 53 ]. CLR was applied to a compendium of E. The CLR algorithm had superior performance as compared to the other algorithms, which included Bayesian networks and ARACNe, when tested against experimentally determined interactions curated in the RegulonDB database.
It also identified many novel interactions, a number of which were verified with chromatin immunoprecipitation [ 53 ]. There are fundamental differences between the biochemical and statistical classes of network modeling described herein. One clear difference is the manner in which these underlying networks are reconstructed.
For biochemical networks, reconstruction is typically a work-intensive process requiring significant biochemical characterization with little network inference done other than inferences of single gene function for catalyzing a reaction based on sequence homology. Thus, the ability to rapidly generate networks for organisms that are relatively uncharacterized from high-throughput data is an inherent advantage of the inferred statistical networks.
One advantage of the biochemical network models is that, once reconstructed, the networks are not as subject to change other than addition since many of the links are based directly on biochemical evidence. Inferred networks, on the other hand, can undergo substantial changes in light of additional data.
Another common difference, although not fundamental, is that constraint-based biochemical network models have mostly been used to model flux, whereas inference networks have mostly been used to predict substance amounts e. One way this can be thought of is that the biochemical network models currently link more closely to functional phenotype i. The kinetic biochemical network models, of course, have the capacity to account for both flux and abundance, but suffer from the limitation that they are by far the most data intensive to reconstruct.
Another key advantage of biochemical reaction networks, stemming from their basis in chemistry, is that physico-chemical laws apply, such as mass-energy balance, while such laws are not generally applicable to the inferred networks. Of course, the advantage of the inferred networks is that, since they do not need to be mechanistic or require biochemical detail, they can be applied very broadly to systems that are not yet well characterized and can link very disparate data types as long as underlying correlations exist.
In summary, both modeling types are essential to contemporary computational systems biology and provide advantages over each other in different settings. One interesting challenge going forward is whether hybrid models that take advantage of the strengths of the different modeling approaches can be constructed to move us further towards the goal of predictive whole-cell models and beyond. Early attempts have been done to link Boolean regulatory networks with constraint-based flux models [ 8 ], but the extent to which these approaches can be married to provide significant advances in our ability to model biological networks remains an open question.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript.
Related Modelling metabolism with Mathematica : detailed examples including erythrocyte metabolism
Copyright 2019 - All Right Reserved