Quasispecies diversity in the V1–V2 region in <i>env</i> gene of the recovered viruses
a<p>Mean genetic distance measured as substitutions per site in all pairwise comparison. As... more a<p>Mean genetic distance measured as substitutions per site in all pairwise comparison. As heterogeneity differences between the four regions studied were minor, we used the mean genetic distance of all viruses. The quasispecies heterogeneity was estimated using the mean genetic distance of the nucleotide sequences by Maximum Likelihood after the use of the jModeltest to establish the parameters which selected the GTR+G model. The estimation was carried with the PAUP program <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone.0088579-Swofford1" target="_blank">[40]</a>.</p
Fitness values of the viruses and their increases during the recovery passages
Representation of the fitness landscape from viral consensus sequences in the V1–V2 region in <i>env</i> gene and of the evolutionary trajectories of quasispecies variants
<p>The landscape was created by SOM (15×15 neurons) with the 55 consensus sequences in the ... more <p>The landscape was created by SOM (15×15 neurons) with the 55 consensus sequences in the V1–V2 region in <i>env</i> gene from the global sequences, labelled as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone-0088579-g003" target="_blank">Figure 3</a> (with an <i>L</i> = 1 factor) and drawn using the same grey scale as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone-0088579-g003" target="_blank">Figure 3</a>. A) Fitness landscape map showing the neuron that maps each viral consensus sequence. B) Representation of some of the 911 sequences from the viral quasispecies dataset, with unknown fitness values, projected on this fitness landscape map. The quasispecies variants from each virus are displayed as a circle over the neuron that maps them, and the diameter of the circle symbolizes the proportion of variants identified in passage 1 (in blue), passage 11 (in green), passage 21 (yellow) and passage 31 (red). The quantification of the quasispecies variants in each neuron is summarized in Table S4 in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone.0088579.s001" target="_blank">File S1</a>. Colour arrows joining the circles show the estimated evolutionary trajectories of the viral clones during the recovery passages.</p
Mean genetic divergence between lineages
a<p>Estimates of evolutionary divergence between lineages expressed as mean number of base ... more a<p>Estimates of evolutionary divergence between lineages expressed as mean number of base substitutions per site in sequence pairs.</p>b<p>Standard error estimate(s). Analyses were conducted using the Maximum Composite Likelihood model <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone.0088579-Tamura1" target="_blank">[32]</a>. The analysis involved 46 nucleotide sequences and a total of 8663 positions in the final dataset. Codon positions included were 1st+2nd+3rd+Noncoding. All positions containing gaps and missing data were eliminated. Evolutionary analyses were conducted in MEGA5program <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0088579#pone.0088579-Tamura1" target="_blank">[32]</a>.</p
Populations of RNA viruses are composed of complex and dynamic mixtures of variant genomes that a... more Populations of RNA viruses are composed of complex and dynamic mixtures of variant genomes that are termed mutant spectra or mutant clouds. This applies also to SARS-CoV-2, and mutations that are detected at low frequency in an infected individual can be dominant (represented in the consensus sequence) in subsequent variants of interest or variants of concern. Here we briefly review the main conclusions of our work on mutant spectrum characterization of hepatitis C virus (HCV) and SARS-CoV-2 at the nucleotide and amino acid levels and address the following two new questions derived from previous results: (i) how is the SARS-CoV-2 mutant and deletion spectrum composition in diagnostic samples, when examined at progressively lower cut-off mutant frequency values in ultra-deep sequencing; (ii) how the frequency distribution of minority amino acid substitutions in SARS-CoV-2 compares with that of HCV sampled also from infected patients. The main conclusions are the following: (i) the nu...
The entropy production per unit volume in the chaotic regime of a chiral hypercycle in an open-fl... more The entropy production per unit volume in the chaotic regime of a chiral hypercycle in an open-flow reactor.
RESEARCH ARTICLE Open Access Crowdsourced direct-to-consumer genomic analysis of a family quartet
Background: We describe the pioneering experience of a Spanish family pursuing the goal of unders... more Background: We describe the pioneering experience of a Spanish family pursuing the goal of understanding their own personal genetic data to the fullest possible extent using Direct to Consumer (DTC) tests. With full informed consent from the Corpas family, all genotype, exome and metagenome data from members of this family, are publicly available under a public domain Creative Commons 0 (CC0) license waiver. All scientists or companies analysing these data (“the Corpasome”) were invited to return results to the family. Methods: We released 5 genotypes, 4 exomes, 1 metagenome from the Corpas family via a blog and figshare under a public domain license, inviting scientists to join the crowdsourcing efforts to analyse the genomes in return for coauthorship or acknowldgement in derived papers. Resulting analysis data were compiled via social media and direct email. Results: Here we present the results of our investigations, combining the crowdsourced contributions and our own efforts. F...
Time (sec) courses of external variations of concentrations (mM) of <i>F</i> (in blue) and <i>T</i> (in green)
<p>The first four courses <i>a–d</i> (A–D) are taken from <a href="http... more <p>The first four courses <i>a–d</i> (A–D) are taken from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041122#pone-0041122-g004" target="_blank">figure 4</a> of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0041122#pone.0041122-Gilman1" target="_blank">[17]</a> labeled as I, II, III and IV. The other four courses <i>e–h</i> (E–H) are obtained through the following sinusoidal equations: and , where <i>a<sub>1</sub></i> and <i>a<sub>2</sub></i> are the amplitudes, <i>t</i> is the time, <i>T</i> the period, <i>φ</i> the phase and <i>min<sub>F</sub></i> and <i>min<sub>T</sub></i> are the minimum values of <i>F</i> and <i>T</i>. The first two (<i>e</i> and <i>f</i>) differ in their period but have the same phase (<i>φ</i> = 0); course <i>e</i> presents a high period (<i>T</i> = 1000) while course <i>f</i> presents a lower one (<i>T</i> = 50). The two last sinusoidal courses (<i>g</i> and <i>h</i>) differ from each other in phase (for course <i>g</i>, <i>φ</i> = 0 and for course <i>h</i>, <i>φ</i> = 10) and from the other two in period (<i>T</i> = 300). The concentrations of the reservoir species <i>F</i> and <i>T</i> vary within two regimes. For <i>F</i> centred at 60 mM and 30 mM and for <i>T</i> at 30 mM and 20 mM.</p
Additional file 4: of Crowdsourced direct-to-consumer genomic analysis of a family quartet
Consent form signed by the living participants of this study. (PDF 544 kb)
Additional file 2: of Crowdsourced direct-to-consumer genomic analysis of a family quartet
Genome Trax table with homozygous deleterious SNPs and their associated disease. (XLSX 41 kb)
Additional file 1: of Crowdsourced direct-to-consumer genomic analysis of a family quartet
Combined report from personal genomics providers. (DOCX 275 kb)
Additional file 3: of Crowdsourced direct-to-consumer genomic analysis of a family quartet
Table summary of ingenuity analysis results. (XLS 96 kb)
The European sea bass is one of the most important cultured fish in Europe and has a 45 marked se... more The European sea bass is one of the most important cultured fish in Europe and has a 45 marked sexual growth dimorphism in favor of females. It is a gonochoristic species with 46 polygenic sex determination, where a combination between still undifferentiated genetic 47 factors and environmental temperature determine sex ratios. The molecular mechanisms 48 responsible for gonadal sex differentiation are still unknown. Here, we sampled fish during the gonadal developmental period (110 to 350 days post fertilization, dpf), and 50 performed a comprehensive transcriptomic study by using a species-specific microarray. 51 This analysis uncovered sex-specific gonadal transcriptomic profiles at each stage of 52 development, identifying larger number of differentially expressed genes in ovaries 53 when compared to testis. The expression patterns of 54 reproduction-related genes were 54 analyzed. We found that hsd1710 is a reliable marker of early ovarian differentiation. 55 Further, three genes, pdgfb, snx1 and nfy, not previously related to fish sex 56 differentiation, were tightly associated with testis development in the sea bass. 57 Regarding signaling pathways, lysine degradation, bladder cancer and NOD-like 58 receptor signaling were enriched for ovarian development while eight pathways 59 including basal transcription factors and steroid biosynthesis were enriched for testis 60 development. Analysis of the transcription factor abundance showed an earlier increase 61 in females than in males. Our results show that, although many players in the sex 62 differentiation pathways are conserved among species, there are peculiarities in gene 63 expression worth exploring. The genes identified in this study illustrate the diversity of 64 players involved in fish sex differentiation and can become potential biomarkers for the 65 management of sex ratios in the European sea bass and perhaps other cultured species.
A new approach for parameter estimation in chemical kinetics has been recently proposed. 1 It mak... more A new approach for parameter estimation in chemical kinetics has been recently proposed. 1 It makes use of an optimization criterion based on a Generalized Fisher Equation (GFE). Its utility has been demonstrated with two reaction mechanisms, the chlorite-iodide and Oregonator, which are computationally stiff systems. In this paper the performance of the GFE-based algorithm is compared to that obtained from minimization of the squared distances between the observed and predicted concentrations obtained by solving the corresponding initial value problem (we call this latter approach "traditional" for simplicity). Comparison of the proposed GFE-based optimization method with the "traditional" one has revealed their differences in performance. This difference can be seen as a trade-off between speed (which favors GFE) and accuracy (which favors the traditional method). The chlorite-iodide and Oregonator systems are again chosen as case studies. An identifiability analysis is performed for both of them, followed by an optimal experimental design based on the Fisher Information Matrix (FIM). This allows to identify and overcome most of the previously encountered identifiability issues, improving the estimation accuracy. With the new data, obtained from optimally designed experiments, it is now possible to estimate effectively more parameters than with the previous data. This result, which holds for both GFE-based and traditional methods, stresses the importance of an appropriate experimental design. Finally, a new hybrid method that combines advantages from the GFE and traditional approaches is presented.
Correction: Multi-Criteria Optimization of Regulation in Metabolic Networks
PLoS ONE, 2012
There were equation formatting errors in the Figure 2 legend. The correct equations can be viewed... more There were equation formatting errors in the Figure 2 legend. The correct equations can be viewed here:
Determining the regulation of metabolic networks at genome scale is a hard task. It has been hypo... more Determining the regulation of metabolic networks at genome scale is a hard task. It has been hypothesized that biochemical pathways and metabolic networks might have undergone an evolutionary process of optimization with respect to several criteria over time. In this contribution, a multi-criteria approach has been used to optimize parameters for the allosteric regulation of enzymes in a model of a metabolic substrate-cycle. This has been carried out by calculating the Pareto set of optimal solutions according to two objectives: the proper direction of flux in a metabolic cycle and the energetic cost of applying the set of parameters. Different Pareto fronts have been calculated for eight different ''environments'' (specific time courses of end product concentrations). For each resulting front the so-called knee point is identified, which can be considered a preferred trade-off solution. Interestingly, the optimal control parameters corresponding to each of these points also lead to optimal behaviour in all the other environments. By calculating the average of the different parameter sets for the knee solutions more frequently found, a final and optimal consensus set of parameters can be obtained, which is an indication on the existence of a universal regulation mechanism for this system.The implications from such a universal regulatory switch are discussed in the framework of large metabolic networks.
The influence that intrinsic local density fluctuations can have on solutions of mean-field react... more The influence that intrinsic local density fluctuations can have on solutions of mean-field reactiondiffusion models is investigated numerically by means of the spatial patterns arising from two species that react and diffuse in the presence of strong internal reaction noise. The dynamics of the Gray-Scott (GS) model with constant external source is first cast in terms of a continuum field theory representing the corresponding master equation. We then derive a Langevin description of the field theory and use these stochastic differential equations in our simulations. The nature of the multiplicative noise is specified exactly without recourse to assumptions and turns out to be of the same order as the reaction itself, and thus cannot be treated as a small perturbation. Many of the complex patterns obtained in the absence of noise for the GS model are completely obliterated by these strong internal fluctuations, but we find novel spatial patterns induced by this reaction noise in regions of parameter space that otherwise correspond to homogeneous solutions when fluctuations are not included.
Studying the communities of microbial species is highly important since many natural and artifici... more Studying the communities of microbial species is highly important since many natural and artificial processes are mediated by groups of microbes rather than by single entities. One way of studying them is the search of common metabolic characteristics among microbial species, which is not only a potential measure for the differentiation and classification of closely-related organisms but also their study allows the finding of common functional properties that may describe the way of life of entire organisms or species. In this work we propose an expert system (ES), making the main contribution, to cluster a complex data set of 365 prokaryotic species by 114 metabolic features, information which may be incomplete for some species. Inspired on the human expert reasoning and based on hierarchical clustering strategies, our proposed ES estimates the optimal number of clusters adequate to divide the dataset and afterwards it starts an iterative process of clustering, based on the Self-organizing Maps (SOM) approach, where it finds relevant clusters at different steps by means of a new validity index inspired on the well-known Davies Bouldin (DB) index. In order to monitor the process and assess the behavior of the ES the partition obtained at each step is validated with the DB validity index. The resulting clusters prove that the use of metabolic features combined with the ES is able to handle a complex dataset that can help in the extraction of underlying information, gaining advantage over other existing approaches, that may relate metabolism with phenotypic, environmental or evolutionary characteristics in prokaryotic species.
Finding the genes that exist within a DNA sequence and assigning them biological features and fun... more Finding the genes that exist within a DNA sequence and assigning them biological features and functions is one of the biggest challenges of Genomics. This task, called annotation, has to be as accurate and reliable as possible, because this information will be applied in other researches. Ideally, each sequence should be annotated and validated by a human expert, who has the knowledge to infer the most appropriate annotation. Nevertheless, the huge amount of genomic data produced by the new sequencing technologies prevents this practice. Developing expert systems that are able to annotate sequences automatically and emulate the expert involvement in certain key points of the process would enhance the annotation quality. In this work, the CommonKADS methodology is innovatively applied for this purpose. It is used to structure and model the knowledge required to build an expert system able to deal with the functional part of sequence annotation, i.e. establishing the biological purpose of the sequence. This approach provides the first general framework for the aforementioned problem, which can be easily extended to related issues.
Uploads
Papers by Federico Moran