An Introduction to RNA Databases
We present an introduction to RNA databases. The history and technology behind RNA databases are briefly discussed. We examine differing methods of data collection and curation and discuss their impact on both the scope and accuracy of the resulting databases. Finally, we demonstrate these principles through detailed examination of four leading RNA databases: Noncode, miRBase, Rfam, and SILVA. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

Introduction to Stochastic Context Free Grammars
Stochastic context free grammars are a formalism which plays a prominent role in RNA secondary structure analysis. This chapter provides the theoretical background on stochastic context free grammars. We recall the general definitions and study the basic properties, virtues, and shortcomings of stochastic context free grammars. We then introduce two ways in which they are used in RNA secondary structure analysis, secondary structure prediction and RNA family modeling. This prepares for the discussion of applications of stochastic context free grammars in the chapters on Rfam (6), Pfold (8), and Infernal (9). (Source: Sprin...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

Energy-Directed RNA Structure Prediction
In this chapter we present the classic dynamic programming algorithms for RNA structure prediction by energy minimization, as well as variations of this approach that allow to compute suboptimal foldings, or even the partition function over all possible secondary structures. The latter are essential in order to deal with the inaccuracy of minimum free energy (MFE) structure prediction, and can be used, for example, to derive reliability measures that assign a confidence value to all or part of a predicted structure. In addition, we discuss recently proposed alternatives to the MFE criterion such as the use of maximum expec...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

The Determination of RNA Folding Nearest Neighbor Parameters
The stability of RNA secondary structure can be predicted using a set of nearest neighbor parameters. These parameters are widely used by algorithms that predict secondary structure. This contribution introduces the UV optical melting experiments that are used to determine the folding stability of short RNA strands. It explains how the nearest neighbor parameters are chosen and how the values are fit to the data. A sample nearest neighbor calculation is provided. The contribution concludes with new methods that use the database of sequences with known structures to determine parameter values. (Source: Springer protocols fe...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

The Principles of RNA Structure Architecture
Being informational, enzymatic, as well as a nanoscale molecular machine, ribonucleic acid (RNA) permeates all areas of biology and has been exploited in biotechnology as drug and sensor. Here we describe the composition and fundamental properties of RNA and how the single-stranded RNA chains fold and shape certain motifs that are repeatedly observed in different structures. Small and large molecular mass RNA binders are being touched upon, as is the technology for selecting RNA molecules in vitro that bind almost any kind of natural or artificial target. Recognizing the versatility of RNA is expected to foster the develop...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

RNA–Protein Interactions: An Overview
RNA binding proteins (RBPs) are key players in the regulation of gene expression. In this chapter we discuss the main protein–RNA recognition modes used by RBPs in order to regulate multiple steps of RNA processing. We discuss traditional and state-of-the-art technologies that can be used to study RNAs bound by individual RBPs, or vice versa, for both in vitro and in vivo methodologies. To help highlight the biological significance of RBP mediated regulation, online resources on experimentally verified protein–RNA interactions are briefly presented. Finally, we present the major tools to computationally infer R...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

Bioinformatics of siRNA Design
RNA interference mediated by small interfering RNAs is a powerful tool for investigation of gene functions and is increasingly used as a therapeutic agent. However, not all siRNAs are equally potent, and although simple rules for the selection of good siRNAs were proposed early on, siRNAs are still plagued with widely fluctuating efficiency. Recently, new design tools incorporating both the structural features of the targeted RNAs and the sequence features of the siRNAs substantially improved the efficacy of siRNAs. In this chapter we will present a review of sequence and structure-based algorithms behind them. (Source: Sp...
Source: Springer protocols feed by Bioinformatics - December 4, 2013 Category: Bioinformatics Source Type: news

Genotype Imputation to Increase Sample Size in Pedigreed Populations
Genotype imputation is a cost-effective way to increase the power of genomic selection or genome-wide association studies. While several genotype imputation algorithms are available, this chapter focuses on a heuristic algorithm, as implemented in the AlphaImpute software. This algorithm combines long-range phasing, haplotype library imputation, and segregation analysis and it is specifically designed to work with pedigreed populations. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Validation of Genome-Wide Association Studies (GWAS) Results
Validation of the results of genome-wide association studies or genomic selection studies is an essential component of the experimental program. Validation allows users to quantify the benefit of applying gene tests or genomic prediction, relative to the costs of implementing the program. Further, if implemented, an appropriate weight in a selection index can only be derived if estimates of the accuracy of genomic predictions are available. In this chapter the reasons for validation are explored, and a range of commonly encountered scenarios described. General principles are stated, and options for performing validation di...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Detection of Signatures of Selection Using F ST
Natural selection has molded the evolution of species across all taxa. Much more recently, on an evolutionary scale, human-oriented selection started to play an important role in shaping organisms, markedly so after the domestication of animals and plants. These selection processes have left traceable marks in the genome. Following from the recent advances in molecular genetics technologies, a number of methods have been developed to detect such signals, termed genomic signatures of selection. In this chapter we discuss a straightforward protocol based on the F ST statistic to identify genomic regions that ex...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

R for Genome-Wide Association Studies
In recent years R has become de facto statistical programming language of choice for statisticians and it is also arguably the most widely used generic environment for analysis of high-throughput genomic data. In this chapter we discuss some approaches to improve performance of R when working with large SNP datasets. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genotype Phasing in Populations of Closely Related Individuals
Knowledge of phase has many potential applications for empowering genomic information. For example, phase can facilitate the identification of identical by descent sharing between pairs of individuals, as part of the process of genotype imputation, or to facilitate parent of origin of allele modeling in order to quantify the effect of parental imprinting. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Use of Ancestral Haplotypes in Genome-Wide Association Studies
We herein present a haplotype-based method to perform genome-wide association studies. The method relies on hidden Markov models to describe haplotypes from a population as a mosaic of a set of ancestral haplotypes. For a given position in the genome, haplotypes deriving from the same ancestral haplotype are also likely to carry the same risk alleles. Therefore, the model can be used in several applications such as haplotype reconstruction, imputation, association studies or genomic predictions. We illustrate then the model with two applications: the fine-mapping of a QTL affecting live weight in cattle and association stu...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Detecting Regions of Homozygosity to Map the Cause of Recessively Inherited Disease
Homozygosity is a component of genetic patterning that can be used to search for the cause of genetic disease. In this chapter, methods are presented to analyze SNP data for the presence of homozygosity. Two exercises demonstrate methods to define runs of homozygosity, to identify shared homozygosity between individuals, and to evaluate the results in light of the expectations of a recessively inherited genetic disorder. An example dataset is used to aid in data interpretation. (Source: Springer protocols feed by Bioinformatics)
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news

Genomic Best Linear Unbiased Prediction (gBLUP) for the Estimation of Genomic Breeding Values
Genomic best linear unbiased prediction (gBLUP) is a method that utilizes genomic relationships to estimate the genetic merit of an individual. For this purpose, a genomic relationship matrix is used, estimated from DNA marker information. The matrix defines the covariance between individuals based on observed similarity at the genomic level, rather than on expected similarity based on pedigree, so that more accurate predictions of merit can be made. gBLUP has been used for the prediction of merit in livestock breeding, may also have some applications to the prediction of disease risk, and is also useful in the estimation ...
Source: Springer protocols feed by Bioinformatics - January 1, 2013 Category: Bioinformatics Source Type: news