A minor update to my “apply functions” post
One of my more popular posts is A brief introduction to “apply” in R. Come August, it will be four years old. Technology moves on, old blog posts do not. So: thanks to BioStar user zx8754 for pointing me to this Stack Overflow post, in which someone complains that the code in the post does not work as described. The by example is now fixed. Side note: I often find “contact the author” is the most direct approach to solving this kind of problem ;) always happy to be contacted.Filed under: R, statistics, this blog Tagged: biostar, stack overflow, stackexchange (Source: What You're Doing Is Rather Desperate)
Source: What You're Doing Is Rather Desperate - February 27, 2014 Category: Bioinformaticians Authors: nsaunders Tags: R statistics this blog biostar stack overflow stackexchange Source Type: blogs

NGS Saves A Young Life
One of the most electrifying talks at AGBT this year was given by Joe DeRisi of UCSF, who gave a brief intro on the difficulty of diagnosing the root cause of encephalitis (as it can be autoimmune, viral, protozoal, bacterial and probably a few other causes) and then ran down a gripping case history which seemed straight out of House.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - February 26, 2014 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs

A Sunset for Draft Genomes?
The sun set during AGBT 2014 for a final time over a week ago.  The posters have long been down, and perhaps the liver enzyme levels of the attendees are now down to normal as well.  This year’s conference underscored a possibility that was suggested last year: that the era of the poorly connected, low quality draft genome is headed for the sunset as wellRead more » (Source: Omics! Omics!)
Source: Omics! Omics! - February 24, 2014 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs

New publication: A panel of genes methylated with high frequency in colorectal cancer
This study has characterised a panel of 23 genes that show elevated DNA methylation in >50% of CRC tissue relative to non-neoplastic tissue. Six of these genes (SOX21, SLC6A15, NPY, GRASP, ST8SIA1 and ZSCAN18) show very low methylation in non-neoplastic colorectal tissue and are candidate biomarkers for stool-based assays, while 11 genes (BCAT1, COL4A2, DLX5, FGF5, FOXF1, FOXI2, GRASP, IKZF1, IRF4, SDC2 and SOX21) have very low methylation in peripheral blood DNA and are suitable for further evaluation as blood-based diagnostic markers. [1] Despite the tagline for this blog, very few of my posts feature the details of...
Source: What You're Doing Is Rather Desperate - February 6, 2014 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics publications research diary biomarker colorectal cancer csiro methylation Source Type: blogs

A lesson in “reading before you tweet”
So, I read the title: Mining locus tags in PubMed Central to improve microbial gene annotation and skimmed the abstract: The scientific literature contains millions of microbial gene identifiers within the full text and tables, but these annotations rarely get incorporated into public sequence databases. and thought, well OK, but wouldn’t it be better to incorporate annotations in the first place – when submitting to the public databases – rather than by this indirect method? You could annotate genomes like this. Or databases could enforce standards in the first place. biomedcentral.com/1471-2105/15/...
Source: What You're Doing Is Rather Desperate - February 5, 2014 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics publications discussion mea culpa twitter Source Type: blogs

Creating real-time audio-visual effects using brain waves
Finally got some quality time to spend with a birthday gift from Katharina. It is one of those dry EEG sensor containing devices which can capture the EEG signal just by placing the sensor on the forehead, no gel, no hell ;) To top it all, NeuroSky mobile, can connect to any Bluetooth device and through a good collection of APIs, EEG signals and be collected and processed. This felt like the easiest way to enter the amazing world of brain-computer interface :) and so I did... Quick Google search showed that Windows+Python would be the fastest way to hack together a piece of code which can unleash some of the power of this ...
Source: Bioinformatics Latest News - February 3, 2014 Category: Bioinformaticians Authors: Animesh Sharma Source Type: blogs

Box plots. Like box plots, only…box plots.
On a rare, brief holiday (here and here, if you’re interested; both highly-recommended), I make the mistake of checking my Twitter feed: paging @neilfws . . . RT @psudmant: Ground breaking new methods from @naturemethods – boxplots – no rly nature.com/nmeth/journal/…— Chris Miller (@chrisamiller) January 30, 2014 This points me to BoxPlotR. It draws box plots. Using Shiny Server. That’s the “innovation”, presumably. With “quilt plots” and now this, I’m starting to think that I’ve been doing science wrong all these years. If I’d been told to submit t...
Source: What You're Doing Is Rather Desperate - February 2, 2014 Category: Bioinformaticians Authors: nsaunders Tags: publications R statistics boxplot methods nature Source Type: blogs

Parallelizing #RStats using #make
In the current post, I'll show how to use R as the main SHELL of GNU-Make instead of using a classical linux shell like 'bash'. Why would you do this ? awesomeness Make-based workflow management Make-based execution with --jobs. GNU make knows how to execute several recipes at once. Normally, make will execute only one recipe at a time, waiting for it to finish before executing the next. However (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 30, 2014 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs

Mapping the UCSC/Web-Sequences to a world map.
People at the UCSC have recently released a new track for the GenomeBrowser We BLATted the Internet! The DNA sequences from 40 billion webpages mapped to hg19 and other species: http://t.co/5XAsFCguE2/ UCSC Genome Browser (@GenomeBrowser) January 23, 2014"We're pleased to announce the release of the Web Sequences track on the UCSC Genome Browser. This track, produced in collaboration with (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 30, 2014 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs

EuroSeattle Startup Weekend
Last weekend I attended my first Startup Weekend. The event was very well organized; I had a great time and got to know a lot of good people. Didn’t get enough votes for the idea I pitched (a dating site that matches people using data from services like Fitbit), so I ended up joining another team to build an app that generates running routes that pass near popular sights. Our team consisted of 6 “non-technicals” and 3 developers (including myself). There was little friction, and I was impressed how well tasks like market research were performed. Somewhat less impressive was the amount of time spent dis...
Source: eric.jain.name - January 29, 2014 Category: Bioinformaticians Authors: Eric Jain Tags: Programming Source Type: blogs

BLATting the internet: the most frequent gene?
I enjoyed this story from the OpenHelix blog today, describing a Microsoft Research project to mine DNA sequences from web pages and map them to UCSC genome builds. Laura DeMare asks: what was the most-hit gene? Most hit gene? APOE? MT @GenomeBrowser We BLATed the Internet! DNA sequences from 40 billion webpages mapped to hg19 goo.gl/7T2d5w— Laura DeMare (@ldemare) January 23, 2014 First, visit the UCSC Table Browser. For the human genome hg19 build, the relevant group is “Phenotype and Literature”, the track is “Web Sequences” and the table is pubsBingBlat. Check the “genome” bu...
Source: What You're Doing Is Rather Desperate - January 24, 2014 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics genomics statistics blat microsoft research ucsc Source Type: blogs

Illumina's New Lineup
Illumina made a brace of big hardware announcements at this week's J.P. Morgan conference, and Mick Watson has done a nice job of covering them.  I'll try to cover some different points that have occurred to me after letting the news ferment -- plus Illumina made yet another announcement tonight that scotched a portion of an earlier draft of this piece.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - January 16, 2014 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs

Quilt plots. Like heat maps, only…heat maps
Stephen tweets: Quilt Plots: A Simple Tool for the #Visualisation of Large Epidemiological Data buff.ly/1doSx4X— Stephen Rudd (@SAGRudd) January 15, 2014 A “quilt plot” Quilt plots. Sounds interesting. The link points to a short article in PLoS ONE, containing a table and a figure. Here is Figure 1. If you looked at that and thought “Hey, that’s a heat map!”, you are correct. That is a heat map. Let’s be quite clear about that. It’s a heat map. So, how do the authors justify publishing a method for drawing heat maps and then calling them “quilt plots”? Well, the...
Source: What You're Doing Is Rather Desperate - January 15, 2014 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics publications software statistics heat map plos one r-project Source Type: blogs

Relearning Chemistry
An evening ritual is to inquire what homework requires assistance, and at the beginning of the year it was a science worksheet as part of an introduction to chemistry.  That, and a later project, have exposed how much rust my knowledge of chemistry has accumulated, but also have led me down the path of repairing forgotten bits and certainly learning some new stuffRead more » (Source: Omics! Omics!)
Source: Omics! Omics! - January 13, 2014 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs

Credit for code: enough with the half-measures already
May as well begin 2014 where we left off: complaining about the attitude of scientific publishers regarding reproducible computational research. I had a “Twitter blurt”. That’s when you read, react and tweet. Happens to the best of us. With hindsight, it was perhaps a little harsh: Weak, half-hearted and unconvincing suggestions from Nature Genetics for publishing code: nature.com/ng/journal/v46…— Neil Saunders (@neilfws) December 27, 2013 The link is to an editorial in Nature Genetics, “Credit for code.” It points out, quite rightly, that “review, replication, reuse and recogn...
Source: What You're Doing Is Rather Desperate - January 5, 2014 Category: Bioinformaticians Authors: nsaunders Tags: computing editorial npg publishing reproducibility Source Type: blogs