An Analysis of Contributions to PubMed Commons

I recently saw a tweet floating by which included a link to some recent statistics from PubMed Commons, the NCBI service for commenting on scientific articles in PubMed. Perhaps it was this post at their blog. So I thought now would be a good time to write some code to analyse PubMed Commons data. The tl;dr version: here’s the Github repository and the RPubs report. For further details and some charts, read on. Currently, there is no access to PubMed Commons data via the NCBI Entrez API aside from a PubMed search filter to return articles that have comments. However, a Google search for “pubmed commons api” returns this useful Gist. It shows how to construct a URL which returns JSON-formatted PubMed Commons data for a given PMID. If Alf is reading this, I’d like to know how he discovered this information gem! Armed with this I was able to write Ruby code to return all PMIDs with comments, fetch the comment data, parse it and output a summary to a CSV file. I used to be an XPath guy. This experience changed me into a CSS selector guy. Analysis and visualisation can then be performed using this RMarkdown file. Here are some of the highlights; the RPubs report contains the complete analysis. At the time of writing 5 877 “real” comments have been written, for 4 703 articles, authored by 1 504 people. By “real comments”, I mean those with an author name and comment text. This excludes automatically-generated notes and moderated comm...
Source: What You're Doing Is Rather Desperate - Category: Bioinformatics Authors: Tags: publications R statistics comments ncbi pubmed commons Source Type: blogs