Quick update - our paper is out
The contents of this blog + more detailed analyses and text are available in our BMC Microbiology paper here: http://www.biomedcentral.com/1471-2180/12/221/abstract One reviewer had a very interesting suggestion -- they asked us to add the average bootstrap scores to our heat map figure so that readers could get a sense of sequences that may be "novel" -- that is, that the RDPII-NBC hadn't seen before. See the PLoS ONE article by Lan et al (http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0032491). As a result you can clearly see that the greengenes training set is the most diverse and best at capturing the diversity found within the honey bee gut (Figure 2A). That said, a fair number of unique sequences (>1000 out of ~4000) are still unclassified using this training set. The classifications improve with the addition of honey bee gut specific sequences as do average bootstrap scores (Figure 2B). Also interesting, and expec...