haotu : an open lab notebook


Install Osiris: Phylogenetic tools for Galaxy

Filed under: Genomics, phylogeny, Uncategorized — S @ 03:36

Assuming that you have access to a galaxy server, if not it is easy to install: http://wiki.g2.bx.psu.edu/Admin/Get%20Galaxy

1. I first cloned/copied this Osiris tools into the tools directory in the galaxy-dist folder on my computer (Ubuntu)

hg clone http://toolshed.g2.bx.psu.edu/repos/ucsb-phylogenetics/ucsb_phylogenetics

2. OK that was the easy part, now you will need to install all the dependencies and also edit the tool_conf.xml file that is in the galaxy-dist folder such that the Osiris tools are can be seen in the galaxy browser window.

Here is an example:
1. Install Bioperl, I did this through Ubuntu software center.

2. Open tool_conf.xml in a text editor. You can see that the text structure mirrors the left-hand side, tools menu, of the Galaxy browser window.

3. To this file, I then added an Osiris section and the Osiris tool addstring2fashead:

<section name="Osiris" id="osiris" >
    <tool file="ucsb_phylogenetics/ucsb_phylogenetics/addstring2fashead/addstring2fashead.xml" / >
  </section >

I put this at the bottom of the tools under VCF Tools… but I do not think it matters where you put it…

4. Start, or restart Galaxy and see if the Osiris section is in the tools menu along with addstring2fashead.

5. Give the function a try and make sure that it works. Problems occur if you do not put a string in the text box or if you have not installed Bioperl properly.

OK now what I did was add each function I wanted to the tools config file checking for dependencies along the way. Todd Oakley listed most of the dependences on the depository toolshed website

Good Luck!



Incomplete nucleic acid nomenclature contrast object for R to use with phangorn

Filed under: phylogeny, R — S @ 08:47





in R for a description of why this contrast is needed for phylogenetic analyses in phangorn

contrast = matrix(data = c(
# a  c  g  t  -
  1, 0, 0, 0, 0, #a
  0, 1, 0, 0, 0, #c
  0, 0, 1, 0, 0, #g
  0, 0, 0, 1, 0, #t
  1, 0, 1, 0, 0, #r
  0, 1, 0, 1, 0, #y
  1, 1, 0, 0, 0, #m
  0, 0, 1, 1, 0, #k
  0, 1, 1, 0, 0, #s
  1, 0, 0, 1, 0, #w
  1, 1, 0, 1, 0, #h
  0, 1, 1, 1, 0, #b
  1, 1, 1, 0, 0, #v
  1, 0, 1, 1, 0, #d
  0, 0, 0, 0, 1, #-
  1, 1, 1, 1, 0, #n
  1, 1, 1, 1, 1  #?
  ), ncol = 5, byrow = TRUE)

dimnames(contrast) = list(c(
), c(


Add taxa to phylogeny with dated branch lengths and nodes

Filed under: bladj, phylocom, phylogeny, R — S @ 14:41

This is a common problem I have when trying to construct a phylogeny for a set of taxa when all taxa do not have sequence data associated.

1. read your tree into R as a phylo object

2a. label all the nodes


2b. your tree may not have branch lengths and thus no branching times (which you need in the next step), so add arbitrary branch lengths to your tree.


3.  output the node dates and put them in a spreadsheet program like excel


4. remove the branch lengths (or if you have a the original tree without branch lengths just use that)


copy the output into a text editor and remove all occurrences of :1

5. add your taxa to the newick formatted phylogeny that is in your text editor. Add them where you hypothesize they should be. Also name the nodes as you go and add them to your list of nodes that you copied to excel. This may take some time.

6. Then use phylocom and the bladj function to date those added nodes. I typically start with the timetree of life website to find dates for the nodes: http://www.timetree.org/

Blog at WordPress.com.