Robust Newick tree reader (original) (raw)
phytools has a function to read simple Newick format trees (called read.newick). This function has been totally useless because it is completely redundant with read.tree in the 'ape' package. However a recent query on the R-sig-phylo email list got me wondering if there was any reason that the code from my recent major rewrite of read.simmap couldn't be harvested to re-write read.newick as a "robust" tree reader that can accept singleton nodes, etc.
First off - what is a singleton node. Most nodes in our tree have two or more descendants. A singleton node is a node with only one descendant. These are created by extra left & right parentheses in our Newick string. For example:
(((A,B),(C,D)),E);
has no singletons; whereas:
((((A),B),(C,D))),E);
has two singletons - one on the edge leading from the common ancestor of A & B to tip A, and another below the clade containing A, B, C, and D.
The code for my robust tree reader is here. Let's try to use read.tree and read.newick to read in the two trees above:
> # load source
> source("read.newick.R")
>
> # read tree
> aa<-"(((A,B),(C,D)),E);"
> t1<-read.tree(text=aa)
> t2<-read.newick(text=aa)
>
> # plot
> par(lend=2) # for plotting
> par(mar=c(0.1,0.1,3.1,0.1))
> layout(matrix(c(1,2),1,2))
> plot(t1,edge.width=3,lend=2)
> title("read.tree")
> plot(t2,edge.width=3,lend=2)
> title("read.newick")
> bb<-"((((A),B),(C,D))),E);"
> t3<-read.tree(text=bb)
Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 && dim(obj[[i]]$edge)[1] > :
missing value where TRUE/FALSE needed
> t4<-read.newick(text=bb)
> layout(matrix(c(2,1),1,2))
> plot(t4,edge.width=3,lend=2)
Error in plot.phylo(t4, edge.width = 3, lend = 2) :
there are single (non-splitting) nodes in your tree; you may need to use collapse.singles()
> t4<-collapse.singles(t4)
> plot(t4,edge.width=3,lend=2)
> title("read.newick")
That's it.