"Newick's 8:45" Tree Format Standard (original) (raw)

                                          Thursday, August 30, 1990

Gary Olsen's Interpretation of the "Newick's 8:45" Tree Format Standard

(Here is the reason for the Newick name)

Conventions: Items in { } may appear zero or more times. Items in [ ] are optional, they may appear once or not at all. All other punctuation marks (colon, semicolon, parentheses, comma and single quote) are required parts of the format.

          tree ==> descendant_list [ root_label ] [ : branch_length ] ;

descendant_list ==> ( subtree { , subtree } )

       subtree ==> descendant_list [internal_node_label] [: branch_length]
               ==> leaf_label [: branch_length]

        root_label ==> label

internal_node_label ==> label leaf_label ==> label

             label ==> unquoted_label
                   ==> quoted_label

    unquoted_label ==> string_of_printing_characters
      quoted_label ==> ' string_of_printing_characters '

     branch_length ==> signed_number
                   ==> unsigned_number

Notes: Unquoted labels may not contain blanks, parentheses, square brackets, single_quotes, colons, semicolons, or commas. Underscore characters in unquoted labels are converted to blanks. Single quote characters in a quoted label are represented by two single quotes. Blanks or tabs may appear anywhere except within unquoted labels or branch_lengths. Newlines may appear anywhere except within labels or branch_lengths. Comments are enclosed in square brackets and may appear anywhere newlines are permitted.

Other notes: PAUP (David Swofford) allows nesting of comments. TreeAlign (Jotun Hein) writes a root node branch length (with a value of 0.0). PHYLIP (Joseph Felsenstein) requires that an unrooted tree begin with a trifurcation; it will not "uproot" a rooted tree.

Example:

(((One:0.2,Two:0.3):0.3,(Three:0.5,Four:0.3):0.2):0.3,Five:0.7):0.0;

       +-+ One
    +--+
    |  +--+ Two
 +--+
 |  | +----+ Three
 |  +-+
 |    +--+ Four
 +
 +------+ Five