decorator

Overview

decorator is a simple Java command line tool to decorate phylogentic trees with various data fields (such as sequence accessors, taxonomic scientific names) and write them in phyloXML format.
It can read trees in the following formats:

It is implemented in Java as part of the forester libraries.
A similar, but more limited, tool is the phyloxml converter (newick to phyloxml): phyloxml_converter

Download

» forester.jar

Source code is available at [forester] at sourceforge.net

Usage

java -cp path\to\forester.jar org.forester.application.decorator -table | -f=<c> <phylogenies infile> <mapping table file> <phylogenies outfile>

Options:

-table : table instead of one to one map (-f=), see below for tags
-r=<n> : allow to remove up to n characters from the end of the names in phylogenies infile if not found (in map) otherwise
-p     : for picky, fails if node name not found in mapping table, default is off
-pn=<s>: name for the phylogeny
-pi=<s>: identifier for the phylogeny (in the form provider:value)
-pd=<s>: description for phylogenies


Advanced options, only available if -table is not used:

-f=<c>: field to be replaced:
        n : node name
        a : sequence annotation description
        d : domain structure
        c : taxonomy code
        sn: taxonomy scientific name
        s : sequence name
-k=<n>: key column in mapping table (0 based), names of the node to be decorated - default is 0
-v=<n>: value column in mapping table (0 based), data which with to decorate - default is 1
-sn   : to extract bracketed scientific names
-s=<c>: column separator in mapping file, default is ":"
-x    : process name "intelligently" (only for -f=n)
-xs   : process name "intelligently" and process information after "similar to" (only for -f=n)
-c    : cut name after first space (only for -f=n)


Example:
"java -cp \soft\forester.jar org.forester.application.decorator -table my_simple_tree.nh my_map.txt decorated_tree.xml"

tags for mapping table:

TAG:value pairs have to be separated by tabs.
The content of the first column has to be a node name matching a node name in the phylogeny to be decorated.
These tags correspond to phyloXML elements [phyloXML documentation]).

mapping table example row (tab separated):

"1 TAXONOMY_CODE:BACTN TAXONOMY_ID:226186 TAXONOMY_ID_PROVIDER:ncbi TAXONOMY_SN:Bacteroides thetaiotaomicron SEQ_ACCESSION:29341016 SEQ_ACCESSION_SOURCE:gi SEQ_SYMBOL:BT3701 SEQ_NAME:SusD"

"1" is the node name matching a node name in the phylogeny to be decorated.

Bookmark this

Delicious Delicious | | このエントリをはてなブックマークに追加 Hatena Bookmark

Contact

Christian M Zmasek
Burnham Institute for Medical Research | cmzmasek yahoo com

Last updated 2009.10.07

www.phyloxml.org | Archaeopteryx | www.phylosoft.org