Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

creating lists of taxa names

Hi Lucas,

I was wondering if you know of any programs that are able to parse JPLACE files and output a list of taxa names at each edge of the tree? The list should not include any taxa within subtrees of that edge. Are there plans to implement such a function to genesis? From reading the other post, I know this can be done manually with tree visualization software like Dendroscope, but this is not feasible if we are working with a large tree and want a separate list for each edge. Thank you!

cheers,
Charles

Comments

  • just adding to this (Charles and I work together).  

    All we want is to use phylogenetic placement to subdivide our input sequence file into lots of separate files, with each file holding the sequences that have been placed to an individual edge.  So if a tree has 100 edges, we can get maximum 100 separate output files. We then take these files and separately run them through a clustering program like CROP or PTP.  

    So there is no need to make a judgement on taxonomic assignments (we can do this with the final OTUs, separately).  

    (yes, this is implementing Zhang Jiajie's paper, but using Genesis).  
  • edited May 2016
    Hi Charles, hi Doug,

    this is not particularly difficult. I wrote a program which extracts names of pqueries for each of the edges of the reference tree: https://gist.github.com/lczech/f42ed5c6f229a078449b308ad5497919
    You need to update to genesis v0.7.0 (https://github.com/lczech/genesis/releases/tag/v0.7.0) to compile it.

    It only takes the position of the most likely placement per pquery. All others are filtered out. If you instead want this pquery to appear at every edge where it has a placement, let me know.
    The edges are referred to by a number. I don't know whether you need some other form of referring to them.

    You are referring to this paper, right?: https://bioinformatics.oxfordjournals.org/content/29/22/2869.full
    Our lab is currently working on an improved PTP version, which will be released (hopefully) within the next weeks.

    Also, we are thinking about reimplementing the EPA+PTP pipeline, which seems to be exactly what you are planning to do. Depending on your needs and time frame, this might be relevant for you. However, we don't know when we will get to do this. Maybe, we can keep each other updated on our progress?

    Let me know if all of this works for you, so long
    Lucas
  • Awesome. This is exactly what we need. Thanks so much Lucas!
  • edited July 2016
    Hi Charles and Doug,

    we recently released a significantly faster and more accurate version of
    our species delimitation tool PTP:

    Preprint: http://biorxiv.org/content/early/2016/07/14/063875
    Web-Service: http://mptp.h-its.org/
    Documentation & Download: https://github.com/Pas-Kapli/mptp

    Cheers,
    Lucas
Sign In or Register to comment.