Monday, August 3, 2015

Networks vs augmented trees

The distinction between networks and augmented trees is interesting from a biological, computational and mathematical point of view. An augmented tree is the result of adding cross-connecting branches to a tree, turning it into a network. So each augmented tree is a network (called a tree-based network). But is each network an augmented tree? In a previous blog post we showed that this is not the case. There exist networks that are inherently network-like and cannot be obtained by adding branches to a tree. (If we are allowed to create new nodes by subdividing branches of the tree, but are not allowed to subdivide any of the previously-added branches.)

The biological question here is as follows: is evolution a tree-like process augmented with horizontal events, or is evolutionary inherently network-like?

This concept is also relevant to phylogenetic network reconstruction approaches, because several such methods work by adding edges to an estimated species tree. Therefore, there exist networks that will always be missed by such methods.

Interestingly, it has turned out that it is easy to find out if a given network is tree-based or not. A polynomial-time algorithm was presented recently by Francis and Steel:

Andrew Francis and Mike Steel, Which Phylogenetic Networks are Merely Trees with Additional Arcs? Systematic Biology (2015).

They solve the problem by reducing it to a model called 2-SAT, which is interesting because it automatically leads to a very simple and fast algorithm solving the problem.

An interesting question that remains open is the following. Given a network and a tree, can we decide in polynomial time if the network can be obtained by adding edges to the given tree? Another question is whether there exists a clean graph-theoretic characterisation of tree-based networks.

Below you see Mike Steel presenting their recent paper at the Phylogenetic Networks Workshop in Singapore. He also discussed other recent results, concerning folding and unfolding phylogenetic trees and networks, as well as distance-based methods for detecting reticulation.

Friday, July 31, 2015

Singapore, Day 5

Today was a mixed bag ot talks.

Louxin Zhang started with a couple of proofs about what he called "stable" networks; and Stefan Grünewald developed his thoughts on quartet algoritms for splits graphs. At the other extreme, Nadine Ziemert talked entirely biology, introducing the audience to the problem of trying to study the evolution of secondary metabolites. In between, Eric Tannier tried to use horizontal gene transfer to date the nodes of networks, assuming that HGT requires a temporally consistent network. Francois-Joseph Lapointe produced the only really statistical talk of the week, trying to produce p-values for patterns on sequence similarity networks.

Daniel Huson popped in for the last day, and presented us with some ideas for the future development of both SplitsTree (unrooted networks) and Dendroscope (rooted networks). Apparently, the need is for SplitsTree to handle larger sets of trees, while for Dendroscope it is to produce networks from pairs of input trees. He also noted that there are still more networks being produced using median joining rather than neighbor-net, due to the amount of work being done on human mitochondrial sequences.

An interest was expressed in continuing the series of meetings on phylogenetic networks (Leiden 2012, Leiden 2014) — I first met most of the people working on networks in phylogenetics in Uppsala in 2004 (Phylogenetic Combinatorics and Applications).

Today we also celebrated Dan Gusfield's 2^6 birthday, with a strawberry cream cake.

So, all in all, a very successful meeting.

After the sessions finished, I went down to the Gardens By The Bay to look at the Supertree Grove. As you can see, a "super" tree is by any definition actually a network.