Version 2.0 is now available!!

Don't miss out on the new among-site-rate-variation-aware iterative Neighbor Joining (see below and in the user manual for more information)


SEMPHY is a tool for data-intensive phylogenetic reconstruction. SEMPHY infers phylogenies by Maximum Likelihood, the most established criterion for finding the correct phylogenetic tree. SEMPHY searches for both the most likely topology of the evolutionary tree, and the optimal lengths of its branches. It uses the algorithmic paradigm of Structural EM in a new computational method for phylogenetic inference, making computation both effective and efficient: SEMPHY can handle very large data sets with both good accuracy and reasonable running time. We will refer to the EM procedure as the "SEMPHY step".  For a full description of the SEMPHY algoritm please see Friedman et al. 2002 in the publications section below.

The new version of SEMPHY 2.0 uses Maximum Likelihood methods and EM to improve the accuracy of pairwise distance estimation for Neighbor Joining (NJ) by taking into account among-site rate variation. SEMPHY uses this novel variant of NJ to construct an "initial guess" for the tree which is used as the starting point for the SEMPHY search. The new NJ can also be used by itself where Maximum Likelihood methods like SEMPHY are not suitable, for example, where more than 500 sequences are involved.  For a full description of the improved NJ please see Ninio et al. 2006 in the publications section below.

Main Features

Development Team


Please cite the appropriate reference if you use SEMPHY in your publications.


Semphy can be downloaded from The SEMPHY download page

