CDS alignment are calculated dynamically based on the pre-calculated protein alignment by mapping codons to their corresponding amino
acids, with coding changes highlighted in a different color. Note that only the regions selected in the query are displayed in the alignment and that the number of displayed residues in the alignment is limited to avoid delivering excessive amounts of data to client browsers. Currently the limit is 100,000 residues (for example 200 sequences of length 500), but planned improvements to the alignment viewer will likely raise this limit. Tree builder and viewer Phylogenetic or clustering trees can be calculated and displayed for protein sequences or their corresponding CDS sequences. The tree builder is accessible from the results and the alignment views with the “”Build a tree”" button QNZ and allows sequences to be selected for inclusion based on a trade off between total length of the alignment and the PF-3084014 exclusion of short sequences. Various
measures of distance for protein and nucleotide sequences are available and are identical to those described for the NCBI Influenza Virus Resource [1]. Trees can be constructed from the distance matrices using the neighbor-joining, average linkage, complete linkage, or single linkage algorithms. To facilitate the display of trees with many leaf nodes an adaptive resolution technique in which some branches are displayed in a sub-scale representation is employed [2] (Figure 3D). Users can interactively HDAC inhibitor manipulate the aggregation or refinement of any branch in the tree. In addition, certain metadata, such as year or Country of isolation, Ribonuclease T1 can be displayed on the tree and are shown as aggregate measures for aggregated branches. Case study It was reported that strains of DENV-3 circulating in Thailand prior to 1992 are distinct from those circulating after 1992, and this finding has been interpreted as an extinction of existing DENV-3 strains
and the emergence of new, locally evolved strains. This event reportedly happened coincidentally with the replacement of DENV-2 with DENV-3 as the majority serotype in Thailand [15]. We demonstrate a preliminary analysis of dengue sequences using the tools of the Virus Variation Resource that supports this observation. There are 142 DEV-3 envelope protein sequences from Thailand in the database. Of those, 114 sequences have collection year on record (these can be selected by selecting collection year from 1900 to 2010). All selected sequences have complete coding sequences for envelope proteins. We selected complete linkage clustering algorithm and Felsenstein’s F84 distance. The clustering tree is shown in Figure 4. Using “”Viewing options, search and markup”" in the tree viewer, sequences isolated before 1992 were highlighted in red. The majority of the pre-1992 sequences (92%) stay in one cluster. Figure 4 Case study.