Supplementary Materials for the manuscript "Gene markers for exon capture and phylogenomics in ray-finned fishes"

  1. Basic information of the 4,435 loci

  2. Target sequences of the 4,434 loci for all eight model fishes
    The sequences are needed when you want to design baits using those target sequences or you want to design your own baits combining a new genome or transcriptome.

  3. Pipeline and scripts for ray-finned fishes baits design
    The above link has pipelines and scripts for retrieving target sequences of the 4,434 loci if new genome sequences or transcriptomes provided.

  4. Pipeline and scripts for baits refinement
    The above link has pipelines and scripts for: I. merge data from different project; II. select loci with less missing data and high phylogenetic decisiveness; III. find and mask region with extraordinary read depth for bait redesign.

Bioinformatic Tools

  1. EvolMarkers
    EvolMarkers is a database based on genome comparison to find conserved single-copy exon (CDS) and intron (EPIC) markers for phylogenetic and population studies (Li et al., 2010; Li et al., 2007). Unfortunately, now we are lack of resource to maintain the web server. You could download the scripts and run it on your own computer, if you are interested in searching for useful markers.

  2. Pipeline & Scripts for reads assembly
    The above link has pipelines and scripts for assembling Illumina sequencing reads into contigs and output aligned sequences for subsequent data analyses. More introduction and tutorial materials could be found in Learning.

  3. Finding target loci for gene capture
    Perl scripts for finding single-copy loci conserved among interested species, which can be used in gene capture.

  4. Misc. Perl Scripts

Molecular Markers

  1. Gene-capture target markers for Tapeworms
    We identified 3,641 single-copy nuclear coding loci by comparing the genomes of Hymenolepis microstoma, Echinococcus granulosus, and Taenia solium. We designed RNA baits based on the sequence of H. microstoma, and applied target enrichment and Illumina sequencing to test the utility of those baits to recover loci useful for phylogenetic analyses. We captured DNA from five species of tapeworms representing two families of cyclophyllideans. We obtained an average of 3,284 (90%) of the targets from the test samples and then used captured sequences (2,181,361 bp in total; fragment size ranging from 301 bp to 6,969 bp) to reconstruct a phylogeny for the five test species plus the three species for which genomic data are available.