LASER: Locating Ancestry from SEquence Reads
LASER is a program to estimate individual ancestry by directly analyzing shotgun sequence reads without calling genotypes. LASER uses principal components analysis (PCA) and Procrustes analysis to analyze sequence reads of each sample and place the sample into a reference PCA space constructed using genotypes of a set of reference individuals. With an appropriate reference panel, the estimated coordinates of the sequence samples reflect their ancestral background and can be used to correct for population stratification in association studies. LASER can accurately estimate ancestry even with modest amounts of data, such as the off-target sequence data generated by targeted sequencing experiments.
Comments and suggestions are welcome; please email Chaolong Wang at email@example.com or Gonçalo Abecasis at firstname.lastname@example.org.
If you use LASER, please take a minute to fill out the registration form. We will keep you updated when a new version is released.
Reference for LASER:
- C Wang, X Zhan et al. (2014) Ancestry estimation and control of population stratification for sequence-based association studies. Nature Genetics, in press. (Prepublication manuscript is available upon request at email@example.com)
- The HGDP data in Downloads are based on the Illumina 650K SNP data published by Li et al. (2008, Science 319:1100-1104).
We processed the data as described in our paper (Wang et al. 2014, Nature Genetics). Main steps include updating genomic coordinates to Build 37,
removing tri-allelic SNPs, flipping alleles to the forward strand, and formatting the data to a reference genotype format taken by the LASER program.
We post the processed data to assist users of LASER. The original data can be downloaded from the Stanford HGDP website.
- August 8, 2013 - Upload version 1.03 manual and software
- June 19, 2013 - Upload version 1.02 manual and software
- March 11, 2013 - Upload version 1.01 manual and software
- February 1, 2013 - Upload version 1.0 manual, software, HGDP data, and the reference sequence hs37d5.fa