1000G 2010-08 Download
Original data (generated by the Broad Institute) are available at
BI autosomes and BI X chromosome.
There are total 629 individuals broken down by:
174 AFR = 78 YRI + 67 LWK + 24 ASW + 5 PUR
283 EUR = 90 CEU + 92 TSI + 43 GBR + 36 FIN + 17 MXL + 5 PUR
194 ASN = 68 CHB + 25 CHS + 84 JPT + 17 MXL
For each continental group (AFR, EUR, ASN), SNPs with missing genotypes are removed.
For autosomes, we applied further filtering of SNPs not flagged as QC+ in the 4-way (Broad Institute, Michigan, Boston College and NCBI) merged set.
Original supporting data used to construct the 4-way consensus SNP set are available at
4-way merged autosomes. Note that this original set doesn't include the X chromosome.
Singletons (SNPs with minor allele appearing once) are NOT removed.
The files can be directly fed to mach. We recommend a 2-step imputation procedure: pre-phasing using MaCH and imputation using minimac.
For details, please go to minimac .
Report to Yun Li if a large number of genotyped SNPs are discarded due to absence in this
reference. You can check through the following command line
> grep "will be ignored" mach.*.log
* $pop.chrX.hap.gz contains two duplicated chromosomes for each male. Please use $pop.chrX.noDup.hap.gz instead.
* Do not turn on --compact if memory is not an issue.