################################################################### # Data release of 5 deeply sequenced Arabidopsis thaliana strains. # (Col-0, Kro-0, C24, Ler-1, Bur-0) # date: 2010_07_01 ################################################################### Data analysis description: First each of the genomes was analyzed using the resequencing short read pipeline SHORE (Ossowski et al, Genome Res, 2008). Later, we used a homology guided assembly pipeline to assemble the genomes. So far we provide parts of the results of the resequencing analysis. This data has proven to excellent for linkage analysis, e.g. in genetic mapping. Within the current folder are 2 subdirectories, Complete_Set and Strains. Complete_Set contains data archives of all high quality and standard assembly data in the according subdirectory. The strains directory contains high quality and standard assemblies as well as high quality and and 215k sets. | `-- Strains |-- | |-- Assemblies | | |-- High_Quality | | | |-- .SHORE.scaffolds.2010-09-30.500bp.fa | | | `-- .SHORE.scaffolds.2010-09-30.500bp.qual | | `-- Standard | | |-- .SHORE.scaffolds.2010-06-08.500bp.fa | | `-- .SHORE.scaffolds.2010-06-08.500bp.qual | `-- SNP_Marker | |-- 215k | | |-- .215k.TAIR8.csv | | `-- .215k.TAIR9.csv | `-- High_Quality | |-- .SNPs.TAIR8.txt | `-- .SNPs.TAIR9.txt TAIR8 and TAIR9 files describes the same information though in respect to the different versions of the reference assembly. .SNPs.TAIRX.txt describes high quality SNP markers, all of which have been proven to be highly usful for mapping studies. .215k.TAIRX.csv describes the base calls at those positions that have been querried with tiling arrays in the study of Suzi Atwell, published in Nature in 2010. This data can immediately be combined with the data of Suzi Atwell. Download data: http://walnut.usc.edu/2010/data/call_method_32.tar.gz/view Find publication: Atwell et al, Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines, Nature, 2010. As there are only 269 SNPs found in the reference resequencing Col-0 was excluded from the subsampling. ################################################################### # File format description ################################################################### .SNPs.TAIRX.txt <# of nonrepetitive reads supporting substituion> .215k.TAIRX.csv ,, ------------------------------------------------------------------- Tuebingen, Germany, 2010 Questions? korbinian.schneeberger@tuebingen.mpg.de