1001 Genomes
A Catalog of Arabidopsis thaliana Genetic Variation.
1001 Genomes
A Catalog of Arabidopsis thaliana Genetic Variation.
Explore the variants. We maintain several tools for data download, visualization, and analysis.
GoVisit the Data Center and download whole sets of SNPs, indels, SVs, and genome sequences.
GoSeed sets of natural accessions are available for
Complete setThe 1001 Genomes Project was launched at the beginning of 2008 to discover detailed whole-genome sequence variation in at least 1001 strains (accessions) of the reference plant Arabidopsis thaliana. The first major phase of the project was completed in 2016, with publication of a detailed analysis of 1135 genomes. Unfortunately, the second-generation sequencing methods that have made it economically feasible to screen large numbers of individuals do not actually produce complete genome sequences — they produce massive numbers of very short sequence fragments that must be aligned to a reference genome in order to identify variants. Because of this, only simple variants are reported, and the results are invariably biased with respect to what is present or missing in the reference genome. Large or complex structural variants, as well as simple variants inside complex variants are generally missed completely. To remedy this problem, we have recently begun the second major phase, the 1001G+ project. We have begun to assemble genomes from a diverse collection of A. thaliana strains, with the goal of annotating them with transcriptome and epigenome information, and to develop tools to make the results available to the community.
Read more ...