###################################################################
# Additional strains sequenced at MPI by collaborators
# date: 2011_06_28
###################################################################

The three strains (Ws-2, Tnz-1 and Strand-1) were sequenced by Seth 
Davis (Max Planck Institute for Plant Breeding Research, Cologne)
and were analyzed at the MPI Tuebingen.

###################################################################
# Folder organisation 
###################################################################

Within the folder 'strains' there are subfolders for each strain
containing analyzed data files. As the short read analysis was 
performed against TAIR8, the directory structure looks like:

    <release>
    `-- strains
        `-- <strain name>
           `-- TAIR8

The analyzed data files are:

filtered_variant.txt
	Positions and annotation of SNPs (single nucleotide
	polymorphisms) and 1-3bp deletions detected through
	short read alignments, including a quality value.

filtered_reference.txt
	Positions featuring the reference base, including a 
	quality value.

SV_deletion_high_quality.PE.txt	
	Large deletion prediction. Note deletions shorter than
	10bp have a bad selectivity and sensitivity and have
	only been included for completeness.
	An unique ID value was attached to each prediction in 
	order to find the identical deletion in other accessions
	again. This allows for frequency analysis of SV deletions.

unsequenced.txt
	Describes all regions which did not allow for a base
	call (including SV deletions) independent of the reason.

###################################################################
# File format description
###################################################################

filtered_variant.txt
	<Sample>
	<Chromosome>
	<Position>
	<Reference base>
	<Substitution base>
	<Quality>
	<# of nonrepetitive reads supporting substituion>
	<concordance>
	<Avg. # alignments of overlapping reads>

filtered_reference.txt
	<Sample>
	<Chromosome>
	<Position>
	<Reference base>
	<Substitution base>
	<Quality>
	<# of nonrepetitive reads supporting substituion>
	<concordance>
	<Avg. # alignments of overlapping reads>

SV_deletion_high_quality.PE.txt
	<Sample>
	<Sequencing library ID>
	<# read pairs supporting SV call>
	<Chromosome>
	<Start>
	<End>
	<Deletion length based on mate pairs>
	<Gap length between stretched mate pair clusters>
	<# positions w/o nonrepetitive core alignment>
	<# positions w/o core alignment>
	<p-value>
	<id>

unsequenced.txt
	<Chromosome>
	<Start>
	<End>


-------------------------------------------------------------------
Tuebingen, Germany, 2011
Questions?
Seth Davis <davis@mpipz.mpg.de>
Jun Cao <jun.cao@tuebingen.mpg.de>

