Notice: Due to a server crash, the database was briefly unavailable yesterday. Please let us know if anything does not work as expected. Apologies for any inconviences.

Solanum tuberosum Genome Data

The potato genome has been sequenced by the Potato Genome Sequence Consortium (PGSC), an international group of scientists from 14 countries. The sequence was published in Nature in July 2011. Many potato lines are tetraploid; however, the sequenced accession was a homozygous diploid line (double haploid) Solanum tuberosum phureja. Below is a mirror of the files provided on http://potatogenome.net.
Sequence datasets  
S_tuberosum_Group_Phureja_chloroplast 
Description
Sequence files and other related information from the Potato Genome Sequencing Consortium (PGSC) sequencing of the chloroplast of the heterozygous diploid S. tuberosum Group Tuberosum cultivar, RH89-039-16 (RH).
Files
S_tuberosum_Group_Tuberosum_chloroplast_RH89-039-16.fasta.zip
S. tuberosum Group Tuberosum RH89-039-16 chloroplast sequences
S_tuberosum_Group_Phureja_mitochondrion 
Description
Sequence files and other related information from the Potato Genome Sequencing Consortium (PGSC) sequencing of the mitochondrion of the heterozygous diploid S. tuberosum Group Tuberosum cultivar, RH89-039-16 (RH).
Files
S_tuberosum_Group_Tuberosum_mitochondrion_RH89-039-16.fasta.zip
S. tuberosum Group Tuberosum RH89-039-16 mitochondrion sequences
PGSC_DM_v3 
Description
Sequence files and other related information from the Potato Genome Sequencing Consortium (PGSC) sequencing of the doubled monoploid S. tuberosum Group Phureja clone DM1-3 516R44 (DM).
Annotated by
PGSC_DM_v3.4
Files
PGSC_DM_v3_superscaffolds.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 DM superscaffold sequences
PGSC_DM_v3_scaffolds.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 DM scaffold sequences
S_tuberosum_Group_Phureja_chloroplast_DM1-3-516-R44.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 chloroplast sequences
S_tuberosum_Group_Phureja_mitochondrion_DM1-3-516-R44.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 mitochondrion sequences
PGSC_DM_v3_superscaffolds.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 DM, Version 2.1.9 AGP Pseudomolecule Sequences
PGSC_DM_v3_2.1.9_superscaffolds_unanchored_gtr_2.5k.fasta.zip
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 DM, Version 2.1.9 AGP Unanchored Superscaffold Sequences (>2.5kbp)
PGSC_DM_v3_2.1.9_pseudomolecule_AGP.xlsx
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Version 3 DM Pseudomolecule AGP data (v2.1.9) - Excel Format
Assembly issues

If in the course of your work you find errors or other issues with these genome assemblies, please report them using one of the following links:

Annotation datasets  
PGSC_DM_v3.4 
Description
S. tuberosum Group Phureja DM1-3 516R44 (CIP801092) Genome Annotation v3.4 (based on v3 superscaffolds)
Files
PGSC_DM_v3.4_gene.fasta.zip
Nucleotide sequences of all genes.
PGSC_DM_v3.4_cds.fasta.zip
Nucleotide sequences of all gene coding sequences (coding sequence only, i.e. no introns and no UTRs).
PGSC_DM_v3.4_transcript.fasta.zip
Nucleotide sequences of all transcript sequences (UTRs and exons).
PGSC_DM_v3.4_pep.fasta.zip
Amino acid sequences corresponding to all gene coding sequences.
PGSC_DM_v3.4_gene.gff.zip
Gene annotation in GFF3 format
PGSC_DM_v3.4_cds_nonredundant.fasta.zip
Alternative isoforms sometimes share the same coding sequence (CDS) which only appears once in this file.
PGSC_DM_v3.4_pep_nonredundant.fasta.zip
Amino acid sequences corresponding to nonredundant CDS file above.
PGSC_DM_v3.4_gene_nonredundant.gff.zip
Same as PGSC_DM_v3.4_gene.gff with additional flaggings for a) identical peptides originating from multiple genes b) identical peptides originating from alternative isoforms from the same gene.
PGSC_DM_v3.4_transcript_representative.fasta.zip
The transcript that produces the longest peptide sequence among all the alternative isoforms of a gene is selected as the representative transcript.
PGSC_DM_v3.4_cds_representative.fasta.zip
Coding sequences of the representative transcripts.
PGSC_DM_v3.4_pep_representative.fasta.zip
Amio acid sequences corrsponding to the representative CDS file above
PGSC_DM_v3.4_gene_func.txt.zip
Putative function of all genes. The putative function of the representative peptide is used if alternative isoforms exist.