TABLE 1

Species and sequence data sets used for computational screening of COSII genes

Common nameSpeciesResourceStatisticsURL
ArabidopsisArabidopsis thalianaTAIR28,581 sequences; from 60 to 15468 bp; 1274 bp on averageftp://tairpub:tairpub@ftp.Arabidopsis.org/home/tair/Sequences/blast_datasets/
TomatoS. lycopersicum and S. pennelliiSGN30576 sequences; from 89 to 4127 bp; 773 bp on averageftp://ftp.sgn.cornell.edu/unigene_builds/new-tomato.seq
PotatoS. tuberosumSGN24,931 sequences; from 151 to 4200 bp; 740 bp on averageftp.sgn.cornell.edu/unigene_builds/Solanum_tuberosum.seq
PepperCapsicum annuumSGN9554 sequences; from 150 to 3182 bp; 556 bp on averageftp.sgn.cornell.edu/unigene_builds/Capsicum_combined.seq
CoffeeCoffea canephora var. robustaCGN13,175 sequences; from 150 to 2714 bp; 677 bp on averagehttp://coffee.pgn.cornell.edu
SunflowerHelianthus annuusTIGR20,520 sequences; from 100 to 4587 bp; 478 bp on averageftp://ftp.tigr.org/pub/data/tgi/Helianthus_annuus
LettuceLactuca sativaTIGR22,185 sequences; from 100 to 5544 bp; 632 bp on averageftp://ftp.tigr.org/pub/data/tgi/Lactuca_sativa