|
Strategies for Mining SNPs from EST
data 1.Big problems
•a.EST
sequences always are poor quality (single-pass sequencing)
•b.Public
EST data without trace files and quality files
c.No
original resource information to identify accession
•d.ESTs
are short and difficult to classify orthologs or paralogs,
moreover homozygous and
heterozygous sequences.
•e.It’s
difficult to find
nsSNP(non synonymous SNP) from EST
•f.Difficult
to detect low frequencies SNP
2.Strategies a. Get all information for all EST data of potato or brassic from EMBL Database, And then exact sequence,culitvar and tissue from them b. Align these data by Cap3 with Cross_match removing vectors C.Get alignment information for analysis SNP d. Analyze these information and identify true SNP e. Get nsSNP according blastx and fasty information 3. Workframe
|