BIOINFORMATICS
NCBI
WHAT ?
- National Centre for Biotechnology Information (www.ncbi.nlm.nih.gov)
- Application of computational tools on molecular data
(sequence of DNA cDNA or protein) - to acquire , analyse or visualise data
FUNCTION
- Store/ retrieve biological info (database)
- Retrieve/ compare gene sequences
- Predict function of unknown genes/ proteins
- Search for previously known functions of a gene
- Compare data with other researchers
- Compile/ distribute data for other researchers
APPLICATION
Pattern recognition
- A particular sequence or structure has been seen before and that particular characteristics can be associated with
Prediction
- From what we know to what we don`t know (sequence - predict structure or function )
THE PATHWAY of rDNA TECH to BIOINFO
1) DNA extraction
- verify the DNA from any organism through AGE
- agarose percentage a 1/DNA size
2) PCR
- amplify targeted DNA sequence using a pair of primers
- verify amplified DNA fragments through AGE
3) DNA sequencing
- verify amplified DNA fragments through DNA sequencing
- use automated DNA sequencing approach, result output
in the format of chromatogram .
4) Bioinformatics
- interpret message in the DNA sequence
- usage of BLAST program provided by NCBI
- results shown as BLAST output
- in Bethesda, Maryland and founded in 1988
- biological (db) database or databank
- services : GenBank, PubMed, Entrez and BLAST
BLAST
- Basic Local Alignment Search Tools
- Developed b Altschul et al.. in 1990
- A tool for searching gene or protein sequence db for related genes
- Principle = computer comparisons using Algorithm to find best matching pair ( INPUT - OUTPUT )
ALIGNMENTS
- between query sq. and given db sq (DNA-DNA or amino acid- amino acid) - allow mismatches and gaps - indicate degree of similarity
- similarity (:) characters =resemblance between two residues =the greater would expect in random
- Identity (*) of nucleotides
=exact match between two nucleotides.
HOW TO
USE BLAST
- 2types: blastx (protein), tblastn (translated nucleotides)
- DNA sq save in FASTA format (use NotePad) - start with > - spaceless ID sq - nucleotides as a single letter .
BLAST output
- 1) Graphical representation
- top = linear view of query sq. with bars below indicates where the matches
- bars = coloured due to score alignments, grey area is not similar to query sq that surrounded by areas of similarity.
- under = a list of hits in decreasing order significance.
- Mouse over the bars = display the identifier of the sq.
- click on a bar = take to the alignment of that sq.