BIOINFORMATICS

NCBI

WHAT ?

- National Centre for Biotechnology Information (www.ncbi.nlm.nih.gov)

  • Application of computational tools on molecular data
    (sequence of DNA cDNA or protein)
  • to acquire , analyse or visualise data

FUNCTION

  • Store/ retrieve biological info (database)
  • Retrieve/ compare gene sequences
  • Predict function of unknown genes/ proteins
  • Search for previously known functions of a gene
  • Compare data with other researchers
  • Compile/ distribute data for other researchers

APPLICATION

Pattern recognition

  • A particular sequence or structure has been seen before and that particular characteristics can be associated with

Prediction

  • From what we know to what we don`t know (sequence - predict structure or function )

THE PATHWAY of rDNA TECH to BIOINFO

1) DNA extraction

  • verify the DNA from any organism through AGE
  • agarose percentage a 1/DNA size

2) PCR

  • amplify targeted DNA sequence using a pair of primers
  • verify amplified DNA fragments through AGE

3) DNA sequencing

  • verify amplified DNA fragments through DNA sequencing
  • use automated DNA sequencing approach, result output
    in the format of chromatogram .

4) Bioinformatics

  • interpret message in the DNA sequence
  • usage of BLAST program provided by NCBI
  • results shown as BLAST output
  • in Bethesda, Maryland and founded in 1988
  • biological (db) database or databank
  • services : GenBank, PubMed, Entrez and BLAST

BLAST

- Basic Local Alignment Search Tools

  • Developed b Altschul et al.. in 1990
  • A tool for searching gene or protein sequence db for related genes
  • Principle = computer comparisons using Algorithm to find best matching pair ( INPUT - OUTPUT )

ALIGNMENTS

  • between query sq. and given db sq (DNA-DNA or amino acid- amino acid) - allow mismatches and gaps - indicate degree of similarity
  • similarity (:) characters =resemblance between two residues =the greater would expect in random
  • Identity (*) of nucleotides
    =exact match between two nucleotides.

HOW TO
USE BLAST

  • 2types: blastx (protein), tblastn (translated nucleotides)
  • DNA sq save in FASTA format (use NotePad) - start with > - spaceless ID sq - nucleotides as a single letter .

BLAST output

  • 1) Graphical representation
    • top = linear view of query sq. with bars below indicates where the matches
  • bars = coloured due to score alignments, grey area is not similar to query sq that surrounded by areas of similarity.
  • under = a list of hits in decreasing order significance.
  • Mouse over the bars = display the identifier of the sq.
  • click on a bar = take to the alignment of that sq.