Please enable JavaScript.

Coggle requires JavaScript to display documents.

Protein Analysis - Coggle Diagram

- - - - Proteins w/ similar sequences should have similar functions
      - Most reliable method for determining protein function is to do a database search (as discussed in previous section)
      - Some functional features can be predicted directly from a protein seq e.g. hydrophobicity profiles can b used to predict transmembrane helices
        
        Hyrdophobicity profiles can be generated and displayed graphically using ProtScale at ExPASy allows user to calculate over 50 diff properties of proteins where a number (i.e. hydrophobicity) assigned to each AA
        
        Input to program can either be a sequence pasted into sequence window or a SWISS-PROT accession code (only other parameter is size of window)
      - Predicting transmembrane domains
        
        Prediction of transmembrane helices in seq's easiest to look at regions of protein containing a run of 20 hydrophobic residues
        
        Algorithm TMBase, program named TMPRED (80-95% accurate in prediction location of helices and orientation)
      - Leader Sequences and Protein Localisation
        
        P contain signals in seq that help their processing within cell e.g. leader sequences or signals which target proteins to specific compartments in cells
        
        Program name SignalP (predicts leader sequences and cleavage sites in both Prok and Euk
        
        Program name PSORT (analyses Prok or Euk seq and searches for protein sorting signals + program reports back probability of protein being localised to diff compartments within cell) Accuracy 60%
    - - Often protein seq is too distantly related to an in databases to allow reliable ID to be made by sequence alignment
        
        Alternatively seq alignment might find a match but to a protein of no known function in this case there is still a lot that can be done to predict function using bioinfo tool
      - Diff regions of proteins evolve at different rates some parts much retain certain patterns of residues for protein to function
        
        If ID these conserved regions its possible to make predictions about the protein function
        
        e.g. there are many short seq's that are diagnostic of the active site or binding region of a protein i.e. Metal binding domains (MBD) in the Cu uptake system
        
        If the protein seq contains an MBD motif its possible to predict that one of its functions might be binds to metals. Presence of MBD motif doesn't mean that the protein binds metal ions but it provides an experimentally testable hypothesis as to protein function
      - Several bioinformatic resources have been made to build DB of conserved motifs and to search for instances of such motifs in seq
        
        Pattern bases
        
        Best known = PROSITE and contains ~2,000 diff families
        
        PROSITE uses highly conserved regions to create a signature of multiple motifs for each domain family similar to finger prints
        
        Typical entry in PROSITE would be : [ST]-x(2)-[DE]
        
        i.e. a Serine or Threonine followed by any 2 residues followed by a D or E which is the consensus sequence of a Casein kinase II phosphorylation site
        
        Profile databases
        
        More sensitive tool (PROSITE, PRINTS, BLOCKS, Pfam, InterPro)
        
        InterPro Protein Archive
        
        Central collection of family and domain descriptions linking different resources
        
        Provides access to a range of diagnostic opportunities for a given query through a single interface i.e. provided an unified front end to the signature databases