ALIGMENT

click to edit

click to edit

click to edit

click to edit

click to edit

important functions conservation

varibility

patterns 🔁

click to edit

most important with db query

phylogenetic analysis

evolutionary relationship 🏈⚽

correspondence

click to edit

...

proteins

structures(20/30% similar)

...

degree of similarity 🏀⚽

searchs

click to edit

click to edit

click to edit

homology

click to edit

genes/proteins

similarity

degree of similarity(%)

click to edit

motive

set of characters (NAs or AAs)

contiguos or not

orthology

paralogy

click to edit

if

25% aligned AAs

75% aligned Nucleotidies

yes or no answare

hom. sequences used to interpret new sequence

members of families

proteins that are related

identity unknown sequencies

conserved through evolution(i.e. the most important reagions)

overlap during sequencies(long molecules e.g DNA)

grammars

previous knowledge on functions

homology preferred

click to edit

dot matrix

click to edit

Filtering background noise

noise high in NAs seq

readability of the graph

click to edit

window

stringency

e.g. s=7,w=11 \(\Rightarrow \) every 11 chars if at least are equals keep that part of the seq

about motive size

pairwise alignment

substitution matrix

Optimal alignment = max scoring function = min distance function

distance

Levenshtein

Hamming

c1 != c1 \(\Rightarrow\) +1

insertion, deletion, substitution

editing distance

symmetric

click to edit

BLOSSUM

PAM

Percent/Point Accepted Mutations

click to edit

AAs(1978)

Nucleotides(1991)

AAs(1992)

evolutionary mutations

explicit evolutionary model

construction

1572 accepted mutations

71 related groups of proteins with 85% similarity

from probability of changes in related proteins/genes

global

relative / segments of AAs

PAMn = (PAM1)^n

BLOSSUMn \( \rightarrow \) n = % of similarity of the sequences of the blocks used in each clusters

PAM250 most used

Blocks Substitution Matrices

block = highly conserved region without gaps

BLOSSUM62 standard

large n for evolutionary relationship close in time

click to edit

click to edit

click to edit

click to edit

implicit evolutionary model

based on sequences primary structure

gaps

consecutive gaps likely to be related to the same mutation

recommended by ClustalW

DNA

AAs(BLOSSUM64)

g0 = 10

ge = 0.1*l

ge = 1*l

g0 = 11

minimize gaps(insertions/deletions)

maximize aligned symbols

minimize different symbols aligned