Protein Structure
Amino Acids
Types
Polar- Tyrosine, tryptophan, asparagine, glutamine. cytesine, serine, theronine
nonpolar- glycine, alanine, valine, leucine, isoleucine, methionine, phenylalanine
charged- histidine, asparate, glutamine, lysine
Tyrosine, tryptophan and phenylalanine absorb and give uniques UV spectrums
All have at least 2 ionisable groups
pH alteration
pKa
it is the pH when ionisable groups spend 50% of time charged and 50% uncharged
it is protonated when pH is less than pH, since it is surrounded by protons
isoelectric point
least soluble state since less water interaction and no repulsion of proteins
Modifications
post-transitional modification by specialised enzymes
modification by phosphorylation, which makes it negative
Motifs are pattens in the sequence linked with a particular function
Primary Structure
interactions
weak interactions in water so folding
leads to hydrophobic effect
Bonds
Peptide
condensation reaction
C-N bond is planer so no rotation
Hydrogen
they are directional
due to electronegativity of lone pairs
Cis
cause side chains to collide
Trans
no side chain collisions
preferred over cis
Secondary Structure
Alpha helix
hydrogen bonds
between every fourth amino acid
all H-bonds in same orientation which creates a dipole so N-terminus is positive and C-terminus negative
maximises bonding
Pro (because of covalent bonding of amino group to carbon in side chain prevents stabilising through normal hydrogen bonding) and Gly are unable to form helices
side groups
usually stick out
polar side chains on on eface and nonpolar on the other makes it amphipathic
3.6 residues per turn
Beta Sheet
they consist of laterally packed B strands (5-8 residues)
stabilised by hydrogen (between the carbonyl O atom of each residue in 1 B strand and amide H atom of a residue in an adjecent B strand)and peptide bonds
can be parallel ore anti-parallel
Beta turn
tight since only 3-4 amino acids and reverse the direction of the polypeptide backbone
often have Pro and Gly, the lack of large side chain in Gly and presence of a built in bend in Pro allows tight bending
Tertiary structure
Categories
Globular
hydrophilic residues on the surface so are generally water soluble
hydrophobic inside
Fibrous
Comparison
Domains
Quaternary Structure
can be dimers, trimers, tetramers (icosahedral are found in viruses)
interface is usually nonpolar
it is a combination of either homomeric (identical) or heteromeric (different) protein subunits held together by noncovalent bonds
Folding
typically a folded folded protein is 50KJ more stable than the unfolded
Entropy
protein entropy decreases
offset by large increase in entropy of solvent
unfolded protein has hydrophobic side chains facing water which is unfavourable
Bonds
Disulphide bonds
Refolding experiments revealed incorrect disulphide formation prevents correct structure
Levinthal paradox- assume each carbon can have one of three bond angles for 100 amino acids then th etime it takes for the formation of the correct structure is 1.6x10^27 years, so the searching process is not random
Progressive folding
unfolded polypeptide undergoes folding via series of partially folded intermediates
after a few amino acids have been correctly established the rest forms
this is because of the energy landscape becoming more stable
It is the linear covalent arrangement of amino acid residues that compose it, linked by peptide binds
Oligopeptides are short chains of amino acids linked by peptide bonds and longer chains are refered to as polypeptides (200-500 residues)
form between carbonyl oxygen atom and the amide hydrogen atom
can curve around and form a cylinder, called a beta barrel, when these proteins are embedded in membranes the cylindrical beta sheet can form a hydrophilic central pole through which ions and small molecules may flow
stabilised by a hydrogen bond between end residues
Can be covalently modified by phosphorylation of glycosylation which can alter the mass of those residues
Parts of a polypeptide that don't form secondary structures but have a wall defined, stable shape have an irregular structure. The areas of highly flexible parts with no stable, fixed 3D structure have random coil
the carbonyl oxygen atom of each peptide bond is H bonded to the amide H atom of the amino acid 4 residues farther in C terminus direction
all the backbone amino and carboxyl groups are H bonded to one another (conferring substantial stability) except at beginning and end of helix
Hydrophilic helices have polar side chains extending outward on the outer surfaces (so they can interact with the aqueous environment)
Hydrophobic helices with nonpolar side chains tend to be buried within the core of the folded protein
the first and fourth residues are usually less than 0.7nm apart and those residues are often link ed by a H bond
Structural motifs
They are the particular combination of 2 or more secondary structures that form a 3D structure
Its role is often associated with a specific function like binding
some are stable after being isolated from the rest of the protein but others do not form thermodynamically stable structures in the absence of other portions of the protein
Coiled-coil
a helices from multiple separate polypeptide chains coil abouth one another
many fibrous proteins and transcriptional factors assemble into dimers or trimers by using this motif
They bind as each helix has an alipathic side chain strip that interact with a similar strip in the adjacent helix so sequestering hydrophobic groups away from water and stabilising the assembly of multiple independent helices; the hydrophobic strops along only one side because the primary structure of each helix is composed of heptad repeats (in which the side chains of first and fourth residues are alipathic and others are often hydrophilic)
EF HAnd
this is a common calcium binding motif that contains 2 short a helices connected by a loop
Ca2+ ion binds to oxygen atoms in conserved residues in the loop when the concentration of Ca2+ in the cell is high enough, the binding can induce a conformational change in the protein
Interactions
stabilised primarily by hydrophobic interactions between nonpolar side chains with van der Waals interactions and H bonds involving both polar side chains and backbone amino and carboxyl groups
because the interactions stabilising tertiary structures are often weaker than secondary structure, the tertiary is not rigidly fixed but can continually fluctuate
Bonds
Disulphide bonds can covalently link regions of the proteins which restricts protein flexibility and increases stability
protein solubility can increase as amino acids with charged hydrophilic polar side chains tend to be on outer surfaces and interact with water
They are the distinct regions of protein structure
3 classes
functional domain
structural domain
region of protein that exhibits particular activity characteristic of that protein even when isolated from rest of protein
e.g. kinase domain that covalently adds a phosphate group to another molecule
often identified by using proteases that cleave one or more peptide bonds in a target polypeptide
is region of about 40 or more amino acids arracnged in single, stable and distinct structure often comprising one or more secondary structures
they can fold independently of rest of the protein
distinct domains can be linked together by spacers to form a larfe multidomain protein (like hemagglutinin)
They are often functional domains since caan have independent activity
they can be reognised on proteins whose structures have been determined by x-ray crystallography or NMR analysis or electron microscopy
EGF domain
present in several proteins- small, soluble peptide hormone that binds to cells in the embryo and skin of adults
generated by proteolytic cleavage between repeated EGF domains in EGF precursor protein anchored in plasma membrane
found in tissue plasminogen activator (dissolves blood clots), Neu protein (embryonic differentiation), Notch protein (receptors in membrane for development signalling)
Topological domain
they are defined by spatial relationship to rest of the protein
each domain can comprise multiple structural and functional domains
Proteins with similar shapes have similar structures so can be used in modelling
the greater the similarity in sequences of 2 polypeptide chains the more likely they are to have similar 3D structures and function
Proteins that hav a common ancestor are referred to as homologs, evidence for thsi is similarity in sequences
generally thought that proteins with about 30& sequence identity (exact amino acid matches) are likely to have similar 3D structures
they are compactly folded structures (often spheroidal)
They are large, elonagated stiff molecules
some are composed of a long polypeptide chain comprising many tandem copies of a short amino acid sequence motif that forms a single repeating secondary structure
often made up of helical polypeptide chains like a helices, triple helices and helical coil coils with multiple strands
they are composed of repeating globular protein subunits
Integral membrane
embedded in phospholipid bilater
membrane spanning domain comprises of one or more roughly 20 residue long a helices and some Beta barrels
Disordered
meaning that they do not form thermodynamically stable structures, and they are exceptionally flexible in conformation
phosphorylation of the disordered C terminal domain of RNA pol II, composed of multiple repeats of 7 amino acid sequence contain Pro, Thr and Ser an dregulates mRNA synthesis
Sometimes the entire polypeptide chain is disordered, so these proteins do no t have a well ordered structure in their native state, these proteins are called intrinsically disordred proteins. These usually serve as signalling molecules, activity regulators or as scaffolds for multiple proteins, small molecuels and ions
Intrinsically disordered proteins adn disordered regions can be identified by tests of protease sensitivity (since they usually exhibit greater protease sensitivity) and by spectroscopy
The segments arise when they are richer in polar amino acids, proline and poorer in hydrophobic residues
In some cases an intrinsically disordered protein (or region) can transition into a highly ordered structure
sometimes the individual monomer subunits cannot function unless assembled but sometimes when multimeric protein assembly permits proteins that act sequentially in a pathway to increase their efficiency
supramolecular complexes
these structures are very large with tens to hundred of polypeptide chains
mass can exceeed 1 megadalton an dapproach 300nm in size
the capsid that encases the nucleic acids of a viral genome and cytoskeleton are structural examples
A molecular machine responsible for synthesising mRNA, it involves RNA polymerase and at least 50 additional components
The nuclear allows access of macromolecules to pass nuclear membrane and is composed of multiple copies of 30 distinct proteins with 50 MDa mass
Biomolecular condensates
they are membrane less compartments in cells
unlike supramolecular components their components fo not have a fixed stoichiometry, can vary in size, nor have fixed quaternary structural arrangement
like vesicles they can break apart into smaller liquid droplets or fuse into larger ones
the capacity of a protein to form a condensate depends on structure, concentration and conditions
they contain multiple domains that have ability to bind to regions of other proteins or nucleic acids, and when a protein does bind it oligomerises at the oligomerisation sites
examples
Wnt signalling
several proteins assemble into condensates (including APc2 adn Axin)
fluorescently labelled version of D. melanogaster APC2 and Axin were expressed in mammalian cells and assembled inot spherical condensates
nucleoli
they are sites of ribosomal subunit synthesis)
P-bodies
asites of translational repression and nRNA degredation
stabilises structure but removal does not alter structure
Peptide Bonds
This bonds behaves somewhat like a planar double bond, the portions on either side of the peptide bond can be orientated in either trans or cis configuration relative to peptide bond
In the bond, the carbonyl carbon and amide nitrogen and atoms directly bonded to them all lie in fixed plane, little rotation about the peptode bond itself is possible
teh only flexibility in a polypeptide chain is the rotation of fixed planes of adjacent peptide bonds wiht respect to one another about the amino nitrogen bond and carbonal carbon bond
Cells use PPIases to catalyses cis/trans isomerisations so that Pro in folding protein forms the proper isomer, these isomerisations can drastically alter protein structure
Comes from primary structure
The native state of any protein not intrinsically disordered, will adapt a very few closely related stably form conformations
the features of well ordered protein that limit conformations are side chain properties and polypeptide backbone sequence (e.h. Trp has large side chain so no tight packing, Arg has positive side charge so attract negative)
Denaturation
can be induced by thermal energy, pH extremes, denaturants (like urea, guanidine hydrochloride at 6-8M); and treatment with reducing agents (Like Beta mercaptoethanol) can further desabilse
Spontaneous unfolding occurs because increase in entropy that occurs because a denatured protein can adopt many non-native conformations (increased disorder)
However when a pure sample is placed back in normal conditions some denatured proteins can spontaneously refold like in Anfinsen's experiments
so the information contained in the primary structure is enough to direct correct folding
similar 3D structure of proteins with similar amino acid sequences
Chaperones
total cytosolic protein concentration can be 300mg/ml in mammalian, these high concentrations favour formation of aggregates by increasing chances a nascent protein will encounter proteins prior to folding and so aggregate into awter-insoluble mass due to hydrophobic effect
Intrinsically disordered proteins are less likely to form deleterious aggregates as have fewer hydrophobic side chains, whereas newly synthesised have high risk as not yet properly folded
95% of proteins in cells are in native control, without chaperones the cell would waste too much energy to destroy aggregates
chaperones facilitate folding by preventing aggregation yb binding to polypeptide or sequestering it from other partially unfolded proteins so nascent protein has time to correctly fold
They help fold newly made proteins or refold misfolded or unfolded proteins; sometimes the protein fails to fold so the chaperones re-engage for additional cycles
chaperones can disassemble potentially toxic protein aggregrates that form due to misfolding
They bind to client proteins and use a cycle of ATP binding, hydrolysis and exchange to induce conformational changes. ATP used for enhanced binding, switching own configuration, optimise folding and returning to initial state
2 families
Molecular chaperones
They bind to a short segment of protein and stabilise unfolded/partially folded proteins and prevent aggregation and degradation
Bind to nascent cahin as being synthesised, as leaving ribosom
2 types
Hsp70
it is heat shock protein in cytosol
When an ATP is bound at nucleotide binding domain Hsp70 is in open conformation
co-chaperone accessory proteins (DnaJ/Hsp40) stimulate ATP hydroylsis (increase hydrolysis rate 100-1000 times the induces a large conformational change in substrate binding domain causing closed conformation, in which the substrate is tightly locked in
exchange of cytosolic ATP for bound ABP stimualted by otehr proteins (GrpE/BAG1) converts it back to open conformation, releasing the substrate and freeing it to continue folding
if the released substrate does not fold properly it can rebind to another chaperone and repeat
Hsp90
they have strong evolutionary conservation since similarity is 55%
They help cells fold partially folded client proteins and cope with denatured proteins from stress
they can form a relatively stable complex with a client until signal causes dissociation
unlike Hsp70, they function as a dimer in a cycle. Rapid ATP binding leads to conformational change in which nucleotide binding domains and substrate binding domains move together into a closed conformation
The closed conformation means the client protein may undergo folding, binding of clinets to Hsp90 occurs at different points
ATP hydrolysis causes change that may include highly compact form, client folding, client protein release adn additional folding of unbound client. ADP then released
Chaperonins
these are huge cylindrical supramolecular assemblies with a centre of chambers in will protein enters; the chamber allows folding and is formed from 2 rings of olgiomers
2 groups
Group I
found in prokaryotes, chloroplasts adn mitochondria
composed of 2 rings each with 7 subunits and each ring is a folding chamber into which an unfolded protein enters
GroEL/GroES
Group II
found in cytosol of eukaryotes and archaea
They have 8-9 homomeric or heteromeric subunits in each ring and lid funciton is incorporated into those subunits
ATP hydrolysis triggers lid closing
is thought to participate in 10% of all proteins
a misfolded protein enters and the second chamber is then blocked by a GroES lid, each ring of 7 GroEL subunits binds 7 ARPs, hydrolyses them to set coordinated GroES and protein binding, folding and release
GroEL rings controls binding of GroES lid to sela chamber
the polypeptide remains in chamber capped by lid adn undergoes folding until ATP hydrolysis induces binding of ATP na da differenct GroES to the ring
this binding causes GroES led and ADP to be released which opens the chamber and lets out folded protein
TriC
ATP binding and hydrolysis in presence of bound client protein leads to closing of the lid and folding of client in sequestered environment withing the folding chamner
release of Pi opens the lid adn release the substrate adn ADP
Misfolding diseases
aggregates can either by amorphous or be well organised which is most commonly the amyloid state
many proteins can each aggregate into amyloid fibrils that have a bross B-sheet, where each strand is nearly perpendicular to the long axis and 2 long B sheets pack closely together and twist aroufn each other to form protofilaments which then assemble together int amyloid fibrils
amyloids are associated with amyloidosis diseases each characterised by presence of filamentous plagues in deterioratign brain
in Alzheimer's a hyperphosphorylated form of protein tau forms twisted fibres called tangles which are relatively short, water-soluble protofilaments or long insoluble fibrils