Protein Structure

Amino Acids

Types

Polar- Tyrosine, tryptophan, asparagine, glutamine. cytesine, serine, theronine

nonpolar- glycine, alanine, valine, leucine, isoleucine, methionine, phenylalanine

charged- histidine, asparate, glutamine, lysine

Tyrosine, tryptophan and phenylalanine absorb and give uniques UV spectrums

All have at least 2 ionisable groups

pH alteration

pKa

it is the pH when ionisable groups spend 50% of time charged and 50% uncharged

it is protonated when pH is less than pH, since it is surrounded by protons

isoelectric point

least soluble state since less water interaction and no repulsion of proteins

Modifications

post-transitional modification by specialised enzymes

modification by phosphorylation, which makes it negative

Motifs are pattens in the sequence linked with a particular function

Primary Structure

interactions

weak interactions in water so folding

leads to hydrophobic effect

Bonds

Peptide

condensation reaction

C-N bond is planer so no rotation

Hydrogen

they are directional

due to electronegativity of lone pairs

Cis

cause side chains to collide

Trans

no side chain collisions

preferred over cis

Secondary Structure

Alpha helix

hydrogen bonds

between every fourth amino acid

all H-bonds in same orientation which creates a dipole so N-terminus is positive and C-terminus negative

maximises bonding

Pro (because of covalent bonding of amino group to carbon in side chain prevents stabilising through normal hydrogen bonding) and Gly are unable to form helices

side groups

usually stick out

polar side chains on on eface and nonpolar on the other makes it amphipathic

3.6 residues per turn

Beta Sheet

they consist of laterally packed B strands (5-8 residues)

stabilised by hydrogen (between the carbonyl O atom of each residue in 1 B strand and amide H atom of a residue in an adjecent B strand)and peptide bonds

can be parallel ore anti-parallel

Beta turn

tight since only 3-4 amino acids and reverse the direction of the polypeptide backbone

often have Pro and Gly, the lack of large side chain in Gly and presence of a built in bend in Pro allows tight bending

Tertiary structure

Categories

Globular

hydrophilic residues on the surface so are generally water soluble

hydrophobic inside

Fibrous

Comparison

Domains

Quaternary Structure

can be dimers, trimers, tetramers (icosahedral are found in viruses)

interface is usually nonpolar

it is a combination of either homomeric (identical) or heteromeric (different) protein subunits held together by noncovalent bonds

Folding

typically a folded folded protein is 50KJ more stable than the unfolded

Entropy

protein entropy decreases

offset by large increase in entropy of solvent

unfolded protein has hydrophobic side chains facing water which is unfavourable

Bonds

Disulphide bonds

Refolding experiments revealed incorrect disulphide formation prevents correct structure

Levinthal paradox- assume each carbon can have one of three bond angles for 100 amino acids then th etime it takes for the formation of the correct structure is 1.6x10^27 years, so the searching process is not random

Progressive folding

unfolded polypeptide undergoes folding via series of partially folded intermediates

after a few amino acids have been correctly established the rest forms

this is because of the energy landscape becoming more stable

It is the linear covalent arrangement of amino acid residues that compose it, linked by peptide binds

Oligopeptides are short chains of amino acids linked by peptide bonds and longer chains are refered to as polypeptides (200-500 residues)

form between carbonyl oxygen atom and the amide hydrogen atom

can curve around and form a cylinder, called a beta barrel, when these proteins are embedded in membranes the cylindrical beta sheet can form a hydrophilic central pole through which ions and small molecules may flow

stabilised by a hydrogen bond between end residues

Can be covalently modified by phosphorylation of glycosylation which can alter the mass of those residues

Parts of a polypeptide that don't form secondary structures but have a wall defined, stable shape have an irregular structure. The areas of highly flexible parts with no stable, fixed 3D structure have random coil

the carbonyl oxygen atom of each peptide bond is H bonded to the amide H atom of the amino acid 4 residues farther in C terminus direction

all the backbone amino and carboxyl groups are H bonded to one another (conferring substantial stability) except at beginning and end of helix

Hydrophilic helices have polar side chains extending outward on the outer surfaces (so they can interact with the aqueous environment)

Hydrophobic helices with nonpolar side chains tend to be buried within the core of the folded protein

the first and fourth residues are usually less than 0.7nm apart and those residues are often link ed by a H bond

Structural motifs

They are the particular combination of 2 or more secondary structures that form a 3D structure

Its role is often associated with a specific function like binding

some are stable after being isolated from the rest of the protein but others do not form thermodynamically stable structures in the absence of other portions of the protein

Coiled-coil

a helices from multiple separate polypeptide chains coil abouth one another

many fibrous proteins and transcriptional factors assemble into dimers or trimers by using this motif

They bind as each helix has an alipathic side chain strip that interact with a similar strip in the adjacent helix so sequestering hydrophobic groups away from water and stabilising the assembly of multiple independent helices; the hydrophobic strops along only one side because the primary structure of each helix is composed of heptad repeats (in which the side chains of first and fourth residues are alipathic and others are often hydrophilic)

EF HAnd

this is a common calcium binding motif that contains 2 short a helices connected by a loop

Ca2+ ion binds to oxygen atoms in conserved residues in the loop when the concentration of Ca2+ in the cell is high enough, the binding can induce a conformational change in the protein

Interactions

stabilised primarily by hydrophobic interactions between nonpolar side chains with van der Waals interactions and H bonds involving both polar side chains and backbone amino and carboxyl groups

because the interactions stabilising tertiary structures are often weaker than secondary structure, the tertiary is not rigidly fixed but can continually fluctuate

Bonds

Disulphide bonds can covalently link regions of the proteins which restricts protein flexibility and increases stability

protein solubility can increase as amino acids with charged hydrophilic polar side chains tend to be on outer surfaces and interact with water

They are the distinct regions of protein structure

3 classes

functional domain

structural domain

region of protein that exhibits particular activity characteristic of that protein even when isolated from rest of protein

e.g. kinase domain that covalently adds a phosphate group to another molecule

often identified by using proteases that cleave one or more peptide bonds in a target polypeptide

is region of about 40 or more amino acids arracnged in single, stable and distinct structure often comprising one or more secondary structures

they can fold independently of rest of the protein

distinct domains can be linked together by spacers to form a larfe multidomain protein (like hemagglutinin)

They are often functional domains since caan have independent activity

they can be reognised on proteins whose structures have been determined by x-ray crystallography or NMR analysis or electron microscopy

EGF domain

present in several proteins- small, soluble peptide hormone that binds to cells in the embryo and skin of adults

generated by proteolytic cleavage between repeated EGF domains in EGF precursor protein anchored in plasma membrane

found in tissue plasminogen activator (dissolves blood clots), Neu protein (embryonic differentiation), Notch protein (receptors in membrane for development signalling)

Topological domain

they are defined by spatial relationship to rest of the protein

each domain can comprise multiple structural and functional domains

Proteins with similar shapes have similar structures so can be used in modelling

the greater the similarity in sequences of 2 polypeptide chains the more likely they are to have similar 3D structures and function

Proteins that hav a common ancestor are referred to as homologs, evidence for thsi is similarity in sequences

generally thought that proteins with about 30& sequence identity (exact amino acid matches) are likely to have similar 3D structures

they are compactly folded structures (often spheroidal)

They are large, elonagated stiff molecules

some are composed of a long polypeptide chain comprising many tandem copies of a short amino acid sequence motif that forms a single repeating secondary structure

often made up of helical polypeptide chains like a helices, triple helices and helical coil coils with multiple strands

they are composed of repeating globular protein subunits

Integral membrane

embedded in phospholipid bilater

membrane spanning domain comprises of one or more roughly 20 residue long a helices and some Beta barrels

Disordered

meaning that they do not form thermodynamically stable structures, and they are exceptionally flexible in conformation

phosphorylation of the disordered C terminal domain of RNA pol II, composed of multiple repeats of 7 amino acid sequence contain Pro, Thr and Ser an dregulates mRNA synthesis

Sometimes the entire polypeptide chain is disordered, so these proteins do no t have a well ordered structure in their native state, these proteins are called intrinsically disordred proteins. These usually serve as signalling molecules, activity regulators or as scaffolds for multiple proteins, small molecuels and ions

Intrinsically disordered proteins adn disordered regions can be identified by tests of protease sensitivity (since they usually exhibit greater protease sensitivity) and by spectroscopy

The segments arise when they are richer in polar amino acids, proline and poorer in hydrophobic residues

In some cases an intrinsically disordered protein (or region) can transition into a highly ordered structure

sometimes the individual monomer subunits cannot function unless assembled but sometimes when multimeric protein assembly permits proteins that act sequentially in a pathway to increase their efficiency

supramolecular complexes

these structures are very large with tens to hundred of polypeptide chains

mass can exceeed 1 megadalton an dapproach 300nm in size

the capsid that encases the nucleic acids of a viral genome and cytoskeleton are structural examples

A molecular machine responsible for synthesising mRNA, it involves RNA polymerase and at least 50 additional components

The nuclear allows access of macromolecules to pass nuclear membrane and is composed of multiple copies of 30 distinct proteins with 50 MDa mass

Biomolecular condensates

they are membrane less compartments in cells

unlike supramolecular components their components fo not have a fixed stoichiometry, can vary in size, nor have fixed quaternary structural arrangement

like vesicles they can break apart into smaller liquid droplets or fuse into larger ones

the capacity of a protein to form a condensate depends on structure, concentration and conditions

they contain multiple domains that have ability to bind to regions of other proteins or nucleic acids, and when a protein does bind it oligomerises at the oligomerisation sites

examples

Wnt signalling

several proteins assemble into condensates (including APc2 adn Axin)

fluorescently labelled version of D. melanogaster APC2 and Axin were expressed in mammalian cells and assembled inot spherical condensates

nucleoli

they are sites of ribosomal subunit synthesis)

P-bodies

asites of translational repression and nRNA degredation

stabilises structure but removal does not alter structure

Peptide Bonds

This bonds behaves somewhat like a planar double bond, the portions on either side of the peptide bond can be orientated in either trans or cis configuration relative to peptide bond

In the bond, the carbonyl carbon and amide nitrogen and atoms directly bonded to them all lie in fixed plane, little rotation about the peptode bond itself is possible

teh only flexibility in a polypeptide chain is the rotation of fixed planes of adjacent peptide bonds wiht respect to one another about the amino nitrogen bond and carbonal carbon bond

Cells use PPIases to catalyses cis/trans isomerisations so that Pro in folding protein forms the proper isomer, these isomerisations can drastically alter protein structure

Comes from primary structure

The native state of any protein not intrinsically disordered, will adapt a very few closely related stably form conformations

the features of well ordered protein that limit conformations are side chain properties and polypeptide backbone sequence (e.h. Trp has large side chain so no tight packing, Arg has positive side charge so attract negative)

Denaturation

can be induced by thermal energy, pH extremes, denaturants (like urea, guanidine hydrochloride at 6-8M); and treatment with reducing agents (Like Beta mercaptoethanol) can further desabilse

Spontaneous unfolding occurs because increase in entropy that occurs because a denatured protein can adopt many non-native conformations (increased disorder)

However when a pure sample is placed back in normal conditions some denatured proteins can spontaneously refold like in Anfinsen's experiments

so the information contained in the primary structure is enough to direct correct folding

similar 3D structure of proteins with similar amino acid sequences

Chaperones

total cytosolic protein concentration can be 300mg/ml in mammalian, these high concentrations favour formation of aggregates by increasing chances a nascent protein will encounter proteins prior to folding and so aggregate into awter-insoluble mass due to hydrophobic effect

Intrinsically disordered proteins are less likely to form deleterious aggregates as have fewer hydrophobic side chains, whereas newly synthesised have high risk as not yet properly folded

95% of proteins in cells are in native control, without chaperones the cell would waste too much energy to destroy aggregates

chaperones facilitate folding by preventing aggregation yb binding to polypeptide or sequestering it from other partially unfolded proteins so nascent protein has time to correctly fold

They help fold newly made proteins or refold misfolded or unfolded proteins; sometimes the protein fails to fold so the chaperones re-engage for additional cycles

chaperones can disassemble potentially toxic protein aggregrates that form due to misfolding

They bind to client proteins and use a cycle of ATP binding, hydrolysis and exchange to induce conformational changes. ATP used for enhanced binding, switching own configuration, optimise folding and returning to initial state

2 families

Molecular chaperones

They bind to a short segment of protein and stabilise unfolded/partially folded proteins and prevent aggregation and degradation

Bind to nascent cahin as being synthesised, as leaving ribosom

2 types

Hsp70

it is heat shock protein in cytosol

When an ATP is bound at nucleotide binding domain Hsp70 is in open conformation

co-chaperone accessory proteins (DnaJ/Hsp40) stimulate ATP hydroylsis (increase hydrolysis rate 100-1000 times the induces a large conformational change in substrate binding domain causing closed conformation, in which the substrate is tightly locked in

exchange of cytosolic ATP for bound ABP stimualted by otehr proteins (GrpE/BAG1) converts it back to open conformation, releasing the substrate and freeing it to continue folding

if the released substrate does not fold properly it can rebind to another chaperone and repeat

Hsp90

they have strong evolutionary conservation since similarity is 55%

They help cells fold partially folded client proteins and cope with denatured proteins from stress

they can form a relatively stable complex with a client until signal causes dissociation

unlike Hsp70, they function as a dimer in a cycle. Rapid ATP binding leads to conformational change in which nucleotide binding domains and substrate binding domains move together into a closed conformation

The closed conformation means the client protein may undergo folding, binding of clinets to Hsp90 occurs at different points

ATP hydrolysis causes change that may include highly compact form, client folding, client protein release adn additional folding of unbound client. ADP then released

Chaperonins

these are huge cylindrical supramolecular assemblies with a centre of chambers in will protein enters; the chamber allows folding and is formed from 2 rings of olgiomers

2 groups

Group I

found in prokaryotes, chloroplasts adn mitochondria

composed of 2 rings each with 7 subunits and each ring is a folding chamber into which an unfolded protein enters

GroEL/GroES

Group II

found in cytosol of eukaryotes and archaea

They have 8-9 homomeric or heteromeric subunits in each ring and lid funciton is incorporated into those subunits

ATP hydrolysis triggers lid closing

is thought to participate in 10% of all proteins

a misfolded protein enters and the second chamber is then blocked by a GroES lid, each ring of 7 GroEL subunits binds 7 ARPs, hydrolyses them to set coordinated GroES and protein binding, folding and release

GroEL rings controls binding of GroES lid to sela chamber

the polypeptide remains in chamber capped by lid adn undergoes folding until ATP hydrolysis induces binding of ATP na da differenct GroES to the ring

this binding causes GroES led and ADP to be released which opens the chamber and lets out folded protein

TriC

ATP binding and hydrolysis in presence of bound client protein leads to closing of the lid and folding of client in sequestered environment withing the folding chamner

release of Pi opens the lid adn release the substrate adn ADP

Misfolding diseases

aggregates can either by amorphous or be well organised which is most commonly the amyloid state

many proteins can each aggregate into amyloid fibrils that have a bross B-sheet, where each strand is nearly perpendicular to the long axis and 2 long B sheets pack closely together and twist aroufn each other to form protofilaments which then assemble together int amyloid fibrils

amyloids are associated with amyloidosis diseases each characterised by presence of filamentous plagues in deterioratign brain

in Alzheimer's a hyperphosphorylated form of protein tau forms twisted fibres called tangles which are relatively short, water-soluble protofilaments or long insoluble fibrils