Protein Folding and Misfolding

Protein folding

Anfinsen principle of protein folding

Protein can refold spontaneously by themselves

Conformation is determined by amino acid sequence alone

Anfinsen's Experiment (ribonuclease folding)

Secondary structure is broken apart by denaturant(urea), and the disulfide bonds are broken with mercaptoethanol

When urea and mercaptoethanol are removed, the ribonuclease refold by itself with correct folding

Two-state folding equilibrium

highly cooperative

rapidly interconvert between completely folded and completely unfolded (very transient intermediate)

localised “nucleation” of folding of nearby sequence

equilibrium constant

at equilibrium, mixture of folded and unfolded state

determined from rate constant Keq = kF / kU

determined from free energy ΔG = – RTlnKeq

Minoring folding/unfolding

CD - secondary structure

Fluorescence - whether it's buried

NMR

Free-energy of folding

monitor fraction unfolded at different [denaturant]

determine Keq at different [denaturant] (Keq=[Native]/[Unfolded])

calculate ΔG at different [denaturant] #

extrapolate ΔG to [denaturant] =0 to bobtain ΔG(H2O)

although ΔG only slight negative, but protein generally entirely folded

Folding kinetics

Measuring folding rate via stopped flow technology

Fast mixing between a solution with unfolded proteins and denaturant and a refolding buffer solution

Use detector (eg. CD) to measure folding

Another experiment with injection of folded protein into unfolding buffer (measuring unfolding rate)

Chevron Plot

logk vs. [denaturant]

extrapolating the folding curve to get kF, extrapolating the unfolding curve to get kU.

If the folding/unfolding curve is not linear, the folding can involve more than 2 states.

via fluorescence dye

mixing a protein with a hydrophobic fluorescence dye

When heated, the protein starts to unfold, dye binds to the exposed hydrophobic patches, brighter in fluorescence

After certain temperature, protein begins to aggregate and the fluorescence decrease

Negative enthalpy from formation of bonds, positive entropy from hydrophobic collapse (release of structured water)

Application: VSVG protein mutant

Temperature sensitive mutations: less negative 𝚫G(H20), more sensitive to temperature change

unfolded at 40 °C, retained in ER; refold at 32°C, secrete into Golgi

Use VSVG to study the mechanism of vesicular trafficking from ER to Golgi

Folding Trajectory

Transition state

undetectable

Adopt higher energy state than unfolded protein
(energy barrier)

Φ analysis

the importance of particular chemical moiety
in a protein structure on stabilizing a transition state

use site-directed mutagenesis

conservative mutation

only remove interaction (not introduce new interaction)

should not remove charges (affect other interaction)

choose conserved/buried residues

Φ = 𝚫𝚫G(‡–U) / 𝚫𝚫G(F–U)

use Chevron plot to get folding rate from mutation(kF') (𝚫𝚫G‡–U= -RTln(kF/kF’)

If mutation does not change the folding rate, then Φ =0

If both kF and kU are changed, 0<Φ<1

Φ =1

the moiety is important to the stability of transition state,

lies at a structured site in the transition state

0<Φ<1

additional processes are occurring (eg. mixture of transition states (different trajectories), conformational reorganization at the mutation site)

Models of folding

If two proteins share the same fold, but ) analysis marks different residues as being structured in transition state, two protein must fold through different transition state (trajectories)

Nucleation Condensation

Hydrophobic collapse

Framework

Secondary and tertiary structures form together at the nucleus

Then remaining structure forms around nucleus

Example: Chymotrypsin inhibitor 2 (CI2), helix and sheet have structure in transition state

Residue collapse together (hydrophobic effect)

Then secondary and tertiary structure form

Secondary structure form first

Then tertiary contacts form

Protein Dynamics Stimulation

for protein with known structure

Define parameter (eg. temperature, volume, density)

Initiate unfolding by increasing temperature

Motion of atom based on their own kinetic energy and forces exerted by other atom

Advantageous: Folding trajectories can be viewed precisely by simulation

In the example of Cl2 #

Good agreement between simulation (S value) and experimental Φ analysis

both S and Φ value predict transition state comprises of residue local and distal in sequence.Hence, nucleation condensation. #

Folding landscapes

Multiple-state folding

the plot of Fractional unfolded vs. [denaturant] consists of multiple sigmoidal curve

stable intermediate accumulate in the folding pathway

Example: Apolipoprotein E #

three isoform: apoE2, apoE3 and apoE4. ApoE4 can lead up increase of risk in Alzheimer's.

only one amino acid difference in sequence, but large difference in the folding properties

ApoE4 poorly fit to the 2-state model

Experiment: mixing ApoE with pepsin protease and different concentration of urea

when the sequence unfold, it will be cut by proteases

some regions is more resistant to protease (more structured in intermediate)

use mass spectrometry to identify the cut sites

from the cut sites, a "molten globule" intermediate model can be built up

protein retains near-native secondary structure

hydrophobic core not packed

conformation overall sightly expanded and "molten"

Levinthal Paradox

protein folding cannot occur by random sampling of all conformation

protein folding is sped up and guided by local interaction.

Local amino sequences which form stable interactions can serve as nucleation points in the folding process

Lattice model

consider protein as beads connected by one string

bead interact with each other by pairwise contact potentials

each additional contact lower the energy (ie. unfolded has the highest energy, the native state has the lowest energy)

Q0=native contact; C=total contact (Q0=C at folded-state)

small random changes are made and the conformation with lower energy is favoured

plotting ad 3D diagram for all conformation gives a funnel shape (Q0 vs. C vs. F(free energy))

unfolded state: Q0=0, C=low, F=high

intermediate: Q0=medium, C=high, F=lower

folded state: Q0=highest, C=high, F=very low

folding funnel (energy relationships with structure)

different protein has different folding landscape.

2-state model: smooth landscape; multiple-state model: rugged landscape (energetic barrier)

change in environment can alter the landscape

intermediate accumulate when trapped in local minimum

takes more energy (hence more time) to bump out of the minimum

example: molten globule #

Example: folding pathway of hen egg white lysozyme

folding is monitored by "stopped flow" kinetics, and the contacts are tracked using hydrogen/deuterium exhcange

hydrogen/deuterium exchange

hydrogen in water and aminde group can exchange

hydrogens do not exchange whilst forming a hydrogen bond

exchange rate is fast at pH7 (faster than protein folding), but slow at low pH (H/D exchange can be quenched by dropping pH)

stable secondary structure has many hydrogen bonds

unfolded sequences has little hydrogen-bonded amide hydrogens

Side-chain hydrogens exchange too fast to detect

click to edit

stopped flow experiment

diluting protein into D2O + buffer to initiate folding

using stopped-flow kinetics

addition of deuterated water to unfolded protein to initiate folding and exchange (?)

quench the exchange by dropping pH to 2.5

protein is deuterated and kept unfolded in D2O + denaturant

at different time point, diluting into H2O + buffer for H/D back-exchange

quench the exchange by dropping pH to 2.5

protein is allowed to completely fold and is analysed via NMR or Mass spectrometry

change in mass indicates the extent of H/D exchange

at t=0, entire protein is still unfolded back-exchanges to H (low mass)

at t=0.1s, some positions are protecting, the intermediate mass suggests intermediate formation

at t=2s, protein is folded and protected from back-exchange (high mass) (surface residue?)

Mass spectrometry #

NMR analysis

Deuterium is invisible by HSQC

the intensity of each amide resonance is proportional to the extent of H back-exchange

enable to look at each residue

experimental evidence suggests multiple folding pathway

Misfolding

Rival folding funnels

when protein flips from a normal folding trajectory into a non-normal one via intermediate conformation

in the abnormal funnel, aggregation offers a new route to reach lower energy states which compete with the proper folding pathway

change in environment can change the landscape and favour the intermediate

increase the risk of triggering non-native folding options

factors that can promote aggregation pathway

accumulation of intermediate confomation allows more non-native contacts to form

mutation

environmental changes

post-translational modification

changes in ligand interaction

Amyloid

generic structural motif characterized by a fibrous morphology

β-sheet core structure

formation is nucleated (cooperatively) and only slowly reversible

very low energy state that is comparable to native state

end product in many neurodegenerative diseases

fibrillar structure can be observed using negative staining TEM

additional electron dense dye coating

only observing the shadow of the fibrils, hence inherently low resolution

can be improved using CryoTEM

combining low contrast individual images

able to define structural characteristics of the fibrils at high resolution

Three examples of protein misfolding

ApoE

Fraction of unfold at different [urea]

ApoE4 is more sensitive to unfolding (low ΔG)

Urea (denaturant) changes the energy landscape to favour unfolded forms

Gel Filtration

gel filtration sorts molecules based on their size (larger molecules elute earlier)

ApoE4 forms aggregates more rapidly (more elute at earlier time)

Negatively stained TEM #

aggregates of apoE appear as small twisted amyloid fibrils

equilibria affects aggregation

Native state: ApoE2>ApoE3>ApoE4

mistfolding pathway: ApoE4>ApoE3>ApoE2 (accumulation of partially unfolded state favours amyloid pathway)

Huntingtin

Pelletability indicates aggregation

aggregates have different tertiary/quaternary structure

react differently to different conformational specific antibodies

forming different morphologies (TEM)

expansion of polyQ enables new misfolded states to be populated as well as diverse aggregation pathways

long polygulatamine repeat lengths induce huntingtin to aggregate

aggregates are pelleted by high-speed centrifugation

Tau

microtubule-binding protein that forms aggregates in Alzheimer's and frontotemporal dementia

immuno-staining

in normal brain cell, huntingtin is diffusely distributed

in disease brain cell, huntingtin forms dense agglomerates within cell (inclusions)

normal function: stabilize microtubules

the binding of Tau to microtubules is regulated by phosporylation

mutation influences Tau aggregation

when dephosphorylated, the Tau binds to microtubule; when phosphorylated, it detaches

phosphorylation

detached tau can undergo aggregation pathways

affect microtubule binding (affinity)

aggregation rate

Hyper-phosphorylation increases the concentration of detached tau #

gene splicing #

expressed as 6 isoform, result from differential splicing

mutation can change folding landscape, and enable new aggregation pathways

folding in a crowded cell

excluded volume effects

molecules in a solution occupy space

packing of the molecules can change their interaction with each other and their conformation and folding

excluded volume affects different molecules differently

small molecules can fit in the gaps between large molecules

crowding is more acute for large molecules in confined spaces

large molecular requires more energy to fit in crowded space

energy-molecule relationships: μ(ex)

varies with size, shape , crowdedness

protein concentration + surrounding crowdedness

effect of excluded volume

improved self-association behaviour

retarded diffusion

altered kinetic rates and equilibrium condition

promotion of ordered packing

trimer takes up less volume than monomer + dimer

crowding increases μ(ex) less for trimer

change the energy level (free energy= initial free energy + μ(ex)) and favour polymer formation

Example: apoC-II

crowding increase the rate of amyloid formation

spontaneously form amyloid in neutral buffers

use glucose polymer "dextran" to mimic the crowded cell environment (higher [dextran], more crowded)

Compartmentalization

Each compartment has a cloud of specialized machinery to collectively work together and help keep the protein folded

Endoplasmic reticulum

govern the quality control of proteins and remove/repair not properly folded protein

Calnexin cycles

use chaperones and glycosylation tag to fold complex protein

ERAD

after quality control from calnexin cycle, proteins that are deemed unfolded are expelled from ER and be degraded

N-linked glycosylation (N:asparagine)

glycosylation pattern "tag" the foldedness of the protein and confer other function (reprocess/degraded) to the protein

Calnexin

a chaperone which binds to a protein and help it fold

if successfully folded, distinct glycosylation pattern signal the cycle to deliver the protein to the cell

if not properly folded, different glycosylation pattern indicates it need to go through the cycle again (or be further processed)

a certain glycosylation tags trigger export to the cytosol for degradation

Pulse Chase

detect the longitudinal fate of molecules using transient labelling

pulse the cell: add radioactive amino acid ([35S] methionine) for a short time (all new protein produced during the period are labelled)

Chaperones

chase the cell: harvest cell at different time interval and capture proteins of interest by immunoprecipitation

radioactive amino acid is wash out

run proteins on a gel and scan the gel for radioactive protein (protein with glycosylation tag has a higher molecular weight)

function

assist protein folding correctly by minimizing incorrect folding pathway

prevent nascent chains from aggregating

HSP chaperones

Trigger factor (TF) make initial contact with nascent peptide being produced to prevent aggregation

HSP40 and 70 recognize and hold hydrophobic and unstructured amino acid sequences (allow native fold to form)

sometimes deliver protein to chaperonins for folding

Chaperonins

a closed chamber to assist individual proteins to fold; changes the energy landscape to favour folded forms

limited to small proteins (<60 kDa)

TRiC in mammal

GroEL/ES in bateria

once the unfolded peptide enters, the chamber closes (ATP dependent process)

the protein allowed to fold inside the chamber (disfavours non-native folding pathway)

CryoEM shows that when ATP-bound it's closed conformation; when ATP-free, it's open conformation

the unfolded proteins binds near the lid via a hydrophobic patch

when the lid close, the chamber becomes hydrophilic

it forces the protein detach and hydrophobic contacts to form inside the protein

hypothesis 1: promote reaction from the intermediate state and help break non-native interaction

hypothesis 2: remove the possibility of intermediate state

HSP40 binds to the protein and recruit HSP70 which clamps on the peptide (ATP-dependent step)

HSP70 has high affinity to the peptide

nucleotide exchange factor release ADP and return HSP70 to low affinity state and peptide is release

each cycle allows protein to incrementally fold

different domain can form separately (folding of large complicated protein)

the hydrophobic-rich residues of the unfolded protein binds to the β-sheet of the HSP70 peptide binding domain and the α-helix clamps on

vectorial folding

protein tethered to ribosome can restrict available conformations in the folding funnel (geometric constraints) (biases away from abnormal folding pathway)

Example: Green fluorescent protein (GFP) #

β-barrel with fluorophore in the barrel core; only form fluorophore when fully folded

folding is not reversible when denatured

stalling sequences are inserted at the end of the GFP sequence, causing ribosome to jam the leaving GFP peptide

two stall position are created

one enable the whole GFP comes out of the exit tunnel

one traps part of the last β-strand in the exit tunnel

GFP fluorescence yielded indicates the extent of folding

Experiment shows GFP cannot fully fold until the c-terminus is comepletely released (but still not at maximum efficiency)

experiment also show that folding is more efficiently when tethered into ribosome than refolding from solution

Frontiers of folding

therapeutic strategies

Proteostasis

active mechanism (integral machinery) to maintain folded proteins and remove misfolded/aggregate protein #

proteostasis can target any stage of the folding/misfolding pathway

disaggregation chaperons can suppress/reverse the misfolding pathway

folding chaperones can minimise the accumulation of intermediate state which is vulnerable to misfolding

Therapeutics for misfolding disease

Chaperones responds to aggregating protein #

chaperones are up-regulated

Example: Ataxin-3

long polyQ mutation cause neurodegenerative disease #

different distribution pattern for different Q-length (long polyQ length forms aggregates and shows as spots)

HSP70 accumulate at the site of aggregated ataxin-3 (HSP70 engages with aggregated protein)

over-expressing HSP70 can protect flies from eye tissue degeneration by ataxin-3 (Q78) (extra HSP70 can help restore proteostasis)

overstraining proteostasis

the intrinsic folding equilibrium can not be changed, but can smooth the folding funnel (reduce the accumulation of intermediate state)

Example: Par(ts)

Par(ts) is a temperature sensitive protein #

at 15 °C, it is a smooth distribution (no aggregation); aggregates at 25 °C

when poly-Q protein is introduced, par(ts) starts aggregate at 15 °C (not at the same site as poly-Q protein)

there is only a finite reserve of resources, the introduction of misfolding proteins can divert the resources, and insufficient resources to prevent Par(ts) from aggregating (also other protein can become unstable)

Case 1: Cystic fibrosis

Caused by mutation in CFTR protein which leads to misfolded CFTR (folding of CFTR is kinetically driven- undergo a series of incremental step to fold)

mutant is efficiently removed by the quality control system

folding is disrupted at co-translational and post-translational steps

cotranslational folding on ER membrane (vectorial folding) #

post-translational folding in Golgi where domains assemble to form final structure

ΔF mutation in the NBD1 domain destabilizes other domains

in wild-type: a highly packed, low free energy protein forms

in mutant: domains can't assemble efficiently, retain non-compact structure (molten globule like) #

proteostasis degrades the CFTR

Freshly made CFTR in ER: when calnexin fails to fold CFTR, it is directed to ERAD for degradation

Mature CFTR on plasma membrane: targeted by quality control, through endocytosis, it gets rescued by QC or degraded in lysosome

Strategies to rescue CFTR recovery

drug 1: binding to the NBD and stabilise the native fold

drug2: reducing detection of mutant CFTR by QC

assess efficiency of folding

express ΔF508 CFTR that normally can't transport halide

express YFP (yellow fluorescence protein, quenches by iodide) in the cell

add iodide to the cell culture

faster quenching rate = more functional CFTR

drug were screened that improved halide transport

Case 2: transthyretin

mutant transthyretin (TTR) causes familial amyloid polyneuropathy, characterised by massive accumulation of TTR amyloid fibrils

aggregation proceeds through misfolded monomer (dissociation of tetramer to native monomer, then partial denature to amyloidogenic monomer)

suppressor of the amyloid disease (T119M mutation)

express V30M TTR and T119M TTR with acidic tag (FT2)

mix and form chimeric tetramers (5 possible forms) which then are separated using charge difference

Assay different tetramer for their stability and amyloid fibril formation rate

T119M conferred stability of tetramer (increase the energy barrier of the transition state)

T119M suppressed fibril formation

V30M reduces the energy barrier of transition state to monomer

monomer is vulnerable to rival misfolding pathway but not energetic favourable under normal condition

use drug to stabilise tetramer

TTR tetramer has two thyroxin binding site, thyroxine binds weakly but can stabilise the oligomer and mediate the dissociation

require a high affinity mimetics of thyroxine

a library of thryoxine analogues with no biological hormone function is made and tested for their influence on tetramer stability and amyloid formation

drug has variable effects on dissociation and fibril formation, but there is a good correlation between drugs that inhibit dissociation and those inhibit fibril formation

binding of a ligand lock TTR into lower energy state far away from misfolding pathway, hence suppress aggregation

basic of fluorescence

the emission wavelength has lower energy than the absorption wavelength

fluorescent protein

entirely genetically encoded, can express in cell and tag to protein as a marker

fluorophore involves an autocatalyzed cyclization reaction of three residue in the middle of the β-barrel

FRET: fluorescence Resonance Energy Transfer

fluorescence-based strategy to monitor orientation and distances

two fluorophores within a few nm can transfer energy to each other (acceptor absorbs some of the energy non-radiatively from the donor and emit its fluorescence)

FRET depends on the distances and orientation

basic parameters

FRET efficiency E(FRET) (decrease significantly after certain distance)

Distance where E(FRET)=50% (R0)

proportional to: orientation factor, quantum yield (efficiency of the fluorescence when shine light on it), spectral overlap (the overlap of the donor emission and acceptor absorption spectrum)

inverse proportional: refractive index of solution

orientation factor κ2

fluorophore is planar and absorb light in orientation-dependent manner

usually assume dynamic freedom approximates the sidechain behavior (κ2 = 2/3)

Example 1: Tau protein conformational changes upon phosphorylation

Tau is mutated to contain one fluorescence donor and one acceptor

click to edit

additional mutation to mimic phosphorylated form of tau

donor: tryptophan

acceptor: IAEDANS (attached to Cys)

different mutant with Trp and Cys at different locations

FRET efficiency is different for different mutation pair, which infer the distance between two residue and enable mapping of conformational changes

FRET can also track unfolding

FRET changes at different [denaturant]

the FRET pairs report on localized changes in as tau unfolds

higher [denaturant], higher donor's fluorescence, less acceptor fluorescence

Example 2: orientation of fluorophores on DNA

donor fluorophore at 5' end; acceptor fluorophore at the 5' end of complementary strand

Single molecule folding

Since DNA is relatively rigid and periodic, can use this property to study κ2

as DNA shorten, the orientation of the fluorophore will twist

stimulation assuming completely rigid shows periodicity of peaks and as distance increase, the peak is lower

experimental data show the DNA is not completely rigid, there is some mobility due to the flexible linker. Also, the dynamic average (κ2=2/3) assumption is acceptable

bulk phase: average signal of an ensemble of molecules

lose of information encoded in single molecules

Trajectory of folding

variation of structure

single molecule FRET

use highly sensitive detector to measure photons emitted from a single fluorophore

using dilute sample to isolate single molecules

Approach 1: confocal fluorescence (watch molecules diffuse in and out of the volume)

Approach 2: traaping single molecules on a surface (either entrapment in phospholipid bilayer vesicle or tether to antibodies)

Example 1: folding of Chymotrypsin Inhibitor 2 (CI2)

label CI2 (wild type and mutant) with a FRET pair

confocal imaging of dilute solution

solution has an equilibrium of folded and unfolded protein

unfolded protein will show a large donor fluorescence and a small acceptor fluorescence

each molecule has its FRET efficiency

create a histogram of all FRET efficiencies

Measure FRET at different [denaturant]

E(FRET) fell into two Gaussian population, indicating 2-state unfolding

mutation K17 destabilizes the conformation

mutant completely unfolds at lower [denaturant]

Example 2: clathrin coat disassembly

clathrin forms coats to drive endocytosis.

auxilin helps to buildup the clathrin coats

HSC70 interacts with auxilin to break apart the cage unit

once the clathrin coat buds off vesicles, the cage falls apart into unit

label clathrin with one fluorophore and label Hsc70 with a different coloured fluorophore

clathrin coats are fixed to a slide with antibodies and each coat is spaced far apart

through fluorescence imaging, we can observe the clathrin recruits Hsc70 and the Hsc70 breaks down the clathrin (disapparence of fluorescence)