Expression of Recombinant Proteins in E.coli
Studying proteins
Why?
Important part of cell
Structure
Function
Transport, catalysis etc...
Drug design
Engineered enzymes
Biocatalysis
Antiretroviral drug therapies
HIV-1, HCV
Polyprotein cleavage by proteases essential step of virus maturation
Protease inhibitor
Inhibit virus maturation/replication
Development based on knowledge of proteins
Protein-based therapeutics
Hormones
Cytokines
Vaccines
Monoclonal Ab's
140 therapies approved
1/3 produced in E.coli
How?
Purification
NMR, X-ray, cryo-EM
In vivo studies
Labelling
Cloning + overexpression
High yield of purified protein
Possibility to mutate amino acid residues specifically
Engineering
Studying function
Cell expression systems
Bacterial
Most well developed
High protein expression
Reproducibility is robust
High cost
Short production cycles
Ab's + enzymes produced
No complex proteins
Protein-tag needed for post production recovery
Mammalian
Medium protein expression
High cost
Short production cycles
Some complex proteins
Human glycoprotein - vaccines produced
Post-production recovery by concentration of proteins from culture medium
Reproducible
Dominates bio-therapeutic market
Plants
High protein expression
Low-high cost
Seasonal production cycles - slow
Complex proteins produced
Ab's, vaccines, enzymes produced
Post-production recovery by co-extraction with commercial by-product or seed endosperm
Reproducible
Protein purification
Why?
Purification of a single protein from a mixture
DNA technologies
Genes of interest amplified by PCR
Cloned into expression vector
Study structure/function of a particular protein individually
Comparison of mutant proteins
Structural studies
x-ray, NMR etc...
General method
- Identify target gene
- Create PCR product with RE sites at either end
- Digest product + plasmid vector
- Ligate digested product + vector
- Insert plasmid vector with gene into E.coli for expression
Purification tags allow detection
High throughput genome sequencing
Protein can be produced in liquid culture e.g. E.coli
Plasmid vectors for protein production in E.coli
Requirements
Transcription
RNApol binds promoter + mRNA produced
Translation
How?
Insert target gene in between upstream regions for starting transcription/translation and terminating transcription
MCS at correct location
Repressor binding site for control of transcription
Arabad promoter (PBAD)
Vector contains
Origin of replication (ori)
Selection marker
Repressor gene (araC)
ATG start codon
MCS containing ATG start codon e.g. NcoI (CCATGG) or NdeI (CATATG)
Other features
C-terminal myc
His-tags
Ab recognition
Purification
Promoter systems
LacI
L-arabinose induction
Based on lac-operon
Anhydrotetracycline
Based on tet-repressor
arabad promoter system
All 3 based on repressor proteins
Block RNApol binding
Inducible by small molecule
Small molecule binds repressor + it dissociates from DNA binding site
RNApol can bind + transcribe
Autoinduction
lac + arabad
Actively repressed by glucose
Even in presence of inducer
Glucose metabolised as cells grow
Relieving expression
Lac operon
Genes
3 structural
lacZ
Encodes \(\beta\)-galactosidase
lacY
Encodes lactose permease
lacA
Encodes thiogalactoside transacetylase
3 functional
lacP
Promoter
lacI
Repressor
lacO
Operator
Only produced when lactose is present
Metabolism of lactose
In recombinant gene expression
Replace lacA/Z/Y cassette with gene of interest
Induce expression by addition of lactose or synthetic analogue (\(\beta\)-D-1-thiogalactopyranoside)
pET-series vectors
2-step expression system
T7 promoter binding site
Not standard RNApol binding site from E.coli
From phage artificially inserted into vector
More efficient
More mRNA
Expression
lacO sits after T7 promoter, lacI blocks
Gene cloned in plasmid MCS
T7 expression under control of lacI
Transcribed by T7
Protein only expressed in E.coli strains with T7 pol
DE3
Genetics carried out in strains not expressing endonucleases - prevent DNA degradation in storage
lacI repressor blocks at 2 points
lacO site on host genome repressing transcription of T7 RNApol gene
lacO site on pET-vector repressing transcription of recombinant gene
Leaky, incomplete repression
Problem if target gene is toxic
Some T7 still produced
Expression of T7 lysosome from pLysS plasmid blocks T7 RNApol
IPTG induction produces excess T7, overcoming block from T7 lysosome
Primer design
Typical MCS for cloning genes for overexpression
General rules
- Length \(\geq\)18nt
- Finish at 3' end in G or C
- T\(_m\) values of primer pair must be within 5\(^o\)C of each other
T\(_m\) = 69.3 + (0.41 x %GC) - (\(\frac{650}{primer length}\))
- Annealing temp. of PCR is 5\(^o\)C lower than lowest T\(_m\) value
Forward primer
Exactly the same as start of gene from ATG
Once at 18nt, if it does notend in G or C keep going until it does
Reverse primer
Reverse compliment of original DNA strand
Once at 18nt, if it does notend in G or C keep going until it does
e.g. original strand is AATGGCTA
Compliment is TTACCGAT
Then reverse TAGCCATT
After ~5 cycles the majority of template molecules include originally non-annealing basepairs
T\(_m\) changes
If T\(_m\)'s are too far apart, add basepairs to shorter primer
T\(_m\) is the temperature at which half of the primer is bound to the template molecule
Non-pET series vectors
pASKIBA63b+-NdeI
Resistant marker
Ampicillin
Based on Tet repressor
Inducer
Anhydrotetracycline
Expression strain
Any
Challenges of expressing recombinant proteins
Insolubility
Doesn't fold intro proper conformation inside E.coli
mRNA stability
Too stable
Can't be translated
Too instable
Degraded
Codon utilisation
Insoluble aggregates formed
Inclusion bodies
Possible causes
High % protein production
~50% rather than 1%
Combined with slow and/or incorrect folding
Large hydrophobic patches produced
Protein molecules aggregate via these patches(inclusion bodies)
Possible remedies
Reduce temp. of E.coli propagation during overexpression
Test range of temps (20,28 + 37\(^o\)C
Slows all processes including transcription + translation
Allowing folding to 'catch up'
Link recombinant protein to soluble fusion protein e.g. maltose binding protein