The design of vectors for industrial production of recombinant proteins in E. coli is complex and challenging. This is due to expression being the result of a series of cellular events which can all be affected by various parameters, including the target protein itself. Because of this, there is no single expression vector that works for each and every protein, and it is unlikely that such a vector will ever be developed. Every protein is unique and demands a unique production process for optimal production.

In this first chapter in our continuing discussion on common expression bottlenecks, we will look at transcription, its importance, and how it can be improved for each target protein.


Improving transcription rates has been one of the more straight-forward approaches in recombinant gene expression. Promoters as regulatory genetic elements have been studied since the early 1960s and it is quite well understood how they interact with RNA polymerases and transcription factors to facilitate mRNA synthesis. A guiding early principle was that the more mRNA that is produced, the more protein would be produced. Hence, stronger promoters were initially favored. As such, the very strong T7 promoter (together with the T7 RNA polymerase) was quickly adopted in the late 1970s and is still extensively used.

This principle unfortunately has a limited scope. It might work nicely on many of the simpler enzymes, especially those of microbial origin, but as the industrial trend is towards more and more proteins with complicated tertiary structures, it is more and more frequently observed that maximizing transcription does not lead to more biologically active protein. Instead it is frequently observed that when overexpressing challenging proteins in E. coli, inclusion body formation, aggregation into soluble particles, disulfide bond scrambling, and other aberrant final conformations of the protein, is the unfortunate result.

The likely explanation for this is that the E. coli cell has not been evolved to produce such proteins at very high titers in a biologically active state. Fortunately, the solution is often to simply reduce the expression rate to allow the folding machinery of the cell to keep up with the production, resulting in more of the protein ending up in a soluble, biologically active state. This can be done on the level of transcription by carefully selecting a promoter that results in optimal, not maximal, transcription rate.

Predicting in silico the structural fate of a protein - whether it will form inclusion bodies, soluble aggregates, etc - when expressed from different promoters is unfortunately impossible. This can only be elucidated through experimental work. If it is determined that the protein would benefit from lower or higher expression rates, one needs to either have promoters that are tunable over an extended range or a set of promoters exhibiting varying expression rates, or affect this through other steps of the expression process (e.g. fine-tune translational initiation or elongation) or by optimizing cultivation conditions.