AAV offers many advantages as a gene delivery system over other viral vectors; however, its strict packaging capacity limits its use for delivering large transgenes or large cell-type specific promoters. Techniques to divide a gene into two AAV vectors and then combining each segment to reconstitute the full-length sequence after transduction by trans-splicing and recombination have been developed, but this technique is still in its infancy showing much lower expression efficiency compared to a non-divided single vector . Other groups have also developed shorter expression cassettes containing a functional, minimal-sized polyadenylation signal [15, 22]; however, we found that they yield only very low levels of transgene expression. To fully exploit the potential of AAV vectors, a short expression cassette capable of supporting proper expression levels of transgenes required for basic and clinical applications is highly required.
In this study, we systematically compared AAV expression cassettes comprising various short regulatory elements as substitutes for the WPRE and polyadenylation signal of a prototype cassette capable of expressing high levels of transgenes. We found that a shortened WPRE, named WPRE3, which contains two of the three regulatory elements of WPRE, affords efficient transgene expression (83.4%) even when reduced in size from 600 bp to 247 bp.
We then altered the polyadenylation signal of the modified expression cassette (CW3B) that contains WPRE3. Because synthetic, minimal consensus polyadenylation signal sequence was used successfully in conjunction with a strong promoter in a previous study , we used this 49 bp segment in AAV in conjunction with the CaMKIIα promoter (CW3A); however, the resulting transgene expression driven by CW3A was only 40.4% of the original expression cassette CWB. The SV40 early/late polyadenylation signal is often used for expressing the transgene in mammalian cells. It is well known that SV40 late polyadenylation signal is more efficient due to a short upstream element sequence [19, 20]. This sequence is also known to increase the efficiency of a weak polyadenylation signal and even more so when used in tandem . In the present study, we found that the 135 bp SV40 late polyadenylation signal (CW3L) showed similar efficiency to that of bGHpA, which is 223 bp (CW3B). An additional upstream element sequence (CW3SL) further improved the efficiency and restored the reduced level of expression from WPRE3. The upstream element also improved the efficiency of the synthetic polyadenylation signal (comparing CW3A, CW3SA, and CW3SSA). However, these expression cassettes were not as efficient in terms of size as the ones containing the SV40 late polyadenylation signal (CW3L and CW3SL).
The new CW3SL cassette provided 399 bp additional cloning capacity compared with the original template vector CWB and produced comparable expression to CWB (103.4%). The maximal packaging capacity of CWB and CW3SL is approximately 3.6 kb and 4 kb respectively. Among the transcripts expressed in the mouse hippocampus (defined by RNA sequencing), 1.82% (488 genes) have coding sequence length within the range of 3.6 to 4 kb (Figure 5C). Furthermore, note that CWB represents an optimized, compact expression cassette from which sequences not required for expression have been eliminated in our study. Conventionally used AAV backbones which include a commonly used neuronal expression cassette composed of 1.3 kb CaMKIIα promoter (which shows comparable expression efficiency with 0.4 kb CaMKIIα promoter  used in CWB), WPRE, and human growth hormone polyadenylation signal [23–25] has a capacity of 2.4 kb. Therefore, when compared with these conventionally used AAV cassettes, the CW3SL construct would obtain even more cloning capacity than 399 bp. 12.40% (3324 genes) of the transcripts expressed in the mouse hippocampus have coding sequence length within the range of 2.4 to 4 kb. Therefore, CW3SL now makes it possible to express larger transgenes at high efficiency compared with the prototype strong expression cassette CWB and other conventional expression cassettes. In some cases, CW3SL may provide an opportunity to package larger transgenes in AAV vectors that were previously impossible to package, while in other cases it may enable fusing a reporter such as EGFP or other fluorescent proteins without losing expression efficiency as we demonstrated. The family of new expression cassettes tested in this study could also provide a variety of cloning options. Other cassettes shorter than CW3SL would allow packaging of even longer transgenes and still would be useful in certain experimental conditions although their expression level would be lower than that of CW3SL. The new expression cassettes described here would also be useful for designing other derivatives. For example, the SV40 late polyadenylation signal supported the expression of transgenes as efficiently as vectors containing bGHpA in the presence of WPRE3 (Figure 2). Thus, the SV40 late polyadenylation signal alone may support expression at a level similar to that of the CB expression cassette while saving 88 bp. In addition, an SV40 late polyadenylation signal upstream element may further enhance the efficiency of transgene expression. In this study, we used the CaMKIIα promoter to drive transgene expression in neurons, but it would also be possible to substitute this promoter with other ubiquitous promoters, cell-type–specific promoters, or even conditional promoters for different applications.