Systematic analysis of expression signatures of neuronal subpopulations in the VTA

Gene expression profiling across various brain areas at the single-cell resolution enables the identification of molecular markers of neuronal subpopulations and comprehensive characterization of their functional roles. Despite the scientific importance and experimental versatility, systematic methods to analyze such data have not been established yet. To this end, we developed a statistical approach based on in situ hybridization data in the Allen Brain Atlas and thereby identified specific genes for each type of neuron in the ventral tegmental area (VTA). This approach also allowed us to demarcate subregions within the VTA comprising specific neuronal subpopulations. We further identified WW domain-containing oxidoreductase as a molecular marker of a population of VTA neurons that co-express tyrosine hydroxylase and vesicular glutamate transporter 2, and confirmed their region-specific distribution by immunohistochemistry. The results demonstrate the utility of our analytical approach for uncovering expression signatures representing specific cell types and neuronal subpopulations enriched in a given brain area.


Introduction
The brain is an extremely complicated organ containing myriad regions for the distinct processing and integration of neural information. These regions are composed of diverse subregions, only some of which have been characterized thus far. To understand the functional roles of individual neural circuits, the primary resident neuron types must first be identified. Conventionally, neuron types have been classified in accordance with their morphology, connectivity, and electrophysiological features [1][2][3]. There is a limited set of established markers for neuron types, and the expression patterns of many genes remain uncharacterized [4]. Currently, in situ hybridization (ISH) data are available in the Allen Brain Atlas (ABA), providing brain-wide gene expression profiles in adult mice, particularly at the singlecell resolution [5,6]. The ISH data provide opportunities to search and pinpoint genes that were selectively expressed in neuronal subpopulations [7,8]. The select genes can then serve as molecular signatures that represent these neurons.
ISH data in the ABA have been used to identify neuronal subpopulations whose functions were investigated with genetic animal models. For instance, Elfn1 is expressed by subpopulations of interneurons within the oriens-lacunosum moleculare area of the hippocampus and confers target-specific synaptic properties [9]. Hence, the identification of the neuronal subpopulation by a marker gene led to the functional characterization of the subregion in which they mainly reside. However, the ISH data are not in an easily accessible format, which would deter systematic searches for genes expressed specifically in subpopulations.
The ventral tegmental area (VTA) is a midbrain dopamine-producing center that is causally involved in emotional states such as motivation and reward [10,11]. The VTA largely comprises dopaminergic, glutamatergic, and GABAergic neurons that express the key enzymes for the synthesis and release of their respective neurotransmitters [12,13]. However, it is not clear whether cellular identity can be systematically analyzed by profiling gene expression in each subregion of the VTA or which genes are selectively expressed by each cell type. To address these questions, we developed and applied analytical approaches for identifying molecular markers of the neuronal subpopulations enriched in VTA subregions. This newly developed experimental algorithm provided a set of unanticipated genes as molecular markers of VTA cell types.

Identification of alternative marker genes
To identify potential marker genes for glutamatergic, dopaminergic, and GABAergic neurons in the VTA, for 1143 genes with the data available, Spearman's correlations of their expression intensities in the 42 voxels of the VTA were calculated with the expression intensities of the following three known maker genes: tyrosine hydroxylase (TH; the enzyme required for dopamine synthesis), vesicular glutamate transporter 2 (VGLUT2; encoded by Slc17a6), and glutamate decarboxylase 67 (GAD67; encoded by Gad1). P values of the correlations between the genes and those known marker genes for the null hypothesis (i.e., gene is not correlated with the markers) were estimated according to a t test [14] previously described for the correlation coefficient. The correlations with P < 0.05 were considered to be statistically significant, and thereby marker candidates were selected as the genes with significant positive correlations uniquely with a known marker gene. Those genes having significant positive correlations with each used marker gene could show significant (P < 0.05) negative correlations with the other marker genes. On the basis of the correlation patterns (positive, negative, or no significant correlation) with the known marker genes, the selected candidate genes were grouped into 11 clusters. The final marker candidates were genes that correlated positively with the neuron type of interest, but correlated negatively with the other two neuronal types.

Identification of marker genes for neurons co-releasing dopamine and glutamate
A virtual expression profile of a marker gene for neurons co-releasing dopamine and glutamate was constructed by taking the minimum expression levels of Th and Slc17a6 across the grid voxels of the VTA, assuming that these values would be the maximum expression levels achieved by neurons expressing both Th and Slc17a6. To identify marker candidates for the coreleasing neurons, Spearman's correlation values were calculated between the expression profiles of each candidate gene in the grid voxels of the VTA and the virtual expression profile. The P value of the correlation was computed according to the t test mentioned above. The marker candidate genes for neurons co-releasing dopamine and glutamate displayed positive correlation with a P value of < 0.05.

Animals and tissue preparation
Male C57BL/6 J mice were housed under a 12-h light/ dark cycle with ad libitum access to food and water. All procedures for animal experiments were approved by the ethical review committee of POSTECH (Pohang University of Science & Technology), Korea, and performed in accordance with the relevant guidelines. Mice were anesthetized by intraperitoneal injection of Avertin (250 mg/kg body weight, T48402; Sigma) and transcardially perfused with PBS followed by 4% formaldehyde. The brains were isolated, postfixed overnight at 4°C in a 4% formaldehyde solution, and embedded in 5% agarose gel for sectioning (50-μm-thick coronal sections) with a vibratome (VT1000S; Leica, Germany). Tissue sections containing the VTA region according to the mouse brain atlas [15] were collected.

Cellular imaging and quantification
Sections were imaged with a laser scanning confocal microscope (LSM 510; Zeiss, Germany) with 40× lens objective (C-Apochromat 40 × /1.  (Fig. 2b), and the average cell count was obtained. In the experiments using different brain slices, we tried to capture all IHC images from the VTA locations indicated in Fig. 2a, which effectively covers the VTA [15]. To further clarify the location information, we assigned IDs to the sampling locations, M1-6 and L1-6, in Fig. 2a and used these IDs to indicate the locations from which the representative images were obtained. Mander's overlap coefficient was calculated by the Coloc2 plugin function of Image J.  these experiments, we used the following numbers of animals and images: for NeuN + counting, N = 3, 10 brain slices, M location: 10 images, L location: 10 images; for TH-GAD67-VGLUT2 triple labeling, N = 4, 13 brain slices, M location: 11 images, L location: 9 images; for TH-CHRNA6 double labeling, N = 3, 11 brain slices, M location: 7 images, L location: 7 images; for VGLUT2-P2RY14 double labeling, N = 3, 11 brain slices, M location: 6 images, L location: 6 images; and for TH-VGLUT2-WWOX triple labeling, N = 6, 22 brain slices, M location: 22 images, L location: 14 images).

Analytical algorithms for gene expression profiles in the VTA
To parse gene expression profiles in the VTA, we first selected a grid of 42 voxels (200 × 200 × 200 μm 3 ) encompassing the VTA according to the annotated three-dimensional reference spaces reconstructed based on ISH and magnetic resonance imaging data in the ABA ( Fig. 1a and b). For each gene, expression intensity in each voxel was calculated as the sum of the pixel intensity divided by the sum of expressing pixels from four ISH images (intensity/pixel, Fig. 1c), using the three-dimensional expression grid data. Expression intensities for the 1143 genes available from the coronal section dataset in the 42 voxels were obtained, resulting in a 1143 × 42 gene expression intensity matrix (Fig. 1d). For further cellular quantification, we estimated neuronal cell numbers in brain tissue sections by IHC with a selective neuronal cell marker. Empirically, there were 916.82 ± 33.77 and 365.63 ± 9.28 neuronal cells included in a unit area (mm 2 ) and in a voxel (200 × 200 × 200 μm 3 ), respectively, in the VTA (Fig. 1e).

IHC analysis of the VTA
Next, we performed IHC analysis of the VTA using antibodies against TH, VGLUT2, and GAD67 to label dopaminergic, glutamatergic, and GABAergic neurons, respectively. The numbers of each neuron type were counted from 20 images taken at sampling regions along the anterior-posterior axis (indicated in Fig. 2a) to encompass the entire VTA region from multiple mice. GAD67 + cells were not largely co-localized with other cell types, but TH + and VGLUT2 + cells were partially co-localized (Fig. 2b). The proportions of TH + , VGLUT2 + , and GAD67 + neurons were estimated to be 70, 22, and 16%, respectively, of the population of NeuN + cells (set at 100%, see Fig. 1e) (Fig. 2c), which is consistent with previous findings [16,17]. The remaining 2% of neurons had no detectable expression of TH, VGLUT2, or GAD67. Interestingly, 10% of the neurons expressed both TH and VGLUT2 (see TH-VGLUT2 + neuron in Fig. 2b), suggesting that the VTA contains a substantial proportion of neurons that corelease dopamine and glutamate.
Alternative marker genes to Th, Slc17a6, and Gad1 To demonstrate the utility of the ISH data in the ABA, we first attempted to identify genes that showed similar expression profiles to the known marker genes, Th, Slc17a6, and Gad1, across the 42-voxel grid in the VTA. To this end, we calculated Spearman's correlations for the expression intensity of Th, Slc17a6, or Gad1 with those of the 1143 genes in the 42 voxels and then estimated the significance (P value) of the correlation for each marker gene pair. Using this algorithm, the expression profiles of 539, 422, and 336 genes positively or negatively correlated significantly (P < 0.05) with those of Slc17a6, Th, and Gad1, respectively (Fig. 3a). Among these, we selected 171, 231, and 179 genes whose expression intensity patterns were positively correlated uniquely to those of Slc17a6, Th, and Gad1, respectively (Fig. 3b-e). Interestingly, anticorrelations were found between proportions of these genes, which may better distinguish these cell types. For example, among the 231 Th-like genes, 47 and 9 showed significant (P < 0.05) anticorrelations with Slc17a6 and Gad1, respectively. Similar anticorrelated gene sets were identified from the Slc17a6-like genes (68 genes anticorrelated with Gad1, 12 genes with Th, and three genes with both) and the Gad1-like genes (18 genes anticorrelated with Th, 104 genes with Slc17a6, and 16 genes with both). These genes included previously known marker genes for dopaminergic and GABAergic neurons, namely Slc6a3 [18,19] and Drd2 [18] in Th-like genes and Gad2 [20] and Slc32a1 [21] in Gad1-like genes, respectively (Fig.  3b). These data support the utility of the ISH data in the search for potential marker genes associated with the primary neuron types in the VTA.

Distributions of distinct neuron types in the VTA
The search for alternative marker genes resulted in novel candidates for Th + , Slc17a6 + , and Gad1 + neurons. We determined whether their expression in the VTA correlated with Th, Slc17a6, and Gad1 expression using the ISH images in the ABA and selected the top five novel marker candidates for each neuronal type ( Fig. 4a and  b). From these results, we selected Chrna6 and P2ry14 from the Th-and Slc17a6-like genes, respectively ( Fig.  4b and c) for further analysis; none of the top five Gad1like candidates showed expression patterns similar to that for Gad1 based on the ISH data.
We further examined the anatomical distribution of Th, Slc17a6, and Gad1, as well as the alternative marker candidates, in the VTA via the ISH images. Th + and Chrna6 + neurons were distributed throughout the VTA as well as in the substantia nigra pars compacta area (Fig. 4c, top row). Slc17a6 + and P2ry14 + neurons were enriched in the medial part of the VTA, with P2ry14 also Expression intensity for each gene was autoscaled to yield a mean of 0 and a standard deviation of 1 (red, positive; blue, negative). e Correlation patterns of Slc17a6-, Th-, and Gad1-like genes. These three groups of genes were categorized into 11 clusters (C1-11) based on their correlations (positive, red; negative, blue) with Slc17a6, Th, and Gad1 distributed weakly in the substantia nigra pars reticulata (Fig. 4c, middle row). By contrast, Gad1 + cells were distributed peripherally around the VTA and in the substantia nigra pars reticulata (Fig. 4c, bottom). These data suggest that the anatomical distribution of neurons expressing the marker genes can potentially be used to identify subregions in the structures in which they reside. To assess the validity of P2ry14 and Chrna6 as marker genes, we performed IHC to examine the expression of P2RY14 and CHRNA6 in VGLUT2 + and TH + cells (Fig. 4d). Quantification of the numbers of single-and double-positive cells confirmed that the expression of these genes can be used as reliable markers of individual cell types (Fig. 4e). Collectively, the data described above support the utility of our analytical approach for identifying marker genes for neuronal subpopulations as well as their distribution in the VTA.

Marker genes for neurons co-releasing dopamine and glutamate
The IHC analysis confirmed that a subpopulation of neurons in the VTA co-express TH and VGLUT2 (Fig.  2b and c), which can be considered neurons that corelease dopamine and glutamate [13,16]. Since there are no faithful marker genes for these co-releasing neurons, we sought to examine their gene expression profiles in the VTA. We first computed the minimum expression intensities of Th and Slc17a6 in individual voxels (Fig. 5a, gray shading area), assuming these intensities are the maximum that can originate from neurons coexpressing TH and VGLUT2. Using this idea, we identified 191 genes with expression intensities that correlated significantly (P < 0.05) with the minimum intensities of Th and Slc17a6 (Fig. 5b). We then selected the top five candidates (Fig. 5c) and examined ISH images to determine whether they are co-expressed with Th and Slc17a6 in the VTA. We selected the gene encoding WW domain-containing oxidoreductase (Wwox), whose expression pattern was most similar to that of Slc17a6 (Fig. 5d), overlapped with Th (Fig. 4c, top left), and was consistent with the minimum expression profiles of Th and Slc17a6 (Fig. 5a). To confirm Wwox as a marker of TH-and VGLUT2-co-expressing neurons, we performed an IHC analysis (Fig. 6a) and a pixel-level analysis of fluorescence signals using Mander's overlap coefficient ( Fig. 6a and b). The IHC data showed that > 70% of the neurons that expressed WWOX also expressed both TH and VGLUT2 (Fig. 6c) and were enriched in the medial part of the VTA relative to the lateral part (Fig. 6d and e), which was consistent with the minimum expression profiles of Th and Slc17a6 (Fig.  5a). These data further support the utility of our analytical approach and algorithm in identifying novel marker genes for a subpopulation of neurons and their distribution in the VTA.

Discussion
In this study, we analyzed gene expression intensities within voxels encompassing the VTA. We estimated from IHC that each voxel contained > 300 neurons and thus may not allow for sufficient spatial resolution in order to pinpoint marker gene expression in individual cells. However, our results demonstrate that such data can provide a list of useful marker candidates, such as Th and Slc17a6 for dopaminergic and glutamatergic neurons, respectively. Our analytical approach suggests that the ISH data can identify marker candidates when the variation in expression intensities across each voxel serves as a representation of the variation in neuron subpopulations in a specific region, such as the VTA.
Our systematic analytical approach involved supervised clustering of the genes based on correlation patterns with the known markers (Th, Slc17a6, and Gad1) to identify alternative markers for neuron subpopulations in the VTA. However, this approach may be not necessary, as we can perform unsupervised clustering of the genes according to the similarity of their expression patterns across the voxels in the grid. Each of the resulting clusters may represent a neuron subpopulation. In this study, unsupervised clustering of the genes within the 42-voxel grids in the VTA using the non-negative matrix factorization method [22] provided four major clusters that included Th, Slc17a6, Gad1, or both Th and Slc17a6. These results were consistent with those from our supervised clustering approach.
Although dopamine-and glutamate-co-releasing neurons were previously identified [12,13,23], their cellular features and functional consequences remain to be fully clarified [13,24,25]. Their functional roles are just  [26][27][28]. However, these studies were unable to target the coreleasing neurons selectively and failed to delineate their impact on synaptic plasticity and animal behaviors. We identified Wwox as a potential marker gene for these coreleasing neurons, which may enable them to be modulated in cell-type-specific and temporally dependent fashions both in vitro and in vivo.
Previously, Wwox was shown to act as a tumor suppressor whose loss of heterozygosity and chromosomal rearrangement have been detected in various cancers including ovarian, breast, hepatocellular, and prostate cancers [29]. Upon its phosphorylation at Tyr33 in the WW domain, activated WWOX acquires enhanced interactions with various transcription factors including p53, c-Jun, TNF, p73, AP2 gamma, and E2f1. Recently, a number of studies have reported that Wwox plays important roles also in the brain, and its dysregulation leads to neurodegeneration [30]. For example, Wwox is downregulated in the hippocampi of patients with Alzheimer's disease [31], and knockdown of Wwox in neuroblastoma cells and mice resulted in aggregation of amyloid β and Tau [32]. However, the potential roles of Wwox in the VTA have been rarely investigated. WWOX binds and co-translates with many transcription factors to relocate to the nucleus to enhance or block neuronal survival under physiological or pathological conditions [33]. Our finding suggests that Wwox can be induced highly in dopamine-and glutamate-co-releasing neurons and selective targeting of these co-releasing neurons using Wwox may provide new insights into the roles of these neurons in neuronal survival in the VTA, as well as animal behaviors associated with the VTA.
The number of genes with the expression intensity available in the ABA continues to increase, which should lead to more comprehensive searches of marker genes. Moreover, the gene expression intensities from the sagittal section datasets can be combined with those from the coronal section datasets, and our analytical approach can be applied to the combined gene expression profiles. Genes that show specific expression in neuronal subpopulations consistently in both coronal and sagittal section datasets could be considered as more reliable candidates. Therefore, our analytical approach is widely applicable to the identification of various cellular marker genes in various cellular contexts and brain areas.