Age-related changes in behavior in C57BL/6J mice from young adulthood to middle age

Background Aging is considered to be associated with progressive changes in the brain and its associated sensory, motor, and cognitive functions. A large number of studies comparing young and aged animals have reported differences in various behaviors between age-cohorts, indicating behavioral dysfunctions related to aging. However, relatively little is known about behavioral changes from young adulthood to middle age, and the effect of age on behavior during the early stages of life remains to be understood. In order to investigate age-related changes in the behaviors of mice from young adulthood to middle age, we performed a large-scale analysis of the behavioral data obtained from our behavioral test battery involving 1739 C57BL/6J wild-type mice at 2–12 months of age. Results Significant behavioral differences between age groups (2–3-, 4–5-, 6–7-, and 8–12-month-old groups) were found in all the behavioral tests, including the light/dark transition, open field, elevated plus maze, rotarod, social interaction, prepulse inhibition, Porsolt forced swim, tail suspension, Barnes maze, and fear conditioning tests, except for the hot plate test. Compared with the 2–3-month-old group, the 4–5- and 6–7-month-old groups exhibited decreased locomotor activity to novel environments, motor function, acoustic startle response, social behavior, and depression-related behavior, increased prepulse inhibition, and deficits in spatial and cued fear memory. For most behaviors, the 8–12-month-old group showed similar but more pronounced changes in most of these behaviors compared with the younger age groups. Older groups exhibited increased anxiety-like behavior in the light/dark transition test whereas those groups showed seemingly decreased anxiety-like behavior measured by the elevated plus maze test. Conclusions The large-scale analysis of behavioral data from our battery of behavioral tests indicated age-related changes in a wide range of behaviors from young adulthood to middle age in C57BL/6J mice, though these results might have been influenced by possible confounding factors such as the time of day at testing and prior test experience. Our results also indicate that relatively narrow age differences can produce significant behavioral differences during adulthood in mice. These findings provide an insight into our understanding of the neurobiological processes underlying brain function and behavior that are subject to age-related changes in early to middle life. The findings also indicate that age is one of the critical factors to be carefully considered when designing behavioral tests and interpreting behavioral differences that might be induced by experimental manipulations. Electronic supplementary material The online version of this article (doi:10.1186/s13041-016-0191-9) contains supplementary material, which is available to authorized users.


Background
Aging is a complex process associated with structural and physiological changes in the brain that can account for age-related behavioral changes and increased incidence of neuropsychiatric disorders. Rodents have been extensively used as models for human aging and disease because of their widespread availability and short life span. A number of rodent studies have examined the impact of age on brain and behavior, contributing to our understanding of the brain mechanisms underlying behavioral changes associated with normal and pathological aging [1][2][3]. Many such age-related behavioral differences have been reported from comparisons between young adult (2-6 months of age) and aged (18 months of age and over) animals through behavioral tests (e.g., [4][5][6][7][8][9][10][11][12][13]). In the past two decades, advances in gene targeting technology have enabled us to generate targeted gene mutations in mice, which has increased interest in the use of mutant mouse models to elucidate the relationship between the aging brain and behavior (e.g., [14,15]). Despite these extensive studies comparing young and aged animals and the increasing interest in mutant models, there is still relatively little information regarding age-related changes in behavior from young adulthood to middle age (approximately 2-12 months of age) in the background strains of mice used to create these mutant mice.
Behavioral test batteries have been used to assess a wide variety of behavioral traits, including locomotor activity, sensory and motor functions, anxiety-like behavior, and learning and memory, in a cohort of inbred and mutant mouse strains [16][17][18][19][20][21]. Most of the behavioral traits were sensitive to genetic, environmental, and experimental factors, e.g., genetic background, laboratory conditions, and prior test experience [22][23][24][25][26][27]. Thus, behavioral experiments are generally designed to minimize the potential effects of various confounding factors, and the use of a battery of standardized behavioral tests is needed to ensure more accurate interpretations of behavioral phenotypes. In our laboratory, we have examined over 170 strains of genetically engineered mice using a comprehensive behavioral test battery according to our standardized protocol [21]. The large amount of behavioral data from various cohorts of mice of different ages may be useful for understanding the effects of age on mouse behavior.
The present study was conducted in order to examine the effects of age on behavior from young adulthood to middle age, and to identify behaviors that are affected by age at the early stages of life. We performed a large-scale analysis of behavioral data obtained from mice of different ages using our behavioral test battery. Our test battery included general health and neurological screening, light/dark transition, open field, elevated plus maze, hot plate, social interaction, rotarod, startle response/ prepulse inhibition, Porsolt forced swim, Barnes maze, fear conditioning, and tail suspension tests. These behavioral tests had been performed in a nearly uniform order following our standardized protocols. We used the behavioral data from up to 1739 C57BL/6J wild-type mice, which is a widely used inbred strain that often serves as a background strain for mutant mice, at 2-12 months of age. The detailed characterization of age-related changes in behavior of C57BL/6J mice will provide researchers with useful information for designing behavioral experiments, interpreting mouse phenotypes, and understanding the neurobiological basis of age-related behavioral changes. Our large-scale analysis showed significant behavioral differences between age groups in almost all the tests, demonstrating the effects of age on various behavioral domains in C57BL/6J mice, from young adulthood to middle age.

Results
We divided the behavioral data from up to 1739 wildtype C57BL/6J mice into the following four age groups for each behavioral test: 2-3, 4-5, 6-7, and 8-12 months of age. The data from each behavioral test were statistically compared between the age groups using analysis of variance (Additional file 1: Table S1). We defined "study-wide significance" as the statistical significance that survived Bonferroni correction for 69 behavioral measures used in the analysis (p < 0.05/69 = 0.000724). "Nominal significance" was defined as the one that achieved a statistical significance in an index (p < 0.05) but did not survive the correction. The post-hoc multiple comparisons were further performed using Bonferroni correction (for the study wide significance, p < 0.000724/ 6 = 0.00012; for the nominal significance, p < 0.05/6 = 0.008333). Unless otherwise noted, the results for nominal significance were described in the following sections (see Figs. 1, 2, 3, 4, 5, 6 and 7, for the results of post-hoc analyses with Bonferroni correction after the study-wide significance was obtained).
The hot plate test is widely used to assess pain sensitivity to a thermal stimulus. In this test, there was no significant effect of Age on paw responses to a thermal stimulus ( Fig. 1e; F 3,1232 = 0.25, p = 0.8644). This result indicates no age-related changes in pain sensitivity.
The rotarod test, in which mice are required to walk on an accelerating rotating rod across six trials, is used to evaluate motor function. Average latency to fall off the rod was calculated from the six trials. ANOVA revealed a significant effect of Age on the average latency ( Fig. 1f; F 3,1206 = 62.54, p < 0.0001), which reached studywide significance. Older mice showed a significantly shorter latency to fall off the rod compared with younger mice (8-12 mo < 6-7 mo < 4-5 mo < 2-3 mo, all comparisons p < 0.0001). Previous studies have indicated that rotarod performance is negatively correlated with body weight [28,29]. We performed an analysis of covariance (ANCOVA) with body weight measured at the beginning of the test battery as a covariate. This analysis revealed a significant Age effect and a significant Age × Body weight interaction in the average rotarod latency (Age effect, F 3,1205 = 15.059, p < 0.0001; Body weight effect, F 1,1202 = 31.48, p < 0.0001; Age × Body weight, F 3,1202 = 3.899, Fig. 1 Increased body weight, decreased muscular strength, normal pain sensitivity, and motor dysfunction in older C57BL/6J mice. a Body weight (g), b body temperature (°C), c wire hang latency (s), d grip strength (Newton, N), e latency to paw lick or foot shake (s) in the hot plate test, and f latency to fall off a rotating rod (s) in the rotarod test. Values are means ± SEM. *p < 0.05, **p < 0.01, and ***p < 0.001 after Bonferroni correction performed when ANOVAs reached nominal significance. †p < 0.05 after Bonferroni correction performed when ANOVAs reached study-wide significance Values are means ± SEM. *p < 0.05, **p < 0.01, and ***p < 0.001 after Bonferroni correction performed when ANOVAs reached nominal significance. †p < 0.05 after Bonferroni correction performed when ANOVAs reached study-wide significance p = 0.0087), indicating that the linear association between rotarod performance and body weight varies with ages. In older age groups, there were negative correlations between body weight and average rotarod latency (Additional file 2: Figure S1A-E; all age groups, r = −0.352, p < 0.0001; 2-3 mo, r = −0.039, p = 0.2997; 4-5 mo, r = −0.289, p < 0.0001; 6-7 mo, r = −0.153, p = 0.0645; 8-12 mo, r = −0.68, p < 0.0001). To control the effect of body weight, we used the data from mice with body weights ranging from 27.5 to 32.5 g and compared the rotarod latency among age groups. ANOCOVA showed that there was a significant effect of Age on the rotarod latency (Additional file 2: Figure S1F, G; Age effect, F 3,522 = 12.37, p < 0.0001; Body weight effect, F 1,522 = 0.05, p = 0.8182). These data suggest that aging is associated with decreased motor function.

Decreased locomotor activity and altered anxiety-like behaviors in older C57BL/6J mice
The open field test is widely used to measure locomotor activity and anxiety-like behavior [30]. In the first 5-min period of the test, there were significant effects of Age on the distance traveled ( Fig. 2a; F 3,1451 = 8.51, p < 0.0001) and stereotypic counts ( Fig. 2d; F 3,1451 = 8.049, p < 0.0001), which achieved study-wide significance. Subjects in older age groups traveled significantly shorter distances than those in younger age groups (8-12 mo < 4-5 and 2-3 mo, p = 0.0009 and p < 0.0001, respectively; 6-7 mo < 2-3 mo, p = 0.0012). The 6-7-and 4-5-month-old subjects showed significantly more stereotypic counts than 2-3month-old subjects (p = 0.0007 and p < 0.0001, respectively). During the first 5 min, there were trends toward  Vertical activity in the 4-5-month-old group was significantly greater than that in the 2-3-month-old group ( Fig. 2f; p < 0.0001), and stereotypic counts in the 6-7and 4-5-month-old groups were greater than those in the 2-3-month-old group ( Fig. 2h; p = 0.003 and p = 0.0073, respectively). Overall, the age-dependent decline in locomotor activity during the early testing period in a novel open field environment suggests that aging is associated with increased anxiety-like behavior.
The light/dark transition test is also commonly used to assess anxiety-like behavior. There were significant effects of Age on the distance traveled in the light and dark chambers ( Fig. 2i; the light chamber, F 3,1352 = 11.59, p < 0.0001; the dark chamber, F 3,1352 = 4.46, p = 0.004). In the light chamber, subjects in older age groups exhibited significantly shorter distances traveled than those in younger age groups (8-12 mo < 4-5 mo, p < 0.0001; 8-12 and 6-7 mo < 2-3 mo, p < 0.0001 and p = 0.0027, respectively). In the dark chamber, older age subjects also traveled shorter distances than younger age subjects, although post-hoc tests revealed that these differences did not survive the Bonferroni correction (8-12, 6-7, and 4-5 mo < 2-3 mo, p = 0.0271, p = 0.0519, and p = 0.0083, respectively). A significant effect of Age was found on the number of transitions between the chambers ( Fig. 2j; F 3,1352 = 35.05, p < 0.0001), which achieved study-wide significance. The number of transitions of the older age subjects was significantly lower than those of the younger age subjects (8-12 and 6-7 mo < 4-5 mo, p < 0.0001 and p = 0.0034, respectively; 8-12, 6-7, and 4-5 mo < 2-3 mo, all comparisons p < 0.0001). Regarding latency to enter the light chamber, the oldest age group displayed a longer latency compared with the youngest age group ( Fig. 2k; F 3,1352 = 2.95, p = 0.0317; 8-12 mo > 2-3 mo, p = 0.0056). There was also significant age effect on the time spent in the light chamber ( Fig. 2l; F 3,1352 = 9.93, p < 0.0001), and older age subjects spent less time in the light chamber than younger age subjects (8-12 mo < 4-5 mo, p < 0.0001; 8-12 and 6-7 mo < 2-3 mo, p < 0.0001 and p = 0.0081, respectively). These results of the light/ dark transition test imply that locomotor activity decreases and anxiety-like behavior increases from young to middle age.
In the elevated plus maze test that has been widely used for assessing anxiety-like behavior, there was a significant effect of Age on distance traveled ( Fig. 2m; Values are means ± SEM. *p < 0.05, **p < 0.01, and ***p < 0.001 after Bonferroni correction performed when ANOVAs reached nominal significance. †p < 0.05 after Bonferroni correction performed when ANOVAs reached study-wide significance F 3,1231 = 15.11, p < 0.0001). This effect reached studywide significance. Subjects in older age groups traveled significantly shorter distances than those in the youngest age group (8-12, 6-7, and 4-5 mo < 2-3 mo, p = 0.0044, p < 0.0001, and p < 0.0001, respectively). There was also a significant effect of Age on the number of arm entries ( Fig. 2n; F 3,1231 = 19.34, p < 0.0001), which attained study-wide significance. Older age subjects showed a significantly lower number of arm entries compared with subjects in the youngest age group (8-12, 6-7, and 4-5 mo < 2-3 mo, p = 0.0001, p < 0.0001, and p < 0.0001, respectively). Significant effects of Age were found on the percentages of entries into open arms (  Impaired spatial learning and memory in older C57BL/6J mice. a Number of errors to reach the target hole, b latency to reach the target hole, and c distance traveled to reach the target hole during the training session of the Barnes maze test. d Time spent around the target hole in the probe test. e Number of errors to reach the target hole, f moving time (s), g total distance traveled (cm), and h moving speed (cm/s) in the probe test. Values are means ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001 after Bonferroni correction performed when ANOVAs reached nominal significance. †p < 0.05 after Bonferroni correction performed when ANOVAs reached study-wide significance groups, which suggest reduced anxiety-like behavior, are seemingly inconsistent with the findings from the other two tests (see Discussion for details).

Decreased social contacts in older C57BL/6J mice
The social interaction test, in which two unfamiliar mice are placed together in a novel chamber, is used to assess social behavior. There were significant effects of Age on the distance traveled ( Fig. 3a; F 3,546 = 8.48, p < 0.0001), number of contacts ( Fig. 3b; F 3,546 = 11.71, p < 0.0001), and total duration of active contacts ( Fig. 3d; F 3,546 = 8.58, p < 0.0001), all of which reached study-wide significance. Older age subjects traveled significantly shorter distances than subjects in the youngest age group (8-12, 6-7, and 4-5 mo < 2-3 mo, p < 0.0001, p = 0.0043, and p = 0.0055, respectively). The number of contacts in the older age groups was significantly lower than those in the younger age groups (8-12 and 6-7 mo < 2-3 mo, p = 0.0002 and p < 0.0001, respectively; 8-12 and 6-7 mo < 4-5 mo, p = 0.0079 and p = 0.0017, respectively). Subjects in older age groups showed a significantly shorter duration of active contacts compared with those in the youngest age group (8-12, 6-7, and 4-5 mo < 2-3 mo, p = 0.0056, p < 0.0001, and p = 0.0022, respectively). There was also significant effect of Age on the total duration of contacts ( Fig. 3c; F 3,546 = 3.93, p = 0.0085). The total duration of contacts was shorter in the 6-7month-old group than in the 2-3-month-old group (p = 0.0036). There was no significant age effect in the mean duration per contact ( Fig. 3e; F 3,546 = 0.78, p = 0.5044). These results indicate that locomotor activity and social interaction decrease with age in a novel environment.
Decreased acoustic startle response and increased prepulse inhibition in older C57BL/6J mice Prepulse inhibition of an acoustic startle response (PPI) is a phenomenon in which a weak prepulse stimulus suppresses the startle response to a loud auditory stimulus, and is measured to assess sensorimotor gating. A test session consists of six trial types, i.e., two types of startle-stimulus-only trials (110 or 120 dB auditory stimulus) and four types for prepulse inhibition trials (74-110, 78-110, 74-120, or 78-120 dB auditory stimulus). In the two types of startle-stimulus-only trials, there were significant effects of Age on the startle response to the loud stimuli ( Fig. 4a; 110 dB, F 3,1359 = 3.56, p = Fig. 7 Reduced contextual and cued fear memory in older C57BL/6J mice. a Freezing (%) and b distance traveled (cm) in the conditioning, context test, and cued test. c Activity suppression ratio in the context and cued tests. d Distance traveled (cm) was measured during each footshock. Values are means ± SEM. *p < 0.05, **p < 0.01, and ***p < 0.001 after Bonferroni correction performed when ANOVAs reached nominal significance. †p < 0.05 after Bonferroni correction performed when ANOVAs reached study-wide significance 0.0139; 120 dB, F 3, 1359 = 35.07, p < 0.0001), which reached study-wide significance. The startle responses to the 110 dB stimulus in the 8-12-month-old group were significantly lower than those in the other age groups (8-12 mo < 6-7, 4-5, and 2-3 mo, p = 0.005, p = 0.0011, and p = 0.0025, respectively). For the 120 dB stimulus intensity, subjects in older age groups showed significantly lower startle responses than those in younger age groups (8-12 and 6-7 mo < 4-5 mo < 2-3 mo, all comparisons p < 0.0001).
The tail suspension test is another method to evaluate depression-related behavior. There was a significant effect of Age on immobility time in the tail suspension test ( Fig. 5c; F 3,396 = 8.14, p < 0.0001), which achieved studywide significance. Immobility times in the 8-12-monthold group were significantly lower than those in the 4-5-and 2-3-month-old groups (8-12 mo < 4-5 and 2-3 mo, p < 0.0001 and p < 0.0001, respectively). Overall, the results of these two different types of tests indicate that immobility decreases with age, suggesting an age-related decrease in depression-related behavior.
Impaired spatial learning and memory in older C57BL/6J mice To examine age-related changes in spatial learning and memory, behavioral data of the Barnes maze test were analyzed for the first, fifth, and ninth block of two trials during the training session. There were no significant effects of Age on the number of errors (Fig. 6a) in the first or fifth blocks (F 3,349 = 0.82, p = 0.4838; F 3,350 = 1.44, p = 0.2307, respectively); however, a significant effect of Age on the number of errors was found in the ninth block (F 3,349 = 7.55, p < 0.0001), which reached study-wide significance. Subjects in the oldest age group showed a significantly greater number of errors in reaching around the target hole when compared to subjects in the other age groups in the ninth block (8-12 mo > 6-7, 4-5, and 2-3 mo, p = 0.0028, p = 0.0004, and p < 0.0001, respectively). With regard to the latency to reach around the target hole (Fig. 6b), one-way ANOVAs revealed significant effects of Age during the first, fifth, and ninth blocks (F 3,349 = 2.95, p = 0.0327; F 3,350 = 3.32, p = 0.02; F 3,349 = 3.86, p = 0.0097, respectively). In both the first and fifth blocks, 8-12-month-old subjects exhibited longer latencies to reach around the target hole than subjects in the 6-7-and 2-3-month-old groups, although the differences were not significant after a Bonferroni correction (for the first block, 8-12 mo > 6-7 and 2-3 mo, p = 0.0278 and p = 0.0088, respectively; for the fifth block, 8-12 mo > 6-7 and 2-3 mo, p = 0.0131 and p = 0.0219, respectively). Similarly, in the ninth block, subjects in the 8-12-month-old group showed longer latencies than 6-7-, 4-5-, and 2-3-month-old groups (p = 0.0021, p = 0.0195, and p = 0.0653, respectively). Regarding distance traveled to reach around the target hole (Fig. 6c), there was no significant Age effect in the first block (F 3,349 = 1.30, p = 0.2742), but significant effects of Age were found in the fifth and ninth blocks (F 3,350 = 3.50, p = 0.0157; F 3,349 = 10.13, p < 0.0001, respectively). Subjects in the 8-12-month-old group traveled longer distance than other age groups in the fifth block (vs. 6-7 mo, p = 0.0056; vs. 4-5 mo, p = 0.099; vs. 2-3 mo, p = 0.0176) and in the ninth block (vs. 6-7 mo, p < 0.0001; vs. 4-5 mo, p < 0.0001; vs. 2-3 mo, p < 0.0001).
At 1 day after the last training session, in the probe test, there was a significant effect of Age on the time spent around the target hole ( Fig. 6d; F 3,349 = 21.86, p < 0.0001), which reached study-wide significance. The 8-12-, 6-7-, and 4-5-month-old group subjects spent significantly less time than the 2-3-month-old group subjects (all comparisons p < 0.0001), and 8-12month-old group subjects spent shorter amounts of time around the target hole than 6-7-and 4-5month-old mice (p = 0.0078 and p < 0.0001, respectively). It is unlikely that older groups were unable to reach the target hole because of decreased locomotor activity or increased anxiety-like behavior. In fact, there was no significant effect of Age on the number of errors in the probe test ( Fig. 6e; F 3,349 = 2.41, p = 0.0667). Additionally, older age mice moved significantly more than mice in the youngest age group over the test period ( Fig. 6f: for moving time, F 3,349 = 38.18, p < 0.0001; 8-12 mo > 6-7, 4-5, and 2-3 mo, all comparisons p < 0.0001; 4-5 mo > 2-3 mo, p < 0.0001; Fig. 6g: for total distance, F 3,349 = 6.98, p = 0.0001; 8-12, 6-7, and 4-5 mo > 2-3 mo, all comparisons p < 0.0001), although subjects in the oldest group exhibited reduced moving speed compared with subjects in the younger groups ( Fig. 6h; F 3,349 = 5.39, p = 0.0012; 8-12 < 6-7 and 4-5 mo, p = 0.0007 and p = 0.0008, respectively). These data indicate that spatial learning and memory performance decreases from young adulthood to middle age.
Reduced contextual and cued fear memory in older C57BL/6J mice The contextual and cued fear conditioning test is used to assess fear memory. In the conditioning session, freezing behavior and distance traveled during the first 2 min of the session without presentation of the conditioned stimulus (CS, white noise) and unconditioned stimulus (US, footshock) were evaluated to assess baseline activity in the novel environment. In the first 2 min of the session, there was a marginally significant effect of Age on the percentage of freezing ( Fig. 7a; F 3,510 = 2.39, p = 0.0678) and there was a significant Age effect on distance traveled (Fig. 7b; F 3,510 = 5.67, p = 0.0008). Older age subjects traveled significantly shorter distances than younger age subjects (8-12 mo < 4-5 and 2-3 mo, p = 0.0018 and p = 0.0002, respectively). Similarly, during the last 6 min of the conditioning session with CS-US pairings, significant effects of Age were found on the percentage of freezing (F 3,510 = 12.81, p < 0.0001) and distance traveled (F 3,510 = 4.29, p = 0.0053). Post-hoc analysis revealed that subjects in the 8-12-month-old group showed significantly greater freezing than those in other age groups (vs. 6-7, 4-5, and 2-3 mo, all comparisons p < 0.0001). In addition, the subjects in the 8-12-monthold group traveled shorter distances than subjects in other age groups (vs. 6-7 mo, p = 0.0022; vs. 4-5 mo, p = 0.0025; vs. 2-3 mo, p = 0.0147). During the first US presentation (US1), there was a significant effect of Age on the distance traveled ( Fig. 7d; F 3,510 = 3.30, p = 0.0203). Subjects in the 8-12-month-old group traveled shorter distances during the US presentation than 4-5month-old subjects (p = 0.0029). No significant age effects were found in the distance traveled during the second and third US presentations (US2, F 3,510 = 2.18 p = 0.09; US3, F 3,510 = 0.85, p = 0.4691).
In the context test, in which mice are exposed to the same conditioning chamber 24 hr after the conditioning session, no significant effects of Age were found for freezing and distance traveled (F 3,510 = 0.20, p = 0.8971; F 3,510 = 1.07, p = 0.3605, respectively). However, there was a significant effect of Age on the activity suppression ratio as an index of fear [ratio = (distance traveled during the first 2 min in the context test)/(distance traveled during the first 2 min in the conditioning + distance traveled during the first 2 min in the context test), F 3,510 = 10.50, p < 0.0001], which has been used to normalize individual differences in baseline activity [31,32]. This effect achieved study-wide significance. The activity suppression ratio of the 8-12-month-old group was significantly higher than that of other age groups ( Fig. 7c; 8-12 mo > 6-7, 4-5, and 2-3 mo, p = 0.0005, p < 0.0001, and p < 0.0001, respectively). These results suggest that aging is associated with impaired contextual fear memory.

Discussion
In this study, we performed a large-scale analysis of age effects on behavior in C57BL/6J mice by taking advantage of a large amount of behavioral data that we had collected through a behavioral test battery applied to many mutant strains of mice. This battery of behavioral tests was conducted using uniform protocols and apparatuses in our laboratory, although the experimenters and test dates were not necessarily the same. The genetic background of the subjects and their breeding environment were also not uniform because the C57BL/6J mice were obtained from different vendors and laboratories. Therefore, we cannot completely exclude the possibility of potential genetic and environmental factors that may yield behavioral differences between age groups. Nonetheless, the use of a large number of samples may have minimized the distortion of the data by potential influences of these confounding factors, allowing us to detect the subtle but significant effects of age on behavior. Overall, the present results are indicative of age-related changes in physical characteristics, motor function, locomotor activity, anxiety-like behavior, social behavior, prepulse inhibition, depressionrelated behavior, and learning and memory functions from young adulthood to middle age, although some behavioral differences between age groups did not reach study-wide significance when the highly conservative statistical approach was used.
Our data revealed age-related physiological changes, including gradual increases in body weight and decreases in body temperature, wire hang latency, and rotarod performance, which are generally consistent with previous reports [33][34][35][36][37][38]. These findings suggest that there are age-related changes from young to middle age in the peripheral and central nervous systems associated with declines in thermoregulation, neuromuscular strength, and motor function in C57BL/6J mice, though the gradual decline in motor function may be explained by an increase in body weight with age. With regard to pain sensitivity, some studies have reported that hot plate latency decreases with age when compared between 4-, 11-, and 24-month-old 129Sv/Ev mice and between 3 and 5-and 19-21-month-old BALB/c mice [39,40]. A similar finding was reported in 2-28 month old C57BL/ 6J mice in the tail-withdrawal test [41], suggesting an age-related increase in pain sensitivity. Contrary to these findings, no age-related changes in sensitivity to thermal stimulus were found in our study or a previous study comparing 4-and 28-month-old C57BL/6J mice [6]. Consistent with the results, the present study also showed that there were no substantial differences in distance traveled in response to electric footshock. The discrepancy between the studies might be due to differences in strain, age of testing, and prior test experience, and further study will be required to understand the relationship between age and pain sensitivity in mice.
The open field test is widely used to measure locomotor activity and anxiety-like behavior in a novel environment (for review, see [30]). Previous studies have reported that older C57BL/6 mice showed reductions in the distance traveled and center time when compared with younger mice [42][43][44]. Similarly, the present results indicated an age-related decrease from young to middle age in the distance traveled during the first 5 min of the open field test when anxiety-like behavior has been generally assessed in the test [30]. It is unlikely that the decreased distance traveled is due to age-related declines in muscular strength and motor function as shown in the wire hang and rotarod tests, since the total distance traveled during the entire test period did not significantly differ between age groups, which suggests no reduction in general activity with age. In agreement with the findings of the open field test, older mice traveled shorter distances than younger mice in the light/dark transition, elevated plus maze, and social interaction tests, which have also been used to assess anxiety-like behavior [45][46][47][48], whereas no reduction in distance traveled in older mice was found in the dark chamber of the light/dark transition apparatus that is considered to be less anxiogenic environment. Together, these findings suggest that the decreased locomotor activity in older mice during the early period may be due to an increased emotional response to a novel environment, and not age-related muscle weakness and motor dysfunction. Interestingly, older mice tended to spend less time in the center area than younger mice during the first 5 min of the open field test. In addition, older mice spent less time in the light chamber and exhibited longer latency to first entry into the light chamber compared with younger mice in the light/dark transition test. Together, these results support the conclusion that anxiety-like behavior increases with age.
In contrast, older mice showed increased percentages of open arm entries and time spent in open arms in the elevated plus maze test, which could be interpreted as a decrease in anxiety-like behavior. Such conflicting results from the two different tests have been reported in several inbred and knockout mouse studies [49][50][51][52]. These reports suggest that the behavioral indices of each test may reflect different aspects of anxiety-like behavior, as has been confirmed by principal component analyses [53,54]. Some researchers have speculated that the increased exploration of open arms may reflect an increased panic-like escape response to stress and/or a higher level of anxiety [50][51][52]55], which may be partially supported by the finding that mice showing increased open arm exploration exhibited increased stress response or a higher plasma corticosterone level [52]. Aged animals showed higher corticosterone levels than younger animals after exposure to a novel environment or a sudden noise [56][57][58]. These findings suggest that anxiety-and panic-like behaviors in response to a novel/stressful environment increase with age from young adulthood to middle age. However, the developmental processes of anxiety-or panic-like behaviors need to be further investigated since previous studies have reported some inconsistent findings on agerelated changes in the anxiety-like behaviors assessed in these tests [42,43,[59][60][61][62].
Reduced locomotor activity and social contacts in older mice were observed in the social interaction test. Our results are consistent with previous reports showing that middle-aged and aged animals exhibited decreased social behavior than their young counterparts [39,59,63,64]. In the social interaction test, the reduction of locomotor activity and social contacts may reflect increased anxiety (for review, see [47]). The decreased social behavior in older mice seems to agree with the findings obtained from the open field and light/dark transition tests. These findings suggest that aging from young adulthood to middle age is associated with decreases in social motivation to approach and social investigation accompanied by increased anxiety in response to a novel social environment.
Our results revealed that the startle response to a 110 dB stimulus does not change from approximately 2-7 months of age, and decreases thereafter. Additionally, the startle response to a 120 dB stimulus is the highest at 2-3 months of age and then gradually decreases with age. These findings suggest that younger mice are more sensitive to the auditory stimuli than older mice, indicating that there is an age-related decrease in the startle response to a loud noise. C57BL/6J mice show cochlear degeneration and hearing loss with advancing age [65,66]. Therefore, the age-related decline in the acoustic responses may be partly attributed to age-related hearing impairments. Prepulse inhibition of the startle response in C57BL/6J mice was found to increase from approximately 2-7 months of age and tend to decrease thereafter. Similar inverted U-shaped changes with age have been reported in previous studies [66,67]. Given that there was little to no correlation between startle response and percentage of prepulse inhibition (correlation coefficients ranging from r = −0.192 to r = 0.154), the age-related changes in prepulse inhibition seems not to be simply related to the age-related decrease in startle response or hearing ability. Although it remains unclear why there is an inverted Ushaped relation between age and prepulse inhibition, these findings suggest that auditory information processing in the central nervous system changes from young adulthood to middle age.
The effect of age on depression-related behavior remains controversial because previous studies have shown inconsistent findings on depression-related behavior in rats and mice [60,62,[68][69][70][71][72]. Some studies in mice have reported that young, middle-aged, and aged mice showed no differences in immobility in the forced swim and tail suspension tests [60,68,69,72]. However, our data indicated that immobility decreases from young adulthood to middle age in the two tests. One of the reasons for the age-related differences found in our study is that the sample size was large enough to detect statistically significant differences. Another possible explanation for the discrepancy between studies is that there might be the differences in animal species, strain, behavioral procedures, and prior experience with stress. Prior test experience and stress can alter subsequent behavioral responses [24,26,[73][74][75][76][77]. In our study, mice were subjected to a number of behavioral tests before the Porsolt forced swim and tail suspension tests. This test history may have contributed to the age-related differences in depression-related behavior, although further study will be needed to clarify the precise effects of age coupled with prior test experience on depression-related behavior.
The effect of age on spatial memory has been intensively studied using the Barnes maze task [78] and the Morris water maze task (for review, see [79]) in rats and mice. The spatial memory deficit in the water maze task has been found in C57BL/6 mice aged approximately above 12 months [80][81][82][83][84][85]. However, since this task requires animals to have the ability and motivation to swim, the group differences in memory performance might be due to differences in swimming ability [86]. Furthermore, the task paradigm is a stressful situation, which can affect memory performance. In contrast, the Barnes maze task is a dry-land task that is less stressful and may be a more appropriate task for mice [87,88], in which memory performance has not been systematically evaluated across the entire lifespan [79]. Our data indicated that 8-12-month-old mice exhibited an increase in the number of errors, latency to approach the target, and distance traveled to the target hole when compared with younger mice during training session. These results are consistent with a previous report [89], indicating that spatial learning and memory functions decline from 12 months of age in C57BL/6J mice. The present results of the probe test further indicate that spatial memory deficits can be observed after 4-5 months of age. It is of interest to examine the effect of age on other spatial learning and memory tasks, such as the Morris water maze and eight-arm radial maze tasks.
A number of studies have reported little or no agerelated differences in the contextual and cued fear memory performance in rats and mice (e.g., [90][91][92][93][94][95]; for review, see [79,96]), whereas some studies have found age-related deficits in contextual memory when comparing young (3-6-months-old) and aged (16-18months-old) C57BL/6 mice [97,98]. Our study showed that activity suppression ratio in the context test increased at 8 months of age, suggesting that contextual fear memory deficit can occur after middle age in C57BL/6 mice. In addition, we found decreased freezing and increased activity suppression ratio in the altered context during CS presentation in 8-12months old mice, showing deficit in cued fear memory. Furthermore, the present study found an increase in freezing behavior in 4-12-month-old mice compared with 2-3-month-old mice when the CS was not presented in the altered context, suggesting an agerelated increase in generalized fear/anxiety or agerelated deficits in the ability to discriminate between contexts, or spatial pattern separation.
Overall, the present study reveals that there are agerelated changes in various behaviors from young adulthood to middle age. Some of the behavioral changes were observed at the early stage of adulthood, the occurrences of which depended on the types of behavioral tests implemented. For example, decreased locomotor activity occurred between 2-3 months of age and 4-5 months of age, as shown in the number of transitions in the light/dark transition test and in the distance traveled in the elevated plus maze and social interaction tests in a novel environment, whereas such marked changes were not found in the open field test and in the dark chamber of the light/dark transition test. Similarly, there was a gradual decrease in rotarod performance and acoustic startle response after 2-3 months of age. For the Barnes maze test, spatial memory performance decreased after 2-3 months of age, though there were no significant differences in performance between subjects of 4-5 and 6-7 months old, suggesting that spatial memory function was stable during 4-7 months of age. In contrast, prepulse inhibition of the startle response increased until 6-7 months of age, and then appeared to decrease after middle age. The pain sensitivity to thermal stimulus and electric footshock was stable from young adulthood to middle age. These findings show that the age of the subjects is one of the possible confounding factors influencing behavioral outcomes and therefore, needs to be taken into account in designing and conducting behavioral tests involving a cohort of mice.
Aging is associated with various behavioral changes that are mediated by brain structures and networks [2,99,100]. The age-related changes in behaviors have usually been found by comparing young and aged animals in behavioral tests. Our large-scale analysis using a number of behavioral data from mice subjected to a behavioral test battery demonstrated that almost all the behaviors examined, including locomotor activity, anxiety-like behavior, social behavior, startle response, depression-related behavior, spatial learning and memory, and associative fear memory, gradually change from young adulthood to middle age. In this study, we used the data of C57BL/6J mice with, which are the most widely used inbred strain for creating genetically engineered mice and studying mouse behavioral phenotypes. These findings will provide further opportunities to understand the neurobiological processes underlying brain function and behavior that are changeable with age from early to middle life.

Conclusions
In the present study, our large-scale analysis of the effects of age on behavior demonstrates that various behaviors change from young adulthood to middle age in C57BL/6J mice. Our findings provide insights into understanding the developmental process and the underlying mechanisms of brain and behavior in C57BL/6J mice. In addition, our study showed that relatively narrow age differences can produce great variability in behavior during adulthood. Age is therefore one of the critical factors influencing behaviors that should be carefully considered when designing behavioral experiments and interpreting behavioral differences that could be induced by experimental manipulations.

Animals and experimental design
Genetically engineered mice and their wild-type control mice were transported from the animal facilities of other laboratories or vendors to our laboratory, and subjected to a behavioral test battery. Behavioral data of up to 1739 wild-type control male mice that we have collected from the behavioral analysis of 61 strains of genetically engineered mice with a C57BL/6J genetic background were used for analysis in this study. We did not exclude any behavioral data except for some specific cases in which animals fell from a testing apparatus or data were not recorded due to some technical problems. More than 90 % of the mice were backcrossed at least six times (and more than 95 % of the mice used were backcrossed at least five times) with C57BL/6J mice. The wild-type control mice, which were derived from JAX C57BL/6J strain or C57BL/6J substrains (6JJcl or 6JJmsSlc) maintained in Japan, were regarded as "C57BL/6J" mice. They were housed in plastic cages with sterilized PaperClean Bedding (Japan SLC) under a 12hr light/dark cycle (lights on at 7:00 am) with access to food (CRF-1, Oriental Yeast Co., Ltd.) and water ad libitum in our animal facilities. Behavioral testing was performed between 9:00 a.m. and 6:00 p.m. Of the mice, 1.3, 2.3, 6.7, 80.1, 8.9, and 0.7 % were housed with 1, 2, 3, 4, 5, and 6 animals per cage, respectively, at the beginning of the test battery. The mice were generally tested in the following order; general health and neurological screening, light/dark transition, open field, elevated plus maze, hot plate, social interaction, rotarod, startle response/prepulse inhibition, Porsolt forced swim, Barnes maze, contextual and cued fear conditioning, and tail suspension tests. The interval between tests was at least 1 day. More than 75 % of the mice were subjected to the behavioral tests in accordance with the order of the test battery, although in some strains of mice, several tests were omitted from the test battery. For the remaining mice, several tests were performed while changing the order of the test and/or were omitted from the test battery. After the tests, all apparatus were cleaned with super hypochlorous water and 70 % ethanol to prevent a bias due to olfactory cues. The information about each mouse and behavioral data used in this study are open on a public database "Mouse Phenotype Database" (URL: http://www.mouse-phenotype.org/). All behavioral testing procedures were approved by the Animal Care and Use Committee of Kyoto University Graduate School of Medicine and National Institute for Physiological Sciences in Japan.

General health and neurological screen
The general health check and neurological screen was conducted as previously described [101][102][103]. Body weight and rectal temperature were measured. Neuromuscular strength was assessed using the grip strength and wire hang tests. A grip strength meter (O'Hara & Co., Tokyo, Japan) was used to assess forelimb grip strength. Mice were lifted and held by their tail so that their forepaws could grasp a wire grid. The mice were then gently pulled backward by the tail until they released the grid. The peak force applied by the forelimbs of the mouse was recorded in Newtons (N). Each mouse was tested three times, and the largest value was used for statistical analysis. In the wire hang test, the mouse was placed on a wire mesh that was then inverted, and the latency to fall from the wire was recorded with a 60 s cut-off time.

Light/dark transition test
The light/dark transition test, originally developed by Crawley and colleagues [104], was performed as previously described [105]. The apparatus consisted of a cage (21 × 42 × 25 cm) divided into two sections of equal size by a partition with a door (O'Hara & Co., Tokyo, Japan). One chamber was brightly illuminated (390 lux), whereas the other chamber was dark (2 lux). Mice were placed into the dark chamber and allowed to move freely between the chambers with the door open for 10 min. The total number of transitions between chambers, time spent in each chamber (s), latency to first enter the light chamber (s), and distance traveled in each chamber (cm) were recorded automatically by ImageLD software (see Section, "Image analysis").

Open field test
The open field test was used to evaluate locomotor activity and emotional response [30]. The apparatus was a transparent square cage (42 × 42 × 30 cm; Accuscan Instruments, Columbus, OH, USA). The center of the floor was illuminated at 100 lux. Each mouse was placed in the open field apparatus and recorded for 120 min. Total distance traveled (cm), vertical activity (rearing measured by counting the number of photobeam interruptions), time spent in the center area (20 × 20 cm), and the beam-break counts for stereotyped behaviors were measured.

Elevated plus maze test
The elevated plus maze test was conducted as previously described [106]. The elevated plus maze consisted of two open arms (25 × 5 cm, with 3-mm-high ledges) and two closed arms (25 × 5 cm, with 15-cm-high transparent walls) of the same size (O'Hara & Co., Tokyo, Japan). The arms and central square were made of white plastic plates and were elevated to a height of 55 cm above the floor. The arms of the same type were arranged at opposite sides to each other. The center of the maze was illuminated at 100 lux. Each mouse was placed in the central square of the maze (5 × 5 cm), facing one of the closed arms, and was recorded for 10 min. The distance traveled (cm), number of total entries into arms, percentage of entries into open arms, and percentage of time spent in open arms were calculated automatically using ImageEP software (see Section, "Image analysis").

Hot plate test
The hot plate test was performed to evaluate sensitivity to a painful stimulus. Mice were placed on a 55.0 ± 0.3°C hot plate (Columbus Instruments, Columbus, OH, USA), and latency to the first paw response (s) was recorded with a 15 s cut-off time. The paw response was defined as either a foot shake or a paw lick.

Social interaction test in a novel environment
Social interaction test [107,108] was conducted as previously described [51]. Two mice of same genotype that were previously housed in different cages, were placed into a box together (40 × 40 × 30 cm; O'Hara & Co., Tokyo, Japan) and were allowed to explore freely for 10 min. Mouse behavior was analyzed automatically using ImageSI software (see Section, "Image analysis"). The total duration of contacts (s), number of contacts, total duration of active contacts (s), mean duration per contact, and total distance traveled (cm) were measured. The active contact was defined as follows: images were captured at three frames per second, and distance traveled between two successive frames was calculated for each mouse. If the two mice contacted each other and the distance traveled by either mouse was 5 cm and more, the behavior was considered an "active contact."

Startle response/prepulse inhibition test
A startle reflex measurement system (O'Hara & Co., Tokyo, Japan) was used to measure startle response to a loud noise and prepulse inhibition of the startle response. A test session began by placing a mouse in a plastic cylinder where it was left undisturbed for 10 min. White noise (40 ms) was used as the startle stimulus for all trial types. The startle response was recorded for 400 ms starting with the onset of the startle stimulus. The background noise level was 70 dB. The peak startle amplitude was used as a dependent variable. A test session consisted of six trial types (e.g., two types of startle-stimulus-only trials, and four types for prepulse inhibition trials). The intensity of the startle stimulus was either 110 or 120 dB. The prepulse sound was presented 100 ms before the onset of the startle stimulus, and its intensity was 74 or 78 dB (20 ms). Four combinations of prepulse and startle stimuli were used (74-110, 78-110, 74-120, and 78-120 dB). Six blocks of the six trial types were presented in a pseudorandom order such that each trial type was presented once within a block. The average inter-trial interval was 15 s (range: 10-20 s).

Porsolt forced swim test
The Porsolt forced swim test, developed by Porsolt et al. [109], was performed as previously described [102,110,111]. The apparatus consisted of four plastic cylinders (20 × 10 cm; O'Hara & Co., Tokyo, Japan). The cylinders were filled with water (approximately 23°C) to a height of 7.5 cm. Mice were placed into the cylinders, and the immobility and distance traveled were recorded over a 10-min test period. Images were captured, and for each pair of successive frames, the amount of area (pixels) within which the mouse moved was measured. When the amount of area was below a certain threshold, mouse behavior was judged as "immobile". When the amount of area equaled or exceeded the threshold, the mouse was considered as "moving". The optimal threshold by which to judge was determined by adjusting it to the amount of immobility measured by human observation. Immobility lasting for less than 2 s was not included in the analysis. Data acquisition and analysis were performed automatically using ImagePS software (see Section, "Image analysis").

Barnes maze test
The Barnes circular maze task, developed by Barnes [78], was conducted on a white circular surface (1.0 m in diameter, with 12 holes equally spaced around the perimeter; O'hara & Co., Tokyo, Japan). The circular open field was elevated 75 cm from the floor. A black Plexiglas escape box (17 × 13 × 7 cm), which had paper cage bedding on its bottom, was located under one of the holes. The hole above the escape box represented the target, analogous to the hidden platform in the Morris water maze task. The location of the target was consistent for a given mouse but randomized across mice. The maze was rotated daily, with the spatial location of the target unchanged with respect to the distal visual room cues, to prevent a bias based on olfactory or proximal cues within the maze. One to four trials per day were conducted as training sessions. The number of errors, latency to reach the target (s), and distance traveled to reach the target (cm) were automatically calculated by ImageBM software (see Section, "Image analysis"). After the last training session, a probe test was conducted without the escape box for 3 min, to confirm that this spatial task was acquired based on navigation by distal environment cues. The time spent around each hole, number of errors, moving time (s), distance traveled (cm), and moving speed (cm/s) were recorded using ImageBM software.

Contextual and cued fear conditioning test
Contextual and cued fear conditioning test was performed as previously described [112]. In brief, each mouse was placed in a transparent acrylic chamber (33 × 25 × 28 cm) with a stainless-steel grid floor (0.2 cm diameter, spaced 0.5 cm apart; O'Hara & Co., Tokyo, Japan) and was allowed to explore freely for 2 min. Subsequently, a 55 dB white noise, which served as the conditioned stimulus (CS), was presented for 30 s. During the last 2 s of CS presentation, a mild footshock (0.3 mA, 2 s), which served as the unconditioned stimulus (US), was presented. Two more CS-US pairings were presented with a 2-min inter-stimulus interval. Twentyfour hours after the conditioning session, a context test was conducted in the same chamber. A cued test with altered context was then performed in a triangular chamber (33 × 29 × 32 cm) made of white opaque plastic, which was located in a different room. In each test, freezing percentage and distance traveled were calculated automatically using ImageFZ software (see Section, "Image analysis").

Tail suspension test
Tail suspension test, developed by Steru et al. [113], was performed as described previously [114]. Each mouse was suspended 30 cm above the floor by the tail in a white plastic chamber (31 × 41 × 41 cm) (O'Hara & Co., Tokyo, Japan). The behavior was recorded for 10 min. Images were captured through a video camera, and immobility was measured. Similar to the Porsolt forced swim test, immobility was judged by ImageTS software (see Section, "Image analysis") according to a certain threshold. Immobility lasting for less than a 2 s was not included in the analysis.

Image analysis
The application software used for the behavioral experiments (ImageLD, EP, SI, PS, BM, FZ, and TS) were based on the public domain NIH Image program (developed at the U.S. National Institutes of Health and available at http://rsb.info.nih.gov/nih-image/) and ImageJ program (http://rsb.info.nih.gov/ij/), which were modified for each test by Tsuyoshi Miyakawa.

Statistical analysis
Statistical analysis was conducted using SAS University Edition (SAS Institute, Cary, NC). Data were analyzed using a one-way ANOVA. We set the "study-wide significance" level to p < 0.05/69 = 0.000724 by Bonferroni correction based on 69 behavioral measures used in the test battery. "Nominal significance" was defined as the one that achieved a statistical significance in an index (p < 0.05) but did not survive the correction. The post-hoc multiple comparisons were further performed using Fisher's PLSD with Bonferroni correction (for the study wide significance, p < 0.000724/6 = 0.00012; for the nominal significance, p < 0.05/6 = 0.008333). Values in graphs are expressed as mean ± SEM.