A Synaptic Basis for Auditory–Vocal Integration in the Songbird (original) (raw)

Articles, Behavioral/Systems/Cognitive

, Melissa J. Coleman, Todd F. Roberts, Arani Roy, Jonathan F. Prather and Richard Mooney

Journal of Neuroscience 6 February 2008, 28 (6) 1509-1522; https://doi.org/10.1523/JNEUROSCI.3838-07.2008

Loading

Abstract

Songbirds learn to sing by memorizing a tutor song that they then vocally mimic using auditory feedback. This developmental sequence suggests that brain areas that encode auditory memories communicate with brain areas for learned vocal control. In the songbird, the secondary auditory telencephalic region caudal mesopallium (CM) contains neurons that encode aspects of auditory experience. We investigated whether CM is an important source of auditory input to two sensorimotor structures implicated in singing, the telencephalic song nucleus interface (NIf) and HVC. We used reversible inactivation methods to show that activity in CM is necessary for much of the auditory-evoked activity that can be detected in NIf and HVC of anesthetized adult male zebra finches. Furthermore, extracellular and intracellular recordings along with spike-triggered averaging methods indicate that auditory selectivity for the bird's own song is enhanced between CM and NIf. We used lentiviral-mediated tracing methods to confirm that CM neurons directly innervate NIf. To our surprise, these tracing studies also revealed a direct projection from CM to HVC. We combined irreversible lesions of NIf with reversible inactivation of CM to establish that CM supplies a direct source of auditory drive to HVC. Finally, using chronic recording methods, we found that CM neurons are active in response to song playback and during singing, indicating their potential importance to song perception and processing of auditory feedback. These results establish the functional synaptic linkage between sites of auditory and vocal learning and may identify an important substrate for learned vocal communication.

Introduction

Songbirds use tutor song memories and singing-related feedback to adaptively modify their vocalizations (Marler and Tamura, 1964; Konishi, 1965; Immelmann, 1969; Price, 1979). This form of learning requires auditory–vocal integration. Anatomical and physiological studies in songbirds have illuminated distinct pathways for auditory memory and learned vocal control. However, the anatomical and functional interactions between these pathways remain unclear.

Recent evidence indicates that secondary regions of the avian auditory telencephalon are important sites for the formation and storage of auditory memories (Bolhuis and Gahr, 2006). Of these regions, the caudal mesopallium (CM) (see Fig. 1A) displays several features consistent with a role in encoding memories of conspecific songs. First, CM neurons in zebra finches show enhanced selectivity for conspecific vocalizations, suggesting that CM resides at the apex of an auditory processing hierarchy for conspecific songs (Theunissen et al., 2004). Second, auditory selectivity of some CM neurons can be modified in operant conditioning tasks, indicating that CM encodes information about an individual's auditory experience (Gentner and Margoliash, 2003). Third, CM receives afferents from the caudomedial nidopallium (NCM) (see Fig. 1A), an area implicated in the long-term storage of auditory memories, including those of the tutor song (Mello and Clayton, 1994; Jarvis et al., 1995; Vates et al., 1996; Bolhuis et al., 2000). Although these studies suggest that CM is important for auditory learning, the role of CM in song motor learning is unknown.

The songbird's brain is distinguished by the song system (see Fig. 1B), a constellation of interconnected sensorimotor nuclei essential to singing and song learning (Nottebohm et al., 1976, 1982; Bottjer et al., 1984; Scharff and Nottebohm, 1991). Many song system neurons respond to auditory presentation of the bird's own song (BOS), indicating that the song system receives auditory input (McCasland and Konishi, 1981; Margoliash, 1983; Doupe and Konishi, 1991). In contrast to neurons in primary and secondary auditory telencephalon, many song system neurons exhibit exquisite selectivity for spectral and temporal features of the BOS (Lewicki and Arthur, 1996; Theunissen et al., 2004). How auditory information enters the song system and the origins of BOS selectivity remain obscure. The telencephalic nucleus interface (NIf) is the earliest site to display both auditory and song premotor activity (McCasland, 1987; Janata and Margoliash, 1999), and many NIf neurons exhibit a high degree of selectivity for the BOS (Cardin and Schmidt, 2004; Coleman and Mooney, 2004). Additionally, NIf is a major source of auditory input to the telencephalic song nucleus HVC (used as a proper name) (Cardin and Schmidt, 2004; Coleman and Mooney, 2004), which in turn transmits auditory information to brain pathways important to singing or song learning (Mooney, 2000). Thus, identifying the auditory inputs of NIf can establish how auditory information flows into song control networks and better localize where BOS selectivity originates.

Retrograde tracers placed in NIf sparsely label neurons in lateral CM (Vates et al., 1996), but these results are difficult to fully interpret because of a fibers-of-passage confound. Moreover, the functional importance of CM to auditory activity in NIf and the extent to which BOS selectivity is enhanced between CM and NIf remain unknown. Using viral tracing methods and in vivo intracellular and extracellular recordings in anesthetized zebra finches, we find that CM provides direct auditory input to NIf and HVC and that BOS selectivity increases between CM and NIf. Furthermore, chronic recordings made in freely behaving birds suggest that CM has the potential to transmit information about auditory memories and singing-related auditory feedback to the song system. These results establish a synaptic linkage between brain areas important for auditory memory and those important to learned vocal control.

Materials and Methods

General methods for many of these procedures have been described previously (Mooney, 2000; Rosen and Mooney, 2000; Coleman and Mooney, 2004). Experiments were performed using 58 adult [age, >90 d posthatch (dph); 159 ± 55 dph, mean ± SD] male zebra finches (Taeniopygia guttata) in accordance with a protocol approved by the Duke University Institutional Animal Care and Use Committee.

Localization of target brain structures.

Animals were anesthetized with 20% urethane (∼90 μl total; Sigma, St. Louis, MO) delivered intramuscularly in 30 μl aliquots at 30 min intervals. Before placing the bird in the stereotaxic apparatus, the ear bars were centered. A marking pipette was then lowered so its tip was positioned at interaural zero. The pipette was then raised, the bird was placed in the stereotaxic device (by withdrawing the ear bars and then reinserting them in the bird's ear canals). Lidocaine (2%; Abbott Laboratories, Chicago, IL) was applied to the scalp, after which the scalp was dissected along the midline with a scalpel blade. The bird's head was rotated so that the bifurcation of the sinus was directly under the pipette tip. This produces a head angle slightly steeper than 45°. The positions of Field L, CM, NIf, and/or HVC were marked on the skull based on stereotaxic coordinates. The approximate coordinates of each nucleus relative to interaural zero and the brain surface were as follows: Field L2, 1.0 mm rostral, 1.0 mm lateral, 2.0 mm deep; CM, 1.3 mm rostral, 1.3 mm lateral, 0.9 mm deep; NIf, 2.2 mm rostral, 1.7 mm lateral, 2.0 mm deep; HVC, 0 mm rostral, 2.4 mm lateral, 0.4 mm deep. The stereotaxic coordinates we used for localizing CM correspond to the medial part of the caudolateral hyperstriatum ventrale (clHV), as described by Vates et al. (1996). In CM recording experiments, the median depth of CM recording locations was ∼900 μm, with quartiles from 700 to 1100 μm. As a result, most of the recordings we made were from the ventral half of CM. With these locations marked, a metal post was attached to the rostral part of the skull with dental cement and cyanoacrylate. After the cement hardened, the bird was transferred to a sound-attenuating chamber (Industrial Acoustics Company, Bronx, NY) on a vibration isolation table (TMC, Peabody, MA) and placed on a heating pad maintained at 36°C (Harvard Apparatus, Holliston, MA). The bird's head was immobilized via the mounted post and positioned at an angle equivalent to that used in the surgical stereotaxic apparatus. Small craniotomies were made over the nuclei of interest, and a small tear was made in the dura with a minuten pin. Recording electrodes or injection micropipettes were lowered into the brain using a one-dimensional hydraulic micromanipulator (Soma Scientific, Irvine, CA).

Preparation and presentation of acoustic stimuli.

Before each experiment, songs were recorded by placing the subject male zebra finch in a sound-isolation chamber (Industrial Acoustics Company) with a female zebra finch. Songs were amplified and low-pass filtered at 10 kHz, digitized at 22.05 kHz, and stored on a hard drive. Songs were recorded and edited using custom software (Labview; National Instruments, Austin, TX) or Sound Analysis Pro (David Swigger and Ofer Tchernichovski, City College of New York, New York, NY). Songs used in playback experiments were edited to include introductory notes and two or three motifs, the largest repeated unit in the bird's song, and were typically 1.5–3 s in total duration. Several auditory stimuli, including the BOS, reversed BOS (REV), one to three conspecific songs (CON), and white noise (10 kHz bandwidth), were played during the course of the experiment. CON songs were chosen from birds in our colony that were unrelated to the subject animal and were similar in length to the BOS. A sample comparison of a subset of songs (n = 10 song pairs) used in these experiments revealed a BOS duration of 1.63 ± 0.41 s versus a CON duration of 1.78 ± 0.43 s (p = 0.45). Different CON songs were used in different experiments. For inactivation experiments and intracellular NIf recordings, the stimulus set consisted of BOS, REV, and CON, whereas a larger set including more than one CON and noise bursts was used for most investigations of CM neuron response properties. The amplitude envelope of each song was scaled linearly so the peak intensity measured by the sound pressure level meter at the position of the bird's head was ∼70 dB, measured (A-weighted) with a sound level meter (Radio Shack). Songs and noise stimuli were presented with an interstimulus interval of 6 ± 1 s (mean ± SD) and in a fixed order (BOS, REV, CON, and white noise), but no visual evidence of habituation or facilitation of the auditory responses was detected. To confirm this visual impression, the song-evoked action potential response of a subset of NIf intracellular recordings (n = 10) was tested statistically, revealing no difference between the response evoked by the first and either the second or tenth iteration of BOS playback (p = 0.48 and p = 0.9, respectively). Similarly, a sample of single and multiunit extracellular recordings in CM (n = 15) revealed no difference between BOS-evoked z-scores for the first and last five iterations of BOS playback [the mean of first five iterations, 2.48; mean of the last five, 2.27; p = 0.73; z-scores were used for this comparison because we combined multiunit and single-unit experiments and because there was a high degree of variation in response strength (RS) values measured in CM].

Extracellular and intracellular recordings.

Multiunit extracellular recordings were obtained through carbon-fiber electrodes (0.4–0.8 MΩ, Carbostar electrodes; Kation Scientific, Minneapolis, MN). Single-unit extracellular recordings were obtained through glass micropipettes pulled on a vertical puller (David Kopf Instruments, Tujunga, CA) with tips broken to a diameter of ∼1.0 μm. The electrodes were tip filled with 5% Cascade Blue dextrans [3000 molecular weight (MW) anionic; Invitrogen, Carlsbad, CA] in 3.0 m NaCl and backfilled with 3.0 m NaCl solution, yielding resistances of 10–30 MΩ. Extracellular signals were amplified via a differential amplifier (A-M Systems, Everett, WA) and bandpass filtered (0.3–5 kHz) and digitized (11,025 Hz) for storage on a personal computer. Intracellular recordings were made with sharp electrodes of borosilicate glass (100–200 MΩ; Sutter Instruments, Novato, CA) tip filled with 5% Neurobiotin in 2.0 m KAc and backfilled with 2.0 m KAc. Intracellular signals were amplified (Molecular Devices, Palo Alto, CA), low-pass filtered (3 kHz), and digitized (11,025 Hz) for storage on a personal computer.

Pharmacological inactivation of CM.

To inactivate CM, GABA (250 mm in 0.9% NaCl with 0.5% Texas Red dextran, 3000 MW) was pressure injected using a Picospritzer (30–60 ms pulses at 10 psi; General Valve, Fairfield, NJ) through a glass micropipette with tip broken to ∼10 μm. A pulse of GABA was injected 500 ms before each stimulus presentation for 90 stimulus presentations (30 repetitions of the stimulus set). The location of the injection site was determined post hoc by visualizing the dextran label under epifluorescence. The dextrans, because of their higher MW, likely underestimated the extent of GABA spread. In separate control experiments (data not shown), the spread of GABA from injection sites in CM was measured electrophysiologically by assessing neural activity at regular intervals from the injection site. The spread of GABA was not spherical, most likely because of the relative impermeability of the mesopallial lamina immediately ventral to CM. Neuronal activity within an average of 700 μm dorsal and 350 μm ventral to the injection site was quickly and completely abolished during GABA injection and completely recovered within a few tens of minutes after cessation of GABA injections, indicating that a substantial extent of CM was silenced using this protocol.

NIf lesions and pharmacological inactivation of CM.

To unilaterally lesion NIf, the nucleus was first identified with multiunit recording through carbon-fiber electrodes (0.4–0.8 MΩ, Carbostar electrodes; Kation Scientific), and subsequently ibotenic acid was pressure injected in the area. A total of 300–400 nl of ibotenic acid (7 mg/ml in 0.1N NaOH; Sigma), comprising injections of smaller volumes (∼30 nl) carefully spread over 10–12 spots, was injected into NIf in the right hemisphere of each bird, after which the bird recovered for 4–9 d before additional in vivo electrophysiological recordings were performed. Song recordings made after placing unilateral lesions in NIf did not show any detectable changes in the song pattern (data not shown). After the 4–9 d recovery period, we placed the bird under urethane anesthesia and made multiunit extracellular recordings in HVC while reversibly inactivating CM ipsilateral to the NIf lesion.

To reversibly inactivate CM, we pressure injected GABA (250 mm in 0.9% NaCl with 0.5% dextran-conjugated Alexa-fluor 488, 3000 MW; Invitrogen) using a Picospritzer (30–50 ms pulses at 10 psi; General Valve) through a glass micropipette. During each experiment, the volume of GABA injected was estimated by pressure injecting a single droplet of GABA solution in mineral oil and measuring the diameter of the droplet under a microscope. According to this estimation, 30–90 nl of GABA was injected in CM in each experiment. The confinement of the injected GABA in CM was confirmed by the post hoc visualization of the dextran-conjugated Alexa-fluor 488 in sagittal tissue sections (data not shown). In unilaterally NIf-lesioned birds, the extent of the lesion was confirmed by counterstaining alternate sections of 50 μm thickness for Nissl and parvalbumin as described below (see Tissue collection and histology). In all three birds used in these experiments, the lesion of NIf was complete and included small portions of Field L in two birds (supplemental Fig. 2 provides a comparison of the region containing NIf on the lesioned and intact side of an individual bird; available at www.jneurosci.org as supplemental material).

Chronic recordings from awake birds.

Neurons were sampled using a miniaturized micromanipulation device (Fee and Leonardo, 2001) in awake and freely behaving birds. Several days before implantation, birds were transferred from their housing cage to the recording chamber, a sound-attenuating box (Acoustic Systems) where they would reside throughout experimentation. To implant the device, birds were first anesthetized using isoflurane (inhalation, 3% in 100% O2) and then placed in a stereotaxic device with the head positioned at an angle of 45°. A small incision was made in the skin overlying the skull, and the outer leaflet of bone was removed over CM (1.3 mm rostral, 1.3 mm lateral, relative to the bifurcation of the midsagittal sinus). A small craniotomy (∼300 by 300 μm) was made in the inner leaflet over CM, and the microdrive recording device was implanted so that the recording electrodes were slightly dorsal of CM (∼0.5 mm depth). The implant site was covered with a sterile film, the microdrive was secured to the skull using dental cement, and the incision site was closed using surgical skin adhesive (Vetbond; 3M, St. Paul, MN). The bird was monitored closely until it was fully recovered (typically <15 min). Most of the CM neurons (37 of 39) we sampled were from the left hemisphere. Songs used for playback were prepared as described in the previous section. After the recording session was complete (1–3 weeks), the bird was deeply anesthetized with Equithesin and perfused transcardially with saline and then 4% paraformaldehyde (PFA), and the brain was processed histologically. All electrode positions were verified at the end of each experiment using Nissl-stained sagittal sections (thickness, 75 μm).

Data analysis.

All electrophysiological recordings were analyzed off-line using custom software [Labview programs written by Merri Rosen and Stefan Nenkov, Duke University, Durham, NC, and Matlab (MathWorks, Natick, MA) routines written by Jonathan Prather]. For multiunit recordings, the threshold for detecting action potentials was set visually by the user. Stimulus-evoked activity was evaluated using the RS, which is the difference between the mean firing rates observed during the stimulus and during a prestimulus baseline period of similar duration. Significance of these responses was determined using a paired t test. For comparison across cells, response activity was expressed as z-scores. The z-score is the difference between the firing rate during the stimulus versus the baseline divided by the SD of that difference:Embedded Image where is the mean activity during the stimulus, is the mean activity during a baseline period, and the denominator is the SD of (SB).

Neuronal selectivity for one stimulus versus another was quantified using the d′ value, which provides a statistical measure for the discriminability between two stimuli:Embedded Image where R is the response strength to the stimulus (STIM), is the mean value of R, and ς2 is its variance. The selectivity for BOS (STIM1) was compared with each of several stimuli (STIM2); REV, RS, and at least one CON. A d′ score >0.5 was used as the criterion for deeming a cell “BOS selective” (Solis and Doupe, 1997). In cases in which several CON songs were presented, d′ values were calculated for the CON that most closely matched the duration of the BOS.

Intracellular subthreshold data were analyzed according to Mooney (2000). Briefly, membrane potential was median filtered using a sliding window (5 ms) to remove action potentials while not significantly distorting slower membrane potential fluctuations. The integrals of the positive-going deviations in membrane potential relative to the mode membrane potential of the baseline prestimulus period were calculated for the periods before and during the stimulus presentation. The prestimulus integral was subtracted from the stimulus integral to assess the depolarizing subthreshold response strength. Similar calculations were conducted to quantify the hyperpolarizing response strength. Average positive and negative response strengths were computed over multiple stimulus iterations and then used to calculate z-scores and d′ values of the positive and negative deflections of membrane potential.

Spike-triggered averages (STAs) were calculated to measure the relative timing of action potentials (spikes) of single units in CM with changes in NIf neuron membrane potential. STAs were calculated off-line from paired CM and NIf recordings during stimulus playback as well as silence by averaging the median-filtered NIf neuron membrane potential within a time window of ±200 ms relative to the action potential in the CM neuron (i.e., the trigger event). Separate calculations were made for the stimulus and prestimulus periods. To correct for stimulus coordination artifacts, we generated a shuffled STA by pairing CM and NIf records from different trials and then subtracted this shuffled STA from the raw STA (Perkel et al., 1967; Coleman and Mooney, 2004). A pair was determined to have coherent activity if the corrected STA contained a peak (or trough) within ±50 ms of the CM spike with absolute amplitude that exceeded 4 SDs of the STA in a time window −200 to −100 ms before the CM trigger spike. The criterion of 4 SDs was chosen because it most closely matched the assessments of coherent/noncoherent activity made by experienced observers blinded to the stimulus condition. Whereas noise bursts theoretically afford a more accurate assessment of timing of auditory activity in CM and NIf, most CM units showed little to no onset responses to noise.

A cumulative sum (CUSUM) analysis was performed to facilitate visualization of song-related activity of individual CM neurons during auditory playback and singing. Because the CUSUM compared activity levels during the song versus immediately preceding baseline activity, only the first motif of each song was used in this analysis. Briefly, action potentials from each cell during the first song motif and the immediately preceding 1 s period were extracted for each playback of the BOS or song bout. Those data were divided into 5 ms bins, and the baseline rate of action potential activity (computed during the first 500 ms of the 1 s pre-motif period to avoid the possible confound of introductory notes) was subtracted from each bin. Thus, each bin contained a positive or negative value that described the rate of action potential activity at that time relative to the baseline firing rate. Those bins were then summed serially to construct a CUSUM, which was plotted for each cell. An epoch in which the CUSUM value exceeded and remained consistently >3 SDs above the baseline mean rate was taken as a significant increase in activity. Conversely, an epoch in which the CUSUM value fell below and remained consistently <3 SDs below the baseline mean rate was taken as a significant decrease in activity. Individual neurons typically expressed significant changes in activity in association with both auditory playback and singing.

Lentiviral anterograde pathway tracing.

One of us (T. F. Roberts) has adapted and refined the use of lentivirus as an exclusive anterograde pathway tracer in zebra finches (unpublished observations). Presently, we used two lentiviral constructs driving the expression of either a monomeric Cherry fluorescent protein (mCherry) (FRChW) or enhanced green fluorescent protein (eGFP) (FRGW) under the control of a Rous sarcoma virus promoter. Lentivirus vectors were made by transfecting 6 × 106 293FT cells with 3 μg each of vesicular stomatitis virus glycoprotein, Δ8.9, and one promoter-reporter plasmid using Lipofectamine. After 72 h, supernatant was harvested, filtered at 0.45 μm, and pelleted by ultracentrifugation at 25,000 rpm for 90 min at 4°C. After resuspension, serially diluted lentivirus solution was used to infect 293FT cells; 72 h later, labeled 293FT cells were counted to calculate the viral titer. Lentivirus with titers ranging from 1 to 9.6 × 109 IU/μl were used in this study.

Lentivirus labeling provides a method for exclusive and sensitive anterograde pathway tracing (T. F. Roberts, unpublished observation). Large control injections of virus into the dense fiber tract between HVC and robust nucleus of the arcopallium (RA) returned no labeled cells within either nucleus, although neurons local to the injection site were labeled (data not shown), eliminating problems related to tracer uptake by fibers of passage seen with conventional tracers. Brain areas to be targeted for viral injection were first identified through a combination of stereotaxic coordinates and electrophysiological recordings, as described above. Subsequently, the recording electrode was withdrawn and an injection pipette was lowered to the same stereotaxic coordinates. Small volumes (200–400 nl) were injected in 20 nl increments every 1.5 min. Ten minutes after the final injection, the pipette was slowly removed, and the wound was closed (Vetbond) and treated with topical lidocaine and Neosporin ointment. Survival times ranged from 12 to 21 d, after which the animal was deeply anesthetized with Equithesin and perfused transcardially with saline and then 4% PFA. Frozen sections were collected and processed as described below.

Tissue collection and histology.

After each recording session or at the end of the viral tracing experiments, birds were deeply anesthetized with Equithesin and perfused transcardially with 0.9% saline, followed by 4% PFA in 25 mm phosphate buffer. Brains were removed from the skull, postfixed in PFA with 30% sucrose overnight at 4°C, blocked sagittally, and sectioned on a freezing microtome at 50–100 μm section thickness. Neurobiotin-filled neurons were further processed by incubating the sections in fluorescently tagged streptavidin (Invitrogen) and visualized using epifluorescence on a confocal microscope (Axioskop; Zeiss, Oberkochen, Germany). Simultaneously, the positions of the CM recordings and injections were also confirmed by either the fluorescent label from the GABA injections or lesion sites that were readily visualized in Nissl-stained sections.

The fluorescent signal of virally labeled neurons, although typically bright enough for microscopic analysis without further processing, was amplified by antibodies against eGFP (Chemicon, Temecula, CA) or Cherry Red (AB3216; Chemicon) at 1:1000 dilutions and fluorescently tagged secondary antibodies (A11012 and A21202; Invitrogen) at 1:400 dilutions using standard immunohistochemical methods. This amplification allowed for greater photostability and hence more robust visualization of very fine neural processes.

The boundaries of the various nuclei in this study were determined by counterstaining for either Nissl or parvalbumin. Nissl substance was stained using blue (NeuroTrace 435, N21479; Invitrogen) or green (NeuroTrace 500, N21480; Invitrogen) fluorescent label to allow visualization of double- or triple-labeled sections using standard histochemical protocols. In some cases, immunostaining for the calcium-binding protein parvalbumin was used to more sharply define the borders of NIf and HVC, which both exhibit enhanced immunoreactivity for parvalbumin relative to contiguous brain areas. Antibodies against parvalbumin (Swant, Bellinzona, Switzerland) at 1:1000 dilution were recognized by fluorescently tagged secondary antibodies (A21202; Molecular Probes) at 1:400 dilution using standard immunohistochemical methods.

Results

Overview

To establish the functional importance of CM in driving BOS-evoked auditory activity in NIf and HVC, we first used reversible inactivation methods in anesthetized adult male zebra finches. We then used extracellular and intracellular recording techniques in anesthetized birds to characterize the suprathreshold auditory responses properties of CM neurons and combined extracellular recordings in CM with intracellular recordings in NIf to assess the functional connectivity between neurons in these two areas (Fig. 1). The structural basis of these functional interactions was then examined using a lentiviral labeling method to visualize CM axons, which revealed CM axon terminals in NIf and also, to our surprise, in HVC. Based on these anatomical observations, we then reversibly inactivated CM in NIf-lesioned birds to test whether CM provides direct auditory drive to HVC. Finally, because these functional and anatomical studies indicate that CM could provide auditory information to NIf and HVC important to vocal learning and communication and because previous studies have shown that auditory activity in NIf and HVC of the adult male zebra finch is strongly attenuated or even absent during periods of wakefulness (Schmidt and Konishi, 1998; Cardin and Schmidt, 2003, 2004; Rauske et al., 2003), we made chronic recordings from CM neurons in freely behaving zebra finches during song playback and during singing.

Figure 1.

Figure 1.

Auditory and song control pathways in the songbird. A, Sagittal view of the songbird brain showing major features of the central auditory system. Auditory information passes via the eighth cranial nerve to the cochlear nucleus (CN) in the medulla, in which it is relayed through the auditory hindbrain (OS and LL) and midbrain (MLd) to the thalamic nucleus ovoidalis (Ov). Axons from Ov terminate in the massively interconnected telencephalic area Field L, which is reciprocally connected with the NCM and the CM. Previous anatomical studies (Vates et al., 1996) suggest that CM innervates the NIf, which is a major source of auditory input to HVC (Cardin and Schmidt, 2004; Coleman and Mooney, 2004). B, The song system comprises song motor (black; SMP) and anterior forebrain (white; AFP) pathways. The SMP arises from neurons in HVC (HVCRA) that project directly to the RA. RA in turn provides song motor output from the telencephalon through its projections onto syringeal motoneurons in the tracheosyringeal portion of the hypoglossal motor nucleus (XIIts) and onto respiratory premotor neurons in a column of cells in the ventrolateral medulla known as the ventral respiratory group (VRG). RA also innervates the dorsomedial intercollicular nucleus of the midbrain (DM). The anterior forebrain pathway (black arrows) arises from a distinct population of HVC neurons that innervate Area X (part of the songbird basal ganglia). Area X output neurons innervates the medial nucleus of the dorsolateral thalamus (DLM), which in turn innervates the lateral portion of the magnocellular nucleus of the anterior nidopallium (LMAN). Axons from LMAN innervate Area X and also innervate the same song premotor neurons in RA that receive input from HVCRA neurons.

Reversible inactivation of CM

To directly assess whether CM provides a functional auditory input to the song system, we reversibly inactivated CM with GABA (250 mm) while recording multiunit responses to playback of the BOS in either NIf (n = 17) or HVC (n = 5) of urethane-anesthetized adult male zebra finches (for stereotaxic coordinates used to localize CM, NIf, and HVC, see Materials and Methods). Inactivating CM strongly suppressed BOS-evoked activity in both NIf and HVC, suggesting that CM is a major source of auditory drive to the song system (Fig. 2) (NIf: mean BOS RS predrug, 12.8 ± 1.8, mean ± SEM; CM inactivation, 3.7 ± 1.1, p < 0.01; HVC: mean BOS RS predrug, 11.2 ± 3.4; CM inactivation, 1.7 ± 0.9, p = 0.05). The effect of CM inactivation on auditory-evoked NIf and HVC suprathreshold activity was rapid in onset (<10 s) and was reversible over the course of 10–20 min. Coinjection of a fluorescent tracer confirmed that the inactivation site was confined to CM (data not shown). In contrast to the strong suppression of spontaneous activity in HVC seen after NIf inactivation (Coleman and Mooney, 2004), CM inactivation did not significantly alter the spontaneous activity levels in either NIf or HVC, although a trend toward decreased activity in HVC was noted [NIf: mean spontaneous firing rate (FR) predrug, 10.41 ± 1.64; CM inactivation, 8.40 ± 1.00, p = 0.21; HVC: mean spontaneous FR predrug, 6.00 ± 1.11, mean ± SEM; CM inactivation, 3.75 ± 0.48, p = 0.15]. We confirmed the specificity of the CM inactivation effect by recording in Field L, another site of CM axonal termination. In contrast to the strong suppressive effects of CM inactivation on BOS-evoked activity in NIf and HVC, multiunit auditory activity in Field L was unaffected by applying GABA in CM (Fig. 2C) (BOS RS predrug, 13.41 ± 3.40; CM inactivation, 12.01 ± 2.56; recovery, 14.16 ± 3.69; BOS vs inactivity, p = 0.75; BOS vs recovery, p = 0.88).

Figure 2.

Figure 2.

Reversible pharmacological inactivation of CM strongly suppresses auditory activity in NIf and HVC. A, Representative multiunit recording (middle row) from NIf before, during, and after GABA injection into and inactivation of CM during playback of the BOS (bottom row). A threshold voltage (dotted line) was set for the multiunit activity, and histograms of criterion multiunit spikes were generated to 30 repetitions of BOS in each condition (top row). Vigorous auditory responses to BOS playback in the predrug condition were completely abolished by CM inactivation. The auditory responses of NIf returned to control levels after termination of GABA application to CM. B, The mean auditory response recorded in NIf (n = 17) and HVC (n = 5) to BOS playback during and after (i.e., Recovery) inactivation of CM, normalized to response strengths measured before GABA application in CM (Predrug). C, Inactivating CM did not affect the BOS-evoked auditory response of Field L (n = 9).

Detailed characterization of CM and NIf auditory interactions

Inactivation experiments in anesthetized birds show that CM is necessary to much of the auditory-evoked action potential activity in NIf and HVC but do not reveal the selectivity of the auditory responses CM transmits to the song system. To examine this issue, we first used in vivo extracellular and intracellular recordings in urethane-anesthetized birds to record the auditory-evoked action potential responses of CM neurons. We then used in vivo intracellular recordings to characterize the auditory-evoked synaptic activity of individual NIf neurons and compared subthreshold selectivity in NIf with suprathreshold selectivity in CM. Finally, we combined single-unit extracellular recordings from single CM neurons with intracellular recordings in NIf to assess the functional connectivity and compare BOS selectivity between neurons in these two brain areas.

Suprathreshold auditory response properties of CM neurons in urethane-anesthetized birds

As a first step toward understanding the nature of auditory information CM might transmit to the song system, we made multiunit and single-unit recordings in the CM of urethane-anesthetized adult male zebra finches (Fig. 3). Post hoc anatomical reconstruction of the recordings sites revealed that the median depth of CM recording locations was ∼900 μm, with quartiles from 700 to 1100 μm. As a result, most of the recordings we made were from the ventral half of CM. Multiunit recordings from CM (n = 53 sites) were generally responsive to auditory stimulation by BOS, with 75% of the sites significantly excited by BOS (z-score, 2.19 ± 0.34), 11% of sites showing firing rate suppression in response to BOS (z-score, −1.71 ± 1.00), and 14% of sites exhibiting no discernible BOS-evoked activity. The BOS-evoked multiunit responses recorded in CM tended to be sustained throughout the stimulus duration with little onset activity and weak temporally locked bursts within songs (Fig. 3A). Sites that exhibited significant excitatory auditory responses to BOS playback were also typically excited by playback of REV and CON and less so by noise bursts (z-score, 1.79 ± 0.39, 1.85 ± 0.39, and 1.08 ± 0.24 respectively). Multiunit sites that displayed firing rate suppression evoked by BOS playback were generally suppressed by other stimuli, including REV (z-score, −1.37 ± 0.57), CON (z-score, −1.75 ± 0.96), and noise (z-score, −0.68 ± 0.12).

Figure 3.

Figure 3.

Auditory responses of single and multiunits in CM to song playback. A, A representative CM multiunit recording, displayed as a cumulative peristimulus time histogram (PSTH) to 30 repetitions of each song stimulus, shows a strong bias for BOS over REV or another zebra finch song (CON). B, The auditory responses of four different CM single units illustrate the variety of responses observed in the CM population. Most CM neurons were excited by BOS playback (top 3 cells), although a small minority was suppressed by BOS (bottom cell). C, At both the single (gray) and multiunit (black) level, the population of CM neurons is skewed toward BOS selectivity, relative to REV (left) and CON (right). Some neurons exhibit quite strong selectivity for the BOS (d′ > 2). The average d′ for NIf subthreshold responses is indicated by the arrow beneath each graph. A substantial fraction of CM neurons exhibits BOS selectivity equal to or greater than the average NIf subthreshold response.

Previous studies in the urethane-anesthetized zebra finch have shown that many NIf and HVC neurons are highly selective for the BOS, whereas at the population level, many CM neurons are not BOS selective (Amin et al., 2004; Shaevitz and Theunissen, 2007). To further assess BOS selectivity in CM, we calculated the d′ value, or the difference between the z-scores for the BOS and a second stimulus (in this case, either REV or CON; d′ values >0.5 are characterized as BOS selective). The average d′ value for all (n = 41) excitatory multiunit sites was 1.55 ± 0.19 for BOS versus REV and 1.48 ± 0.30 for BOS versus CON. Moreover, many recording locations in urethane-anesthetized birds showed a very strong preference for BOS, with 35% of excitatory multiunit sites having d′ values >2. These calculations reveal that CM multiunit sites excited by song stimuli are strongly selective for the BOS. In contrast, CM recording sites (n = 6) that exhibited auditory-evoked firing rate suppression were nonselective (d′ BOS vs REV, 0.10 ± 0.44; d′ BOS vs CON, 0.42 ± 0.69). For the entire population of CM multiunit sites we sampled (n = 47 sites), the mean d′ BOS–REV was 1.19 ± 0.18 (n = 53), and the mean d′ BOS–CON was 1.12 ± 0.25.

To characterize the auditory responses of individual CM neurons, we recorded both extracellularly and intracellularly from single CM neurons. The suprathreshold auditory responses of these single units (n = 61 extracellular recordings, n = 9 intracellular recordings) paralleled those of our multiunit population (Fig. 3B,C). Most (61%) single units were significantly excited by BOS. The temporal response patterns of single units tended to contain more prominent time-locked bursts than was observed for multiunits, although most single units still responded at several times throughout the duration of the stimulus (Fig. 3B, top three units). As a population, single CM units that were excited by song stimuli displayed a weaker average bias for BOS than did multiunit sites (Fig. 3C). The average d′ value of excited single units for BOS versus REV was 0.70 ± 0.27, whereas that for BOS versus CON was 0.42 ± 0.40. The population of CM single units we recorded displayed a broad range of BOS biases, with 17% of the excitatory single units exhibiting strong selectivity (d′ > 2) for BOS versus REV and 18% exhibiting strong selectivity for BOS versus CON (Fig. 3C). Another fraction of CM single units (17%) were suppressed by BOS playback, as evidenced by a reduction of ongoing spontaneous activity during the playback period (Fig. 3B, bottom example). These cells tended to be suppressed equally by all song stimuli (BOS z-score, −1.17 ± 0.20; REV, −0.90 ± 0.08; CON, −1.09 ± 0.16). The d′ values generally showed no inhibitory bias for one song over another (d′ BOS vs REV, −0.35 ± 0.13; d′ BOS vs CON, 0.40 ± 0.13). For the entire population of CM single units we recorded, the mean d′ BOS–REV was 0.40 ± 0.18 (n = 70), and the mean d′ BOS–CON was 0.20 ± 0.17 (n = 67). These recordings indicate that a significant fraction of CM neurons are strongly BOS selective and also reveal a higher mean level of BOS selectivity, specifically for CM neurons that are excited by song stimuli, than described previously (Amin et al., 2004; Shaevitz and Theunissen, 2007).

The use of intracellular recordings in a subset of CM neurons (n = 28) also allowed comparison of their subthreshold and suprathreshold responses and a determination of whether CM neurons receive BOS-selective inputs (Fig. 4). The subthreshold responses of these CM neurons to song playback were depolarizing (Fig. 4A). On average, these CM neurons displayed subthreshold selectivity for BOS over both REV and CON (Fig. 4B) (d′ BOS vs REV, 0.99 ± 0.24; d′ BOS vs CON, 0.64 ± 0.25). For those CM neurons that showed suprathreshold responses to one or more song stimulus (n = 9), within-cell comparisons of subthreshold and suprathreshold selectivity revealed a trend toward greater subthreshold selectivity (Fig. 4C) (d′ BOS vs REV subthreshold, 1.00 ± 0.25; suprathreshold, 0.33 ± 0.33; p = 0.13). These findings suggest that some CM neurons receive BOS-selective inputs.

Figure 4.

Figure 4.

A comparison of subthreshold and suprathreshold responses of CM neurons. A, Intracellular recording of the responses of a CM neuron to BOS playback. The bottom record shows the membrane potential response of a cell to one stimulus iteration, and the middle and top rows show median filtered average membrane potential responses and the cumulative action potential PSTH to 10 stimulus iterations. All songs generate depolarizing responses in this cell, although the subthreshold response is biased toward BOS over REV and CON. The suprathreshold response of this cell was nonselective. B, Summary of the subthreshold depolarizing selectivity of intracellularly recorded CM neurons. On average, these intracellularly recorded CM neurons displayed subthreshold selectivity for BOS over both REV and CON (d′ BOS vs REV, 0.99 ± 0.24; d′ BOS vs CON, 0.64 ± 0.25). C, For those CM neurons that showed suprathreshold responses to one or more song stimulus (n = 9), within-cell comparisons did not detect a significant difference between subthreshold and suprathreshold selectivity (d′ BOS vs REV subthreshold, 1.00 ± 0.25; suprathreshold, 0.33 ± 0.33; p = 0.13). The line represents identity.

Subthreshold response properties of NIf neurons in urethane-anesthetized birds

Our analysis of the suprathreshold response properties of CM neurons provides an indication of the potential range of auditory information CM might transmit to the song system. As a complement to this characterization of CM auditory “output,” we sought to better characterize the type of auditory input received by neurons in NIf and HVC. Previous studies have shown that HVC neurons receive highly BOS-selective input (Mooney, 2000; Rosen and Mooney, 2006). As a first step toward characterizing the nature of auditory inputs to NIf, we made intracellular recordings from NIf neurons (n = 42 cells) in urethane-anesthetized zebra finches and measured their suprathreshold and subthreshold auditory responses to playback of BOS, REV, and CON. As reported previously (Janata and Margoliash, 1999; Cardin and Schmidt, 2004; Coleman and Mooney, 2004), we observed that song playback evoked sustained excitatory responses from NIf neurons, with the largest responses evoked by the BOS and the weakest responses evoked by REV (Fig. 5A,B). The NIf neurons we recorded from showed a suprathreshold bias to BOS playback, reflected in both z-scores and d′ measurements (Fig. 5B) (Coleman and Mooney, 2004, their Fig. 12_B_) (NIf z-score FR: BOS, 1.5 ± 0.16; REV, 0.7 ± 0.13; CON, 1.2 ± 0.21; d′ BOS vs REV, 1.4 ± 0.15; d′ BOS vs CON, 1.1 ± 0.16). Here we observed that NIf neurons also displayed subthreshold selectivity for the BOS (Fig. 5C) (d′ BOS vs REV area, 1.5 ± 0.18; d′ BOS vs CON area, 0.98 ± 0.22). A pairwise comparison revealed that NIf neurons exhibited similar selectivity at the subthreshold and suprathreshold levels for BOS versus REV (p = 0.79) and BOS versus CON (p = 0.56) (Fig. 5D).

Figure 5.

Figure 5.

In vivo intracellular recordings reveal that the subthreshold responses of NIf neurons are selective for BOS over REV and CON. A, Response of a single HVC-projecting NIf (NIfHVC) neuron to playback of BOS, REV, and CON (shown as oscillograms at bottom). Membrane potential records in response to a single playback of each song stimulus are shown immediately above each oscillogram, and the median-filtered average membrane potential record and cumulative action potential PSTH (bin size, 25 ms) in response to 20 iterations of each stimulus are shown above this individual record. B, Mean z-score values for the FR and subthreshold response area of all (n = 42) NIf neurons to playback of BOS, REV, and CON. C, Scatter plot of individual subthreshold d′ values recorded from NIf neurons (gray circles) and suprathreshold d′ values recorded from CM single units (open circles). Filled black squares indicate mean ± SEM; lighter gray band indicates nonselective region. The mean subthreshold responses of NIf neurons are selective for BOS versus either REV or CON (d′ > 0.5), and the mean subthreshold selectivity in NIf for BOS versus either REV or CON is higher than the mean suprathreshold selectivity in CM for these comparisons (see Results). D, Pairwise comparison of subthreshold and suprathreshold selectivity measurements from NIf neurons. Suprathreshold and subthreshold responses of NIf neurons exhibited similar selectivity for BOS versus REV (black circles; p = 0.79) and BOS versus CON (gray triangles; p = 0.56).

To begin to address whether auditory selectivity was enhanced between CM and NIf, we compared mean subthreshold d′ values recorded intracellularly in NIf (n = 42 cells) with the mean suprathreshold d′ values recorded extracellularly from CM single units (n = 70 cells described previously; these NIf and CM cells were sampled from different birds). A qualitative comparison revealed that the range of suprathreshold selectivity exhibited by CM single units overlapped with the selectivity exhibited at the subthreshold level by NIf neurons (Fig. 5C). A statistical comparison revealed that the mean subthreshold selectivity of NIf neurons for BOS versus either REV or CON was significantly higher than the mean suprathreshold selectivity in CM [NIf area d′ BOS vs REV, 1.5 ± 0.18 (n = 42); CM suprathreshold d′ BOS vs REV, 0.40 ± 0.18 (n = 70); p < 0.001; NIf area d′ BOS vs CON, 0.98 ± 0.22; CM suprathreshold d′ BOS–CON, 0.20 ± 0.17 (n = 67); p < 0.0001].

CM–NIf interactions

At least three different functional architectures could underlie functional interactions between CM and NIf: only BOS-selective CM neurons innervate NIf; only nonselective CM neurons innervate NIf, but interactions within NIf enhance subthreshold selectivity; both nonselective and selective CM neurons innervate NIf. To better distinguish between these possibilities, we combined extracellular recordings from single CM neurons with intracellular recordings in NIf and used spike-triggered averaging (see Materials and Methods) to assess the functional connections and auditory selectivity of pairs of CM and NIf neurons (Fig. 6A,B).

Figure 6.

Figure 6.

STA revealed a mixture of coherent (>4 SD excursion from baseline within ±50 ms of the CM spike; see Materials and Methods) and noncoherent interactions between CM–NIf cell pairs. A, Correlated spontaneous activity in a CM–NIf cell pair. A representative 1-s segment (top) of simultaneously recorded spontaneous single unit (CM) and subthreshold (NIf) activity traces show a relationship between burst of spikes in the CM cell and depolarizing events in the NIf neuron that become more evident at a finer timescale (middle). STAs generated from spontaneous or song-evoked action potentials in the CM neuron reveal a lagged depolarization in the NIf neuron. These examples are corrected for stimulus coordination artifacts (see Materials and Methods). B, An example of a CM–NIf cell pair with noncoherent activity. Action potentials in the CM neuron do not coincide with synaptic events in the NIf neuron (middle). STAs for either spontaneous spikes or song-evoked spikes contain no correlated events within the NIf neuron, although both cells exhibited auditory responses individually (data not shown). C, Time of STA peak versus peak amplitude for coherent (filled) and noncoherent (open) STAs. Peak times for coherent STAs tended to lag the CM spike time, whether for spontaneous or song-evoked spikes. D, There was no correlation between the BOS versus REV d′ values for CM single units and their NIf neuron partners, regardless of whether they had coherent or noncoherent STAs. The slope of the regression line for coherent pairs was −0.33 with an _R_2 of 0.017, and the slope for the noncoherent pairs is 0.042 with an _R_2 of 0.039.

In 10 of 22 cell pairs, spike-triggered averaging revealed significant correlations between spontaneous CM action potentials and membrane depolarizations in the companion NIf neuron (for determination of significant STAs, see Materials and Methods). All 10 of the STAs were depolarizing, with a mean peak magnitude of 2.1 ± 0.58 mV, and nearly all (9 of 10) of the STA peaks occurred after the CM spike (Fig. 6C, left), with an average time lag to peak of 8.6 ± 2.3 ms. Analysis of BOS-evoked activity, corrected for stimulus-dependent coordination artifacts (i.e., a shift predictor correction; see Materials and Methods) also revealed a qualitatively similar correlation between CM action potentials and NIf membrane fluctuations, albeit in a smaller number of recorded pairs (Fig. 6A, bottom). During auditory stimulation with BOS, REV, and CON, significant STAs were observed for 5 of 22 pairs, 4 of which were depolarizing. As with STAs generated from spontaneous activity, the peaks of four of the STAs generated from auditory activity followed the CM spike (Fig. 6C, right), with a mean peak magnitude of 0.78 ± 0.45 mV and average time lag of 10.2 ± 4.8 ms. We also compared the selectivity of CM and NIf neurons in those pairs that yielded significant correlations in response to auditory stimulation, as well as those pairs in which no significant correlations were observed. In neither group did we observe any correlation between the suprathreshold selectivity in CM and the subthreshold selectivity in NIf (Fig. 6D). Thus, it appears that, in this small sample of paired recordings, the BOS selectivity of a given CM neuron does not predict whether its activity is correlated with activity in NIf. These findings support the idea that NIf receives both BOS-selective and nonselective input from CM and suggest that auditory selectivity for the BOS is enhanced between CM and NIf.

CM neurons project to both NIf and HVC

Given the functional importance of CM to auditory-evoked activity in NIf and HVC, we sought to examine the anatomy of the projections of the CM into the song system. A previous study found that tracer injections into NIf resulted in sparse retrograde labeling of CM neurons (Vates et al., 1996). Despite the careful efforts of Vates and colleagues, these results are difficult to fully interpret because NIf is a thin (∼150 μm) structure embedded in a region (Field L) that is reciprocally interconnected with CM. Furthermore, patterns of anterograde labeling after injections into CM are difficult to interpret because several fiber tracts course through and around CM and traditional pathway tracers label fibers of passage. To overcome these potential technical problems, we used lentiviral pathway tracing techniques that provide unidirectional anterograde labeling without uptake by fibers of passage (Roberts et al., unpublished observations).

Small injections (200–400 nl) in CM of lentivirus driving the expression of either eGFP or mCherry routinely resulted in labeling of neuronal cell bodies primarily confined within the borders of CM (Fig. 7A) (for sites of labeling, see supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Fluorescently labeled axons could be traced exiting from CM and terminating in Field L (Fig. 7B) and NCM, as reported previously (Vates et al., 1996). In addition, we also traced axons of CM-labeled neurons into the borders of NIf (Fig. 7 B–D) (5 of 10 CM injections yielded label within NIf). Under high magnification, periodic swellings were visible along the length of these axons, suggestive of presynaptic terminals (Fig. 7C,D). We observed a coarse topographical relationship between the subregion of CM containing transfected neurons and the presence of fiber labeling within NIf: most (three of four) injections that transfected cells in the ventral half of CM resulted in fiber labeling in NIf, whereas most (four of six) injections that transfected cells in the dorsal half of CM did not label fibers within NIf. This coarse dorsoventral topography agrees with the previous observation that small tracer injections made in NIf result in sparse patterns of retrograde label in the ventral half of CM (i.e., clHV) (Vates et al., 1996). We did not observe any relationship between the rostral or caudal extent of infected CM neurons and the presence of fibers in NIf.

Figure 7.

Figure 7.

Projection from CM to NIf after lentivirus transfection of CM neurons (the areas from which these images were generated are shown in supplemental Fig. 1, available at www.jneurosci.org as supplemental material). A, Confocal image of CM made 2 weeks after injection of lentivirus–mCherry construct. The borders of CM are indicated by the dotted lines. Scale bar, 250 μm. B, Low-magnification confocal image of NIf (dotted line), with boxes representing the location of the higher-magnification images of C and D. Fibers from CM can be seen within NIf, although with denser fiber labeling in neighboring Field L. Scale bar, 100 μm. C, D, Higher-magnification confocal images show mCherry-labeled fibers are present within the boundaries of NIf. The CM fibers within NIf are varicose, with numerous bright bulges (arrows) connected by thinner axon segments (arrowheads), suggestive of en passant synaptic boutons. Scale bars, 10 μm.

To our surprise, we also observed that lentiviral injections made in CM yielded fluorescently labeled fibers in HVC (Fig. 8) (labeling in HVC was seen in seven of eight cases in which sections containing HVC were collected) (for sites of labeling, see supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Careful visual inspection revealed that injections were confined to CM, and lentiviral labeled neurons were not found in previously identified HVC afferents [i.e., medial portion of the magnocellular nucleus of the anterior nidopallium (mMAN) and NIf]. In contrast to the restricted axonal projection from ventral CM to NIf, we observed that injections in either dorsal or ventral regions of CM yielded terminal labeling in HVC. Furthermore, close inspection of labeled fibers within HVC after lentiviral injections into CM revealed small varicosities suggestive of presynaptic terminals (Fig. 8D,E). These anatomical results indicate that CM axons terminate in both NIf and HVC.

Figure 8.

Figure 8.

Projection from CM to HVC after lentivirus transfection of CM neurons (the areas from which these images were generated are shown in supplemental Fig. 1, available at www.jneurosci.org as supplemental material). A, Confocal image of the injection site in CM after transfection of CM neurons with mCherry. Scale bar, 250 μm. B, C, Low-magnification confocal images of HVC (dotted lines) from a medial (1.5 lateral from the midline) and a lateral (2.4 mm lateral) brain section, with boxes representing the location of the higher-magnification images shown in D and E. Labeled fibers can be seen within and outside the borders of HVC. The CM fibers terminate throughout the whole of HVC, although with an apparent bias toward medial HVC. Scale bar, 100 μm. D, E, Higher-magnification confocal images show mCherry-labeled CM terminals within HVC. As with the projections of CM to NIf, the projections within HVC are varicose. Bright bulges (arrows) are connected by thinner axon segments (arrowheads), suggestive of en passant synaptic boutons. Scale bar, 10 μm.

CM drives auditory activity in HVC in NIf-lesioned birds

The discovery of direct projections from CM to HVC raises the possibility that CM drives auditory responses in HVC directly, as well as indirectly through NIf. Such direct interactions between CM and HVC are challenging to assess using inactivation methods in normal birds, because inactivating CM would simultaneously remove direct and indirect (i.e., via NIf) auditory inputs to HVC. Instead, such an assessment requires measuring the direct influence of CM on HVC auditory activity in the absence of any indirect influence of CM on HVC mediated through NIf. Therefore, in three adult male zebra finches, we unilaterally lesioned NIf with ibotenic acid injections (see Materials and Methods) and, 4–9 d later, recorded BOS-evoked activity in HVC while reversibly inactivating CM with small (∼30–90 nl) pressure injections of concentrated GABA. Although auditory activity in HVC was weaker on the side of the NIf lesion (data not shown), BOS-evoked auditory responses still were readily detected with extracellular electrodes before GABA application in CM (Fig. 9A, left). Immediately after GABA application in the ipsilateral CM, auditory responses in HVC on the side of the NIf lesion were completely and reversibly abolished (Fig. 9A,B). Subsequent histological analysis revealed that NIf was completely absent on the side of the lesion (supplemental Fig. 2, available at www.jneurosci.org as supplemental material) (for details of histological methods, see Materials and Methods). These results indicate that CM directly supplies auditory input to HVC.

Figure 9.

Figure 9.

Reversible inactivation of CM in unilaterally NIf-lesioned birds abolishes BOS-evoked auditory responses in the ipsilateral HVC. A, Example of multiunit BOS responses from the ipsilateral HVC in an NIf-lesioned bird recorded before, during, and after inactivation of CM. Each column shows PSTH of spiking responses to 20 iterations of BOS playback. B, Quantification of the effect of CM inactivation on HVC BOS responses. Each data point represents the average (spikes per second) of multiunit auditory responses to 30 iterations of BOS playback in five different HVC sites recorded in three birds, before and after CM inactivation. The response strengths for each site at all times are normalized to the mean pre-GABA injection response strength.

Auditory activity in CM during quiet listening and singing

These physiological and anatomical studies show that CM is a major source of auditory input to the song system but cannot address whether CM has the potential to transmit auditory information to the song system in behavioral states in which the animal is engaged in vocal communication. This is an important issue because previous chronic recording studies indicate that auditory responses in NIf and HVC are primarily suppressed in the zebra finch during periods of wakefulness (Schmidt and Konishi, 1998; Cardin and Schmidt, 2003, 2004; Rauske et al., 2003). Therefore, we used a miniature motorized microdrive (see Materials and Methods) to record from CM single units in freely behaving adult male zebra finches (n = 3 birds). In almost all CM neurons we recorded in awake, freely behaving birds, robust auditory responses were evoked by playback of the BOS and other song and non-song stimuli (33 of 39 CM neurons were responsive to BOS playback) (Fig. 10A,B). As a population, these CM neurons were strongly selective for BOS versus noise (d′ = 2.40 ± 0.34, mean ± SEM), marginally nonselective for BOS versus CON (0.40 ± 0.30), and not selective for BOS versus REV (−0.09 ± 0.24) (Fig. 10B). These recordings reveal that CM neurons display robust auditory activity in the awake, nonvocalizing zebra finch and thus have the potential to convey auditory information to the song system during periods of quiet wakefulness.

Figure 10.

Figure 10.

Chronic microelectrode recordings from CM neurons in awake and freely behaving zebra finches revealed elevated activity during song playback and during singing. A, Representative single-unit responses of two neurons (individual cells in rows) to playback of BOS, REV, CON, and noise revealed no bias for forward BOS over REV but a strong bias for BOS over noise in both cells. In one cell (top), CON song was a less effective stimulus than BOS, whereas CON was a more effective stimulus than BOS in the remaining cell. B, Single-unit responses collected from three zebra finches showed no response bias for BOS versus REV (n = 26 cells) or BOS versus CON (n = 21 cells). A weak bias for BOS was evident in comparisons of BOS versus CON (n = 17 cells) and a strong bias was evident for BOS versus noise (n = 17 cells). C, CUSUM analysis of action potential activity during playback and singing (see Materials and Methods) revealed that individual CM neurons were typically active in both states. In each panel, baseline activity was computed during the first 500 ms, and CUSUM values exceeding 3 SDs from that baseline (dashed lines) were taken as cases of significant activity. Each column illustrates the activity of a CM neuron during BOS playback (top) and singing (bottom). Each panel contains a CUSUM plot and the corresponding spectrogram of sound played through the speakers during playback or recorded through a microphone during singing. Below the spectrogram, white boxes indicate the occurrence of introductory notes, and gray boxes indicate the occurrence of a song motif. Only the first motif of each song is considered in these cases. D, Although individual CM neurons were active during both playback and singing, the temporal pattern of activity with respect to features of the song was different in the two conditions. Each column illustrates the activity of a CM neuron during auditory playback (black) and singing (gray) averaged across song motifs; activity was time warped as necessary to permit alignment of activity against the song motif (bottom). Solid lines indicate the mean baseline firing rate of the cell with no stimulus or singing, and the dashed lines illustrate 3 SDs above that mean rate. Data in the left column are from same cell as the left column of C.

To determine whether CM neurons might detect singing-related auditory feedback, we recorded CM neuronal activity during singing and aligned the action potential activity to the onset of the first motif (n = 7 cells, in 3 birds; see Materials and Methods). In five of these cells, CUSUM (see Materials and Methods) revealed that action potential activity exceeded 3 SDs of the prevocalization mean activity level (Fig. 10C). In the remaining two cells, action potential activity was sparse throughout the recording, and an activity decrease was evident during singing (action potential rate decreased by >3 SDs from prevocalization baseline; data not shown). We also recorded BOS-evoked activity from these cells during periods when the bird was awake and not singing. Almost all of these cells (six of seven) showed increased activity during BOS playback, whereas activity in the remaining cell was suppressed. Notably, auditory suppression was not observed in either of the cells that showed singing-related decreases in activity. Although most CM neurons (five of seven) showed positive changes in action potential activity during both listening and singing, the BOS-related activity pattern exhibited qualitative differences in these two states (Fig. 10D). Thus, CM neurons in the awake zebra finch respond to auditory stimuli, and many CM neurons with BOS-evoked auditory activity also are active during singing, consistent with the idea that they could convey singing-related auditory feedback to NIf and HVC.

Discussion

Juvenile songbirds must have auditory experience of a tutor song and singing-related feedback to learn their species-typical songs (Konishi, 1965; Marler and Waser, 1977; Marler and Sherman, 1983). Furthermore, adults of some species, including the zebra finch, rely on auditory feedback for song maintenance (Nordeen and Nordeen, 1992; Leonardo and Konishi, 1999; Lombardino and Nottebohm, 2000). Although these observations indicate that brain areas that encode aspects of auditory experience synaptically interact with brain areas important to singing, identifying the specific linkage has remained elusive. The present study identifies an important functional link between telencephalic areas implicated in auditory memory and the song system.

Electrophysiological recordings in both singing and anesthetized songbirds indicate that NIf is the earliest site within the telencephalon in which auditory information is integrated with song motor activity (McCasland, 1987; Janata and Margoliash, 1999; Cardin and Schmidt, 2004; Coleman and Mooney, 2004). One practical challenge to resolving the auditory inputs of the NIf is that it comprises a thin sheet of cells intimately embedded in Field L, the major auditory thalamorecipient zone in the bird's telencephalon (Durand et al., 1992; Vates et al., 1996; Zeng et al., 2004). The viral tracing methods used here show that CM axons terminate in NIf and include distinct swellings characteristic of synaptic boutons, confirming and extending previous studies by Vates et al. (1996). The lentiviral tracing method also revealed a direct axonal projection from CM to HVC. A previous study attributed fiber labeling in HVC after tracer (biotinylated dextran amine) injections into CM to a fibers-of-passage artifact involving mMAN axons that course through CM en route to HVC (Vates et al., 1996). The lentiviral method rules out this potential confound because axonal labeling requires somatic expression of the reporter construct (Roberts et al., unpublished observations) and because cell bodies in the two major telencephalic afferents of the HVC (i.e., mMAN and NIf) were unlabeled after viral tracer injections into CM (data not shown). These anatomical results show that CM innervates both NIf and HVC but cannot address whether this pathway is functionally important to auditory activity in the song system.

We found that pharmacological inactivation of CM strongly suppressed BOS-evoked activity in both NIf and HVC, confirming the functional importance of CM to auditory activity in the song system. Furthermore, we found that reversibly inactivating CM abolished the auditory responses that could be detected in HVC after irreversible lesions to NIf, indicating that CM supplies direct auditory input to both NIf and HVC. That some auditory activity persisted in HVC after lesions to NIf was unexpected, given previous studies showing that reversibly inactivating NIf strongly suppresses HVC auditory activity (Cardin and Schmidt, 2004; Coleman and Mooney, 2004). One possibility is that NIf normally functions as the major source of auditory input to HVC, but the strength of CM terminals in HVC undergo functional enhancement after NIf lesions, perhaps through homeostatic processes. Another possibility is that the acute effects of deafferentation of HVC (i.e., immediately after NIf lesion) include a transient suppression of HVC neuronal responses to remaining auditory inputs, including those from CM. In either case, the present results indicate that activity in CM is necessary to much or all of the BOS-evoked auditory activity that can be detected in the NIf and HVC of the urethane-anesthetized zebra finch. In contrast, silencing CM activity had no discernible effect on auditory-evoked activity in Field L, a region that makes reciprocal connections with CM (Vates et al., 1996), suggesting that CM exerts greater influence on its feedforward (i.e., NIf and HVC) than feedback (Field L) targets. Furthermore, the strong suppressive effects of CM inactivation on song system auditory activity contrasts with the finding that reversibly inactivating the thalamic nucleus Uvaeformis (Uva), which displays auditory activity and innervates both NIf and HVC, exerts little or no effect on auditory responses in HVC (Coleman et al., 2007). Together, these results underscore that CM functions as the dominant source of auditory information to the telencephalic components of the song system. However, additional experiments are needed to determine to what extent auditory flow to the song system from either Uva or CM varies with behavioral state and to establish whether auditory flow from either of these areas to the song system is necessary to song learning, maintenance, and perception.

A fascinating aspect of auditory responses in NIf and HVC is their BOS selectivity, a property that must develop at least in part via experience-dependent processes. The anatomical and physiological origins of BOS selectivity remain obscure, although the proportion of selective neurons greatly increases between Field L and HVC (Lewicki and Arthur, 1996; Janata and Margoliash, 1999). Furthermore, the selectivity of the action potential output of HVC-projecting NIf neurons rivals the selectivity of synaptic responses recorded in HVC, indicating that the relative bias to the BOS is primarily established either in or before NIf (Coleman and Mooney, 2004). Other studies have reported that the CM neuronal population is not BOS selective (Amin et al., 2004; Shaevitz and Theunissen, 2007), although we found that both single-unit or multiunit CM neuronal populations were moderately BOS selective. Factors that may have contributed to these contrasting observations include the different recording methods used and a tendency in the current study to target recordings to ventromedial CM. We also found that the suprathreshold selectivity of CM neurons for BOS–REV was on average less than the subthreshold selectivity of NIf neurons, and STA analysis showed that BOS-evoked action potentials in both selective and nonselective CM neurons could be associated with depolarizing membrane potential fluctuations in NIf. These results support a model in which BOS selectivity is refined between CM and NIf, as suggested by Amin and colleagues (Amin et al., 2004; Shaevitz and Theunissen, 2007). Furthermore, the exclusively depolarizing STAs resulting from these dual recordings, the CM-leading timing signatures, and the suppression of auditory responses in NIf and HVC after CM inactivation indicate that CM provides excitatory auditory drive to the song system. Intriguingly, the amplitude of STAs between coupled CM–NIf cell pairs decreased during BOS playback, reminiscent of stimulus-dependent changes in functional connectivity detected in the mammalian auditory system (Frostig et al., 1983; Eggermont, 1994). Notably, a recent study in zebra finches also detected decreased efficacy of CM–HVC interactions during BOS playback (Shaevitz and Theunissen, 2007), suggesting that CM inputs onto NIf and HVC neurons may exhibit rate-dependent depression or that the activity of CM neurons may become desynchronized during auditory stimulation.

Recent evidence implicates CM as a site in which auditory memories important to song perception are stored. When trained on auditory behavioral tasks, CM neurons of adult European starlings develop an electrophysiological response bias to the auditory stimuli used in the behavioral assay, regardless of whether the stimuli were associated with a reward (Gentner and Margoliash, 2003). Additionally, CM also forms reciprocal connections with NCM (Vates et al., 1996), another important site of auditory plasticity. Auditory responses and immediate early gene (IEG) expression levels in NCM neurons habituate quickly and persistently in response to repetitive auditory stimulation in adult birds (Chew et al., 1995; Jarvis et al., 1995; Mello et al., 1995). This form of auditory habituation, which is not observed in the primary input of HVC, Field L, is a putative correlate of auditory memory. Moreover, when adult birds are presented with songs they last heard as juveniles (such as their tutor's song), IEG expression in their NCM is enhanced relative to IEG expression elicited by novel songs, implying the presence of a long-term auditory memory within NCM (Terpstra et al., 2004). These studies suggest that secondary regions of the songbird's auditory telencephalon may be important sites for storing information about the bird's auditory experience.

The present study shows that CM plays an essential role in driving auditory activity in NIf and HVC. The direct connection between CM, an area implicated in auditory memory, and both NIf and HVC, highlights a pathway that could be used for processing self-generated vocalizations as well as those of other birds. The chronic recordings we made in awake birds show that CM neurons are active during singing and in response to auditory presentation of other birds' songs. Because CM is located in the auditory telencephalon and is not known to receive input from song motor or premotor areas, singing-related activity in CM neurons likely reflects auditory feedback rather than corollary discharge. Thus, CM is well suited to convey auditory feedback information important to song learning and maintenance. In this context, an important goal of future experiments will be to determine whether CM neurons detect auditory feedback perturbations that disrupt song learning and maintenance. Moreover, the present results suggest that the synapses formed between CM axons and NIf and HVC neurons form the apex of an auditory–vocal pathway important to learned vocal communication in songbirds, possibly identifying a general architecture for producing and perceiving learned vocalizations. In the human brain, a synaptic interface between secondary or tertiary regions of the auditory cortex and regions of the lateral frontal cortex is likely to facilitate speech learning and perception.

Footnotes

References