Chromatic Light Adaptation Measured using Functional Magnetic Resonance Imaging (original) (raw)

ARTICLE, Behavioral/Systems

Journal of Neuroscience 15 September 2002, 22 (18) 8148-8157; https://doi.org/10.1523/JNEUROSCI.22-18-08148.2002

Abstract

Sensitivity changes, beginning at the first stages of visual transduction, permit neurons with modest dynamic range to respond to contrast variations across an enormous range of mean illumination. We have used functional magnetic resonance imaging (fMRI) to investigate how these sensitivity changes are controlled within the visual pathways. We measured responses in human visual area V1 to a constant-amplitude, contrast-reversing probe presented on a range of mean backgrounds. We found that signals from probes initiated in the L and M cones were affected by backgrounds that changed the mean absorption rates in the L and M cones, but not by background changes seen only by the S cones. Similarly, signals from S cone-initiated probes were altered by background changes in the S cones, but not by background changes in the L and M cones. Performance in psychophysical tests under similar conditions closely mirrored the changes in V1 fMRI signals. We compare our data with simulations of the visual pathway from photon catch rates to cortical blood–oxygen level-dependent signals and show that the quantitative fMRI signals are consistent with a simple model of mean-field adaptation based on Naka–Rushton (Naka and Rushton, 1966) adaptation mechanisms within cone photoreceptor classes.

The sensitivity of the human photopic visual system declines with increasing ambient intensity. This sensitivity regulation is part of a process called light adaptation. Light adaptation is an important computational step in stabilizing object appearance across illumination conditions. It involves neural mechanisms that operate as early as the cone photoreceptors. Two different models for a light adaptation mechanism at this stage have been considered: it either “resides within individual photoreceptors or operates on signals from individual receptors” (He and Macleod, 1998). These mechanisms are not mutually exclusive: there may be sensitivity control mechanisms internal to cones as well as others that act on the signals from individual cones. The different mechanisms can be discriminated experimentally by an analysis of the spatial and spectral sensitivity of light adaptation.

The behavioral color literature contains many measurements demonstrating that color appearance is regulated by changes in the gain (multiplicative scaling) of cone signals (Chichilnisky and Wandell, 1996; Nascimento and Foster, 1997; Bauml, 1999; Chichilnisky and Wandell, 1999; Speigle and Brainard, 1999; Foster et al., 2001). Moreover, analyses of the information in the signal suggest that changing the cone signal gain will provide a reasonably satisfactory solution across natural surfaces and illuminants (Foster and Nascimento, 1994; Wandell, 1995). The behavioral data also indicate that in certain conditions, cone gain can be regulated by post-receptoral neurons. A classic example of this is the phenomenon of transient tritanopia (Mollon and Polden, 1977). More recently, Delahunt and Brainard (2000) have used an elegant asymmetric color-matching procedure to show that the gain of S cone signals is influenced by absorptions in the L and M cone pathways.

There have also been electrophysiological measurements of cone gain control. Valeton and van Norren (1983) made extracellular measurements of cone potentials and developed a model of adaptation based on Naka–Rushton response characteristics at the level of individual cone types (Naka and Rushton, 1966). Boynton and Whitten (1970) measured the electroretinogram (ERG) in the human eye and concluded that significant light adaptation can be measured using this method. These measurements have been extended using new biophysical techniques to isolate cone responses in the ERG a-wave (Hood and Birch, 1995; Paupoo et al., 2000), showing significant changes in cone sensitivity as the mean field intensity increases. There also have been direct measurements of cone adaptation in the dissociated outer segment (Schnapf et al., 1990) and in the inner segments in which the pigment epithelium is removed (Schneeweis and Schnapf, 1999). These measurements are qualitatively similar to those made by Valeton and van Norren (1983) but predict slightly different cone adaptation dynamics (specifically, slightly less adaptation at low luminance levels). Hood (1998) summarizes the situation by saying that there is considerable quantitative uncertainty about the presence of gain regulation of the cone signal and the role of post-receptoral processes.

In this paper we assess light adaptation mechanisms under conditions in which a large, steady, uniform background influences the response to a spatially and temporally localized probe. We compare behavioral and physiological responses obtained using functional magnetic resonance imaging (fMRI) and psychophysical thresholds and describe a simulation that begins with the cone absorptions, includes a model of neuronal processing, and predicts both the fMRI signal and psychophysical performance.

MATERIALS AND METHODS

Color calibration procedures and display system

Stimuli were designed and presented using the psychophysics toolbox software (Brainard, 1997) running under Matlab 5.3 on a Macintosh PowerPC 720. A graphics card with 10-bit precision per red, green, and blue (RGB) channel controlled the display devices (Radius).

Color stimuli were calibrated using methods described elsewhere (Brainard, 1989; Wandell, 1995). The cathode ray tube (CRT) and liquid crystal display (LCD) spectra and gamma curves were measured in situ at 4 nm intervals in the range 380–770 nm using a photospectrometer (Photoresearch PR-650, Chatsworth CA). Cone photoreceptor spectral sensitivities were taken from the Stockman cone fundamentals (Stockman et al., 1993b). These fundamentals incorporate the macular pigment absorption spectrum into their shape, but the peak values are normalized to 1. To account for the effects of macular pigment absorptions we scaled the S-cone curve by a factor of 0.7 when calculating cone photoisomerization rates. From the cone fundamentals and the display spectra, we could compute the display parameters needed to generate cone-isolating signals. Absolute cone isomerization rates were determined using calculations similar to those described byRodieck (1998).

The range of the mean field background illumination levels was limited by the gamut of the CRT or LCD monitors. The range of levels that we could obtain represents only a small fraction of the naturally occurring range. Our background illumination changes are therefore limited compared with those used in some other studies (Stiles, 1949;Mollon and Polden, 1977; Valeton and van Norren, 1983; Schneeweis and Schnapf, 1999). To extend the background illumination range requires the use of an additional controlled illumination source that can be used within the MR environment. Such a source was not available to us at the time these experiments were performed.

All experimental subjects were color normal and had a corrected acuity of 20/20 or better.

Spatial and temporal stimulus properties

The probe stimuli were circular, phase-reversing “dartboard” patterns composed of superimposed concentric circles and radial sectors of alternating positive and negative contrast (Fig.1a). The stimulus diameter spanned 10° of visual angle and was presented in the center of a monitor that subtended 32°. The dartboard pattern had an angular spatial frequency of 12 cycles per 2π radians and a radial spatial frequency of 0.25 cycles per degree. The contrast of the fMRI stimulus alternated at 2 Hz (square-wave modulation). Probe amplitudes were either 2% of the long (L) plus middle (M) or 9% of the small (S) cone class amplitude at the midpoint of the background range. For example, if the cone photoisomerization rates (p*) for the L, M, and S cones at the midpoint background were 2700 p*, 2200 p*, and 1100 p*, respectively, and the probe was defined along the S cone isolating direction, then the constant probe amplitude was set to ±99 S-cone p*. These midpoint contrasts were chosen so that they were roughly twice the psychophysically determined target detection threshold contrast on a mean gray background.

Fig. 1.

Stimulus description. a, In the fMRI experiment, the probe stimuli were presented in a block design of 18 sec probe stimulation followed by 18 sec of mean field. The mean field slowly changed value, but the amplitude of the probe stimulus was fixed. Ten blocks were presented in a single scan. The probe contrast was 1% as measured on a neutral gray background. Throughout, the subject attended to a small fixation mark that occasionally flickered. The subject's task was to indicate when the fixation flickered.b, The psychophysical experiment consisted of a sequence of trials. Each trial lasted 1900 msec, and a single session comprised 240 trials. The subjects indicated the quadrant of a missing sector of the probe. The mean field slowly changed value, but the amplitude of the probe stimulus was fixed. See Materials and Methods for details.

The constant amplitude probe stimuli were superimposed on a full-field, spatially homogeneous background whose mean level varied with time. Slight, additional modifications to the probe/background stimulus were made depending on the type of experiment (fMRI or psychophysical target discrimination). These are described later in this section.

In practice, some deviation from the ideal case of cone independence must exist because of inaccuracies in the measurement of the monitor spectra and the limited precision of display RGB settings (10 bit). From repeated calibrations and calculations, and because of the good separation between S cone and L and M cone absorption curves, the cross-talk between different cone class excitations is minimal. Assuming standard macular pigment densities and ignoring L and M cone polymorphisms, the worst case is the cross-talk between the S-cone modulating probe and the M cones. In this case, the S-cone probe causes a modulation of the M-cone isomerization rate that is ∼4% of that experienced by the S cones. For example, when the S-cone modulation is 9%, the worst case M-cone modulation is 0.36%. This cross-talk is unlikely to alter our results significantly.

FMRI methods

MR instruments. The fMRI experimental apparatus has been described elsewhere (Wandell et al., 1999; Press et al., 2001). Briefly, subjects were supine in the scanner bore (1.5T General Electrics Signa). Data were acquired from 16 coronal slices prescribed from an initial localizer: a set of fast, sagittal T1-weighted images. The T2*-weighted functional data had a 4 mm inter-slice spacing, and the effective in-plane resolution was ∼2 mm × 2 mm. A self-navigated, interleaved spiral protocol (Noll et al., 1995; Glover and Lai, 1998) was used with a data acquisition repetition time (TR) of 1.5 sec and two interleaves. Hence, the effective inter-frame sampling interval was 3 sec. All data were acquired using a receive-only surface coil placed behind the subject's head in an arrangement that preferentially measures signals from the visual areas near the occipital pole.

MR stimulus display. Subjects viewed stimuli presented on a calibrated LCD color monitor inside the scanner room at the foot of the patient table (NEC 2000; 1024 × 768; refresh rate 60 Hz). The monitor was inside a shielded box with transparent conductive glass on one face. The supine subjects viewed the screen through binoculars and mirrors, adjusting these elements to ensure that the stimulus was centered in the visual field and no vignetting occurred. Head movement was minimized by padding and/or a bite bar. Monitor calibration was performed on site once every 12 weeks; no significant change was reported during the period of these experiments.

MR experimental design. Stimuli were presented in a block design comprising 18 sec of probe followed by 18 sec of uniform background. The 36 sec cycle time was chosen to minimize the effect of interference between the undershoot at the end of one hemodynamic response with the start of the following one (Glover, 1999). A total of 10 blocks were presented, and an additional 36 sec blank adaptation period was added at the start of the scan. The background changed constantly and smoothly throughout the experiment, ramping up and then back down to ensure that probe presentations were symmetrical around the center of the range (Fig. 1b). Only data from background conditions two through nine were analyzed to ensure that the subject was well adapted to the initial background condition and that epochs with symmetrical conditions could be averaged. The total scan time was 396 sec.

To reduce the dependence of the V1 signal on variations in the subject's attention, a small (0.1°), high-contrast fixation box was placed at the center of the screen throughout the scan. The fixation box flickered for a single video frame at random points in time with an average inter-flicker interval of 5 sec. This flickering was just perceptible. The subject's task was to tap a finger very lightly when the flicker was detected. The fixation task is unlikely to have contributed to the blood–oxygen level-dependent (BOLD) signal at the stimulus alternation frequency: the fixation target was (1) very small, (2) presented throughout both periods of the scan, and (3) presented at irregular times with a frequency that was much higher than the probe/blank alternation time.

MR signal processing. The fMRI BOLD amplitude data were analyzed using our in-house suite of fMRI data analysis tools (the software used to analyze the data in this paper will be provided on request). Linear trends were removed from the BOLD time signals before further analysis. No low-pass temporal filtering (other than that imposed by the natural cutoff limit of the scanner and the finite temporal sampling frequency of the data acquisition) was performed. No spatial filtering beyond that of the scanner acquisition parameters was performed.

FMRI amplitudes were calculated for each probe presentation epoch in each voxel in each scan using the following method. Time series for each voxel were computed as a percentage modulation of the mean. The time points corresponding to blocks two through nine were identified, and the time series were subdivided into eight blocks corresponding to the eight probe presentations of interest (four different background levels with two directions of color change). The time periods of these eight blocks were selected to account for the individual subject's measured hemodynamic delay (ranging between 4 and 6 sec) determined from an initial reference scan. The reference scans used stimuli of high-constant amplitude that modulated the same cone classes as the later scans in the session. The background in the reference scans was a constant mean gray.

The Fourier transform of each block was calculated, associating eight Fourier transforms for each voxel. The amplitude and phase of the Fourier component at the stimulus alternation frequency was calculated. The response amplitude is the amplitude of the first harmonic multiplied by the cosine of its phase lag. The response amplitude measures only the amplitude that is in the proper phase relation to the probe stimulus, effectively removing contributions from voxels that had large, spurious, out-of-phase responses. This procedure implicitly models the fMRI response as a harmonic function; previous work here and in other labs suggests that this is a reasonable approximation for this type of block-design experiment (Bandettini et al., 1993; Press et al., 2001).

The mean amplitude response for each probe/background combination is the amplitude from all voxels within the region of interest averaged across all blocks and multiple sessions. The SEM amplitude across sessions was calculated and is shown as error bars in the figures. There was no significant effect of the direction of background change (i.e., increasing vs decreasing cone isomerization rate), and therefore amplitudes from both background-symmetrical conditions in a single scan were averaged together.

Three subjects took part in the fMRI experiments. Two of the subjects (A.R.W., B.W.) participated in both the L+M probe and S − (L+M) probe experiments. An additional subject (W.A.P.) participated in the L+M cone probe experiments only. At least 12 repetitions of each probe/background combination were obtained from each subject.

Visual area identification. Retinotopic visual areas were identified for all subjects using phase-encoded retinotopic mapping procedures described previously by this and other labs (Engel et al., 1997). Regions of interest (ROIs) delimiting areas V1, V2D, V2V, V3V, V3D, and V4V were saved, and functional analysis was confined to just those voxels lying within these predefined ROIs (Fig.2).

Fig. 2.

The locations of visual areas V1, V2, and V3 are shown in an expanded view of subject B.W.'s left hemisphere. The locations of these visual areas were identified for each subject in separate phase-encoded retinotopic mapping experiments. The data from those experiments were represented on a flattened representation of the cortical gray matter (not shown here), and the area boundaries were identified from reversals of the angular field representation. The retinotopic mapping procedures identified several visual areas. In this paper we present data from area V1 only.

The ROIs were further restricted within each session to those voxels activated by a reference scan presented at the start of the scan session. This reference scan consisted of a high-contrast checkerboard pattern with the same spatial and temporal characteristics as the probe described above. The reference probe was presented against a constant gray-level background in an 18 sec on/18 sec off block design. The color contrast of the reference scan dartboard was the same as that of the probes in the subsequent adaptation scans except that the contrast was constant and high enough to produce a robust response in V1 on each presentation. The purpose of the reference scan was to restrict the analysis to those voxels that could potentially be activated by the probe in the adaptation scans that followed. The reference scans also provided an estimate of the hemodynamic delay so that this could be accounted for in the calculation of the response amplitudes (see MR signal processing).

Results are presented for V1 only. Signals from other areas were found to have lower signal-to-noise ratios and were not analyzed fully in this study.

Psychophysical methods

Psychophysical stimuli. Psychophysical test stimuli were conceptually similar to those used in the fMRI experiment. The probe was a constant-amplitude contrast pattern superimposed on a background that changed slowly along either the same or a different color direction. In the behavioral experiments, one sector of the dartboard probe (angle π/6) was set to zero contrast, creating a target with a clearly defined orientation (Fig. 1b). The probe could be presented in one of four different orientations: 0, π/2, π, or 3π/2 radians. The subject's task was to indicate the position of the missing sector (and hence the orientation of the probe) by pressing one of four buttons.

As in the fMRI experiments, the constant-amplitude probe contrast was in either the S or (L+M) color directions. The probe was superimposed on a slowly varying background that changed in either the same or a contrasting color direction to the probe contrast. The probe contrast was adjusted in a preliminary experiment to yield a correct response on 75% of the trials. The amplitudes of these threshold probes were close to 1% (in the case of L+M cone modulating probes) or 4% (in the case of S cone modulating probes) of the background and were therefore approximately half the probe amplitudes used in the fMRI experiment. The color of the fixation target was varied briefly during the test cycle as described below. As in the fMRI experiments, the background changed very slowly. The background ramped up from the minimum to the maximum and then down again over a 456 sec period. Although the time courses of the fMRI and psychophysics experiments were similar, the rate of change of background illumination in the psychophysical experiments was slightly slower. In both cases, however, the instantaneous variations in illumination were imperceptible to the observers and occurred at a rate that was presumably well within the rate of the mean-field adaptation mechanism.

Trials occurred once every 1.9 sec, and there were a total of 240 trials in each experiment. Each trial consisted of a 1 sec blank period followed by a 400 msec test period. During the test period, subjects fixated a small (0.1°) square target present at the center of the screen. The fixation target switched from black to white to indicate to the subject that a test presentation was occurring. After the test, the subject had 400 msec to respond; the fixation target turned red for this duration before reverting to black for the blank period. The first and last 24 trials were excluded from the data analysis providing a preadaptation period of 45.6 sec, thus paralleling the fMRI experiments in which the first and last trials were discarded to provide a preadaptation period and a symmetrical measurement for data analysis. Results presented are the average data from six experimental repeats on each subject.

There was no significant correlation between response accuracy and the direction of background color variation. Consequently, responses from the beginning and end of the trial (with identical instantaneous background levels) were averaged together. Responses were averaged in bins of 24, resulting in eight measurements of the subject's target discrimination accuracy at eight different background levels. These data were plotted as percentage correct versus background level in units of cone isomerization rates.

Four subjects (J.R., B.J.W., B.W., A.R.W.) participated in the psychophysics experiments. Two of the subjects (A.R.W., B.W.) also served as subjects in the fMRI experiments.

Simulation. Figure 3 shows an overview of a simulation designed to predict the BOLD signal and the psychophysical performance (for details see ). Briefly, the simulator begins with a physical description of the stimulus radiance. This radiance is converted to retinal illuminance based on conventional optics calculations. The illuminance is converted to time-varying cone isomerizations (Fig. 3a) based on the Stockman cone fundamentals (Stockman et al., 1993a, 2002), macular pigment density, and estimates of the cone photoreceptor apertures and optical efficiency (Rodieck, 1998). The isomerizations are converted to time-varying voltages using the cone adaptation measurements fromValeton and van Norren (1983). The mean output of the cone photoreceptors increases slightly with increasing background level. The next stage removes this local mean, effectively calculating the temporal contrast of the cone signal (Fig. 3b). This action parallels the processing that is performed by retinal ganglion cells (Rodieck, 1998). The zero-mean voltages are then transformed to an opponent-color signal. Finally, the signal is rectified to produce a summary of the cortical activity (Fig. 3c).

Fig. 3.

An overview of the simulation method. The graphs illustrate an S cone probe presented on a background change seen by the S cones. a, The photoisomerization rates (p*) of the three cone classes (dashed line, L cone;dotted line, M cone; solid line, S cone).b, The output of the opponent-colors response after removal of the mean. c, The rectified time series.d, The simulated BOLD signal after smoothing the time series in c. The gray-shaded region denotes one cycle of the stimulus alternation. See Materials and Methods for details.

Using the simulated cortical activity, we predict both the BOLD signal and the psychophysical performance. To predict the amplitude and time course of the BOLD signal, the cortical activity is convolved with a hemodynamic response function that blurs the time-varying activity (Fig. 3d). Because we are now operating in arbitrary scale units, the simulated cortical signal is multiplicatively scaled to bring the simulated BOLD signal into register with the measured signal. Scaling by this common factor does not change the predicted ratios of the responses. The change in psychophysical performance is estimated from the cortical signal using signal detection theory as described in detail in the . Note that the scaling factors that we estimate from the measured BOLD data are also used to scale the simulated cortical signals when we calculate the ideal observer psychophysical curves using different probe amplitudes.

RESULTS

FMRI

Figure 4 shows the time series of the fMRI BOLD signal measured in two different conditions. In one condition (Fig. 4a), the probe and background variations were both visible to the L and M cones but not the S cones. In the other condition (Fig. 4b), the probe was visible to the L and M cones, but the background change was visible only to the S cones. These data illustrate the main experimental effect. When the probe and background changes are seen by the same cone classes, the changing background significantly influences the signal amplitude. When the probe and background changes are seen by different cone classes, the time series amplitude remains constant whereas the background changes.

Fig. 4.

The time series of the fMRI signal. The response to an S cone-initiated probe is shown. a, The background change was visible to the L and M cones, but not the S cones.b, The background change was visible to the S cones, but not the L and M cones.

Figure 5 shows the change in probe amplitude at different background levels. The probe amplitude signal change is expressed as a percentage change of the mean BOLD signal level (see Materials and Methods). Background levels are expressed in units of background cone isomerization rates for the cone class that was varied. For example, in the condition where the background changed along the S cone color axis, the test backgrounds generated S-cone isomerization rates of between ∼600 and 2000 S-cone isomerization events per second.

Fig. 5.

The amplitude of the BOLD signal is shown as a function of the mean background isomerization rate. _a_shows the amplitude measurements for background changes seen only by the L and M cones. The test probe was either an S cone-initiated stimulus (dashed line) or an L+M cone-initiated stimulus (solid line). b shows the amplitude measurement for background changes seen only by the S cones. The test probe was either an L+M cone-initiated (solid line) or an S cone-initiated (dashed line) stimulus. The data are average amplitudes from two observers, each with at least 12 samples per observer. See Materials and Methods for details.

Figure 5a show the effects of changing the background (L+M) cone absorptions on a signal generated by S and L+M cone probes. The response amplitude to the S cone probe (Fig. 5a,dashed line) is invariant across changes in the mean L+M cone excitation, whereas the response of the L+M cone probe (solid line) changes from ∼0.9 to 0.5% BOLD contrast. Similarly, Figure 5b show the effect of changing the background along the S cone color direction for an L+M or S cone isolating probe. There is no significant change in the probe response amplitude for the L+M cone probe (Fig. 5b, solid line), but the signal caused by the S cone probe (dashed line) drops from ∼1.7 to 1.1% BOLD contrast.

The theoretical significance of these values will be considered later when we develop the simulation predictions. It is interesting to note that at the highest background levels, the probe was close to the target discrimination threshold as measured in the psychophysical task described below. Even so, an easily measurable BOLD response was present.

Psychophysics

Figure 6 shows the results of the psychophysical experiment. These graphs plot percentage of correct sector identification as a function of mean background cone isomerization rates. As in the fMRI experiments, the probes were constant amplitude. The general principles observed in the fMRI measurements are repeated in the psychophysical graphs for all four subjects. In the cases where the background and probe varied along the same color axis, increasing the background level reduced target discrimination performance. When the change was along a different color direction, the background variation had no significant effect on performance.

Fig. 6.

The psychophysical performance in the sector-discrimination task. a, Performance is shown for the condition when the background change is seen by the L and M cones.b, Performance is shown when the background change was seen by the S cones. In both figures, the dashed line_shows the constant performance to an S cone-initiated probe. The_solid line shows the performance when the probe was defined by excursions along the L+M axis.

For these experiments the probe amplitudes were set to be near detection threshold on a neutral gray background. The subjects felt that performance deteriorated because the background changes reduced the probe visibility. Naturally, when the probes are below threshold, sector discrimination must fall to chance (25% correct). Although we did not explicitly measure contrast detection under these conditions, it is safe to conclude that when probe and background colors differed there was very little change in target detection accuracy.

Simulation

Figure 7 shows the amplitude of the simulated BOLD response in the four main experimental conditions. The panel on the left shows the change in amplitude changes for a background change seen by the L+M cones; the test probe is either a S cone probe (dashed line) or an L+M probe (solid line). The panel on the right shows the simulated amplitude change for a background change seen by the S cones, again, for an S cone and L+M probe. These BOLD amplitude simulations are quite similar to the measurements shown in Figure 5. The simulation quantitatively predicts the drop in sensitivity caused by increasing the background isomerization rate. This suggests that the change in sensitivity of the cones themselves, along with the removal of the instantaneous background, as measured by Valeton and van Norren (1983), are adequate to predict the drop in amplitude of the BOLD signal in human V1. In fact, the measured sensitivity changes are almost identical to the predictions in the case of the L+M cone background changes.

Fig. 7.

The simulated BOLD amplitude responses.a, Simulations for a changing L+M cone background.b, Simulations for a changing S cone background. The_dashed_ and _solid lines_show amplitudes for S cone- and L+M cone-initiated probes, respectively. These simulations should be compared with the data in Figure 4. See Materials and Methods for details.

To predict the psychophysical performance, we must relate the simulated cortical signal to the observer's discrimination performance. We have applied an ideal observer analysis to the simulated cortical input signals. Using assumptions about the cortical noise distribution and signal thresholding described in the , we use the simulated cortical signal to predict the probability of correct sector discrimination. Figure 8 shows four such curves derived from simulated cortical signals. These signals were calculated on the basis of the mean probe amplitudes used in the psychophysical experiments (L+M cone probe modulation of 0.8% mean gray background level; S cone probe modulation of 2.2% mean gray background level).

Fig. 8.

Simulations of psychophysical detection performance based on an ideal observer model. Two free parameters (the width of the noise distribution s and a signal threshold_t_) were estimated using a least-squares fit between the simulation and the data shown in Figure 6. The same parameters are used in all plots. a shows the simulated performance when the background change is defined by a change in L+M cone catch.b shows the simulated performance when the background change is seen by the S cones only. Dashed lines show data for simulated S cone probes. Solid lines are plots for L+M cone probes.

In the simulated cortical signal (Fig. 7), the amplitude of the response to the test in the orthogonal color direction decreases slightly. This slight decrease also causes a prediction of slightly reduced psychophysical performance (Fig. 8). The reason for these increases can be traced to imperfections in the simulated cone isolation and in the method used to remove the instantaneous mean from the retinal signal (subtraction of a temporally low-pass-filtered version of the signal; see ).

DISCUSSION

Models of light adaptation

At a general level, the question we asked in these experiments was this: how do spatially homogeneous mean fields affect the visual response to superimposed probes?

It is widely accepted that cone adaptation in the general case can be modeled using the linear equation (Delahunt and Brainard, 2000): $Embedded Image$ Equation 1where _L_out,_M_out, and_S_out are the zero offset outputs from the L, M, and S cone systems, and _L_in,_M_in, and_S_in are the inputs to those systems and G1… 3 are functions of the LMS background levels_L_bg,_M_bg, and_S_bg.

In our experiments, with spatially and temporally simple adapting fields, we tested whether the adaptation that we saw was consistent with the simpler model: $Embedded Image$ Equation 2where the functions G1,2 are now each dependent on only two cone classes and G3 is dependent on only a single, third cone class. If this model holds, then we expect to see little effect of changing the photon catch in one cone class (for example, the S cones) on fMRI BOLD signals that originate in another cone class (for example, the L cones). We would conclude that, under these conditions, S cone adaptation is regulated at the level of the individual cones.

This is what we observed in both the fMRI and behavioral measurements (the logic of the behavioral experiments is identical to that described above except that we also include an implicit assumption that target discrimination performance is monotonically related to cortical signal levels).

For these probe patterns and background changes, light adaptation gain mechanisms in the S cone-initiated pathway are regulated by signals that share the spectral properties of the cone itself. We did not observe significant effects of the L and M cones on S cone signals, nor are there strong effects of the S cones on L and M cone signals. This segregation of adaptation is generally consistent with the segregation between S cone and L and M cone signals in the outer retina (e.g., H1, H2 cells). In anatomical studies, H1 cells have been shown to contact primarily L and M cones, whereas H2 cell connections are strongly biased in favor of S cones (Ahnelt and Kolb, 1994). In addition, recordings made from H1 and H2 cells indicate that there is significant cone-independent sensitivity regulation (Lee et al., 1999).

On the basis of results from our simulation of the visual pathway from photoisomerization rates to BOLD signal, we find that the light levels that induce these gain changes are slightly lower than the levels of light adaptation measured for dissociated outer segments (Schneeweis and Schnapf, 1999), but as we show in the , they are consistent with the light levels measured using extracellular recordings in the retina (Valeton and van Norren, 1983). Note also that our results are consistent with the even simpler model of fully independent cone adaptation: $Embedded Image$ The validity of this model could be tested by varying the L and M cone catches independently, just as we did with the S cone catches.

The existence of a cone-level gain control mechanism demonstrated by in vitro recordings (Schneeweis and Schnapf, 1999) does not imply that this is the dominant gain control mechanism in the intact retina. Some support that this mechanism is important, however, can be found in the behavioral literature. The experiments described in this paper are similar to behavioral studies of color discrimination and target detection under conditions of slow adaptation (Krauskopf and Gegenfurtner, 1992; Zaidi et al., 1992). They too found good isolation of L+M and S cone contrast mechanisms, although Zaidi et al. (1992)suggested that there might be a very weak interaction between the L+M and S cone systems. Evidence of cone-independent gain control mechanisms in vivo has also come from measurements of the spatial extent of gain control by He and Macleod (1998). Using very high frequency interference gratings, He and Macleod (1998) showed that the spatial spread of response gain extends only across the spatial aperture of the cone itself.

There have also been a substantial number of behavioral measurements of light adaptation that suggest interactions between cone types in the regulation of cone gain. Some of these experiments require viewing conditions that are outside the range that we can generate using a CRT (Mollon and Polden, 1976; Stromeyer et al., 1978; Pugh and Mollon, 1979; Wandell and Pugh, 1980a,b). However, Brainard and his colleagues have produced a substantial body of evidence that under nearly natural viewing conditions appearance matches under different illuminants are accurately modeled by gain changes at the level of the cones (Brainard et al., 1997; Brainard, 1998; Speigle and Brainard, 1999). Under these viewing conditions, however, the gain of a single cone type is influenced by signals from other cone types (Chichilnisky and Wandell, 1996; Delahunt and Brainard, 2000). Furthermore, the nature of these interactions may differ between stimuli that are increments and decrements relative to the mean background (Chichilnisky and Wandell, 1996, 1999).

The differences between our results and those of Delahunt and Brainard (2000) must be caused by the spatiotemporal characteristics of the stimuli that we used. In particular, we note that our results are obtained for spatially simple, near-threshold targets presented on a homogeneous background that varied in time to modulate the cone excitation rate. Delahunt and Brainard (2000) also varied the cone excitation rate, but this variation occurred spatially (rather than temporally). Their stimuli also contained significant spatial structure, and their task (color matching as opposed to target discrimination) was performed on probes that were well above the contrast detection threshold.

Spatial structure in the adapting background may increase the role of post-receptoral mechanisms. Our probe/background combinations are designed to minimize the role of spatial processing and would not change the gain in post-receptoral mechanisms that respond mainly to contrast. It is also possible that even if there are psychophysically detectable interactions between cone classes at high cone contrasts, our use of near-threshold probe stimuli made the detection of such interaction unlikely.

One of the more puzzling aspects of the fMRI measurements is the persistence of a significant signal at contrast levels that are very close to detection threshold. Several authors have reported BOLD signal modulations at or below detection threshold in experiments on attention (Kastner et al., 1999; Sengpiel and Hubener, 1999; Ress et al., 2000). In these experiments, however, we controlled the subjects' attention and still observed a modulation at very low signal levels. Why can we detect the cortical signal reliably when the observer does not? One possible reason is this: the fMRI signal that we measure is linearly pooled across a considerable extent of visual cortex. Perhaps the observer cannot integrate the signal across space as efficiently as the fMRI scanner. Another suggestion is that the fMRI BOLD signal may reflect the input signal to a visual area (Logothetis et al., 2001). Current models of V1 neuronal response indicate that the firing rate of simple cells in response to an input modulation is best approximated as a thresholded, rectified function of the input signal (Carandini and Ferster, 2000). The measured (and simulated) BOLD response at high background levels may represent a regime in which the BOLD signal measures the input, but this input signal may not increase the firing rates in the V1 simple cells or contribute to the observer's performance.

Stimulus range limitations

Although the background illumination in our experiments was restricted by the gamut of the display devices, the range was adequate to measure substantial adaptation and also to differentiate two models of light adaptation dynamics that are current in the literature (Valeton and van Norren, 1983; Schneeweis and Schnapf, 1999) (see). We are modifying our experimental apparatus to extend the range of adapting illuminations. This will permit us to examine additional important phenomena, such as transient tritanopia (Mollon and Polden, 1976), where cone class interactions are known to occur.

Conclusions

Light adaptation mechanisms of retinal origin can be detected and quantitatively accounted for in fMRI signals obtained from visual area V1. The size of the gain regulation is consistent with estimates from psychophysical measurements under quite similar conditions. The patterns of psychophysical and fMRI experiments are both consistent with a process controlled mainly by receptor gain control, and for these experimental conditions we cannot reject the hypothesis that the gain is regulated within the receptor itself.

We find that the output of our current simulation of the visual pathway is in excellent agreement with the signals measured using fMRI and psychophysics. We expect to extend this simulation to more complicated spatiotemporal stimuli and to model more sophisticated aspects of post-receptoral retinal and cortical processing and fMRI physics. We will distribute the Matlab source code for this simulator on the Internet and encourage comments and modifications from other members of the scientific community. Using this quantitative model, we should be able to develop sensitive tests of the computations performed by post-receptoral neurons and perhaps even cortical mechanisms that act on stimulus properties at a much larger spatial scale.

SIMULATION OF BOLD SIGNAL CHANGE AND PSYCHOPHYSICS

In this Appendix we describe the simulator calculations used to predict the BOLD signal and psychophysical performance. The simulation consists of four computational stages that are roughly analogous to the well known processing steps in the visual pathways. After these stages, there is a fifth step that translates the underlying neural signal into either a BOLD response or a psychophysical performance level: (1) cone isomerization rates; (2) cone voltage outputs; (3) contrast and color-opponency: retinal ganglion cell signal; (4) rectified amplitude: cortical signal; (5) translation to specific dependent measure: (5a) BOLD signal: temporal blurring (hemodynamic response function); (5b) psychophysics: noise, thresholding, and ideal observer calculation based on the signal in stage 4.

The computations are described in more detail below. The Matlab code for the simulation will be provided on request.

Cone isomerization

The display spectral radiance functions were measured at 4 nm intervals (watts per nanometers per steradian per meters squared per seconds) using a Photoresearch P650 photospectrometer. Rodieck (1998)describes the computation of cone isomerization rates from stimulus radiance. The following parameters were used to calculate the cone isomerization rates: pupil diameter, 3 mm; effective focal length, 17 mm; optical efficiency (isomerizations per incident photons at cornea), L 0.27, M 0.26, S 0.15; cone diameter (all cone types), 2.2 μm.

Note that the only deviation from Rodieck's values is our estimate of S cone efficiency. Rodieck gives a value 0.07 for a foveal S cone where the macular pigment density is highest. Because our stimuli had a spatial extent of 10°, we calculated a higher value for the mean S cone optical efficiency based on a Gaussian macular pigment distribution with a full width at half maximum of 1° (Hammond et al., 1997)

For the simulation, the mean pupil size at this illumination level was chosen on the basis of data in Wyszecki and Stiles (1982). The Stiles-Crawford effect was assumed to be negligible at this pupil size.

Cone output voltages

Cone output voltages were computed using a general model that has been confirmed by several groups using various techniques (Boynton and Whitten, 1970; Valeton and van Norren, 1983; Hood, 1998; Paupoo et al., 2000). Their results were mostly consistent with a simple Naka–Rushton model (Naka and Rushton, 1966) of cone adaptation and response: $Embedded Image$ Equation 3The specific quantitative values that we have used are fromValeton and van Norren (1983). These authors measured extracellular potentials from luminance probes flashed on a constant background at the cone photoreceptor layer. The value_V_B is the output voltage,_V_m is a predefined maximum output for that cell, _I_B is the input, and ς is the semisaturation constant. The exponent n is found experimentally to be near 0.75. The parameters_V_m and ς vary with background intensity. In the original work, the semisaturation constant is defined in units of retinal illuminance (Td) rather than cone isomerization rates. By knowing the spectrum of the known light source (a Xenon arc lamp), we could convert the retinal illuminance measured in Trolands to cone isomerization rates. Knowledge of the spectral power distribution of the light source used in their work was essential so that we might apply their measurements to our colored backgrounds.

This conversion step is important because the semisaturation constant of the cones must depend on the photoisomerization rate of the cone and not the photopic luminance of the light source (Trolands). For example, short-wavelength light can cause a high isomerization rate in the S cones, but it has a negligible luminance or Troland value. We suggest that stimuli for future experiments of this type include a specific radiometric unit to permit the unambiguous conversion to isomerization rates as well as a conversion to photometric units if desired.

We also compared these results with a simulation based on cone output photovoltage data from Schneeweis and Schnapf (1999). These data (measured in isolated macaque retina in vitro) give no estimate of the small increase in the base-level cone output voltage with increasing light levels. However, the measured sensitivity variations are similar to those predicted by Valeton and van Norren (1983).

Figure 9 shows a comparison of simulated response sensitivity curves generated using the Valeton and van Norren (1983) data (dashed lines) and the Schneeweis and Schnapf (1999) data (solid lines). The Valeton and van Norren model predicts a change in sensitivity for the S cone background change of 0.68, whereas the Schneeweis and Schnapf model predicts a change of only 0.82. For the L+M cone background changes, the corresponding sensitivity reductions are both 0.55. The actual reduction in signal that we observed (Fig. 4) was 0.66 for the S cones and 0.63 for the L and M cones. Hence the fMRI data show a slightly better fit to the Valeton and van Norren model.

Fig. 9.

Predicted response sensitivity curves based on models from Valeton and van Norren (1983) (dashed lines) and Schnapf and Schneeweis (1999) (solid lines). Response sensitivity is measured relative to a zero background. The_horizontal axis_ measures the background cone photopigment isomerization rate (in isomerizations per second). The_shaded areas_ indicate the range of background isomerization rates achieved in our experiments for the different cone types. Over the range of S cone background changes (light brown shading), the models predict slightly different sensitivity changes. In comparison, the predicted sensitivity changes are similar over the L+M cone background range (dark brown shading).

Contrast estimation and opponent colors (retinal ganglion cell)

At this point in the calculation, we have a prediction of the time-varying cone voltage as the stimulus changes throughout the experiment. As the background isomerizations for a particular cone class increase, the cone output voltage increases, the gain is reduced, and the temporal contrast of the signal decreases. At this stage of the calculation, we remove the local mean signal and compute the local temporal contrast. Convolving the time-varying cone voltage with an exponential function (_T_1/2 of 200 msec) produces a local temporal average cone voltage; we subtract this average from the complete time series to generate the local temporal contrast. Finally, signals from the three cone types are combined into three channels, one luminance and two opponent, creating L+M, L−M, and S − (L+M) signals that are analogous to color-opponent retinal ganglion cell responses. The simulation is relatively insensitive to the exact weights of the cone inputs used to create the opponent channels. In our work we used unity weights in all cases, but changing the relative weights of the L and M cones to 2:1 makes almost no difference in the final results.

V1 Cortical signal

The signal in V1 caused by the combined input from the opponent stage is modeled rectification and summation of the opponent inputs channels. Recent research suggests that the BOLD signal may be dependent on the inputs to a visual area rather than the outputs (Logothetis et al., 2001). We therefore suppose that the BOLD signal (a rectified response to signals in both on and off pathways) is linearly dependent on the input levels at the V1 simple cells. The presence or absence of a psychophysical response threshold (Carandini and Ferster, 2000) is considered later.

BOLD prediction

To predict the BOLD signal, the local signal amplitude is convolved with a hemodynamic response function described by Worsley (http://www.bic.mni.mcgill.ca/users/keith/). This resulting time course matches the general properties of the fMRI signal. The absolute size of the MR signal is rather arbitrary at this stage in the simulation, because it depends on the relationship between absolute signal levels at the input to the cortex, their influence on blood oxygenation, and the physics of the detection of the BOLD signal by a hypothetical scanner. All that we hope of these stages is that they each perform approximately linear mappings from input to output so that the ratio of the final simulated signal can be compared with the ratio of the corresponding measured BOLD response (Boynton et al., 1996).

The final step of the calculation, then, is to scale the fMRI signals originating from the L+M and S probes. The scaling factors are calculated independently for S and L+M probes and operate on the cortical signal before the convolution by the hemodynamic response function. Scaling factors that minimized the mean sum of squared error between the predicted and observed fMRI signals were used. The reason for applying the scaling factors to the cortical signal is that the same scaling factors were used when computing the ideal observer psychophysical response functions. These functions were calculated on the basis of the amplitude of the cortical signal (a measure of neuron firing rate) and not the derived BOLD signal. Note that multiplicative scaling of this type makes no difference to the predicted sensitivity changes. Hence, the predicted and observed amplitudes should be compared using changes in amplitudes (slope) of the simulated and actual signal amplitude values.

Psychophysical prediction

To predict psychophysical performance, we must calculate how the observer uses the simulated cortical signal to judge which quadrant contains the missing sector. Following the literature, we assume that the cortical signal is thresholded (Carandini and Ferster, 2000). We also assume that the final decision is determined by an ideal discrimination based on this thresholded signal and normally distributed noise.

The simulation provides us with a measure of the cortical signal. For a cortical signal with amplitude S, a cortical threshold value, τ, we calculate the effective signal available to the observer as: $Embedded Image$ Given a noise standard deviation of ς, we can calculate the 4 alternative forced choice percentage correct (assuming that the ideal strategy is to identify the spatial interval with the lowest signal) as: $Embedded Image$ The term $Embedded Image$

is the cumulative distribution of a normal random variable with mean 0 and SD ς, ∫p(x_∣_A) is the probability density of a signal with mean value A. The probability correct represents the chance that the value x arises from the signal distribution,A, and that all of the three noise quadrants have a value less than x. We assume that the SD of the noise and signal + noise distributions are identical. The free parameters τ (0.2263) and ς (0.0856) were estimated by fitting simulated response curves to real psychophysical data taken from all four conditions (L+M or S background, L+M or S probe).

Footnotes

This work was supported by National Institutes of Health Grant RO1 EY03614. We thank Robert Dougherty.
Correspondence should be addressed to A. R. Wade, 420 Jordan Hall, Stanford University, Stanford, CA 94305 MC2130. E-mail:wade{at}white.stanford.edu.