Robust model-based analysis of single-particle tracking experiments with Spot-On (original) (raw)
Abstract
Single-particle tracking (SPT) has become an important method to bridge biochemistry and cell biology since it allows direct observation of protein binding and diffusion dynamics in live cells. However, accurately inferring information from SPT studies is challenging due to biases in both data analysis and experimental design. To address analysis bias, we introduce ‘Spot-On’, an intuitive web-interface. Spot-On implements a kinetic modeling framework that accounts for known biases, including molecules moving out-of-focus, and robustly infers diffusion constants and subpopulations from pooled single-molecule trajectories. To minimize inherent experimental biases, we implement and validate stroboscopic photo-activation SPT (spaSPT), which minimizes motion-blur bias and tracking errors. We validate Spot-On using experimentally realistic simulations and show that Spot-On outperforms other methods. We then apply Spot-On to spaSPT data from live mammalian cells spanning a wide range of nuclear dynamics and demonstrate that Spot-On consistently and robustly infers subpopulation fractions and diffusion constants.
Research organism: Human, Mouse
eLife digest
Proteins, the molecules that make up the cells’ internal machinery, are responsible for almost every process that keeps cells alive. Watching how proteins move and interact within a living cell can help scientists to better understand these biological mechanisms. Single-particle tracking is a recent technique that makes these observations possible by taking ‘live’ recordings of individual proteins in a cell. Typically, the goal of a single-particle tracking experiment is to assign proteins into groups, or subpopulations, based on the way they move in the cell. For example, one subpopulation may be bound to other cellular structures, a second moving freely at a high speed, and a third diffusing slowly. This informs on the biological roles of the proteins.
The method involves an experimental stage and an analysis stage. During the experiment, proteins of interest are labeled with a small dye molecule that produces light when excited by a laser. The laser then illuminates the cell, stimulating all the labels in a thin layer. The position of each molecule is then determined with a microscope and a ‘snapshot’ taken. By repeating this process over multiple images, the movement of each molecule over time can be tracked. However, experimental problems can make the interpretation difficult. Motion blurring takes place when the proteins move so fast they appear as blurs in the images; tracking errors happen when so many proteins are present in the same space their trajectories overlap.
Here, Hansen, Woringer et al. combine two pre-existing methods to improve the experimental set-up. Using lasers that flash like a strobe light reduces motion blurring by essentially taking snapshots of the proteins at short time intervals. Tracking errors are addressed by a technique whereby only one protein at a time produces light.
Once the images are obtained and analyzed to yield trajectories, the trajectories themselves need to be analyzed to determine the number and properties of the protein subpopulations. Several factors can skew this analysis stage. For example, there is often a bias against fast-moving particles because the laser only lights up a thin layer of the cell. The proteins travelling slowly stay in focus long enough to be detected across many images; the fast ones quickly move out of the layer and are therefore counted less often. Hansen, Woringer et al. designed a free and user-friendly algorithm package called Spot-On to correct for this issue. Spot-On was thoroughly benchmarked against other solutions, demonstrating both its accuracy and robustness.
Single-particle tracking can lead to misleading results if used incorrectly. It is essential to publically share solutions that help make this technique more rigorous, especially since a growing number of scientists have already started to use the method.
Introduction
Advances in imaging technologies, genetically encoded tags and fluorophore development have made single-particle tracking (SPT) an increasingly popular method for analyzing protein dynamics (Liu et al., 2015). Recent biological applications of SPT have revealed that transcription factors (TFs) bind mitotic chromosomes (Teves et al., 2016), how Polycomb interacts with chromatin (Zhen et al., 2016), that ‘pioneer factor’ TFs bind chromatin dynamically (Swinstead et al., 2016), that TF binding time correlates with transcriptional activity (Loffreda et al., 2017) and that different nuclear proteins adopt distinct target search mechanisms (Izeddin et al., 2014; Rhodes et al., 2017). Compared with indirect and bulk techniques such as Fluorescence Recovery After Photobleaching (FRAP) or Fluorescence Correlation Spectroscopy (FCS), SPT is often seen as less biased and less model-dependent (Goulian and Simon, 2000; Mueller et al., 2013; Shen et al., 2017). In particular, SPT makes it possible to directly follow single molecules over time in live cells and has provided clear evidence that proteins often exist in several subpopulations that can be characterized by their distinct diffusion coefficients (Mueller et al., 2013; Shen et al., 2017). For example, nuclear proteins such as TFs and chromatin binding proteins typically show a quasi-immobile chromatin-bound fraction and a freely diffusing fraction inside the nucleus. However, while SPT of slow-diffusing membrane proteins is an established technology (Weimann et al., 2013), 2D-SPT of proteins freely diffusing inside a 3D nucleus introduces several biases that must be corrected for in order to obtain accurate estimates of subpopulations. First, while a frame is acquired, fast-diffusing molecules move and spread out their emitted photons over multiple pixels causing a ‘_motion-blur_’ artifact (Berglund, 2010; Deschout et al., 2012; Frost et al., 2012; Goulian and Simon, 2000; Izeddin et al., 2014), whereas immobile or slow-diffusing molecules resemble point spread functions (PSFs; Figure 1A). This results in under-counting of the fast-diffusing subpopulation. Second, high particle densities tend to cause tracking errors when localized molecules are connected into trajectories. This can result in incorrect displacement estimates (Figure 1B). Third, since SPT generally employs 2D imaging of 3D motion, immobile or slow-diffusing molecules will generally remain in-focus until they photobleach and therefore exhibit long trajectories, whereas fast-diffusing molecules in 3D rapidly move out-of-focus, thus resulting in short trajectories (we refer to this as ‘_defocalization_’; Figure 1C). This results in a time-dependent under-counting of fast-diffusing molecules (Goulian and Simon, 2000; Kues and Kubitscheck, 2002). Fourth, SPT analysis methods themselves may introduce biases; to avoid this, an accurate and validated method is needed (Figure 1D).
Figure 1. Bias in single-particle tracking (SPT) experiments and analysis methods.
(A) ‘Motion-blur’ bias. Constant excitation during acquisition of a frame will cause a fast-moving particle to spread out its emission photons over many pixels and thus appear as a motion-blur, which make detection much less likely with common PSF-fitting algorithms. In contrast, a slow-moving or immobile particle will appear as a well-shaped PSF and thus readily be detected. (B) Tracking ambiguities. Tracking at high particle densities prevents unambiguous connection of particles between frames and tracking errors will cause displacements to be misidentified. (C) Defocalization bias. During 2D-SPT, fast-moving particles will rapidly move out-of-focus resulting in short trajectories, whereas immobile particles will remain in-focus until they photobleach and thus exhibit very long trajectories. This results in a bias toward slow-moving particles, which must be corrected for. (D) Analysis method. Any analysis method should ideally avoid introducing biases and accurately correct for known biases in the estimation of subpopulation parameters such as _D_FREE, _F_BOUND, _D_BOUND.
Here, we introduce an integrated approach to overcome all four biases. The first two biases must be minimized at the data acquisition stage and we describe an experimental SPT method to do so (spaSPT), whereas the latter two can be overcome using a previously developed kinetic modeling framework (Hansen et al., 2017; Mazza et al., 2012) now extended and implemented in Spot-On. Spot-On is available as a web-interface (https://SpotOn.berkeley.edu) as well as Python and Matlab packages.
Results
Overview of Spot-On
Spot-On is a user-friendly web-interface that pedagogically guides the user through a series of quality-checks of uploaded datasets consisting of pooled single-molecule trajectories. It then performs kinetic model-based analysis that leverages the histogram of molecular displacements over time to infer the fraction and diffusion constant of each subpopulation (Figure 2). Spot-On does not directly analyze raw microscopy images, since a large number of localization and tracking algorithms exist that convert microscopy images into single-molecule trajectories (for a comparison of particle tracking methods, see (Chenouard et al., 2014); moreover, Spot-On can be one-click interfaced with TrackMate (Tinevez et al., 2017), which allows inspection of trajectories before uploading to Spot-On).
Figure 2. Overview of Spot-On interface.
To use Spot-On, a user uploads raw SPT data in the form of pooled SPT trajectories to the Spot-On web-interface. Spot-On then calculates displacement histograms. The user inputs relevant experimental descriptors and chooses a model to fit. After model-fitting, the user can then download model-inferred parameters, meta-data and download publication-quality figures.
To use Spot-On, a user uploads their SPT trajectory data in one of several formats (Figure 2). Spot-On then generates useful meta-data for assessing the quality of the experiment (e.g. localization density, number of trajectories etc.). Spot-On also allows a user to upload multiple datasets (e.g. different replicates) and merge them. Spot-On then calculates and displays histograms of displacements over multiple time delays. The next step is model fitting. Spot-On models the distribution of displacements for each subpopulation using Brownian motion under steady-state conditions without state transitions (full model description in Materials and Methods). Spot-On also accounts for localization errors (either user-defined or inferred from the SPT data). Crucially, Spot-On corrects for defocalization bias (Figure 1C) by explicitly calculating the probability that molecules move out-of-focus as a function of time and their diffusion constant (Video 1). In fact, Spot-On uses the gradual loss of freely diffusing molecules over time as additional information to infer the diffusion constant and size of each subpopulation.
Video 1. Related to Figure 1.
Illustration of defocalization bias. Illustration of a single-particle tracking experiment with two subpopulations (one ‘immobile’, D = 0.001 µm²/s, the other ‘free’, D = 4 µm²/s with a 1:1 ratio, observed using 20 ms time interval). The red region corresponds to the axial detection range (1 µm) and molecules randomly appear when they photo-activate. For each trajectory, the detected localizations inside the detection range are shown as red spheres and undetected localizations outside the detection range are shown as white spheres. Each particle has a mean lifetime of 15 frames, 25 nm localization error and trajectories consisting of at least two frames are plotted. Epi illumination is assumed. The SPT data was simulated and plotted using simSPT (available at https://gitlab.com/tjian-darzacq-lab/simSPT).
Spot-On considers either 2 or 3 subpopulations. For instance, TFs in nuclei can generally exist in both a chromatin-bound state characterized by slow diffusion and a freely diffusing state associated with rapid diffusion. In this case, a 2-state model is generally appropriate (‘bound’ vs. ‘free’). Spot-On allows a user to choose their desired model and parameter ranges and then fits the model to the data. Using the previous example of TF dynamics, this allows the user to infer the bound fraction and the diffusion constants. Finally, once a user has finished fitting an appropriate model to their data, Spot-On allows easy download of publication-quality figures and relevant data (Figure 2; Full tutorial in Supplementary file 1).
Validation of Spot-On using simulated SPT data and comparison to other methods
We first evaluated whether Spot-On could accurately infer subpopulations (Figure 1D) and successfully account for known biases (Figure 1C) using simulated data. We compared Spot-On to a popular alternative approach of first fitting the mean square displacement (MSD) of individual trajectories of a minimum length and then fitting the distribution of estimated diffusion constants (we refer to this as ‘MSDi’) as well as a sophisticated Hidden-Markov Model-based Bayesian inference method (vbSPT) (Persson et al., 2013). Since most SPT data is collected using highly inclined illumination (Tokunaga et al., 2008) (HiLo), we simulated TF binding and diffusion dynamics (2-state model: ‘bound vs. free’) confined inside a 4 µm radius mammalian nucleus under realistic HiLo SPT experimental settings subject to a 25 nm localization error (Figure 3—figure supplement 1). We considered the effect of the exposure time (1 ms, 4 ms, 7 ms, 13 ms, 20 ms), the free diffusion constant (from 0.5 µm²/s to 14.5 µm²/s in 0.5 µm²/s increments) and the bound fraction (from 0% to 95% in 5% increments) yielding a total of 3480 different conditions that span the full range of biologically plausible dynamics (Figure 3—figure supplements 2–3; Appendix 1).
Spot-On accurately inferred subpopulation sizes with minimal error (Figure 3A–B, Table 1), but slightly underestimated the diffusion constant (−4.8%; Figure 3B; Table 1). However, this underestimate was due to particle confinement inside the nucleus: Spot-On correctly inferred the diffusion constant when the confinement was relaxed (Figure 3—figure supplement 4; 20 µm nuclear radius instead of 4 µm). This emphasizes that diffusion constants measured by SPT inside cells should be viewed as apparent diffusion constants. In contrast, the MSDi method failed under most conditions regardless of whether all trajectories were used (MSDi (all)) or a fitting filter applied (MSDi (_R_2 >0.8); Figure 3A–B; Table 1). vbSPT performed almost as well as Spot-On for slow-diffusing proteins, but showed larger deviations for fast-diffusing proteins (Figure 3—figure supplements 2–3).
Figure 3. Validation of Spot-On using simulations and comparisons to other methods.
(A–B) Simulation results. Experimentally realistic SPT data was simulated inside a spherical mammalian nucleus with a radius of 4 μm subject to highly-inclined and laminated optical sheet illumination (Tokunaga et al., 2008) (HiLo) of thickness 4 μm illuminating the center of the nucleus. The axial detection window was 700 nm with Gaussian edges and particles were subject to a 25 nm localization error in all three dimensions. Photobleaching corresponded to a mean trajectory length of 4 frames inside the HiLo sheet and 40 outside. 3480 experiments were simulated with parameters of _D_FREE=[0.5;14.5] in steps of 0.5 μm2/s and _F_BOUND=[0;95% in steps of 5% and the frame rate correspond to **Δ**τ=[1,4,7,10,13,20] ms. Each experiment was then fitted using Spot-On, using vbSPT (maximum of 2 states allowed) (Persson et al., 2013), MSDi using all trajectories of at least five frames (MSDi (all)) or MSDi using all trajectories of at least five frames where the MSD-curvefit showed at least _R_2 >0.8 (MSDi (_R_2 >0.8)). (A) shows the distribution of absolute errors in the _F_BOUND–estimate and (B) shows the distribution of relative errors in the _D_FREE–estimate. (C) Single simulation example with _D_FREE = 2.0 µm2/s; _F_BOUND = 70%; 7 ms per frame. The table on the right uses numbers from CDF-fitting, but for simplicity the fits to the histograms (PDF) are shown in the three plots. (D) Single simulation example with _D_FREE = 14.0 µm2/s; _F_BOUND = 50%; 20 ms per frame. Full details on how SPT data was simulated and analyzed with the different methods is given in Appendix 1.
Figure 3—figure supplement 1. Overview of SPT simulations.
(A) Trajectories were simulated in a confined volume: a ‘nucleus’ of 8 µm diameter, in which molecules are photoactivated at random and photobleach when located within the HiLo volume (a ~4 µm thick slice). Molecules are detected when they are within the axial detection range of the objective (~700 nm). (B) confinement within the nucleus was achieved by specular reflections against the nuclear envelope: a particle bumping on the nuclear envelope is ballistically reflected inside. (C) axial detection profile used for the simulation (blue): flat-top Gaussian with 600 nm plateau and 100 nm FWHM for the Gaussian edges. (red): approximated axial detection profile assumed by Spot-On (step function with 700 nm width).
Figure 3—figure supplement 2. Comparison of Spot-On, vbSPT and MSDi estimates of _D_FREE and _F_BOUND to ground-truth simulation results inside a 4 µm radius nucleus.
Heatmaps showing errors in Spot-On, vbSPT and MSDi estimates of _D_FREE (A) and _F_BOUND (B). To comprehensively test Spot-On and alternative analysis methods such as vbSPT and MSDi (Appendix 1), we analyzed 3480 simulations using these methods. The simulations are available for download (see ‘Data availability’) and the code used for the simulations is available at GitLab (https://gitlab.com/tjian-darzacq-lab/simSPT). Briefly, experimentally realistic SPT experiments were simulated assuming 2-state (free or bound) Brownian motion inside a nucleus of 4 µm radius illuminated using HiLo illumination (assuming a HiLo beam width of 4 µm), with an axial detection range of ~700 nm, centered at the middle of the HiLo beam. In (A), the heatmaps show the relative error and in (B) the heatmaps show the absolute error. In a few rare cases (marked by black squares), the MSDi-method failed such that no estimate was possible. Cumulative distribution function (CDF) plots of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D).
Figure 3—figure supplement 3. Representative fits for Spot-On, vbSPT and MSDi to ground-truth simulations.
(A) fits to simulation where _D_FREE = 2.0 µm2/s; _F_BOUND = 75%; 1 ms per frame. First column: Spot-On CDF-fit with JumpsToConsider = 4; only CDF-fit at 3Δt is shown. Second column: Spot-On CDF-fit using all jumps; only CDF-fit at 3Δt is shown. Third column: MSDi-CDF-fit to log10(_D_FREE) considering only trajectories of at least five frames, where the MSD-fit to a single trajectory was good (_R_2 >0.8). Fourth column: MSDi-CDF-fit to log10(_D_FREE) considering only trajectories of at least five frames, but without a MSD-fit threshold. Fifth column: table comparing all the four methods as well as vbSPT (2-state) estimates of _D_FREE and _F_BOUND to the ground truth used for the simulations. (B) fits to simulation where _D_FREE = 10.0 µm2/s; _F_BOUND = 10%; 4 ms per frame. Each column is described in (A). (C) fits to simulation where _D_FREE = 6.0 µm2/s; _F_BOUND = 5%; 7 ms per frame. Each column is described in (A). (D) fits to simulation where _D_FREE = 2.5 µm2/s; _F_BOUND = 40%; 10 ms per frame. Each column is described in (A). (E) fits to simulation where _D_FREE = 3.5 µm2/s; _F_BOUND = 70%; 13 ms per frame. Each column is described in (A). (F) fits to simulation where _D_FREE = 13.0 µm2/s; _F_BOUND = 55%; 20 ms per frame. Each column is described in (A).
Figure 3—figure supplement 4. Comparison of Spot-On, vbSPT and MSDi estimates of _D_FREE and _F_BOUND to ground-truth simulations inside a 20 µm radius nucleus.
Heatmaps showing errors in Spot-On, vbSPT and MSDi estimates of _D_FREE (A) and _F_BOUND (B). To comprehensively test Spot-On and alternative analysis methods such as vbSPT (Persson et al., 2013) and MSDi (Appendix 1), we analyzed 3480 simulations (described in Figure 3—figure supplement 2) using these methods. In (A), the heatmaps show the relative error and in (B) the heatmaps show the absolute error. In a few rare cases (marked by black squares), the MSDi-method failed such that no estimate was possible. Note that whereas Spot-On would always underestimate _D_FREE inside a small 4 µm radius nucleus, when the confinement is largely relaxed by considering a 20 µm radius nucleus, Spot-On now shows essentially no bias in its _D_FREE–estimate. Cumulative distribution function (CDF) plots and summary tables of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D). Bias refers to the mean error, ‘std’ is the standard deviation and ‘iqr’ is the inter-quartile range (difference between the 75th and 25th percentile).
Figure 3—figure supplement 5. Effect of defocalization bias correction.
In order to determine how important correcting for defocalization bias is, we analyzed the 3480 simulations using Spot-On (all) and exactly the same parameters as in Figure 3A–B except without the defocalization bias correction, _Z_CORR. Heatmaps show errors in Spot-On (all; with _Z_CORR) and Spot-On (all; without _Z_CORR) estimates of _D_FREE (A) and _F_BOUND (B). Histogram plots and summary statistics tables of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D). As can be seen, correcting for defocalization bias slightly improves the _D_FREE–estimate, but is essential for an accurate _F_BOUND–estimate. As expected, the longer the lag time, the more important it is to correct for defocalization bias.
Figure 3—figure supplement 6. Evaluation of the 3-states model.
Trajectories were simulated using simSPT for a 3-state model. Three representative fractions were picked and for each of them, one state was always bound (DBOUND = 0.001 µm²/s) and the two other states were varied (0.5–11 µm²/s), together with the framerate (1–20 ms), yielding 720 conditions. The simulations were then either fitted with Spot-On or vbSPT constrained to infer up to three states. (A) Distribution of the error of five of the inferred parameters (_D_SLOW, _D_FAST, _F_BOUND, _F_SLOW, _F_FAST) with respect to ground truth for Spot-On (red) and vbSPT (blue). The top row shows the distribution and the bottom row the cumulative distribution. (B-G) For each of the three fractions configurations (25/25/50, 25/50/25, 50/25/25%, for B-C, D-E, F-G, respectively), detailed error on five inferred parameters (columns) for different frame rates (rows) and various _D_SLOW and _D_FAST (rows and columns of the matrix, respectively). (H) summary table showing the mean error (bias) and standard deviation over all the simulations.
Figure 3—figure supplement 7. Sensitivity of Spot-On to the axial detection range estimate.
Heatmaps showing errors in Spot-On estimates of _D_FREE (A) and _F_BOUND (B) as a function of the axial detection range, **Δ**z. The simulations were as described in Figure 3—figure supplement 2. Rather than an unrealistic step function, the simulated axial detection range has Gaussian edges with FWHM of 700 nm. Spot-On analysis parameters were exactly as for Spot-On (all) in Figure 3—figure supplement 2, except the axial detection range, **Δ**z, was set to either 500 nm, 600 nm, 700 nm, 800 nm or 900 nm. As can be seen, Spot-On is only mildly sensitive to small errors (~100 nm) in the axial detection range estimate. In (A), the heatmaps show the relative error and in (B) the heatmaps show the absolute error. Cumulative distribution function (CDF) plots of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D) as well as summary tables.
Figure 3—figure supplement 8. Sensitivity of Spot-On to the number of time points considered.
Heatmaps showing errors in Spot-On estimates of _D_FREE (A) and _F_BOUND (B) as a function of the number of time points considered. Note that if TimePoints = n, the number of displacements that will be considered goes from one to (n-1). The simulations were as described in Figure 3—figure supplement 2. Spot-On analysis parameters were exactly as for Spot-On (all) in Figure 3—figure supplement 2, except the TimePoints parameter was set to either 3, 5, 7, 9 or dependent on the frame rate (TimePoints(_Δ_τ)) as in Figure 3—figure supplement 2 and described in Appendix 1. As can be seen, Spot-On is not very sensitive to how many TimePoints were considered. However, when working with small datasets of experimental data, if there is not enough data at higher TimePoints values, noise can make the estimates unreliable. In (A), the heatmaps show the relative error and in (B) the heatmaps show the absolute error. Cumulative distribution function (CDF) plots of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D) as well as summary tables.
Figure 3—figure supplement 9. Comparison of Spot-On and MSDi estimates of _D_FREE and _F_BOUND to ground-truth simulation results inside a 4 µm radius nucleus using PDF-fitting.
Heatmaps showing errors in Spot-On and MSDi estimates of _D_FREE (A) and _F_BOUND (B). The analysis was performed identically to Figure 3—figure supplement 2 except PDF-fitting was performed instead of CDF-fitting. In (A), the heatmaps show the relative error and in (B) the heatmaps show the absolute error. Cumulative distribution function (CDF) plots of the relative error in the _D_FREE–estimate (C) and the absolute error in the _F_BOUND–estimate (D).
Figure 3—figure supplement 10. Sensitivity of Spot-On to state changes and comparison with vbSPT.
For six different representative conditions (combinations of _D_FREE and _F_BOUND; _D_BOUND = 0.001 µm²/s; σ = 25 nm), we simulated 100,000 trajectories using simSPT and included state transitions (e.g. transition from bound to free) considering six different lag time (1, 4, 7, 10, 13 and 20 ms) and _k_ON values from 0.1 s−1 to 200 s−1 yielding a total of 396 simulations. The data were analyzed using Spot-On (all) as in Figure 3A–B. (A) For one example parameter set, (A) shows how the histogram of displacements and the goodness of the Spot-On model-fit changes as state transition go from more frequent than the frame rate (left) to very infrequent (right). (B-G), First row: shows sensitivity of the Spot-On estimate of _D_FREE to the timescale of state transitions. The values of _D_FREE and _F_BOUND are shown above the plot. (B-G), Second row: shows sensitivity of the Spot-On estimate of _F_BOUND to the timescale of state transitions. (B-G), Third row: shows sensitivity of the vbSPT estimate of _D_FREE to the timescale of state transitions. The values of _D_FREE and _F_BOUND are shown above the top plot. (B-G), Fourth row: shows sensitivity of the vbSPT estimate of _F_BOUND to the timescale of state transitions. As expected, since Spot-On ignores state transitions, the inference breaks down when the timescale of state transitions becomes comparable to the frame rate. Perhaps surprisingly, vbSPT also breaks down when state transitions are frequent despite explicitly modeling this in the Hidden Markov Model. Also as expected, a faster frame rate (e.g. 1 ms in dark blue) can support a faster state transition rate. Nevertheless, as long as the timescale of transitions is at least a few hundred milliseconds, Spot-On is not strongly affected. For comparison, the residence time of most mammalian transcription factors is tens of seconds.
Figure 3—figure supplement 11. Robustness of localization error estimates from Spot-On.
For six different representative conditions (combinations of _D_FREE and _F_BOUND; _D_BOUND = 0.001 µm²/s), we simulated 100,000 trajectories using simSPT keeping everything as in Figure 3A–B except varying the localization error (σ) from 10 nm to 75 nm in 5 nm steps and considering six different lag time (1, 4, 7, 10, 13 and 20 ms) yielding a total of 504 simulations. The data were analyzed using Spot-On (all) as in Figure 3A–B except here the localization error was inferred from the fitting. (A-F), top row: show how well the Spot-On inferred the localization error vs. the simulated localization error and the lag times are color coded. The values of _D_FREE and _F_BOUND are shown above the plot. (A-F), bottom row: histograms showing the distribution of errors in the localization error estimate across all lag time and σ-values for a given combinations of _D_FREE and _F_BOUND. (G) Table showing summary statistics from the fitting in (A-F). We note that in all cases where the bound fraction is significant (>10%), Spot-On robustly infers the localization error (mean error below 1.5 nm), whereas in cases where the bound fraction is small (10% or below), the localization error estimate becomes less robust (mean error ~3–6 nm). This is because Spot-On can most reliably use how the displacement distribution of the bound fraction changes over time to infer the localization error (see also Materials and Methods).
Figure 3—figure supplement 12. Sensitivity of Spot-On, vbSPT and MSDi (_R_2 >0.8) to sample size.
(A) Jack-knife data sampling for simulation with _D_FREE = 2.0 µm2/s; _F_BOUND = 75%; 1 ms per frame. Simulated data (inside a 4 µm radius nucleus) was used. 100,000 trajectories with a mean photo-bleaching life-time of 4 frames were simulated and then subsampled 50 times without replacement. Sample sizes of either 30, 100, 300, 1,000, 3,000, 10,000, 30,000 or 100,000 trajectories were then fit using Spot-On (all), vbSPT (2-state model) or MSDi (_R_2 >0.8) as described in the analysis of simulations section. Error bars show standard deviation among the 50 sub-samplings. We note that occasionally, no more than ~5% of the time in the case of 30 trajectories, not a single trajectory of sufficient length for Spot-On or MSDi (_R_2 >0.8) was found. In these cases, we re-sampled to obtain at least one trajectory of sufficient length. Left plot shows effect of sample size on the _D_FREE–estimate. Right plot shows effect of sample size on the _F_BOUND–estimate. The dashed line shows the ground truth used to simulate the SPT data. (A) Jack-knife data sampling for simulation with _D_FREE = 10.0 µm2/s; _F_BOUND = 10%; 4 ms per frame. Everything else is as described in (A). (C) Jack-knife data sampling for simulation with _D_FREE = 3.5 µm2/s; _F_BOUND = 50%; 7 ms per frame. Everything else is as described in (A). (D) Jack-knife data sampling for simulation with _D_FREE = 3.5 µm2/s; _F_BOUND = 70%; 13 ms per frame. Everything else is as described in (A). (E) Jack-knife data sampling for simulation with _D_FREE = 13.0 µm2/s; _F_BOUND = 55%; 20 ms per frame. Everything else is as described in (A).
Table 1. Summary of simulation results and comparison of methods.
The table shows the bias (mean error), ‘std’ (standard deviation) and ‘iqr’ (inter-quartile range: difference between the 75th and 25th percentile) for each method for all 3480 simulations. The left column shows the relative bias/std/iqr for the _D_FREE-estimate and the right column shows the absolute bias/std/iqr for the _F_BOUND-estimate.
Analysis method | _D_FREE | _F_BOUND | ||||
---|---|---|---|---|---|---|
bias | std | iqr | bias | std | iqr | |
Spot-On (all) | −4.8% | 3.3% | 3.5% | −1.7% | 1.2% | 1.8% |
vbSPT (2-state) | 0.8% | 12.5% | 6.8% | 5.0% | 4.6% | 6.1% |
MSDi (R2 > 0.8) | 8.0% | 28.5% | 4.9% | −20.6% | 26.4% | 32.1% |
MSDi (all) | −39.6% | 41.8% | 19.0% | 22.0% | 15.8% | 17.8% |
To illustrate how the methods could give such divergent results when run on the same SPT data, we considered two example simulations (Figure 3C–D; more examples in Figure 3—figure supplement 3). First, we considered a mostly bound and relatively slow diffusion case (_D_FREE: 2.0 µm²/s; _F_BOUND: 70%; _Δ_τ: 7 ms; Figure 3C). Spot-On and vbSPT accurately inferred both _D_FREE and _F_BOUND. In contrast, MSDi (_R_2 > 0.8) greatly underestimated _F_BOUND (13.6% vs. 70%), whereas MSDi (all) slightly overestimated _F_BOUND. Since MSDi-based methods apply two thresholds (first, minimum trajectory length: here five frames; second, filtering based on _R_2) in many cases less than 5% of all trajectories passed these thresholds and this example illustrate how sensitive MSDi-based methods are to these thresholds. Note that although we show the fits to the probability density function since this is more intuitive (PDF; histogram), we performed the fitting to the cumulative distribution function (CDF). Second, we considered an example with a slow frame rate and fast diffusion, such that the free population rapidly moves out-of-focus (_D_FREE: 14.0 µm²/s; _F_BOUND: 50%; _Δ_τ: 20 ms; Figure 3D). Spot-On again accurately inferred _F_BOUND, and slightly underestimated _D_FREE due to high nuclear confinement (Figure 3—figure supplement 4). Although vbSPT generally performed well, because it does not correct for defocalization bias (vbSPT was developed for bacteria, where defocalization bias is minimal), vbSPT strongly overestimated _F_BOUND in this case (Figure 3D). Consistent with this, Spot-On without defocalization-bias correction also strongly overestimates the bound fraction (Figure 3—figure supplement 5). We conclude that correcting for defocalization bias is critical. The MSDi-based methods again gave divergent results despite seemingly fitting the data well. Thus, a good fit to a histogram of log(D) does not necessarily imply that the inferred _D_FREE and _F_BOUND are accurate. A full discussion and comparison of the methods is given in Appendix 1. Finally, we extended this analysis of simulated SPT data to three states (one ‘bound’, two ‘free’ states) and compared Spot-On and vbSPT. Spot-On again accurately inferred both the diffusion constants and subpopulation fractions of each population and slightly outperformed vbSPT (Figure 3—figure supplement 6).
Having established that Spot-On is accurate, we next tested whether it was also robust. Spot-On’s ability to infer _D_FREE and _F_BOUND was robust to misestimates of the axial detection range of ~100–200 nm (Figure 3—figure supplement 7), was minimally affected by the number of timepoints considered and fitting parameters (Figure 3—figure supplements 8–9; see also Appendix 2 for parameter considerations) and was not strongly affected by state changes (e.g. binding or unbinding) provided the time-scale of state changes is significantly longer than the frame rate (Figure 3—figure supplement 10). Moreover, Spot-On inferred the localization error with nanometer precision provided that a significant bound fraction is present (Figure 3—figure supplement 11). Finally, we sub-sampled the data sets and found that just ~3000 short trajectories (mean length ~3–4 frames) were sufficient for Spot-On to reliably infer the underlying dynamics (Figure 3—figure supplement 12). We conclude that Spot-On is robust.
Taken together, this analysis of simulated SPT data suggests that Spot-On successfully overcomes defocalization and analysis method biases (Figure 1C–D), accurately and robustly estimates subpopulations and diffusion constants across a wide range of dynamics and, finally, outperforms other methods.
spaSPT minimizes biases in experimental SPT acquisitions
Having validated Spot-On on simulated data, which is not subject to experimental biases (Figure 1A–B), we next sought to evaluate Spot-On on experimental data. To generate SPT data with minimal acquisition bias we performed stroboscopic photo-activation SPT (spaSPT; Figure 4A), which integrates previously and separately published ideas to minimize experimental biases. First, spaSPT minimizes motion-blurring, which is caused by particle movement during the camera exposure time (Figure 1A), by using stroboscopic excitation (Elf et al., 2007; Frost et al., 2012). We found that the bright and photo-stable dyes PA-JF549 and PA-JF646 (Grimm et al., 2016a) in combination with the HaloTag (‘Halo’) labeling strategy made it possible to achieve a signal-to-background ratio greater than 5 with just 1 ms excitation pulses, thus providing a good compromise between minimal motion-blurring and high signal (Figure 4B). Second, spaSPT minimizes tracking errors (Figure 1B) by using photo-activation (Figure 4A) (Grimm et al., 2016a; Manley et al., 2008). Tracking errors are generally caused by high particles densities. Photo-activation allows tracking at extremely low densities (≤1 molecule per nucleus per frame) and thereby minimizes tracking errors (Izeddin et al., 2014), whilst at the same time generating thousands of trajectories. To consider the full spectrum of nuclear protein dynamics, we studied histone H2B-Halo (overwhelmingly bound; fast diffusion; Figure 4C), Halo-CTCF (Hansen et al., 2017) (largely bound; slow diffusion; Figure 4D) and Halo-NLS (overwhelmingly free; very fast diffusion; Figure 4F) in human U2OS cells and Halo-Sox2 (Teves et al., 2016) (largely free; intermediate diffusion; Figure 4E) in mouse embryonic stem cells (mESCs). We labeled Halo-tagged proteins in live cells with the HaloTag ligands PA-JF549 or PA-JF646 (Grimm et al., 2016a) and performed spaSPT using HiLo illumination (Video 2). To generate a large dataset to comprehensively test Spot-On, we performed 1064 spaSPT experiments across 60 different conditions.
Figure 4. Overview of spaSPT and experimental results.
(A) spaSPT. HaloTag-labeling with UV (405 nm) photo-activatable dyes enable spaSPT. spaSPT minimizes tracking errors through photo-activation which maintains low densities. (B) Example data. Raw spaSPT images for Halo-CTCF tracked in human U2OS cells at 134 Hz (1 ms stroboscopic 633 nm excitation of JF646). (C–F) Histograms of displacements for multiple **Δ**τ of histone H2B-Halo in U2OS cells (C), Halo-CTCF in U2OS cells (d), Halo-Sox2 in mES cells (E) and Halo-3xNLS in U2OS cells (F). (G–H) Effect of frame-rate on _D_FREE and _F_BOUND. spaSPT was performed at 200 Hz, 167 Hz, 134 Hz, 100 Hz, 74 Hz and 50 Hz using the 4 cell lines and the data fit using Spot-On and a 2-state model. Each experiment on each cell line was performed in four replicates on different days and ~5 cells imaged each day. (I) Motion-blur experiment. To investigate the effect of ‘motion-blurring’, the total number of excitation photons was kept constant, but delivered during pulses of duration 1, 2, 4, 7 ms or continuous (cont) illumination. (J–K) Effect of motion-blurring on _D_FREE and _F_BOUND. spaSPT data was recorded at 100 Hz and 2-state model-fitting performed with Spot-On. The inferred _D_FREE (J) and _F_BOUND (K) were plotted as a function of excitation pulse duration. Each experiment on each cell line was performed in four replicates on different days and ~5 cells imaged each day. Error bars show standard deviation between replicates.
Figure 4—figure supplement 1. Experimental measurement of axial detection range.
To determine the axial detection range, mESC C59 Halo-mCTCF (Hansen et al., 2017) and U2OS C32 Halo-hCTCF (Hansen et al., 2017) cells were grown overnight on plasma-cleaned coverslips, labelled with 250 pM JF646, fixed in 4% PFA in PBS for 20 min, washed with PBS and then imaged in PBS with 0.01% (w/v) NaN3. We imaged the fixed cells using longer exposure times and very low 633 nm laser intensity to minimize photo-bleaching and collected a 6 µm z-stack spanning most of a nucleus using 20 nm steps. We optimized the imaging conditions to give near identical signal-to-noise to our spaSPT conditions and the data shown is merged data from at least 15 different cells for each cell line. We then localized and ‘tracked’ molecules to determine the experimental axial detection range. (A-B) show empirical survival probability distribution for PFA-fixed mESC C59 JF646Halo-mCTCF (A) and PFA-fixed U2OS C32 JF646Halo-hCTCF (B). A minimal threshold, τMIN, was set to filter out noise and the left plot shows τMIN=10 frames and the right plot τMIN=15 frames. The raw data is shown in red and a model-fit is overlaid. The estimated axial detection range is also shown. We note that the numbers depend somewhat on the threshold set and differ a bit between U2OS and mES cells. As an approximate average, we used 700 nm here. Importantly, we note that Spot-On is relatively robust to the axial detection range estimate and that changing it by 100 nm only marginally changes the results (Figure 3—figure supplement 7) (C) summary of the model that was fit to the data and the key model parameters. The model assumes that photo-bleaching is a Poisson process and that the axial detection range can be modeled as a Gaussian CDF. Raw data used to plot (A-B) and code for reproducing this figure and the model-fitting is available on GitLab: https://gitlab.com/tjian-darzacq-lab/estimateaxialdetectionrange.
Figure 4—figure supplement 2. Sensitivity of Spot-On to anomalous diffusion.
(A) MSD-fit and Spot-On for U2OS H2B-Halo PA-JF646 spaSPT at 134 Hz. First column: A power law was fit to the time- and –ensemble-averaged mean squared displacement (MSD) and the anomalous diffusion exponent, α, inferred. To calculate the MSD from only the free population, vbSPT with an enforced 2-state model was used to classify trajectories into either bound or free and the MSD calculated from the vbSPT-classified free population. Error bars were obtained from jackknife sampling: random 50% subsamples of the data were taken, the MSD calculated and this repeated 20 times. Error bars show standard deviation among subsamplings. The best-fit α-value as well as 95% confidence intervals (CI) are shown. Second column: Spot-On inferred _D_FREE and _F_BOUND from spaSPT experiments at a range of frame rates from 50 Hz to 200 Hz. Despite anomalous diffusion, Spot-On is nevertheless able to estimate reasonably consistent _D_FREE and _F_BOUND across a wide range of frame-rates. Third to fifth column: Spot-On (JumpsToConsider = 4; 2-state model) CDF-fits at three selected time-points, showing that even with significant anomalous diffusion and only three fitted parameters, Spot-On can nonetheless fit the data reasonably well. (B) MSD-fit and Spot-On for U2OS C32 Halo-CTCF PA-JF646 spaSPT at 134 Hz. Each column is described in (A). (C) MSD-fit and Spot-On for mESC C3 Halo-Sox2 PA-JF646 spaSPT at 134 Hz. Each column is described in (A). (D) MSD-fit and Spot-On for U2OS Halo-3xNLS PA-JF646 spaSPT at 134 Hz. Each column is described in (A). (E) MSD-fit and Spot-On for simulated data with _D_FREE = 3.5 µm2/s; _F_BOUND = 50%; 7 ms per frame.
Figure 4—figure supplement 3. Re-analysis of experimental data using vbSPT.
We re-analyzed the experimental data shown in Figure 4G–K using vbSPT (allowing up to two states) instead of Spot-On. (A-B) Effect of frame-rate on _D_FREE and _F_BOUND. spaSPT was performed at 200 Hz, 167 Hz, 134 Hz, 100 Hz, 74 Hz and 50 Hz using the 4 cell lines and the data analyzed using vbSPT (max two states). Each experiment on each cell line was performed in four replicates on different days and ~5 cells imaged each day. Error bars show standard deviation between replicates. (C) Motion-blur experiment. To investigate the effect of ‘motion-blurring’, the total number of excitation photons was kept constant, but delivered during pulses of duration 1, 2, 4, 7 ms or continuous (cont) illumination. (D-E) Effect of motion-blurring on _D_FREE and _F_BOUND. spaSPT data was recorded at 100 Hz and the data analyzed using vbSPT (max two states). The inferred _D_FREE (D) and _F_BOUND (E) were plotted as a function of excitation pulse duration. Each experiment on each cell line was performed in four replicates on different days and ~5 cells imaged each day. Error bars show standard deviation between replicates. Compared to Spot-On (Figure 4G–K; repeated below), vbSPT generally reports a higher bound fraction (e.g. vbSPT reports a total bound fraction for Halo-Sox2 of ~60–65%, which is much higher than previously reported [Teves et al., 2016]). vbSPT most likely overestimates the bound fraction because it does not account for defocalization bias.
Video 2. Related to Figure 4.
Representative raw spaSPT movie (Halo-hCTCF at 134 Hz). spaSPT movie (1 ms of 633 nm laser delivered at the beginning of each frame; 405 nm laser photo-activation pulses delivered in between frames) of endogenously tagged CTCF (C32 Halo-hCTCF) in human U2OS cells imaged at ~134 Hz (7.477 ms per frame). Dye: PA-JF646. One pixel: 160 nm.
Validation of Spot-On using spaSPT data at different frame rates
First, we studied whether Spot-On could consistently infer subpopulations over a wide range of frame rates. We experimentally determined the axial detection range to be ~700 nm (Figure 4—figure supplement 1) and performed spaSPT at 200 Hz, 167 Hz, 134 Hz, 100 Hz, 74 Hz and 50 Hz using the four cell lines. Spot-On consistently inferred the diffusion constant (Figure 4G) and total bound fraction across the wide range of frame rates (Figure 4H). This is notable since all four proteins exhibit apparent anomalous diffusion (Figure 4—figure supplement 2) and this demonstrates that Spot-On is also robust to anomalous diffusion despite modeling Brownian motion. While the ground-truth is unknown when considering experiments, Spot-On gave biologically reasonable results: histone H2B was overwhelmingly bound and free Halo-3xNLS was overwhelmingly unbound (comparison with vbSPT: Figure 4—figure supplement 3). These results provide additional validation for the bias corrections implemented in Spot-On. We also note that although Spot-On was validated on spaSPT data, SPT data with non-photoactivatable dyes is also suitable for Spot-On analysis provided that the density is sufficiently low to minimize tracking errors (see also Appendix 3: "Which datasets are appropriate for Spot-On?”). Finally, we demonstrated above that just ~3000 short trajectories (mean length ~3–4 frames) were sufficient for Spot-On to accurately infer _D_FREE and _F_BOUND (Figure 3—figure supplement 12). Here we obtain well above 3000 trajectories per cell even at ~1 localization/frame. More generally, with spaSPT this should be generally achievable for all but the most lowly expressed nuclear proteins. Thus, this now makes it possible to study biological cell-to-cell variability in TF dynamics.
Effect of motion-blur bias on parameter estimates
Having validated Spot-On on experimental SPT data, we next applied Spot-On to estimate the effect of motion-blurring on the estimation of subpopulations. As mentioned, since most localization algorithms (Chenouard et al., 2014; Sergé et al., 2008) achieve super-resolution through PSF-fitting, this may cause motion-blurred molecules to be undersampled, resulting in a bias towards slow-moving molecules (Figure 1A). We estimated the extent of the bias by imaging the four cell lines at 100 Hz and keeping the total number of excitation photons constant, but varying the excitation pulse duration (1 ms, 2 ms, 4 ms, 7 ms, constant; Figure 4I). For generality, we performed these experiments using both PA-JF549 and PA-JF646 dyes (Grimm et al., 2016a). We used Spot-On to fit the data and plotted the apparent free diffusion constant (Figure 4J) and apparent total bound fraction (Figure 4K) as a function of the excitation pulse duration. For fast-diffusing proteins like Halo-3xNLS and H2B-Halo, motion-blurring resulted in a large underestimate of the free diffusion constant, whereas the effect on slower proteins like CTCF and Sox2 was minor (Figure 4J). Regarding the total bound fraction, motion-blurring caused a ~2 fold overestimate for rapidly diffusing Halo-3xNLS (Figure 4K), but had a minor effect on slower proteins like H2B, CTCF and Sox2. Similar results were obtained for both dyes for proteins with a significant bound fraction, but we note that JF549 appears to better capture the dynamics of proteins with a minimal bound fraction such as Halo-3xNLS (Figure 4J–K). Finally, we note that the extent of the bias due to motion-blurring will likely be very sensitive to the localization algorithm. Here, using the MTT-algorithm (Sergé et al., 2008), motion-blurring caused up to a 2-fold error in both the _D_FREE and _F_BOUND estimates.
Taken together, these results suggest that Spot-On can reliably be used even for SPT data collected under constant illumination provided that protein diffusion is sufficiently slow and, moreover, provides a helpful guide for optimizing SPT imaging acquisitions (we include a full discussion of considerations for SPT acquisitions and a proposal for minimum reporting standards in SPT in Appendix 3 and 4).
Discussion
In summary, SPT is an increasingly popular technique and has been revealing important new biological insight. However, a clear consensus on how to perform and analyze SPT experiments is currently lacking. In particular, 2D SPT of fast-diffusing molecules inside 3D cells is subject to a number of inherent experimental (Figure 1A–B) and analysis (Figure 1C–D) biases, which can lead to inaccurate conclusions if not carefully corrected for.
Here, we introduce approaches for accounting for both experimental and analysis biases. Several methods are available for localization/tracking (Chenouard et al., 2014; Sergé et al., 2008) and for classification of individual trajectories (Monnier et al., 2015; Persson et al., 2013). Spot-On now complements these tools by providing a bias-corrected, comprehensive open-source framework for inferring subpopulations and diffusion constants from pooled SPT data and makes this platform available through a convenient web-interface. This platform can easily be extended to other diffusion regimes (Metzler et al., 2014) and models (Lee et al., 2017) and, as 3D SPT methods mature, to 3D SPT data. Moreover, spaSPT provides an acquisition protocol for tracking fast-diffusing molecules with minimal bias. We hope that these validated tools will help make SPT more accessible to the community and contribute positively to the emergence of ‘gold-standard’ acquisition and analysis procedures for SPT.
Materials and methods
Key resources table.
Reagent type (species) or resource | Designation | Source or reference | Identifiers | Additional information |
---|---|---|---|---|
cell line (Homo sapiens) | Halo-CTCF | Hansen et al. eLife 2017;6:e25776; PMID 28467304; doi: 10.7554/eLife.25776 | U2OS C32 FLAG-Halo-CTCF | Previously reported homozygous endogenous knock-in cell line where all endogenous copies of CTCF have been N-terminally tagged with FLAG-HaloTag |
cell line (Homo sapiens) | Halo-3xNLS | Hansen et al. eLife 2017;6:e25776; PMID 28467304; doi: 10.7554/eLife.25776 | U2OS Halo-3xNLS | U2OS cell line stably expressing Halo-3xNLS (3 copies of the SV40 Nuclear Localization Signal) generated by G418 selection. Generously provided by David T McSwiggen. |
cell line (Homo sapiens) | H2B-Halo | Hansen et al. eLife 2017;6:e25776; PMID 28467304; doi: 10.7554/eLife.25776 | U2OS H2B-Halo-SNAP | U2OS cell line stably expressing histone H2B-Halo-SNAP generated by G418 selection. Generously provided by David T McSwiggen. |
cell line (Mus musculus) | Halo-Sox2 | Teves et al. eLife 2016;5:e22280; PMID 27855781; doi: 10.7554/eLife.22280 | mESC JM8.N4 C3 Halo-FLAG-Sox2 | Previously reported homozygous endogenous knock-in cell line where both endogenous copies of Sox2 have been N-terminally tagged with HaloTag-FLAG. Generously provided by Sheila S Teves. |
software, algorithm | Spot-On Matlab | this paper | Spot-On Matlab | Please see Materials and Methods for a full description. Open-source code is freely available at GitLab: : https://gitlab.com/tjian-darzacqlab/spot-on-matlab (copy archived at https://github.com/elifesciences-publications/spot-on-matlab) |
software, algorithm | Spot-On Python | this paper | Spot-On Python | Please see Materials and Methods for a full description. Open-source code is freely available at GitLab: https://gitlab.com/tjian-darzacqlab/Spot-On-cli (copy archived at https://github.com/elifesciences-publications/spot-on-cli) |
software, algorithm | Spot-On | this paper | Spot-On | Please see Materials and Methods for a full description. The web-interface can be found at https://spoton.berkeley.edu/ and the underlying source-code is freely available at GitLab: https://gitlab.com/tjian-darzacqlab/Spot-On (copy archived at https://github.com/elifesciences-publications/spot-on) |
software, algorithm | simSPT | this paper | simSPT | Code for efficiently simulating experimentally realistic SPT data. Please see Materials and Methods for a full description. Open-source code is freely available at GitLab: https://gitlab.com/tjian-darzacq-lab/simSPT |
software, algorithm | MSDi; vbSPT; | this paper and Persson et al. Nature Methods 2013; PMID: 23396281; DOI: 10.1038/nmeth.2367 | MSDi; vbSPT; | Supplementary software used for MSDi and vbSPT analysis as well as for generating the simulated data can be found at: https://zenodo.org/record/835171 |
chemical compound, drug | PA-JF549 | Grimm et al. Nature Methods 2016; PMID 27776112; DOI: 10.1038/nmeth.4034 | PA-JF549 | Please contact Luke D Lavis for distribution. |
chemical compound, drug | PA-JF646 | Grimm et al. Nature Methods 2016; PMID 27776112; DOI: 10.1038/nmeth.4034 | PA-JF646 | Please contact Luke D Lavis for distribution. |
Spot-On model
Spot-On implements and extends a kinetic modeling framework first described in Mazza et al. (2012) and later extended in Hansen et al. (2017). Briefly, the model infers the diffusion constant and relative fractions of two or three subpopulations from the distribution of displacements (or histogram of displacements) computed at increasing lag time (1Δτ, 2Δτ,. ..). This is performed by fitting a semi-analytical model to the empirical histogram of displacements using non-linear least squares fitting. Defocalization is explicitly accounted for by modeling modeling the fraction of particles that remain in focus over time as a function of their diffusion constant.
Mathematically, the evolution over time of a concentration of particles located at the origin as a Dirac delta function and which follows free diffusion in two dimensions with a diffusion constant D can be described by a propagator (also known as Green’s function). Properly normalized, the probability of a particle starting at the origin ending up at a location r = (x,y) after a time delay, Δτ, is given by:
Here N is a normalization constant with units of length. Spot-On integrates this distribution over a small histogram bin window, Δ_r_, to obtain a normalized distribution, the distribution of displacement lengths to compare to binned experimental data. For simplicity, we will therefore leave out N from subsequent expressions. Since experimental SPT data is subject to a significant mean localization error, σ, Spot-On also accounts for this (Matsuoka et al., 2009):
P(r,Δτ)=r2(DΔτ+σ2)e−r24(DΔτ+σ2)
Many proteins studied by SPT can generally exist in a quasi-immobile state (e.g. a chromatin-bound state in the case of transcription factors) and one or more mobile states. We will first consider the 2-state model. Under most conditions, state transitions can be ignored ((Hansen et al., 2017) and Figure 3—figure supplement 10). Thus, the steady-state 2-state model considered by Spot-On becomes:
P(r,Δτ)=FBOUNDr2(DBOUNDΔτ+σ2)e−r24(DBOUNDΔτ+σ2)+(1−FBOUND)r2(DFREEΔτ+σ2)e−r24(DFREEΔτ+σ2)
Here, the quasi-immobile subpopulation has diffusion constant, DBOUND, and makes up a fraction, FBOUND, whereas the freely diffusing subpopulation has diffusion constant, DFREE, and makes up a fraction, FFREE=1-FBOUND. To account for defocalization bias (Figure 1C), Spot-On explicitly considers the probability of the freely diffusing subpopulation moving out of the axial detection range, Δz, during each time delay, Δτ. This is important. For example, only ~25% of freely-diffusing molecules will remain in focus for at least five frames (assuming Δτ = 10 ms; Δz=700 nm; one gap allowed; D = 5 µm²/s), resulting in a 4-fold undercounting if uncorrected for. If we assume absorbing boundaries such that any molecule that contacts the edges of the axial detection range located at zMAX=Δz/2 and zMIN=−Δz/2 is permanently lost, the fraction of freely diffusing molecules with diffusion constant, DFREE, that remain at time delay, Δτ, is given by (Carslow and Jaeger, 1959; Kues and Kubitscheck, 2002):
Premaining(Δτ,Δz,DFREE)=1Δz∫−Δz/2Δz/2{1−∑n=0∞(−1)n[erfc((2n+1)Δz2−z4DFREEΔτ)+erfc((2n+1)Δz2+z4DFREEΔτ)]}dz
However, this analytical expression overestimates the fraction lost since there is a significant probability that a molecule that briefly contacted or exceeded the boundary re-enters the axial detection range. The re-entry probability depends on the number of gaps allowed in the tracking (g), Δτ, and Δz and can be approximately accounted for by considering a corrected axial detection range, Δzcorr, larger than Δz: Δzcorr>Δz:
Δzcorr(Δz,Δτ,DFREE,g)=Δz+a(Δz,Δτ,g)DFREE+b(Δz,Δτ,g)
Although Δzcorr depend on the number of gaps (g) allowed in the tracking, we will leave it out for simplicity in the following. We determined the coefficients a and b from Monte Carlo simulations. For a given diffusion constant, D, 50,000 molecules were randomly placed one-dimensionally along the _z_-axis drawn from a uniform distribution from zMIN=−Δz/2 to zMAX=Δz/2. Next, using a time-step Δτ, one-dimensional Brownian diffusion was simulated along the _z_-axis using the Euler-Maruyama scheme. For time delays from 1Δτ to 15Δτ, the fraction of molecules that were lost was calculated in the range of _D_=[1;12] μm2/s. a(Δz,Δτ,g) and b(Δz,Δτ,g) were then estimated through least-squares fitting of Premaining(Δτ,Δzcorr,D) to the simulated fraction remaining. The process was repeated over a grid of plausible values of (Δz,Δτ,g) to derive a grid of 134,865 (a,b) parameter pairs. This pre-calculated library of (a,b) parameters enables Spot-On to perform model fitting on nearly any SPT dataset with minimal overhead.
Thus, the 2-state model Spot-On uses for kinetic modeling of SPT data is given by:
P2(r,Δτ)=FBOUNDr2(DBOUNDΔτ+σ2)e−r24(DBOUNDΔτ+σ2)+ZCORR(Δτ,Δzcorr,DFREE)(1−FBOUND)r2(DFREEΔτ+σ2)e−r24(DFREEΔτ+σ2)
where:
ZCORR(Δτ,Δzcorr,DFREE)=1Δzcorr∫−Δzcorr2Δzcorr2{1−∑n=0∞(−1)n[erfc((2n+1)Δzcorr2−z4DFREEΔτ)+erfc((2n+1)Δzcorr2+z4DFREEΔτ)]}dz
Having derived the 2-state model, generalization to a 3-state model with 1 bound and 2 diffusive states is straightforward. If the three subpopulations have diffusion constants DBOUND, DSLOW, DFAST, and fractions FBOUND, FSLOW, FFAST, such that FBOUND+FSLOW+FFAST=1, then the 3-state model considered by Spot-On becomes:
P3(r,Δτ)=FBOUNDr2(DBOUNDΔτ+σ2)e−r24(DBOUNDΔτ+σ2)+ZCORR(Δτ,Δzcorr,DSLOW)FSLOWr2(DSLOWΔτ+σ2)e−r24(DSLOWΔτ+σ2)+ZCORR(Δτ,Δzcorr,DFAST)(1−FBOUND−FSLOW)r2(DFASTΔτ+σ2)e−r24(DFASTΔτ+σ2)
Where ZCORR(Δτ,Δzcorr,D) is as described above.
Numerical implementation of models in Spot-On
Spot-On calculates the empirical histogram of displacements based on a user-defined bin width. Spot-On allows the user to choose between PDF- and CDF-fitting of the kinetic model to the empirical displacement distributions; CDF-fitting is generally most accurate for smaller datasets and the two are similar for large datasets (Figure 3—figure supplement 9). The integral in ZCORR(Δτ,Δzcorr) was numerically evaluated using the midpoint method over 200 points and the terms of the series computed until the term falls below a threshold of 10−10. Model fitting and parameter optimization was performed using a non-linear least squares algorithm (Levenberg-Marquardt). Random initial parameter guesses are drawn uniformly from the user-specified parameter range. The optimization is then repeated several times with different initialization parameters to avoid local minima. Spot-On constrains each fraction to be between 0 and 1 and for the sum of the fractions to equal 1.
Theoretical characteristics and limitations of the model
Although Spot-On performs well on both experimental and simulated SPT data, the model implemented by Spot-On has several limitations. First, the kinetic model assumes diffusion to be ideal Brownian motion, even though it is widely acknowledged that the motion of most proteins inside a cell shows some degree of anomalous diffusion. Nevertheless, Figure 4G–H and Figure 4—figure supplement 2 show that the parameter inference for experimental data of proteins presenting various degrees of anomalous diffusion is quite robust.
Second, Spot-On models the localization error as the static mean localization error and this feature can be used to infer the actual localization error from the data. However, the localization error is affected both by the position of the particle with respect to the focal plane (Lindén et al., 2017) and by motion blur (Deschout et al., 2012). Even though a high signal-to-background ratio and fast framerate/stroboscopic illumination help to mitigate these disparities, it is likely that the localization error of fast moving particles will be higher than the bound/slow-moving particles. In that case, one would expect Spot-On to infer a localization error that is the weighted mean of the ‘bound/static’ localization error and the ‘free’ localization error. However, in many situations DfreeΔτ>> σ2 (even assuming a 2 µm²/s particle imaged at a 5 ms framerate with a ~30 nm localization error, there is still an order of magnitude difference between the two terms). As a consequence, the estimate of σ reflects the static localization error (that is, the localization error of the bound fraction), and the localization error estimate becomes less reliable if the bound fraction is very small (Figure 3—figure supplement 11).
Third, following (Kues and Kubitscheck, 2002) the axial detection profile is assumed to be a step function, which is an approximation. However, all simulations here were performed using a detection profile with Gaussian edges (Figure 3—figure supplement 1) and as shown in Figure 3A–B Spot-On still works quite well and moreover is relatively robust to slight mismatches in the axial detection range (Figure 3—figure supplement 7).
Fourth, unlike the original implementation by Mazza et al. (2012), Spot-On ignores state transitions. This reduces the number of fitted parameters and simplifies the generalization to more than two states, but as shown in Figure 3—figure supplement 10 it also causes the parameter inference to fail unless the timescale of state changes is at least 10–50 times longer than the frame rate. Thus, in cases where a molecule is known to exhibit state changes on a time-scale of tens to a few hundreds of milliseconds, Spot-On may not be appropriate.
Fifth and finally, Spot-On ignores correlations between adjacent displacements, although taking such information into account can potentially improve the parameter inference (Vestergaard et al., 2014).
Cell culture
Halo-Sox2 (Teves et al., 2016) knock-in JM8.N4 mouse embryonic stem cells ((Pettitt et al., 2009) Research Resource Identifier: RRID:CVCL_J962; obtained from the KOMP Repository at UC Davis) were grown on plates pre-coated with a 0.1% autoclaved gelatin solution (Sigma-Aldrich, St. Louis, MO, G9391) under feeder free conditions in knock-out DMEM with 15% FBS and LIF (full recipe: 500 mL knockout DMEM (ThermoFisher, Waltham, MA, #10829018), 6 mL MEM NEAA (ThermoFisher #11140050), 6 mL GlutaMax (ThermoFisher #35050061), 5 mL Penicillin-streptomycin (ThermoFisher #15140122), 4.6 μL 2-mercapoethanol (Sigma-Aldrich M3148), 90 mL fetal bovine serum (HyClone Logan, UT, FBS SH30910.03 lot #AXJ47554)) and LIF. mES cells were fed by replacing half the medium with fresh medium daily and passaged every two days by trypsinization. Halo-3xNLS, H2B-Halo-SNAP and knock-in C32 Halo-CTCF (Hansen et al., 2017) Human U2OS osteosarcoma cells (Research Resource Identifier: RRID:CVCL_0042) were grown in low glucose DMEM with 10% FBS (full recipe: 500 mL DMEM (ThermoFisher #10567014), 50 mL fetal bovine serum (HyClone FBS SH30910.03 lot #AXJ47554) and 5 mL Penicillin-streptomycin (ThermoFisher #15140122)) and were passaged every 2–4 days before reaching confluency. For live-cell imaging, the medium was identical except DMEM without phenol red was used (ThermoFisher #31053028). Both mouse ES and human U2OS cells were grown in a Sanyo copper alloy IncuSafe humidified incubator (MCO-18AIC(UV)) at 37°C/5.5% CO2. Cell lines were pathogen tested and authenticated through STR profiling (U2OS) as described previously (Hansen et al., 2017; Teves et al., 2016). All cell lines will be provided upon request.
Single-molecule imaging
The indicated cell line was grown overnight on plasma-cleaned 25 mm circular no 1.5H cover glasses (Marienfeld, Germany, High-Precision 0117650) either directly (U2OS) or MatriGel coated (mESCs; Fisher Scientific, Hampton, NH, #08-774-552 according to manufacturer’s instructions just prior to cell plating). After overnight growth, cells were labeled with 5–50 nM PA-JF549 or PA-JF646 (Grimm et al., 2016a) for ~15–30 min and washed twice (one wash: medium removed; PBS wash; replenished with fresh medium). At the end of the final wash, the medium was changed to phenol red-free medium keeping all other aspects of the medium the same. Single-molecule imaging was performed on a custom-built Nikon TI microscope (Nikon Instruments Inc., Melville, NY) equipped with a 100x/NA 1.49 oil-immersion TIRF objective (Nikon apochromat CFI Apo TIRF 100x Oil), EM-CCD camera (Andor, Concord, MA, iXon Ultra 897; frame-transfer mode; vertical shift speed: 0.9 μs; −70°C), a perfect focusing system to correct for axial drift and motorized laser illumination (Ti-TIRF, Nikon), which allows an incident angle adjustment to achieve highly inclined and laminated optical sheet illumination (Tokunaga et al., 2008). The incubation chamber maintained a humidified 37°C atmosphere with 5% CO2 and the objective was also heated to 37°C. Excitation was achieved using the following laser lines: 561 nm (1 W, Genesis Coherent, Santa Clara, CA) for PA-JF549; 633 nm (1 W, Genesis Coherent, Pala Alto, CA) for PA-JF646; 405 nm (140 mW, OBIS, Coherent) for all photo-activation experiments. The excitation lasers were modulated by an acousto-optic Tunable Filter (AA Opto-Electronic, France, AOTFnC-VIS-TN) and triggered with the camera TTL exposure output signal. The laser light is coupled into the microscope by an optical fiber and then reflected using a multi-band dichroic (405 nm/488 nm/561 nm/633 nm quad-band, Semrock, Rochester, NY) and then focused in the back focal plane of the objective. Fluorescence emission light was filtered using a single band-pass filter placed in front of the camera using the following filters: PA-JF549: Semrock 593/40 nm bandpass filter; PA-JF646: Semrock 676/37 nm bandpass filter. The microscope, cameras, and hardware were controlled through NIS-Elements software (Nikon).
spaSPT experiments and analysis
The spaSPT experimental settings for Figure 4G–H were as follows: 1 ms 633 nm excitation (100% AOTF) of PA-JF646 was delivered at the beginning of the frame; 405 nm photo-activation pulses were delivered during the camera integration time (~447 μs) to minimize background and their intensity optimized to achieve a mean density of ≤1 molecule per frame per nucleus. 30,000 frames were recorded per cell per experiment. The camera exposure times were: 4.5 ms, 5.5 ms, 7 ms, 9.5 ms, 13 ms and 19.5 ms.
For the motion-blur spaSPT experiments (Figure 4I–K), the camera exposure was fixed to 9.5 ms and photo-activation performed as above. To keep the total number of delivered photons constant, we generated an AOTF-laser intensity calibration curve using a power meter and adjusted the AOTF transmission accordingly for each excitation pulse duration. The excitation settings were as follows: 1 ms, 561 nm 100% AOTF, 633 nm 100% AOTF; 2 ms, 561 nm 43% AOTF, 633 nm 40% AOTF; 4 ms, 561 nm 28% AOTF, 633 nm 27% AOTF; 7 ms, 561 nm 20% AOTF, 633 nm 19% AOTF; constant illumination, 561 nm 17% AOTF, 633 nm 16% AOTF.
spaSPT data was analyzed (localization and tracking) and converted into trajectories using a custom-written Matlab implementation of the MTT-algorithm (Sergé et al., 2008) and the following settings: Localization error: 10-6.25; deflation loops: 0; Blinking (frames): 1; max competitors: 3; max D (μm2/s): 20. The spaSPT trajectory data was then analyzed using the Matlab version of Spot-On (v1.0; GitLab tag 1f9f782b) and the following parameters: dZ = 0.7 µm; GapsAllowed = 1; TimePoints: 4 (50 Hz), 6 (74 Hz), 7 (100 Hz), 8 (134 Hz), 9 (167 and 200 Hz); JumpsToConsider = 4; ModelFit = 2; NumberOfStates = 2; FitLocError = 0; LocError = 0.035 µm; D_Free_2State=[0.4;25]; D_Bound_2State=[0.00001;0.08];
SPT simulations
We developed a utility to simulate diffusing proteins in a confined geometry (simSPT). Briefly, simSPT simulates the diffusion of an arbitrary number of populations of molecules characterized by their diffusion coefficient, under a steady state assumption. Particles are drawn at random between the populations and their location in the 3D nucleus is initialized following a uniform law within the confinement volume. The lifetime of the particle (in frames) is also drawn following an exponential law of mean lifetime β. Then, the particle diffuses in 3D until it bleaches. Diffusion is simulated by drawing jumps following a normal law of parameters N(0,2DΔτ), where D is the diffusion coefficient and Δτ the exposure time. Finally, a localization error (N0,σ) is added to each (x,y,z) localization in the simulated trajectories.
For comparisons of Spot-On, MSDi and vbSPT using a 2-state scenario, we parameterized simSPT to consider two subpopulations of particles diffusing in a sphere (the nucleus) of 8 µm diameter illuminated using HiLo illumination (assuming a HiLo beam width of 4 µm), with an axial detection range of ~700 nm, centered at the middle of the HiLo beam with Gaussian edges. Molecules are assumed to have a mean lifetime of 4 frames (when inside the HiLo beam) and of 40 frames when outside the HiLo beam. The localization error was set to 25 nm and the simulation was run until 100,000 in-focus trajectories were recorded. More specifically, the effect of the exposure time (1 ms, 4 ms, 7 ms, 13 ms, 20 ms), the free diffusion constant (from 0.5 µm²/s to 14.5 µm²/s in 0.5 µm²/s increments) and the fraction bound (from 0% to 95% in 5% increments) were investigated, yielding a dataset consisting of 3480 simulations. More details on the simulations, including scripts to reproduce the dataset, are available on GitLab as detailed in the ‘Computer code’ section. Full details on how the simulations were analyzed by Spot-On, vbSPT and MSDi are given in Appendix 1.
We also considered a 3-state scenario featuring a bound subpopulation (‘bound’), a relatively slow diffusing free subpopulation (‘slow’) and a relatively faster diffusing free subpopulation (‘free’). In this case, we only compared Spot-On and vbSPT (Figure 3—figure supplement 6), since the MSDi methods did not perform well. As in the 2-state simulations, we parameterized simSPT to consider that three subpopulations of particles diffusing in a sphere (the nucleus) of 8 µm diameter illuminated using HiLo illumination (assuming a HiLo beam width of 4 µm), with an axial detection range of ~700 nm, centered at the middle of the HiLo beam with Gaussian edges. Molecules are assumed to have a mean lifetime of 4 frames (when inside the HiLo beam) and of 40 frames when outside the HiLo beam. The localization error was set to 40 nm and the simulation was run until 100,000 in-focus trajectories were recorded. We considered three different subpopulation conditions: (1) _F_BOUND = 25%; _F_SLOW = 25%; _F_FAST = 50%; (2) _F_BOUND = 25%; _F_SLOW = 50%; _F_FAST = 25%; (3) _F_BOUND = 50%; _F_SLOW = 25%; _F_FAST = 25%. Specifically, for each of these condition, the effect of of the exposure time (1 ms, 4 ms, 7 ms, 10 ms, 13 ms, 20 ms), the slower free diffusion constant (from 0.5 µm²/s to 2.5 µm²/s in 0.5 µm²/s increments) and the faster free diffusion constant (from 4 µm²/s to 11 µm²/s in 1 µm²/s increments) were investigated, yielding a dataset of 720 simulations. Both vbSPT and Spot-On (all) were constrained to three subpopulations. Full details on how the simulations were analyzed by Spot-On and vbSPT are given in Appendix 1.
Data availability
All raw 1064 spaSPT experiments (Figure 4) as well as the 3480 simulations (Figure 3) are freely available in Spot-On readable Matlab and CSV file formats in the form of SPT trajectories at Zenodo. The experimental data is available at: https://zenodo.org/record/834781; The simulations are available in Matlab format at: https://zenodo.org/record/835541; The simulations are available in CSV format at: https://zenodo.org/record/834787; And supplementary software used for MSDi and vbSPT analysis as well as for generating the simulated data at: https://zenodo.org/record/835171
Computer code
Spot-On is fully open-source. The web-interface can be found at: https://SpotOn.berkeley.edu. All raw code is available at GitLab: https://gitlab.com/tjian-darzacq-lab. The web-interface code can be found at https://gitlab.com/tjian-darzacq-lab/Spot-On; the Matlab command-line version of Spot-On can be found at: https://gitlab.com/tjian-darzacq-lab/spot-on-matlab; the Python command-line version of Spot-On can be found at https://gitlab.com/tjian-darzacq-lab/Spot-On-cli; the SPT simulation code (simSPT) can be found at: https://gitlab.com/tjian-darzacq-lab/simSPT; finally, the ‘TrackMate to Spot-On connector’ plugin, which adds an extra menu to TrackMate which allows one-click upload of datasets to Spot-On can be found at: https://gitlab.com/tjian-darzacq-lab/Spot-On-TrackMate
Acknowledgements
ASH and MW contributed equally to this work and are alphabetically listed. We are very grateful to Davide Mazza who inspired this work and provided invaluable comments on Spot-On, to Florian Mueller for suggestions on the web-application, Christophe Zimmer for insightful discussions, David McSwiggen and Sheila Teves for kindly providing cell lines, Carolyn Elya and Chiahao Tsui for the name ‘Spot-On’, and to members of the Tjian/Darzacq labs and Maxime Dahan for discussions. We also thank Astou Tangara and Anatalia Robles for microscope maintenance. ASH is a postdoctoral fellow of the Siebel Stem Cell Institute. This work was supported by NIH grants UO1-EB021236 and U54-DK107980 (XD), the California Institute of Regenerative Medicine grant LA1-08013 (XD), by the Howard Hughes Medical Institute (003061, RT) and used the computational and storage services (TARS cluster) provided by the IT department at Institut Pasteur, Paris.
Appendix 1
Fitting of simulations using Spot-On, vbSPT and MSDi
To systematically evaluate the performance of Spot-On as well as other common analysis tools such as MSDi and vbSPT (Persson et al., 2013), we developed simSPT, a simulation tool to generate a comprehensive set of realistic SPT simulations spanning the range of plausible dynamics (almost a billion trajectories were simulated in total). simSPT is freely available at GitLab: https://gitlab.com/tjian-darzacq-lab/simSPT. simSPT simulates 3D SPT trajectories arising from an arbitrary number of subpopulations confined inside a sphere under HiLo illumination and takes into account a limited axial detection range, realistic photobleaching rates and optionally state interconversion. The simulation methods are described in detail at GitLab.
Briefly, we parameterized simSPT to consider that particles diffuse inside a sphere (the nucleus) of 8 µm diameter illuminated using HiLo illumination (assuming a HiLo beam width of 4 µm), with an axial detection range of ~700 nm with Gaussian edges, centered at the middle of the HiLo beam. Molecules are assumed to have a mean lifetime of 4 frames (when inside the HiLo beam) and of 40 frames when outside the HiLo beam.
For the 2-state comparisons, the localization error was set to 25 nm and the simulation was run until 100,000 in-focus trajectories were recorded. More specifically, the effect of the time between frames (1 ms, 4 ms, 7 ms, 13 ms, 20 ms), the free diffusion constant (from 0.5 µm²/s to 14.5 µm²/s in 0.5 µm²/s increments) and the fraction bound (from 0% to 95% in 5% increments) were investigated, yielding a dataset consisting of 3480 simulations. All 3480 simulated datasets are also available (see Data Availability section). The advantage of simulations is that the ground truth is known.
For the 3-state comparisons (Figure 3—figure supplement 6), the localization error was set to 40 nm and the simulation was run until 100,000 in-focus trajectories were recorded. We then simulated one bound state (DBOUND=0.001 µm²/s) and two free states (DSLOW=0.5 to 2.5 µm²/s in 0.5 µm²/s increments; DFAST= 4.0 to 11.0 µm²/s in 1.0 µm²/s increments) and also varying the fractions (FBOUND=25%, FSLOW=25%, FFAST= 50%; or FBOUND=25%, FSLOW=50%, FFAST= 25%; or FBOUND=50%, FSLOW=25%, FFAST= 25%;) as was the time between frames (1 ms, 4 ms, 7 ms, 10, 13 ms, 20 ms).
For more specific simulations, extra parameters were varied, such as the width of the axial detection range (Figure 3—figure supplement 7), localization error (Figure 3—figure supplement 11), or the presence/absence of interconversion between states (Figure 3—figure supplement 10).
Comparison of methods for 2-state simulations
In the case of the main 3480 simulated SPT datasets for the 2-state comparison, we analyzed the data using the Matlab version of Spot-On (either using JumpsToConsider = 4 or all), MSDi (either _R_2 >0.8 or all) or vbSPT. We describe the analysis in details below.
Spot-On (4 jumps)
Rational and parameters
Spot-On allows a user to use the entirety of each trajectory or to use only the first n jumps by adjusting the parameter, JumpsToConsider. For example, consider a trajectory consisting of 6 localizations and without gaps. If JumpsToConsider = 4 and TimePoints = 6, then this trajectory will contribute four displacements to the 1Δτ histogram, four displacements to the 2Δτ histogram, three displacements to the 3Δτ histogram, two displacements to the 4Δτ histogram and one displacement to the 5Δτ histogram. Thus, even though the trajectory contains 5 1Δτ displacements, only the first four will be used for analysis if JumpsToConsider = 4. While on simulated data, using a subset of the trajectories is always slightly less accurate than using the entire trajectory since it slightly underestimates the bound fraction, we previously (Hansen et al., 2017) used this as an empirical way of compensating for all the other experimental biases that cause undercounting of freely diffusing molecules that cannot fully be taken into account in simulations. We therefore also tested this approach in the simulations. To fit the simulations using Spot-On we fed the following parameters to the function SpotOn_core.m (v1.0; GitLab tag 1f9f782b):
- dZ = 0.700;
- GapsAllowed = 1;
- BinWidth = 0.010;
- UseAllTraj = 0;
- JumpsToConsider = 4;
- MaxJump = 6.05;
- ModelFit = 2;
- DoSingleCellFit = 0;
- NumberOfStates = 2;
- FitIterations = 2;
- FitLocError = 0;
- LocError = 0.0247;
- D_Free_2State = [0.4 25];
- D_Bound_2State = [0.00001 0.08];
- TimePoints: 10 if 1 ms; 9 if 4 ms; 8 if 7 ms; 7 if 10 ms; 6 if 13 ms; 5 if 20 ms;
- The empirical a,b parameters used to correct for defocalization bias were as follows:
- Δτ = 1 ms; Δz = 0.7 µm; 1 gap: a = 0.0387 s12; b = 0.3189 µm;
- Δτ = 4 ms; Δz = 0.7 µm; 1 gap: a = 0.1472 s1/2; b = 0.2111 µm;
- Δτ = 7 ms; Δz = 0.7 µm; 1 gap: a = 0.1999 s1/2; b = 0.2058 µm;
- Δτ = 10 ms; Δz = 0.7 µm; 1 gap: a = 0.2379 s1/2; b = 0.2017 µm;
- Δτ = 13 ms; Δz = 0.7 µm; 1 gap: a = 0.2656 s1/2; b = 0.2118 µm;
- Δτ = 20 ms; Δz = 0.7 µm; 1 gap: a = 0.3133 s1/2; b = 0.2391 µm;
CDF-fitting was then performed in MATLAB R2014b using the Matlab version of Spot-On (v1.0; GitLab tag 1f9f782b) and the estimated free diffusion constant, DFREE, and bound fraction, FBOUND, recorded for each of the 3480 simulations. The estimated DFREE and FBOUND were then compared to the ground truth known from the simulations. Three parameters were estimated in the fit.
Performance evaluation
Spot-On (4 jumps) performs slightly worse than Spot-On (all) when it comes to estimating FBOUND as expected and essentially identically to Spot-On (all) for estimating DFREE. The mean error (bias) for estimating FBOUND was −6.4%, the inter-quartile range (IQR) was 5.9% and the standard deviation 3.6%. The origin of the error is the undercounting of the bound population due to considering only the first 4 jumps. Since bound molecules remain in focus until they bleach, they always yield only a single trajectory, whereas a single freely diffusing molecule has a probability of yielding multiple trajectories by diffusing in-focus for a while, then moving out-of-focus for a while and then moving back in-focus. For estimating DFREE the bias for Spot-On (4 jumps) was −5.4%, the IQR 3.6% and the standard deviation 3.2%. However, as shown in Figure 3—figure supplements 2 and 4, the slight underestimate of the free diffusion constant is not due to a limitation of Spot-On, but instead due to confinement inside the nucleus (Figure 3—figure supplement 4). For example, a diffusing molecule close to the nuclear boundary moving towards the nuclear boundary will ‘bounce back’ resulting in a large distance travelled, but only a smaller recorded displacement. We validated that this indeed is the origin of the underestimate of DFREE by considering a nucleus with virtually no confinement (20 μm radius) and found that the DFREE-underestimate was now minimal (Figure 3—figure supplement 4). Finally, Spot-On always estimated the bound diffusion constant, DBOUND, with minimal error unlike MSDi or vbSPT, which were not able to accurately estimate DBOUND. However, since there is generally less interest in DBOUND, we did not use this further for evaluating the performance of the different methods.
Spot-On (all)
Rational and parameters: Spot-On (all) was run on the simulations identically to Spot-On (4 jumps) except the entirety of each trajectory was used for calculating the histograms. To fit the simulations using Spot-On we fed the following parameters to the function SpotOn_core.m (v1.0; GitLab tag 1f9f782b):
- dZ = 0.700;
- GapsAllowed = 1;
- BinWidth = 0.010;
- UseAllTraj = 1;
- MaxJump = 6.05;
- ModelFit = 2;
- DoSingleCellFit = 0;
- NumberOfStates = 2;
- FitIterations = 2;
- FitLocError = 0;
- LocError = 0.0247;
- D_Free_2State = [0.4 25];
- D_Bound_2State = [0.00001 0.08];
- TimePoints: 10 if 1 ms; 9 if 4 ms; 8 if 7 ms; 7 if 10 ms; 6 if 12 ms; 5 if 20 ms;
- The empirical a,b parameters used to correct for defocalization bias were as follows:
- o Δτ = 1 ms; Δz = 0.7 µm; 1 gap: a = 0.0387 s1/2; b = 0.3189 µm;
- o Δτ = 4 ms; Δz = 0.7 µm; 1 gap: a = 0.1472 s1/2; b = 0.2111 µm;
- o Δτ = 7 ms; Δz = 0.7 µm; 1 gap: a = 0.1999 s1/2; b = 0.2058 µm;
- o Δτ = 10 ms; Δz = 0.7 µm; 1 gap: a = 0.2379 s1/2; b = 0.2017 µm;
- o Δτ = 13 ms; Δz = 0.7 µm; 1 gap: a = 0.2656 s1/2; b = 0.2118 µm;
- o Δτ = 20 ms; Δz = 0.7 µm; 1 gap: a = 0.3133 s1/2; b = 0.2391 µm;
As above, CDF-fitting was performed and the DFREE-estimate and FBOUND-estimate compared to the ground truth for each of the 3480 simulations for which the ground truth is known. Three parameters were estimated in the fit.
Performance evaluation
Spot-On (all) out-performed all other approaches. The mean error (bias) for estimating FBOUND was −1.7%, the inter-quartile range (IQR) was 1.8% and the standard deviation 1.2%. For estimating DFREE the bias for Spot-On (all) was −4.8%, the IQR 3.5% and the standard deviation 3.3%. But as mentioned above, the slight underestimate of DFREE is simply due to diffusion being confined inside a 4 μm radius nucleus (Figure 3—figure supplement 4). This also helps to emphasize the point that diffusion constants measured inside a nucleus should be interpreted as apparent diffusion constants.
MSDi (_R_2>0.8)
Rational and parameters
A large number of papers have use different variations of the MSDi approach (Knight et al., 2015; Li et al., 2016; Liu et al., 2014; Schmidt et al., 2016; Zhen et al., 2016). This approach is of course very sensitive to how the MSD is estimated. For example, it is well-known that accurately estimating diffusion constants from short trajectories (<100 frames) subject to significant localization error is all but impossible as shown by Michalet and Berglund (Michalet and Berglund, 2012). Nevertheless, several papers assign diffusion constants to individual trajectories based on a MSD-fit. While the exact method differs somewhat from paper to paper, the most popular approach is to set a threshold of a certain number of localizations per trajectory (most commonly 5; though we note that some reports explicitly attempt to compensate for the bias introduced by setting such a threshold (Zhen et al., 2016)). Each trajectory with at least five localizations are then fit, often using the Matlab library MSDAnalyzer (Tarantino et al., 2014), and thus assigned an apparent diffusion constant. An additional threshold is then applied: only if the fit to the MSD curve is judged sufficiently good, is the diffusion constant then used. Otherwise the trajectory is ignored. This fitting threshold is frequently set based on the coefficient of determination as _R_2>0.8 in some recent papers (Knight et al., 2015; Li et al., 2016; Schmidt et al., 2016). Next, after analyzing all trajectories in this way, a distribution of diffusion constants is then obtained. The analysis is then performed on the logarithm of these diffusion constants (‘LogD histogram’) (Knight et al., 2015; Li et al., 2016; Schmidt et al., 2016). Both the CDF (Knight et al., 2015) and PDF (Knight et al., 2015; Li et al., 2016; Schmidt et al., 2016; Zhen et al., 2016) can be considered. These are then fitted with a sum of Gaussian distributions: either two (Knight et al., 2015; Schmidt et al., 2016; Zhen et al., 2016) or three (Schmidt et al., 2016; Zhen et al., 2016). We note that it is not immediately clear which distribution fitted diffusion constants should actually follow (e.g. Log-normal, Gamma, Normal, etc.). No justification is given for sums of Gaussians (Knight et al., 2015; Li et al., 2016; Schmidt et al., 2016), though we note that the fit is often quite good both in the previous reports (Knight et al., 2015; Li et al., 2016; Schmidt et al., 2016) and also here as shown in Figure 3—figure supplement 3. Please note that fitting a sum of normal distributions to the LogD histogram is equivalent to fitting a sum of log-normal distributions to the D histogram. We also note here, that in a theoretical study Michalet previously showed that the distribution of diffusion constants is approximately Gaussian, but only under a set of stringent criteria (Michalet, 2010). Since CDF-fitting is generally less susceptible to noise from binning and since in this comparison Spot-On also uses CDF-fitting, we fit the LogD histogram with a sum of 2 Gaussians using CDF-fitting. We refer to this whole procedure as MSDi (_R_2>0.8). Examples of fits are shown in Figure 3 and Figure 3—figure supplement 3 and the Matlab code to perform the fitting is available together with the data (see “Data availability’). Five parameters were estimated in the fit.
Performance evaluation
Overall, MSDi (_R_2>0.8) generally performs reasonably well when it comes to estimating DFREE, but extremely poorly when it comes to FBOUND and DBOUND. The mean error (bias) for estimating DFREE was 8.0%, the inter-quartile range (IQR) was 4.9% and the standard deviation 28.5%. For estimating FBOUND the bias for MSDi (_R_2>0.8) was −20.6%, the IQR 32.1% and the standard deviation 26.4%. We note that since FBOUND necessarily has to take a value between 0% and 95% in the simulations and since half the simulations have FBOUND<50%, a mean error of −20.6% is actually quite large. Although the bias for DFREE is much smaller, in ~5% of all cases, the error in estimating DFREE is bigger than 2-fold. Moreover, in a few very rare cases, not a single trajectory out of the 100,000 simulated trajectories pass both thresholds (_R_2>0.8; at least five frames). Why is MSDi (_R_2>0.8) fitting so unreliable? It is instructive to consider an example. In the example dataset provided with the MSDi code (simulation with DFREE=2; FBOUND=0.75; 1 ms frame rate), the estimated DFREE=2.06 is very good, but the estimated FBOUND=0.16 is extremely poor. Even though the simulation dataset contains 100,000 simulated trajectories, only 3726 of them actually pass the threshold (_R_2>0.8; at least five frames). Thus, MSDi (_R_2>0.8) only uses around 4% of the data. Since the tiny fraction of the dataset that is used for analysis is chosen based on how well it fits an MSD-curve and since displacements of bound molecules are dominated by localization errors and therefore generally poorly fit by MSD-analysis, the procedure enriches for the free population, which is why the estimated bound fraction (16%) is so much lower than the true bound fraction (75%). Additionally, we note that MSDi-based analysis is extremely sensitive to the fitting threshold: if instead of _R_2>0.8, all trajectories had been used the estimated bound fraction would be 87% instead of 16%.
In conclusion, MSDi (_R_2>0.8) is unreliable for estimating FBOUND when short trajectories are at stake, which is the usual case when performing intracellular SPT of fast-diffusing molecules. MSDi (_R_2>0.8) most likely fails due to a combination of the following reasons among others. First, it poorly handles localization errors, which dominate the displacements of bound molecules. Second, by only considering trajectories of a certain length (normally at least five frames), it only analyzes a small subsample of the dataset. Third, there is no correction for defocalization bias. Since fast-diffusing molecules move out-of-focus and thus have shorter trajectories, the 5-frame threshold introduces a large bias against freely-diffusing molecules. Fourth, the fitting threshold (_R_2>0.8) is relatively arbitrary and the results of the analysis is extremely sensitive to this threshold. Accordingly, in these simulations MSDi (_R_2>0.8) only analyzes a small fraction (~5%) of all the trajectories; note that this bias against the bound population provides a compensatory bias against the bound population to account for the bias against the free population due to defocalization bias. Fifth, it is difficult to justify the use of Gaussian distributions. Even in cases where the CDF-fit to the data is excellent, the fitted FBOUND-value is often very far off the ground truth. Thus, the goodness of the fit cannot be used to judge how well the parameter-estimation went. Finally, we note that several variants of the MSDi-based method exist (e.g. the approach used by Zhen et al. (Zhen et al., 2016)) is a bit different than the one used here. However, a full validation test of all MSDi-based methods is beyond the scope of this work.
MSDi (all)
Rational and parameters
The MSDi (all) analysis was identical to MSDi (_R_2>0.8) except for a single difference: instead of only using trajectories of at least five frames where the MSD-fit to individual trajectories was judged good (_R_2>0.8), all trajectories of at least five frames were used, regardless of how good the MSD-fit was. five parameters were estimated in the fit.
Performance evaluation
MSDi (all) analysis performed very poorly both when it comes to estimating DFREE and FBOUND. The mean relative error (bias) for estimating DFREE was −39.6%, the inter-quartile range (IQR) was 19.0% and the standard deviation 41.8%. For estimating FBOUND the bias for MSDi (all) was 22.0%, the IQR 17.8% and the standard deviation 15.8%. Thus, in all but a few edge cases, MSDi (all) cannot reliably estimate DFREE or FBOUND. As for MSDi (_R_2>0.8), examples of fits are shown in Figure 3—figure supplement 3 and the Matlab code to perform the fitting is available together with the data (see “Data availability’). In the case of MSDi (all), the main reason for the unreliable estimates is due to defocalization bias. Since fast-diffusing molecules move out-of-focus and thus have shorter trajectories, the 5-frame threshold introduces a large bias against freely-diffusing molecules. Overall, consistent with previous benchmarking efforts on membrane proteins (Weimann et al., 2013), MSDi (all) performed least well among the tested methods.
vbSPT
Rational and parameters
vbSPT performs single-trajectory classification using Hidden-Markov Modeling (HMM) and Bayesian inference (Persson et al., 2013) and can assign different segments of a single trajectory to different diffusive states, each associated with a particular diffusion constant. vbSPT uses the information from all the estimates on single trajectories to consolidate an estimate of diffusion coefficients and associated fractions in each state.
vbSPT additionally uses a statistical model to infer the most likely number of diffusive states assuming all states to exhibit Brownian motion. Since the simulations used to evaluate vbSPT performed contain only two states, it was not clear how to assign DFREE or FBOUND in cases where e.g. three diffusive states were inferred. Therefore, to optimize the performance of vbSPT and perform the fairest comparison, we restricted vbSPT to two states such that vbSPT would infer the diffusion coefficient of up to two states and provide the associated fractions. This method conceptually differs from the MSDi approach in several ways:
- The inferred parameters are not based on the MSD
- A specific and rigorous Bayesian statistical model is used to aggregate the parameters estimated on single trajectories to global diffusion states.
vbSPT was initially designed for SPT of diffusing proteins in bacteria (Persson et al., 2013), where defocalization biases are virtually nonexistent since the axial dimension of most bacteria are generally comparable to or smaller than the microscope axial detection range. Furthermore, vbSPT does not explicitly model the localization error. It is then expected that the software performs poorly when the localization error is high, as can be expected when imaging intranuclear factors.
In practice, the following parameters were used to assess vbSPT performance. The software was run on the full set of 3480 simulations. The priors and optimization parameters were left as default and the scripts to perform the analysis are provided together with the experimental data (please see Data Availability section):
dim = 2;
trjLmin = 2;
runs = 3;
maxHidden = 2;
bootstrapNum = 10;
fullBootstrap = 0;
init_D = [0.001, 16];
init_tD = [2, 20]*timestep;
Performance evaluation
Over the 3480 simulations, vbSPT accurately estimated both DFREE and FBOUND. The mean relative error (bias) for estimating DFREE was 0.8%, the inter-quartile range (IQR) was 6.8% and the standard deviation 12.5%. For estimating FBOUND the bias for vbSPT was 5.0%, the IQR 6.1% and the standard deviation 4.6%. Thus, vbSPT estimated values were quite consistent (IQR <7% for both DFREE and FBOUND). These values were very close to Spot-On in performance.
When looking at the heatmaps (Figure 3—figure supplement 2) more closely, it appeared that vbSPT performs poorly on the estimation of the free diffusion constant when the mean displacements are small. This case occurs either with small free diffusion constants (0.5–2 µm²/s), or with short frame rates (1 ms) and could be explained by the fact that in such conditions, the displacements of the free population and localization error have comparable magnitudes, and that vbSPT does not account for localization error.
Regarding the estimate of the fraction bound, vbSPT tends to overestimate it more and more as the mean displacement of the free population increases (that is, either the exposure time or DFREE). This is most likely because vbSPT does not correct for defocalization bias. Thus, the more free molecules diffuse out-of-focus, the more vbSPT will overestimate FBOUND. Finally, we note that these two biases somewhat compensate for each other: not considering localization errors causes a small overestimate of the free population, whereas not correcting for defocalization bias causes an underestimate of the free population.
In summary, for conditions where the mean jump length of the free population can be distinguished from the localization error, vbSPT performs reasonably well, while being slightly outperformed by Spot-On.
Comparison of methods for 3-state simulations
In the case of the 720 simulated SPT datasets for the 3-state comparison, we analyzed the data using the Matlab version of Spot-On (all) and vbSPT. We describe the analysis in details below.
Spot-On (all)
Rational and parameters
Spot-On (all) was run on the simulations identically to the 2-state situation above except with one added freely diffusive state. To fit the simulations using Spot-On we fed the following parameters to the function SpotOn_core.m (v1.0; GitLab tag 1f9f782b):
- dZ = 0.700;
- GapsAllowed = 1;
- BinWidth = 0.010;
- UseAllTraj = 1;
- MaxJump = 6.05;
- ModelFit = 2;
- DoSingleCellFit = 0;
- NumberOfStates = 3;
- FitIterations = 8;
- FitLocError = 0;
- LocError = 0.04;
- D_Free1_3State = [0.4 10];
- D_Free2_3State = [0.4 25];
- D_Bound_3State = [0.00001 0.04];
- TimePoints: 10 if 1 ms; 9 if 4 ms; 8 if 7 ms; 7 if 10 ms; 6 if 12 ms; 5 if 20 ms;
- The empirical a,b parameters used to correct for defocalization bias were as follows:
- Δτ = 1 ms; Δz = 0.7 µm; 1 gap: a = 0.0387 s1/2; b = 0.3189 µm;
- Δτ = 4 ms; Δz = 0.7 µm; 1 gap: a = 0.1472 s1/2; b = 0.2111 µm;
- Δτ = 7 ms; Δz = 0.7 µm; 1 gap: a = 0.1999 s1/2; b = 0.2058 µm;
- Δτ = 10 ms; Δz = 0.7 µm; 1 gap: a = 0.2379 s1/2; b = 0.2017 µm;
- Δτ = 13 ms; Δz = 0.7 µm; 1 gap: a = 0.2656 s1/2; b = 0.2118 µm;
- Δτ = 20 ms; Δz = 0.7 µm; 1 gap: a = 0.3133 s1/2; b = 0.2391 µm;
As above, CDF-fitting was performed and the diffusion constant- and subpopulation fraction estimates compared to the ground truth for each of the 720 simulations for which the ground truth is known. Five parameters were estimated in the fit.
Performance evaluation
As in the 2-state comparison, Spot-On (all) slightly, but significantly, outperformed vbSPT also in the case of 3 states. The biggest error (bias) in estimating any of the subpopulation fractions was 3% and the biggest standard deviation (3.6% std) was also small (see Figure 3—figure supplement 6 for a full table for statistics). In the case of the diffusion constants, Spot-On also accurately inferred all of these with minimal error. The main limitation of Spot-On 3-state fitting, is that it sometimes gets stuck in local minima (we estimate this happens in <1% of cases). Therefore, it was necessary to increase the number of fitting iterations to 8. Nevertheless, Spot-On was very robust and accurately estimated all five parameters with minimal error and outperformed vbSPT.
vbSPT
Rational and parameters
vbSPT analysis was performed exactly as in the 2-state case, except with three hidden states instead of 2:
dim = 2;
trjLmin = 2;
runs = 3;
maxHidden = 3;
bootstrapNum = 10;
fullBootstrap = 0;
init_D = [0.001, 16];
init_tD = [2, 20]*timestep;
Although vbSPT was constrained to three states, it occasionally inferred that only 1 or 2 states exist. In case vbSPT inferred less than three states (1 or 2), the inferred diffusion coefficients were matched to the closest diffusion coefficient of the ground truth, and the proportion of the one or two unmatched diffusion coefficients was set to zero.
Performance evaluation
vbSPT generally performed quite well. The maximal error (bias) in estimating any of the subpopulation fractions was 6% and the maximal standard deviation (6.3% std; see Figure 3—figure supplement 6 for a full table for statistics). The main limitation of vbSPT was its inability to infer DSLOW: the mean error (bias) for estimating DSLOW was 36.6% and the standard deviation was 64.7%. Therefore, vbSPT performed almost as well as Spot-On for estimating the subpopulation fractions and for estimating DFAST, but vbSPT was unable to accurately estimate both DBOUND and DSLOW and thus failed when estimating 2 out of the five parameters. In conclusion, vbSPT performs almost as well as Spot-On when estimating subpopulation fractions, but quite poorly when estimating diffusion constants unless they are very high.
Appendix 2
Considerations for choosing Spot-On parameters
In order to run Spot-On, the user has to set a number of parameters. While some are determined by the acquisition protocol (e.g. time between frames), others will have to be carefully chosen. We provide a discussion of how to choose these here.
JumpsToConsider
Users can either choose to use all displacements from all trajectories (set ‘Use all trajectories’ to ‘Yes’ in the web-version of Spot-On or ‘UseAllTraj = 1’ in the Matlab version of Spot-On) or to use only a subset by controlling the JumpsToConsider variable. For example, consider a trajectory consisting of 6 localizations and without gaps. If JumpsToConsider = 4 and TimePoints = 6, then this trajectory will contribute four displacements to the 1Δτ histogram, four displacements to the 2Δτ histogram, three displacements to the 3Δτ histogram, two displacements to the 4Δτ histogram and one displacement to the 5Δτ histogram. Thus, even though the trajectory contains 5 1Δτ displacements, only the first four will be used for analysis if JumpsToConsider = 4. Why would we want to limit the number of jumps that were used? Since freely-diffusing molecules move out-of-focus, almost all very long trajectories will be bound molecules. For example, a single trajectory of 21 localizations will provide 20 displacements to the 1Δτ histogram, whereas freely diffusing molecules with short trajectories will provide fewer (e.g. 10 trajectories with three localizations would be necessary to also provide 20 displacements to the 1Δτ histogram). Thus, by limiting JumpsToConsider, one is biasing the displacement histogram against bound molecules. However, as demonstrated in the simulations shown in Figure 3—figure supplement 2, whether all jumps or JumpsToConsider = 4 is used has almost no effect on the DFREE-estimate, but using JumpsToConsider = 4 causes FBOUND to be underestimated by on average of −5% (percentage points) relative to SpotOn (all). We see a similar ~5–10% difference between Spot-On (four jumps) and Spot-On (all) on the experimental spaSPT data shown in Figure 4. As we have discussed previously (Hansen et al., 2017), restricting JumpsToConsider to four is a way one can compensate for all the many acquisition biases (such as motion-blur) that generally cause undercounting for fast-diffusing molecules and which cannot readily be taken into account in simulations. While the optimal value will depend on the trajectory length distribution (JumpsToConsider should not take a value much smaller than the mean trajectory length), we found that JumpsToConsider = 4 provides a good compromise for our experimental data. We strongly recommend including experimental controls (such as histone H2B-Halo and Halo-3xNLS to ensure that experimental and analysis parameters have been reasonably set).
Number of timepoints
Spot-On considers how the histogram of displacement changes over time for multiple Δτ. The number of Δτ that will be considered is equal to the number of timepoints – 1. So, if timepoints = 8, the displacements from 1Δτ to 7Δτ will be considered. How many timepoints to consider will depend on how much data you have and the frame-rate. For example, if the mean trajectory length is two frames, setting timepoints to 20 will cause problems since only a tiny fraction of trajectories will be at least 20 frames long and thus contribute to the 19Δτ histogram. Moreover, the correction for defocalization is approximate, so considering timepoints where more than >95% of free molecules have moved out-of-focus is also not recommended; when this happens will further depend on the free diffusion constant. Nevertheless, as long as there is sufficient data to reasonably populate the displacement histograms at all timepoints, Spot-On is highly robust to how this parameter is set (Figure 3—figure supplement 8). As a rule of thumb we generally do not recommend setting timepoints above 10 or considering Δτ beyond 80 ms.
Iterations for fitting
Spot-On almost always converges optimally in the first iteration, so generally 2 or three is more than sufficient when using the 2-state model. For the 3-state model, the parameter estimation is more complicated and here we recommend eight iterations as a starting point.
PDF or CDF fitting
Although for large datasets PDF- and CDF-fitting perform similarly as shown in Figure 3—figure supplement 9, CDF-fitting tends to provide more reliable estimates of DFREE and FBOUND when the number of trajectories decreases, likely because PDF-fitting is more susceptible to binning noise. Thus, for quantitative analysis we always recommend CDF-fitting, though PDF-fitting can be convenient for making figures since most people find histograms more intuitive.
Fitting localization error
Spot-On can either use a user-supplied localization error or fit it from the data. As long as there is a significant bound fraction, Spot-On will infer this with nanometer precision (Figure 3—figure supplement 11), though we note that this is an average localization error that mostly reflects the localization error of the bound fraction, and the actual localization error for each individual localization will vary (Deschout et al., 2012; Lindén et al., 2017). In cases, where the bound population is very small, fitting the localization error can be less accurate. Thus, in situations where comparisons are being made between the same protein under different conditions or e.g. between different mutants of the same protein, we recommend fitting to obtain a mean localization error and then keeping it fixed in the comparisons.
Choosing allowed ranges for diffusion constants
Spot-On comes with default allowed ranges. For example, for the 2-state model, DFREE=0.5;25 and DBOUND=0.0001;0.08. These ranges are generally reasonable, but may not be appropriate for all datasets. Whenever Spot-On infers a diffusion constant that is equal to the min or max, caution is needed and it may be necessary to change these limits. In particular, unless a molecule is bound to an unusually dynamic scaffold, DBOUND=0.08 µm²/s is almost certainly too high. Thus, we recommend imaging a protein that is overwhelmingly bound, such as histone H2B or H3, fitting the histone data with Spot-On and then use the inferred DBOUND for histone proteins or a slightly larger value as the maximally allowed DBOUND value.
2-state or 3-state model
Spot-On considers either a 2-state or 3-state model. Since the 3-state model contains two additional fitted parameters, the 3-state fit is almost always better. While there are many cases where a 2-state model would be inappropriate (e.g. a transcription factor that can exist as either a monomer or tetramer, thus exhibiting two very different diffusive states), generally speaking, we prefer fitting a 2-state model for most transcription factors or similar nuclear chromatin-interacting proteins. In part, deviations from the 2-state model will be due to anomalous diffusion and confinement inside cells, which cause deviation from the ideal Brownian motion model implemented by Spot-On. For this reason, traditional model-selection techniques such as Akaike’s Information Criterion (AIC) or the Bayesian Information Criterion (BIC) can also be misleading.
Appendix 3
SPT acquisition considerations in spaSPT experiments
Considerations for minimizing bias in SPT acquisitions
To obtain a good single-molecule tracking dataset, a series of requirements have to be met. First of all, it must be possible to image single-molecules at a high signal-to-noise ratio. This is now relatively straightforward thanks to developments in fluorescence labeling strategies and imaging modalities (Lavis, 2017; Liu et al., 2015). The development of the HaloTag protein-labeling system and bright, photo-stable organic Halo-dyes such as TMR and the JF dyes (Grimm et al., 2015) now make it possible to easily visualize single protein molecules inside live cells. Moreover, imaging modalities such as highly inclined and laminated optical sheet illumination (‘HiLo’)(Tokunaga et al., 2008) are relatively straightforward to implement and combined with a high-quality EM-CCD camera make it possible to image single-molecules at high signal-to-noise suitable for generating high-quality 2D SPT data. For details of our imaging setup, which combines HaloTag-labeling with HiLo-illumination and which is relatively common and easy to operate, please see the methods section. But we note that many other imaging modalities, e.g. light-sheet or even epi-fluorescence imaging can generate high-quality single-molecule tracking data.
Thus, in the following we will assume that the above condition is met: namely, that single protein molecules can be tracked inside live cells at high signal-to-noise ratio. Nevertheless, even if this condition is met, there are at least four other major sources of bias:
- Detection: minimize ‘motion-blurring’
- Tracking: minimize tracking errors
- 3D loss: correct for molecules moving out-of-focus (defocalization bias)
- Analysis methods: infer subpopulations with minimal bias
Spot-On addresses point 3 and 4, as described elsewhere, but point 1 and 2 must be addressed in the experimental design. We discuss strategies to minimize these biases below (spaSPT).
1. Detection – minimizing ‘motion-blurring’
Almost all localization algorithms achieve sub-diffraction localization accuracy (‘super-resolution’) by treating individual fluorophores as point-source emitters, which generate blurred images that can be described by the Point-Spread-Function (PSF) of the microscope. Modeling of the PSF (typically as a 2-dimensional Gaussian) then allows extraction of the particle centroid with a precision of tens of nanometers. But as illustrated in Figure 1A, while this works extremely well for bound molecules, fast-diffusing molecules will spread out their photons over many pixels during the camera exposure and thus appear as ‘motion-blurs’. Thus, localization algorithms will reliably detect bound molecules, but may fail to detect fast-moving molecules as has also been observed previously (Berglund, 2010; Deschout et al., 2012; Elf et al., 2007; Izeddin et al., 2014; Lindén et al., 2017). Clearly, the extent of the bias will depend on the exposure time and the diffusion constant: the longer the exposure and higher D, the worse the problem. Assuming Brownian motion, we can calculate the fraction of molecules that will move more than some distance, rmax, during an exposure time, texp, given a free diffusion constant of DFREE using the following equation:
P(r>rmax)=e−rmax24DFREEtexp
For example, if we define motion-blurring as moving more than two pixels (>320 nm assuming a 160 nm pixel size) during the excitation, an exposure time of 10 ms and a typical free diffusion constant of 3.5 μm2/s (e.g. ~Sox2), we get:
P(r>0.32μm)=e−(0.32μm)24⋅3.5μm2s⋅0.010s=0.48
Thus, even for a relatively slowly diffusing protein, with a 10 ms exposure we should expect almost half (48%) of all free molecules to show significant motion-blurring, if we assume that molecules move with a constant speed during the exposure. The most straightforward solution, therefore, is to limit the exposure time: in the limit of an infinitely short exposure time, there is no motion-blur. In practice, most EM-CCD cameras can only image at ~100–200 Hz for reasonably sized ROIs. Moreover, it is generally desirable for the mean jump lengths to be significantly bigger than the localization error, thus for most nuclear factors in mammalian cells it is not desirable to image at above >250 Hz. Accordingly, a reasonable solution is therefore to use stroboscopic illumination. That is, using brief excitation laser pulses that last shorter than the camera frame rate (e.g. 1 ms excitation pulse, 10 ms camera exposure time for a 100 Hz experiment): this achieves minimal motion-blurring while maintaining a useful frame-rate. However, this highlights a key experimental trade-off: shorter excitation pulses minimize motion-blurring, but also minimize the signal-to-noise. Therefore, a reasonable compromise has to be determined. Here we use 1 ms excitation pulses: this achieves minimal motion blurring (0.067% > 320 nm using D = 3.5 μm2/s) and still yields very good signal (signal-to-background >5). But users will need to decide this based on their expected D and their experimental setup (signal-to-noise). Moreover, different localization algorithms (Chenouard et al., 2014; Deschout et al., 2012) have different sensitivities to motion-blurring; thus, the extent of the bias will also depend on the user’s localization algorithm. As we show here, in the case of the MTT-algorithm (Sergé et al., 2008), the estimation of D is quite sensitive to motion-blurring, but the estimation of the bound fraction is less sensitive as long as the diffusion constant is <5 μm2/s. But other localization algorithms may be more or less sensitive. Generally speaking, we do not recommend imaging at a signal-to-background <3 and do not recommend using excitation pulses >5 ms, but the optimal conditions will need to be determined on a case-by-case basis.
In conclusion, experimentally implementing stroboscopic excitation makes it possible to minimize the bias coming from motion-blurring, while still achieving a sufficient signal for reliable localization.
2. Tracking – minimizing tracking errors
It is necessary to minimize tracking errors in order to obtain high-quality SPT data. Tracking errors bias the estimation of essentially all parameters we could want to estimate from SPT experiments including diffusion constants, subpopulations, anomalous diffusion etc. While many different tracking algorithms exist, it is fundamentally impossible to perform tracking, that is connecting localized molecules between subsequent frames, at high densities without introducing many tracking errors. Thus, the simplest solution is to image at low densities: in principle, if there is only one labeled molecule per cell, there can be no tracking errors. Yet, because dyes generally bleach quite quickly under most SPT imaging conditions, this has traditionally led to a serious trade-off between data quality and the number of trajectories which can be obtained. However, with the recent development of bright photo-activatable JF-dyes (Grimm et al., 2016a; 2016b) (PA-dye), it is now possible to combine the superior brightness of the Halo-JF dyes with photo-activation SPT (also called sptPALM (Manley et al., 2008)). That is, a large fraction of Halo-tagged proteins in a cell can be labeled with Halo-PA-JF dyes and then photo-activated one at a time: this allows imaging at extremely low densities (<1 fluorescent molecule per cell per frame) and nevertheless tens of thousands of trajectories from a single cell can be obtained. Thus, PA-dyes now make it possible to nearly eliminate tracking errors without compromising on signal-to-noise or amount of data. In fact, imaging at extremely low densities generally also improves signal-to-noise since out-of-focus background is reduced and overlapping point emitters are avoided (Izeddin et al., 2014).
Nevertheless, even with paSPT it is still necessary to decide on an optimal density. The key parameters are size of the ROI (ideally the whole nucleus for studies in cells) and D: a large nucleus and a slow D can support a higher density than fast-diffusing molecules in a small nucleus. As a general rule of thumb, we recommend a density of ~1 fluorescent molecule per ROI per frame. This will keep tracking errors at a minimum and still support rapid acquisition of large datasets. All data acquired for this study was acquired at approximately this density.
In practice, keeping an optimal density will require some trial-and-error optimization of the 405 nm photo-activation laser intensity. 405 nm excitation does contribute background fluorescence, so we prefer to pulse the 405 nm laser during the camera ‘dead-time’ (~0.5 ms in our case) to avoid this. Moreover, this also makes it easier to keep the photo-activation level constant when changing the frame rate. However, the optimal photo-activation power will depend on the expression level of the protein, protein half-life and the dye concentration and will therefore have to be optimized in each case. We recommend recording initial datasets and then analyzing them using Spot-On which reports the mean number of localizations per frame and then using this information to determine the optimal photo-activation level. However, even then some cell-to-cell variation may be unavoidable: especially in transient transfection experiments where there is large cell-to-cell variation in expression level or when studying proteins expressed from stably integrated transgenes (e.g. Halo-3xNLS and H2b-Halo in our case). In these cases, some cells will likely exhibit too high a density. To deal with this, Spot-On includes the option to analyze datasets from individual cells first and then excluding a cell with too high a density before analyzing the merged dataset.
Which datasets are appropriate for Spot-On?
In the sections above, we have discussed how to minimize common experimental biases in SPT experiments and proposed spaSPT as a general solution. However, many 2D SPT datasets recorded under different conditions are also appropriate for Spot-On. For example, SPT experiments without photo-activation or with continuous illumination may also be appropriate for analysis with Spot-On. For example, there may be situations where photo-activation SPT is not possible: in such cases, it will be essential to keep the labeling density sufficiently low that tracking errors are minimized and it might thus be necessary to image substantially more cells to get enough statistics. Likewise, as we show in Figure 4JK, motion-blurring is a major concern for fast-diffusing molecules, but for a slowly diffusing molecule like Halo-CTCF it makes only a small difference. Thus SPT datasets recorded with continuous illumination may also be appropriate provided that the protein of interest is known to diffuse sufficiently slowly.
We also note that since Spot-On uses the loss of fast-diffusing molecules over time to correct for bias and to estimate the free population, it is essential that all trajectories are included in Spot-On for analysis. For example, some tracking and localization algorithms ignore all trajectories below a certain length (e.g. five frames), but this will cause Spot-On to misestimate the loss of molecules moving out-of-focus and thus it is imperative that trajectories of all lengths be included when analyzing data using Spot-On. Furthermore, trajectories of only a single localization are required to accurately compute the average number of localizations per frame, which is a key quality-control metric for SPT data.
Moreover, Spot-On does not currently support 3D SPT data. Furthermore, Spot-On assumes diffusion to be Brownian. This is a reasonable approximation even for molecules exhibiting some levels of anomalous diffusion as shown in Figure 4—figure supplement 2, but Spot-On is not appropriate for molecules undergoing directed motion (e.g. a protein moving on microtubules). Additionally, in cases where there are frequent state transitions at a time-scale similar to the frame rate (e.g. transcription factor with a 10 ms residence time imaged at 100 Hz), Spot-On may give inaccurate results since it ignores state transitions (Figure 3—figure supplement 10). Finally, the correction for molecules moving out-of-focus assumes that molecules are not fully confined within small compartments, that prevent molecules from moving out-of-focus.
Appendix 4
Proposed minimal reporting guidelines for SPT data and kinetic modeling analysis
To ensure reproducibility of results and subsequent analyses, datasets, statistics and analysis metrics should be provided. This should allow the reader to quickly assess the quality and statistical significance of the presented results and datasets. So far, to our knowledge, no consensus exists on minimal reporting guidelines for single particle tracking datasets and kinetic modeling analyses. We note, however, that a recent preprint suggests a similar conceptual framework, although less applicable to single-molecule experiments (Rigano and Strambio De Castillia, 2017),
We propose that published single-particle datasets be published and reported accompanied with the following metadata. We suggest that these metrics constitute a minimal reporting guideline for single-particle datasets and subsequent kinetic modeling (though additional information may be appropriate and necessary in some cases).
Dataset description
Criterion | How to obtain it | Example value |
---|---|---|
Exposure time | Determined at the acquisition step | 5 ms |
Signal-to-background ratio | Mean peak value of detected particle divided by mean background value | 5 |
Detection algorithm used | MTT (version xxx) | |
Tracking algorithm used | MTT (version xxx) | |
Number of particles per frame | Provided by Spot-On | Mean: 0.76 |
Number of detections | Provided by Spot-On | 360000 |
Number of trajectories of length >3 | Provided by Spot-On | 10000 |
Mean trajectory length | Provided by Spot-On | 4.5 frames |
Localization error | Provided by Spot-On | 30 nm |
Spot-On parameters
In addition to these metrics, it is important to report the parameters specified in the detection and tracking algorithms, since this can greatly affect the results. For Spot-On, we recommend reporting the following parameters:
- Jump length distribution parameters: BinWidth (µm), Number of timepoints, Jumps to consider or Use all trajectories, MaxJump (µm),
- Fitting parameters: Number of states (2 or 3), localization error fitted from data (Yes or No, if no, specify the value, in nm), dZ (µm), a (s-1/2), b (µm), PDF or CDF fit (PDF or CDF), number of iterations. Finally, the bounds used for the fitting algorithm should be reported, e.g:
- Dbound: [0.0005, 0.08] µm²/s
- Dfree [0.15, 25] µm²/s
- Fbound [0,1]
- Obviously, if a 3-state model is used, the bounds for the additional subpopulation should also be reported.
In case a custom-modified version of Spot-On is used, we recommend that the code be made available and that a summary of the modifications be included in the methods section.
Funding Statement
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Contributor Information
Anders S Hansen, Email: anders.sejr.hansen@berkeley.edu.
Robert Tjian, Email: jmlim@berkeley.edu.
Xavier Darzacq, Email: darzacq@berkeley.edu.
David Sherratt, University of Oxford, United Kingdom.
Funding Information
This paper was supported by the following grants:
- National Institutes of Health UO1-EB021236 to Xavier Darzacq.
- National Institutes of Health U54-DK107980 to Xavier Darzacq.
- California Institute for Regenerative Medicine LA1-08013 to Xavier Darzacq.
- Howard Hughes Medical Institute 003061 to Robert Tjian.
- Howard Hughes Medical Institute to Luke D Lavis.
- Siebel Stem Cell Institute to Anders S Hansen.
Additional information
Competing interests
No competing interests declared.
has filed patent applications (e.g. PCT/US2015/023953) whose value may be affected by this publication.
Luke D Lavis: has filed patent applications (e.g. PCT/US2015/023953) whose value may be affected by this publication.
One of the three founding funders of eLife and a member of eLife's Board of Directors.
Author contributions
Conceived of the project, Developed Spot-On, Performed simulations, Performed experiments, Analyzed experiments, Drafted and edited the manuscript.
Conceived of the project, Developed Spot-On, Performed simulations, Analyzed experiments, Drafted and edited the manuscript.
Developed and contributed JF dyes, Edited the manuscript.
Developed and contributed JF dyes, Edited the manuscript.
Supervised the research, Reviewed and edited the manuscript.
Conceived of the project, Supervised the research, Reviewed and edited the manuscript.
Additional files
Supplementary file 1. PDF of step-by-step manual for using Spot-On.
Transparent reporting form
Major datasets
The following datasets were generated:
Hansen AS, author; Woringer M, author; Grimm JB, author; Lavis LD, author; Tjian R, author; Darzacq X, author. Experimental data for "Spot-On: robust model-based analysis of single-particle tracking experiments". 2017 https://doi.org/10.5281/zenodo.834781 Publicly available at Zenodo (https://zenodo.org/)
Hansen AS, author; Woringer M, author; Grimm JB, author; Lavis LD, author; Tjian R, author; Darzacq X, author. Simulated data for "Spot-On: robust model-based analysis of single-particle tracking experiments" (MATLAB format) 2017 https://doi.org/10.5281/zenodo.835541 Publicly available at Zenodo (https://zenodo.org/)
Hansen AS, author; Woringer M, author; Grimm JB, author; Lavis LD, author; Tjian R, author; Darzacq X, author. Simulated data for "Spot-On: robust model-based analysis of single-particle tracking experiments". 2017 https://doi.org/10.5281/zenodo.834787 Publicly available at Zenodo (https://zenodo.org/)
Hansen AS, author; Woringer M, author; Grimm JB, author; Lavis LD, author; Tjian R, author; Darzacq X, author. Software used for "Spot-On: robust model-based analysis of single-particle tracking experiments". 2017 https://doi.org/10.5281/zenodo.835171 Publicly available at Zenodo (https://zenodo.org/)
References
- Berglund AJ. Statistics of camera-based single-particle tracking. Physical Review E. 2010;82 doi: 10.1103/PhysRevE.82.011917. [DOI] [PubMed] [Google Scholar]
- Carslow HS, Jaeger JC. Conduction of Heat in Solids. 1959. [Google Scholar]
- Chenouard N, Smal I, de Chaumont F, Maška M, Sbalzarini IF, Gong Y, Cardinale J, Carthel C, Coraluppi S, Winter M, Cohen AR, Godinez WJ, Rohr K, Kalaidzidis Y, Liang L, Duncan J, Shen H, Xu Y, Magnusson KE, Jaldén J, Blau HM, Paul-Gilloteaux P, Roudot P, Kervrann C, Waharte F, Tinevez JY, Shorte SL, Willemse J, Celler K, van Wezel GP, Dan HW, Tsai YS, Ortiz de Solórzano C, Olivo-Marin JC, Meijering E. Objective comparison of particle tracking methods. Nature Methods. 2014;11:281–289. doi: 10.1038/nmeth.2808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deschout H, Neyts K, Braeckmans K. The influence of movement on the localization precision of sub-resolution particles in fluorescence microscopy. Journal of Biophotonics. 2012;5:97–109. doi: 10.1002/jbio.201100078. [DOI] [PubMed] [Google Scholar]
- Elf J, Li GW, Xie XS. Probing transcription factor dynamics at the single-molecule level in a living cell. Science. 2007;316:1191–1194. doi: 10.1126/science.1141967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frost NA, Lu HE, Blanpied TA. Optimization of cell morphology measurement via single-molecule tracking PALM. PLoS One. 2012;7:e36751. doi: 10.1371/journal.pone.0036751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goulian M, Simon SM. Tracking single proteins within cells. Biophysical Journal. 2000;79:2188–2198. doi: 10.1016/S0006-3495(00)76467-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grimm JB, English BP, Chen J, Slaughter JP, Zhang Z, Revyakin A, Patel R, Macklin JJ, Normanno D, Singer RH, Lionnet T, Lavis LD. A general method to improve fluorophores for live-cell and single-molecule microscopy. Nature Methods. 2015;12:244–250. doi: 10.1038/nmeth.3256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grimm JB, English BP, Choi H, Muthusamy AK, Mehl BP, Dong P, Brown TA, Lippincott-Schwartz J, Liu Z, Lionnet T, Lavis LD. Bright photoactivatable fluorophores for single-molecule imaging. Nature Methods. 2016a;13:985–988. doi: 10.1038/nmeth.4034. [DOI] [PubMed] [Google Scholar]
- Grimm JB, Klein T, Kopek BG, Shtengel G, Hess HF, Sauer M, Lavis LD. Synthesis of a Far-Red Photoactivatable Silicon-Containing Rhodamine for Super-Resolution Microscopy. Angewandte Chemie International Edition. 2016b;55:1723–1727. doi: 10.1002/anie.201509649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hansen AS, Pustova I, Cattoglio C, Tjian R, Darzacq X. CTCF and cohesin regulate chromatin loop stability with distinct dynamics. eLife. 2017;6:e25776. doi: 10.7554/eLife.25776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Izeddin I, Récamier V, Bosanac L, Cissé II, Boudarene L, Dugast-Darzacq C, Proux F, Bénichou O, Voituriez R, Bensaude O, Dahan M, Darzacq X. Single-molecule tracking in live cells reveals distinct target-search strategies of transcription factors in the nucleus. eLife. 2014;3:e2230. doi: 10.7554/eLife.02230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knight SC, Xie L, Deng W, Guglielmi B, Witkowsky LB, Bosanac L, Zhang ET, El Beheiry M, Masson JB, Dahan M, Liu Z, Doudna JA, Tjian R. Dynamics of CRISPR-Cas9 genome interrogation in living cells. Science. 2015;350:823–826. doi: 10.1126/science.aac6572. [DOI] [PubMed] [Google Scholar]
- Kues T, Kubitscheck U. Single molecule motion perpendicular to the focal plane of a microscope: Application to splicing factor dynamics within the cell nucleus. Single Molecules. 2002;3:218–224. doi: 10.1002/1438-5171(200208)3:4<218::AID-SIMO218>3.0.CO;2-C. [DOI] [Google Scholar]
- Lavis LD. Chemistry is dead. Long live chemistry! Biochemistry. 2017;56:5165–5170. doi: 10.1021/acs.biochem.7b00529. [DOI] [PubMed] [Google Scholar]
- Lee A, Tsekouras K, Calderon C, Bustamante C, Pressé S. Unraveling the thousand word picture: an introduction to super-resolution data analysis. Chemical Reviews. 2017;117:7276–7330. doi: 10.1021/acs.chemrev.6b00729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li L, Liu H, Dong P, Li D, Legant WR, Grimm JB, Lavis LD, Betzig E, Tjian R, Liu Z. Real-time imaging of Huntingtin aggregates diverting target search and gene transcription. eLife. 2016;5:e17056. doi: 10.7554/eLife.17056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindén M, Ćurić V, Amselem E, Elf J. Pointwise error estimates in localization microscopy. Nature Communications. 2017;8:15115. doi: 10.1038/ncomms15115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Z, Legant WR, Chen BC, Li L, Grimm JB, Lavis LD, Betzig E, Tjian R. 3D imaging of Sox2 enhancer clusters in embryonic stem cells. eLife. 2014;3:e04236. doi: 10.7554/eLife.04236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Z, Lavis LD, Betzig E. Imaging live-cell dynamics and structure at the single-molecule level. Molecular Cell. 2015;58:644. doi: 10.1016/j.molcel.2015.02.033. [DOI] [PubMed] [Google Scholar]
- Loffreda A, Jacchetti E, Antunes S, Rainone P, Daniele T, Morisaki T, Bianchi ME, Tacchetti C, Mazza D. Live-cell p53 single-molecule binding is modulated by C-terminal acetylation and correlates with transcriptional activity. Nature Communications. 2017;8:313. doi: 10.1038/s41467-017-00398-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manley S, Gillette JM, Patterson GH, Shroff H, Hess HF, Betzig E, Lippincott-Schwartz J. High-density mapping of single-molecule trajectories with photoactivated localization microscopy. Nature Methods. 2008;5:155–157. doi: 10.1038/nmeth.1176. [DOI] [PubMed] [Google Scholar]
- Matsuoka S, Shibata T, Ueda M. Statistical analysis of lateral diffusion and multistate kinetics in single-molecule imaging. Biophysical Journal. 2009;97:1115–1124. doi: 10.1016/j.bpj.2009.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mazza D, Abernathy A, Golob N, Morisaki T, McNally JG. A benchmark for chromatin binding measurements in live cells. Nucleic Acids Research. 2012;40:e119. doi: 10.1093/nar/gks701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Metzler R, Jeon JH, Cherstvy AG, Barkai E. Anomalous diffusion models and their properties: non-stationarity, non-ergodicity, and ageing at the centenary of single particle tracking. Phys. Chem. Chem. Phys. 2014;16:24128–24164. doi: 10.1039/C4CP03465A. [DOI] [PubMed] [Google Scholar]
- Michalet X. Mean square displacement analysis of single-particle trajectories with localization error: Brownian motion in an isotropic medium. Physical Review E. 2010;82 doi: 10.1103/PhysRevE.82.041914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michalet X, Berglund AJ. Optimal diffusion coefficient estimation in single-particle tracking. Physical Review E. 2012;85 doi: 10.1103/PhysRevE.85.061916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Monnier N, Barry Z, Park HY, Su KC, Katz Z, English BP, Dey A, Pan K, Cheeseman IM, Singer RH, Bathe M. Inferring transient particle transport dynamics in live cells. Nature Methods. 2015;12:838–840. doi: 10.1038/nmeth.3483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mueller F, Stasevich TJ, Mazza D, McNally JG. Quantifying transcription factor kinetics: at work or at play? Critical Reviews in Biochemistry and Molecular Biology. 2013;48:492–514. doi: 10.3109/10409238.2013.833891. [DOI] [PubMed] [Google Scholar]
- Persson F, Lindén M, Unoson C, Elf J. Extracting intracellular diffusive states and transition rates from single-molecule tracking data. Nature Methods. 2013;10:265–269. doi: 10.1038/nmeth.2367. [DOI] [PubMed] [Google Scholar]
- Pettitt SJ, Liang Q, Rairdan XY, Moran JL, Prosser HM, Beier DR, Lloyd KC, Bradley A, Skarnes WC. Agouti C57BL/6N embryonic stem cells for mouse genetic resources. Nature Methods. 2009;6:493–495. doi: 10.1038/nmeth.1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rhodes J, Mazza D, Nasmyth K, Uphoff S. Scc2/Nipbl hops between chromosomal cohesin rings after loading. eLife. 2017;6:e30000. doi: 10.7554/eLife.30000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rigano A, Strambio De Castillia C. Proposal for minimum information guidelines to report and reproduce results of particle tracking and motion analysis. bioRxiv. 2017 doi: 10.1101/155036. [DOI]
- Schmidt JC, Zaug AJ, Cech TR. Live cell imaging reveals the dynamics of telomerase recruitment to telomeres. Cell. 2016;166:1188–1197. doi: 10.1016/j.cell.2016.07.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sergé A, Bertaux N, Rigneault H, Marguet D. Dynamic multiple-target tracing to probe spatiotemporal cartography of cell membranes. Nature Methods. 2008;5:687–694. doi: 10.1038/nmeth.1233. [DOI] [PubMed] [Google Scholar]
- Shen H, Tauzin LJ, Baiyasi R, Wang W, Moringo N, Shuang B, Landes CF. Single Particle Tracking: From Theory to Biophysical Applications. Chemical Reviews. 2017;117:7331–7376. doi: 10.1021/acs.chemrev.6b00815. [DOI] [PubMed] [Google Scholar]
- Swinstead EE, Miranda TB, Paakinaho V, Baek S, Goldstein I, Hawkins M, Karpova TS, Ball D, Mazza D, Lavis LD, Grimm JB, Morisaki T, Grøntved L, Presman DM, Hager GL. Steroid receptors reprogram FoxA1 occupancy through dynamic chromatin transitions. Cell. 2016;165:593–605. doi: 10.1016/j.cell.2016.02.067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarantino N, Tinevez JY, Crowell EF, Boisson B, Henriques R, Mhlanga M, Agou F, Israël A, Laplantine E. TNF and IL-1 exhibit distinct ubiquitin requirements for inducing NEMO-IKK supramolecular structures. The Journal of Cell Biology. 2014;204:231–245. doi: 10.1083/jcb.201307172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teves SS, An L, Hansen AS, Xie L, Darzacq X, Tjian R. A dynamic mode of mitotic bookmarking by transcription factors. eLife. 2016;5:e22280. doi: 10.7554/eLife.22280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tinevez JY, Perry N, Schindelin J, Hoopes GM, Reynolds GD, Laplantine E, Bednarek SY, Shorte SL, Eliceiri KW. TrackMate: An open and extensible platform for single-particle tracking. Methods. 2017;115:80–90. doi: 10.1016/j.ymeth.2016.09.016. [DOI] [PubMed] [Google Scholar]
- Tokunaga M, Imamoto N, Sakata-Sogawa K. Highly inclined thin illumination enables clear single-molecule imaging in cells. Nature Methods. 2008;5:159–161. doi: 10.1038/nmeth1171. [DOI] [PubMed] [Google Scholar]
- Vestergaard CL, Blainey PC, Flyvbjerg H. Optimal estimation of diffusion coefficients from single-particle trajectories. Physical Review E. 2014;89 doi: 10.1103/PhysRevE.89.022726. [DOI] [PubMed] [Google Scholar]
- Weimann L, Ganzinger KA, McColl J, Irvine KL, Davis SJ, Gay NJ, Bryant CE, Klenerman D. A quantitative comparison of single-dye tracking analysis tools using Monte Carlo simulations. PLoS One. 2013;8:e64287. doi: 10.1371/journal.pone.0064287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhen CY, Tatavosian R, Huynh TN, Duc HN, Das R, Kokotovic M, Grimm JB, Lavis LD, Lee J, Mejia FJ, Li Y, Yao T, Ren X. Live-cell single-molecule tracking reveals co-recognition of H3K27me3 and DNA targets polycomb Cbx7-PRC1 to chromatin. eLife. 2016;5:e17667. doi: 10.7554/eLife.17667. [DOI] [PMC free article] [PubMed] [Google Scholar]
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Spot-On: robust model-based analysis of single-particle tracking experiments" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Kevin Struhl as the Senior Editor. The following individual involved in review of your submission has agreed to reveal his identity: Lothar Schermelleh (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
This 'Resource' manuscript describes an integrated approach for single-particle tracking (SPT) microscopy using stroboscopic photoactivation (spaSPT) and present "Spot-On", an open-access software analysis tool for SPT data. The latter can be applied to any kind of SPT data and uses model fitting to calculate diffusion rates and bound and mobile population sizes. The authors convincingly validate and compare their approach with other available tools using ground-truth simulations and analysing a number of Halo-tagged nuclear proteins with differential dynamic properties in U2OS cells and labelled with photoactivatable JF dyes.
Both reviewers are enthusiastically supportive of this being published, although a small number of minor comments and corrections are appended below.
A practical issue – could some of the information on the website also be loaded as a supplemental tutorial file (some helpful screenshots?)?
The software in its current form is restricted to 2D-SPT analyses. Why is 3D-tracking not implemented, and is there a planned route for a future upgrade? With many imaging systems offering astigmatism-based 3D localisation option, would this not offer potential benefits? We understand that a package for 3D analysis may be out of the scope of the present article, but whether there is a prospect for such a package in the future might be indicated/commented on.
Along with the software, the authors describe spaSPT as a beneficial approach to minimize motion-blur and tracking errors. What would be the difference between spaSPT and SPT with very low concentration of non-photoactivatable JF dyes? The authors may want to discuss application of Spot-On with alternative imaging approaches in cases where photoactivation is either not possible (e.g. due to lack of laser lines or suitable dyes) or not desired (potential damage, blue channel is already used).
It would be desirable to have a function for visualising tracks with the option to differentially select and display subsets of tracks that fall into different categories (immobile/mobile or other criteria). This would enable the user to directly see spatial patterns of differential dynamics.
i) Figure 4K and subsection “Effect of motion-blur bias on parameter estimates”, first paragraph: Why is there is a small but significant difference between PA-JF549 and PA-JF646?
ii) Subsection “Validation of Spot-On using spaSPT data at different frame rates”: The axial detection range should be specified (around 700 nm?).
iii) Abbreviations CDF and PDF may need to be introduced in the main text as well, not only the figure legend.
iv) Figure 4—figure – supplement 3 legend – last paragraph: Does the "total bound fraction of ~60-65%" refer to Halo-Sox2? Please specify.
[…] Both reviewers are enthusiastically supportive of this being published, although a small number of minor comments and corrections are appended below.
- A practical issue – could some of the information on the website also be loaded as a supplemental tutorial file (some helpful screenshots?)?
This is a good suggestion. We adapted the online tutorial and attach it as Supplementary file 1. To accommodate future revisions of the interface, it includes a reference to the online version, that will follow the upgrades of the platform.
- The software in its current form is restricted to 2D-SPT analyses. Why is 3D-tracking not implemented, and is there a planned route for a future upgrade? With many imaging systems offering astigmatism-based 3D localisation option, would this not offer potential benefits? We understand that a package for 3D analysis may be out of the scope of the present article, but whether there is a prospect for such a package in the future might be indicated/commented on.
This is a very interesting suggestion and one we have thought a lot about. Currently, there are two main options for 3D tracking: 1) PSF shaping (such as astigmatism)-based 3D SPT (e.g. cylindrical lens or adaptive optics) and 2) 3D SPT using a Multi-Focal Microscope (MFM) (Abrahamsson et al., Nature Methods, 2013). We will discuss both here.
- astigmatism-based 3D SPT data will give a truncated 3D displacement distribution. In x,y there are minor restrictions on displacement lengths, but in z, the max displacement will be equal to the axial detection range (~700 nm total; 0 +/- 350 nm). For example, a molecule 200 nm above the focal plane, will defocalize if it moves more than 150 nm up or more than 550 nm down. For this reason, the expected distribution of 3D displacements is not the simple 3D Brownian distribution, but a complicated convolution of a truncated axial distribution (max 700 nm; position-dependent) and the 2D x,y-distribution considered by Spot-On. To our knowledge there is no straightforward way of extending the z-correction currently implemented in Spot-On to 3D. The simulations in Author response image 1 compare the jump length distributions derived from data simulated (D=6.0µm²/s, dt=13ms, sigma=35nm) inside a nucleus of diameter 8µm with increasing axial detection range (400nm, 700nm and full nucleus). Although (right panel) the 2D projection from simulated data is only marginally affected by changes in the axial detection depth, in 3D (left panel), the jump length distribution is sensitive to the axial detection depth, and the lower the axial detection depth, the more truncated the distribution. More work is necessary to determine how mathematically tractable this problem is (and we agree that this will be an interesting future direction), but currently we are not sure if the additional information justifies the approach and of course 3D-astigmatism data can always be 2D-projected and then analyzed by Spot-On.
Author response image 1.
- the truncated 3D displacement distribution issue can in theory be solved with MFM which can in principle yield whole-nucleus 3D SPT data. However, this comes with serious limitations. First, in the 9-focal plane implementation, the signal is split into 9 and due to other losses, the signal-to-noise is reduced by about a factor of ~15-20. Although the JF-dyes are very bright, this big a loss of signal is a serious limitation. Second, since the full field-of-view is required, the frame rate is necessarily slow (33 Hz in Chen Cell 2014 with EM-CCD). While sCMOS cameras might solve the speed issue, this comes at the cost of further loss of signal. Thus, in MFM-based 3D SPT, continuous illumination was necessary to collect enough signal resulting in 30 ms continuous exposure time. This leads to serious motion-blurring and likely undercounting of the fast-moving population. Third, while incredibly elegant, the MFM microscope is quite complicated to set up (we have set it up in our lab recently) and not in regular use in many labs. But as the reviewers also suggest, this is likely to improve in the future, though we note that no Z-correction for defocalization would be necessary in whole-nucleus MFM.
In summary, we believe that 3D tracking methods are currently not sufficiently mature to provide enough added value. But we nevertheless fully agree with the reviewers that this is very likely to change in the future and we have now added a sentence to the Discussion that extending Spot-On to 3D SPT data is an exciting future direction and one that we are potentially interested in pursuing. Specifically, we now write that: “This platform can easily be extended to other diffusion regimes (Metzler et al., 2014) and models (Lee et al., 2017) and, as 3D SPT methods mature, to 3D SPT data.”
- Along with the software, the authors describe spaSPT as a beneficial approach to minimize motion-blur and tracking errors. What would be the difference between spaSPT and SPT with very low concentration of non-photoactivatable JF dyes? The authors may want to discuss application of Spot-On with alternative imaging approaches in cases where photoactivation is either not possible (e.g. due to lack of laser lines or suitable dyes) or not desired (potential damage, blue channel is already used).
This is an important point and as the reviewers suggest, there is in principle no difference between doing SPT with very low concentrations of non-PA dye and spaSPT. It’s just that with non-PA dyes, once the dyes have bleached no more data can be obtained from the cell in question and thus, many more cells have to be imaged to obtain enough data. So spaSPT is more convenient. But to clarify this important point, we now write: “We also note that although Spot-On was validated on spaSPT data, SPT data with non-photoactivatable dyes is also suitable for Spot-On analysis provided that the density is sufficiently low to minimize tracking errors. (see also Appendix 3: “Which datasets are appropriate for Spot-On?”)”.
- It would be desirable to have a function for visualising tracks with the option to differentially select and display subsets of tracks that fall into different categories (immobile/mobile or other criteria). This would enable the user to directly see spatial patterns of differential dynamics.
We thank the reviewers for this important consideration. Indeed, track visualization is crucial as it can be used both to (1) perform quality controls and spot several kinds of biases and (2) refine data analysis, for instance using track segmentation and we discuss both below.
Quality controls (1): visual inspection of the output of the tracking algorithm is indeed crucial, and can be used to detect biases such as tracking errors or issues with a detection threshold. In that case, such visual inspection is much more useful when trajectories and raw images can be overlayed. To draw the full potential of such approach, one would need to upload the raw data to Spot-On, in addition to the tracked data.
Although tracked datasets (trajectories) are easily amenable to online processing (a typical SPT tracking file has a size of the order of a few MB), raw images often weigh several GB, making their interactive processing more challenging. For this reason, we do not believe such an option will work for a web-based platform like Spot-On.
Conversely, several imageJ/Fiji plugins are capable of interactive display of raw images and trajectories. In our opinion, one of the most mature Fiji plugins is TrackMate (Tinevez et al., Methods, 2017). TrackMate can save/reopen tracking files that contain a full description of the tracking parameters, allowing for a careful inspection of the raw movies and overlaid trajectories.
In order to facilitate the integration between a track-visualization software (TrackMate) and a SPT analysis tool (Spot-On), we developed a “TrackMate to Spot-On connector” (available at: https://gitlab.com/tjian-darzacq-lab/Spot-On-TrackMate). This plugin adds an extra menu to TrackMate that allows a one-click upload of a dataset to Spot-On. Thus, a manually inspected file can automatically be uploaded to Spot-On.
We now mention this tool: “Spot-On does not directly analyze raw microscopy images, since a large number of localization and tracking algorithms exist that convert microscopy images into single-molecule trajectories (for a comparison of particle tracking methods, see (Chenouard et al., 2014); moreover, Spot-On can be interfaced with TrackMate (Tinevez et al., 2017), which allows inspection of trajectories before uploading to Spot-On).” Finally, we acknowledge that this approach restricts the inspection of trajectories to the ones tracked using the TrackMate software, and that a tool accepting a wider range of file formats would be desirable. Our group is currently pursuing efforts in that direction, but this is a longer-term project that is not within the scope of this paper.
Track segmentation/classification (2): even though Spot-On can infer diffusion constants and relative fraction of several subpopulations, it does not perform single-trajectory classification. Indeed, this latter problem is significantly harder, or even impossible, depending on the length of the track to classify (see for instance Michalet and Berglund, 2012), and several recently developed approaches already deal with this problem (Persson et al. 2013, Monnier et al., 2015), mostly relying on hidden Markov models (HMMs). Furthermore, since trajectories derived from 3D-diffusing factors entering and exiting the focal plane are inherently short, only a very small fraction of them are actually amenable to single-trajectory inference. Therefore, while we agree that assigning trajectories into particular subcategories (e.g. immobile vs. mobile) is interesting, since HMM-based approaches already exist, we feel that adding this functionality is outside the scope of this paper.
In summary, we thank the reviewers for the suggestion and have implemented a one-click open-source connector between TrackMate and Spot-On that makes visualizing the trajectories before uploading them to Spot-On easy and intuitive.
- i) Figure 4K and subsection “Effect of motion-blur bias on parameter estimates”, first paragraph: Why is there is a small but significant difference between PA-JF549 and PA-JF646?
This is a good point and we have noticed this as well. For proteins with a significant bound fraction (i.e. >25%), we find that both PA-JF549 and PA-JF646 give identical results. However, we find that for proteins with a negligible bound fraction (e.g. Halo-3xNLS or Halo-CTCF without the DNA binding domain), PA-JF646 has a “bound fraction floor” of about 10-15% which it cannot go below, whereas PA-JF549 appears to be able to capture the dynamics of all proteins and we have observed bound fractions <5% for PA-JF549. We believe this is the reason for the difference and we are pretty confident in this effect: it has been highly reproducible in both U2OS and mES cells and we have observed it for several different proteins. We do not have a clear explanation for this, however; one possibility is that JF646 and its conjugates are a little more hydrophobic due to the extra methyl groups in the Si-rhodamine structure. This effect is unlikely to be a case of tracking “free dye ligand”, since JF646 shows negligible photoactivation unless bound to the HaloTag (see Grimm 2016) and since small dyes like JF549/646 are expected to have diffusion constants ~250 um2/s and thus are unlikely to be trackable. Since we do not have a clear explanation of this effect, we prefer not to speculate in the manuscript, but we now state that JF549 appears to be more reliable for SPT (which could be helpful to know for other labs) and specifically write that “Similar results were obtained for both dyes for proteins with a significant bound fraction, but we note that JF549 appears to better capture the dynamics of proteins with a minimal bound fraction such as Halo-3xNLS (Figure 4J-K)”.
ii) Subsection “Validation of Spot-On using spaSPT data at different frame rates”: The axial detection range should be specified (around 700 nm?).
Yes, we agree and have updated the sentence. Thanks for pointing this out.
iii) Abbreviations CDF and PDF may need to be introduced in the main text as well, not only the figure legend.
Yes, this is another good point. We now added this to the Discussion around Figure 3C and state that: “Note that although we show the fits to the probability density function since this is more intuitive (PDF; histogram), we performed the fitting to the cumulative distribution function (CDF).”
iv) Figure 4—figure supplement 3 legend – last paragraph: Does the "total bound fraction of ~60-65%" refer to Halo-Sox2? Please specify.
Thanks for catching this error. The reviewer is correct; it referred to Halo-Sox2 and we have now corrected this.
Supplementary Materials
Supplementary file 1. PDF of step-by-step manual for using Spot-On.
Transparent reporting form
Data Availability Statement
All raw 1064 spaSPT experiments (Figure 4) as well as the 3480 simulations (Figure 3) are freely available in Spot-On readable Matlab and CSV file formats in the form of SPT trajectories at Zenodo. The experimental data is available at: https://zenodo.org/record/834781; The simulations are available in Matlab format at: https://zenodo.org/record/835541; The simulations are available in CSV format at: https://zenodo.org/record/834787; And supplementary software used for MSDi and vbSPT analysis as well as for generating the simulated data at: https://zenodo.org/record/835171