METHOD FOR SINGLE MOLECULE FLUORESCENCE ANALYSIS STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT The invention described and claimed herein was made in part utilizing funds supplied by the United States Department of Energy under contract NO. DE-AC03- 76SF000-98 between the United States Department of Energy and The Regents of the University of California. The government has certain rights to the invention. CROSS-REFERENCES TO RELATED APPLICATIONS This application claims priority to U.S. Provisional Patent Application No.
60/577,960, filed June 07, 2004 which is hereby incorporated by reference in its entirety for all purposes. BACKGROUND OF THE INVENTION The greatest advantage of optical single-molecule spectroscopy — elimination of the ensemble average — is also its greatest fault. Elimination of the ensemble average allows unprecedented opportunities for observation of rare events and the distributions that underlie the ensemble average. These observations are near the limits of optical detection, however, such that raw experimental data are inundated with Poissonian photon counting noise. For instance, as pointed out by Kδllner and Wolfrum (1992), at least 185 photons are required to measure with 10% accuracy a static, mono-exponential fluorescence lifetime from a single molecule. Eliciting quantitative dynamical information from these noisy trajectories is still one of the major challenges in single-molecule spectroscopy. The problem is illustrated by the simulated single-molecule traces displayed in FIG 1. As the bin time is increased from 5 to 30 ms (lowering the time resolution), the "noise" subsides, but the dynamics are obscured as well. On the other hand, very small bin widths (high time resolution) lead to very large statistical errors. Single-molecule spectroscopy will not reach its full potential until this question is thoroughly explored. In this invention there is presented a data analysis algorithm to take advantage of these expressions. A brief overview of recent developments in dynamic single-molecule measurements will help explain the present invention.
Since the pioneering experiments by Moerner (1989) and Orrit (1990), optical single-molecule spectroscopy has gained great momentum both in technology development (Moerner & Fromm, 2003) and in applications (Nie & Zare, 1997; Xie & Trautman, 1998; Weiss, 1999). It is particularly suited for the investigation of biological systems because it probes dynamics on the enzymatically relevant sub-ms to s time scales. Using this technique, for example, enzymatic reaction rates of cholesterol oxidase (Lu etal, 1998) and horse radish peroxidase (Edman etal, 1999) were found to fluctuate with time; previously unreported folding intermediates were directly observed in RNA molecules (Zhuang et al, 2000; Tan et al, 2003) and their transition states characterized (Bokinsky et al, 2003); the detailed dynamics of F-. -ATPase rotation was revealed
(Yasuda et al, 1998; Adachi et al, 2000);

and the time scales of protein conformational fluctuations were quantitatively characterized and modeled (Yang et al, 2003). Not only have single-molecule experiments contributed to our fundamental understanding of biomolecular function, they have also stimulated much theoretical work that provides physical insights into such processes as dynamic disorder, conformational fluctuations, and photon statistics (Jung et al, 2002). Monitoring biochemical events in real time utilizing optical single-molecule spectroscopy can in principle establish a quantitative relationship between the static structure and the dynamic function of a biornolecule. Structural changes in a single molecule can be probed using Pδrster-type resonance energy transfer (FRET) (Ha et al, 1996). While FRET allows studies of structural changes on the length scale of an entire biornolecule (20-80 A) (Stryer, 1978), the minutiae of conformational fluctuations that accompany or facilitate the functioning of a biornolecule can be examined by utilizing excited-state electron transfer (ET) quenching of fluorescence. By virtue of the exponential distance dependence of the quenching rate, ET is sensitive to distance variations on the Angstrom length scale and is a probe of conformational fluctuations at the catarytically active site (Yang et al, 2003). In addition, fluorescence polarization experiments can yield information about the orientational dynamics of a molecule (Adachi et al, 2000, Bartko et al, 2002, Yasuda et al, 1998). A disadvantage of single molecule spectroscopy is that the organic dyes commonly used as fluorescent probes eventuaUy undergo irreversible photo-degradation, limiting the length of recordable single-molecule trajectories (Deschenes & Vanden Bout, 2002, Eggeling et al, 1998b). Consequently, it is not always guaranteed that the molecular system under investigation explores all possible configurations during the measurement period, as the ergodic principle would have dictated. Such non-ergodic conditions are expected to be encountered experimentally in a reactive system such as a single enzyme molecule. Despite the conviction that all dynamical information is contained in single molecule trajectories, these practical matters inevitably hamper the experimentalist's ability to quantitatively characterize the fast conformational motions that critically influence the function of a biornolecule. The challenge thus lies in the efficient extraction of the maximal amount of dynamic information from short, noisy single-molecule traces. Many theoretical tools for the analysis of single molecule systems already exist. In the context of room-temperature time-resolved studies, for example, the correlation method
(Onuchic et al, 1999, Wang & Wolynes, 1995) — which is a very sensitive probe of the
"memory" of a syste — has been used to analyze the dynamics of a single enzyme molecule (Agmon, 2000a, Sch.en.ter et al, 1999) and conformational fluctuations (Chen et al, 2003, Edman et al, 1999, Yang et al, 2003, Zhuang et al, 2002). In principle, more features can be revealed using higher order correlations (Yang k, Xie, 2002b) or event echo analysis (Cao, 2000, Yang & Cao, 2001). These methods are applicable to systems that exhibit stationarity and ergodicity. In cases where the measurement period is commensurate with the inter- conversion time scale between states (Edman et al, 1996, Jia et al, 1997), kinetic parameters can be deduced by applying the motional narrowing concept originally developed for line- shape analysis (Berezhkovskh et al, 1999; 2000; 2001, Geva k, Skinner, 1997). However, due to non-ideal experimental conditions — namely, short trajectories and the non-ergodic conditions typically seen in a reactive setting — it may prove difficult to use these powerful theoretical tools on experimental data. Recent advances in experimental data registering, originally developed for time-correlated single photon counting (TGSPG) (Becker et al, 1999, Boh er et al, 2001), allow the chronological arrival time of each detected photon to be recorded. This has stimulated new experimental schemes such as multi-parameter fluorescence spectroscopy (Kuhnemuth & Seidel, 2001) and photon-by-photon correlation (Yang et al, 2003). For ergodic systems, the latter method allows detailed examination of conformational dynamics that covers a wide range of time scales from sub- s to tens of s. Advanced statistical methods that rely on stationarity and ergodicity have also been developed to elicit physical parameters from such time-stamped data streams (Novikov et al, 2001, Yang & Xie, 2002a;b). Most recently, photon by photon approaches assuming certain Bayesian prior models have been proposed for time-dependent ET (Kou et al., 2003) and FRET distance measurements (Schroder &
Grubmύller, 2003). Despite these exciting new developments, a general, non-parametric method that allows in-depth studies of a reactive, non-ergodic, single molecule system to relate its dynamics to its biochemical function is still lacking. In particular, such methods should allow one to accurately determine the conformational state of a single enzyme molecule with a temporal resolution that is better than its catalytic time scale (ms to s), while simultaneously addressing the problems of background photons, cross-talk between multiple data acquisi- tion channels, and error analysis of the results obtained. The outcome of such model-free analyses will allow an experimentalist to construct a quantitative model that extracts the
dynamics underlying a complex biological macromolecule. With the ultimate goal of developing general methods to build a quantitative dynamic structme-function relationship in biological macromolecules, in this report we first address the theoretical limits of time and distance resolution
' in time-resolved single-molecule mea- surements. We utilize principles from information theory (Cover k, Thomas, 1991), specifically the Fisher information (Fisher, 1925), to quantify the knowledge that can be drawn from experimental data. As an example, consider the task of estimating an experimental parameter a from a measurable quantity λ. Note that the only restriction on the parameter q is that the value of λ must in some way be dependent on it. Otherwise g may represent any property of the experimental system, including distance, orientation and oxidation state. The distribution of experimentally measurable λ is given by the likelihood function /(λ; q): the probability, given that the value of the parameter is q, that the observable will be λ. The Fisher information about q is given by
where (• • • )
λ denotes the expectation value weighted by the likelihood / (λ; q) over all possible A. One may expect that uncertainties in measuring q are related to the Fisher information because, intuitively, q can be determined more accurately if more information about q can be obtained. This qualitative understanding can be quantitatively expressed by the Crarner- Rao-Frechet inequality (Cramer, 1946, Frechet, 1930, Rao, 1949):
where (F(q)) is the expectation value of q from the estimator F. For an unbiased estimator, (F(q)} = q, the Cramer-Rao-Frechet inequalitjr then states that variance of the best possible estimator of q is given by the inverse of its Fisher information matrix. In general, the Maximum Likelihood Estimator (MLE) — which is determined by maximizing /(λ; g) as a function of q given the experimental observation λ — is a good starting point because it is asymptotically normal (Gaussian) under most conditions.
1
SUMMARY OF THE INVENTION
The method determines time dependent parameters of an experimental system and their uncertainty using maximum likelihood estimators and Fisher information. This allows the construction of an algorithm to calculate these parameters as a function of time. Since the algorithm is information-based, time and distance resolutions are inversely related, allowing the selection of the proper time or distance resolution for our system. The present invention describes a method to determine these parameters from a single trajectory, making it possible to account for variations in background and focus during an experiment.
Description of Figures FIG. 1: Simulated single-molecule trajectories where number of detected photon within certain "bin" time (5,10, and 30 ms) is recorded- as a function of chronological time. The simulation assumes a FRET configuration in which the donor- cceptor distance follows Langevin dynamics evolving on a parabolic potential (see main text for details). Only the donor intensity is shown. The simulation also assumes a confocal optical detection scheme with which the number of detected fluorescence photons from a, single molecule is recorded as a function of time. The signal level is set to 3 thousand counts per second (kcps) and the background level 0.4 kcps. The molecule undergoes an irreversible photo-chemical reaction (photo-bleaching) at around 1 s such that it no longer fhioresces. FIG. 2: Energy transfer efficiency as a function of normalized donor (D) acceptor (A) distance x, defined by x ≡ R/R
0 ■
FIG. 3: Observation time, in units of 1/l , required to achieve a relative measurement uncertainty = 0.1 as a function of normalized donor-quencher distance δR/R0 < a. Shown in the figure are expected time resolution under various signal-to-background ratios in the donor channel: βd = 2 ( — )i βd — 5 (o); βά, = 20 (Δ), which are compared to that under background-free conditions, 3d — s- c (*). Overlaid is a FRET efficiency curve for comparison (• • • ), reference to the right- hand-side ordinate.
FIG. 4: Observation time T, in units of (I^-1), as a function of normalized donor-acceptor distance x = R/R0 to achieve a relative measurement error of σ(x) < a = 0.1 in two-channel detection. (A) Background-free scenario when p = 1.5 (V); p — 0 ( — ); and p = 0.5 (D). Background-free, single- channel detection (■) is also included for comparison. (B) Background emission is present, but no cross-talk between the donor and acceptor channels. The background levels are β — 2 (— •); β = 5 (o); and β = 20 (Δ). The background-free case is also plotted ( — ) for comparison. In all plots on this panel, it is assumed that the background levels are the same for both the donor and acceptor channels. (G) Both background, βd = βa = 5, and cross-talk, χd = χa = 0 (*); Xd = Xa = 0.25 (-{-); and Xd = χa = 0-5 ( — ) are' present for two-channel detection. Background- and cross-talk-free, two-channel detection curve ( — ) is also plotted for comparison. (D) Comparison of (1) background- free single-channel detection (β), (2) single-channel with a signal-to-backgroιuιd ratio of 5 (->), (3) background- and cross-talk-free two-channel detection ( — ), and (4) two-channel detection with signal-to-background ratio of βd - βa = 5 and cross-talk coefficient of χd = χa = 0.25 (+). On all panels A-D, FEET efficiency E as & function of x is overlaid (• • • ) and referenced to the ordinate to the right.
FIG. 5: Observation time, in units of (-O)-1 under background-free conditions, or in units of (Ib )~1 when there is background, required to achieve a relative measurement uncertainty α = 0.7 as a function of normalized donoi'-quencher distance SReβe < a. This confidence interval corresponds to an absolute error of ~ 0.5 A if the distance dependence of electron transfer βe is 1.4 A-1, f in these plots are set to 10000, corresponding to ke = 1013 and k0 = ID9. Shown in the figure are expected time resolution under various signal-to-background ratios: β = 2 (— •); β = 5 (o); β — 20 (Δ), which are compared to that under the background-free condition, β — *• c® (-). Overlaid is excited state lifetime relative to the quencher-free case for comparison (• ■ ■ )) reference to the -right-hand-side ordinate.
FIG. 6: Flowchart of the maximum information algorithm.
FIG. 7: Sample FRET trajectories analyzed according to the maximum information algorithm. The top half of each panel shows the simulated intensities on the donor ' ( — ) and acceptor ( — ) channels as a function of time. - The bottom half of each panel compares the analysis of the given FRET trajectory with the simulated trace corresponding to the "true" trajectory ( — ). The dashed black line ( — ) is the trajectory recovered by the maximum information method, and gray shaded areas outline the standard deviations calculated from the information analysis. All trajectories were generated according to Eq. 43 on the potential V (x) = 20 (x - 0.9) 2 at a temperature of Θ = 1/kβ- Trajectories (A) and (B) were simulated with 7 = 10 and 7 = 1, respectively, 1 = 1° = 3000 cps and B = Ba = 400 cps and analyzed with a = 0.07. Their intensity trajectories were calculated with 15-ms bins. (C) was simulated with 7 = .3, 1 = 1° = 10000 cps and Bd = Ba = 1200 cps and analyzed with a = 0.1. Its intensity trajectory was calculated with 5-rns bins.
FIG. 8: The theoretical limit of the time resolution as a function of x, calculated using Eq. 37 with parameters determined by the simulation parameters for the trajectory in Fig. 7(B) ( — ) is compared with the δt values from the maximum information algorithm (D) .
FIG. 9: (A,B,G) Mean error and (D,B,F) root mean square error as a function of acceptor-donor distance triplet lifetime (Eq. 44). Only donor triplet states are 'allowed in panels A and D, only acceptor triplet states in panels B and E, and both acceptor and donor triplet states are allowed in panels G and F. Trajectories were simulated at constant x values of 0.8 (D), 1.0 (<), and 1.2 (©) with effective φιsc = l x 10~2. The trajectories were analyzed by the maximum information algorithm with a = 0.05. Under typical experimental conditions the triplet lifetime r will not exceed 1-2 ms. At these lifetimes, there is no significant effect on the accuracy of the algorithm.
FIG. 10: Bias in estimators of x based on simulations of (A) one channel FRET and (B) two channel FRET at distances of 0.8 ( — ), 1.0 ( — ), and 1.2 (- • -); and Electron Transfer at distances of (C) x — 9 ( — ), and 12 ( — ) and (D) x = 6 ( — ). Wedges are placed to indicate the values of the information (J) which satisfy the cutoff values of = 0.1 and . = 0.07, as discussed in the main text.
DETAILED DESCRIPTION OF THE PREFERED EMBODIMENTS
By "fluorophore" it is meant any molecule or molecule moiety or atomic species( such as Eu ions, Tb ions, and the other lanthanide series ions) capable of electron transfer, energy transfer, fluorescence lifetime or fluroescence spectrum. By "light" it is meant light of any wavelength. Preferred for this invention is light having a high quantum yield and high absorbance for the flurophore in use.
By "providing a composition capable of emitting photons" it is meant creating reaction conditions such that a composition or compound capable of emitting photons is created, or simply supplying a ready made composition. The invention contemplates that any method or act whereby a composition capable of emitting photons is placed in proximity to the light source capable of being illuminated, constitutes the "providing".
By "composition" it is meant the term normally associated with the term. It is noted that the invention contemplates that the "composition" may be a molecule linked to any fluorophore, or a multitude of flurophores.
By "data set" it is meant counted photons as a function of time.
By "providing a data set" it is meant the step of either collecting the data or accessing data that is already present. The present invention contemplates that a "data set" having the required information to run the algorithm of the present invention has been obtained, and in one embodiment the next step would be the accessing of the data set by the algorithm of the present invention.
By "estimator" it is meant an equation providing the value for the molecular information of interest that is statistically the most likely to be correct.
By "providing an estimator" it is meant the act of creating using an equation found herein and/or looking up the equation for the estimator in the literature and/or any other method
that one having ordinary skill in the art would recognize to result in an equation for the estimator.
By "providing an equation for the standard deviation for the estimator" it is meant the act of creating and/or using an equation found herein -and/or looking up the equation for the standard deviation in the literature and/or any other method that one having ordinary skill in the art would recognize to result in an equation for the standard deviation."
By "light source" it is meant any wavelength of light from any source and any speed depending on the fluorphore used. Lasers as well as incandescent lamps are contemplated, as non-liminting examples. The light source must be capable of electronically exciting the fluorophore.
By "composition" it is meant any media containing a molecule of interest.
By "coupled" it is meant covalently bonded or attached by other forces; i.e. any bonding or attraction force is included in this definition.
By "desired standard deviation" it is meant the standard deviation that is of interest to the user or the standard deviation that gives a time resolution that is of interest to the user.
By "desired number of times" it is meant that the method should be applied to different time intervals within the provided dataset until no more time intervals are of interest to the user." Reference will now be made in detail to some specific embodiments of the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, π
embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be practiced without some or all of these specific
details. In other instances, well known process operations have not been described in detail in order not to unnecessarily obscure the present invention. In this specification and the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. All references cited herein are expressly incorporated by reference in their entirety for all purposes. It is understood that in a preferred embodiment of the present invention the following equation is used for operating at one channel or more:
It is understood that in a more preferred embodiment of the present invention the following equation is used if operating two channels or more:
Based on these information-theoretical considerations, we derive the basic equations that determine the best achievable time resolution in a single molecule fluorescence experiment. In this context, the measurable quantity λ will be the arrival times of individual photons. The general expression is given in Eq. 3, and special cases are listed in Eqs. 22-25 for FRET, and Eq. 40 for ET. These equations allow us to propose a data reductio algorithm to extract, photon by photon, the maximum amount of information in distance measurements as a function of time. We then use computer simulations to show that the algorithm extracts distance and time information at resolutions that achieve the theoretical limit. Potential complications in applying this new method such as fluorophore blinking and bias are also considered. We note that the concept and approach presented here are general and their applications are not limited to the examples to be discussed.
THEORY
The experimentalist conducting a time-resolved single-molecule fluorescence measurement wishes to measure some parameter q as a function of time. In general, q is a dynamic variable that changes with time as the molecular conformation undergoes thermal fluctuations. If fluctuations in q cause corresponding fluctuations in the emitted fluorescence intensity of the molecule, then the dynamics of q can in principle be followed in real time by recording the arrival times of the emitted photons. If the photons can be meaningfully separated based on wavelength, polarization, or some other property, they may be detected and analyzed on separate channels.
For example, an observer may measure the donor-acceptor distance of a single-molecule to be 4 nm with a time resolution of 100 μs but with a 68% confidence interval σ(x(t)) = 10 nm. This datum, although measured at a very high time resolution, is not very meaningful; the 10 nm uncertainty is most likely greater than the size of the molecule. Some averaging will thus be required before a meaningful value can be obtained: • 1 rt+τ q = j q , (i) The time interval T is chosen so that the uncertainty associated with this measurement, cr(q), is less than some pre-defined value. Further time averaging, thus reducing σ (q (t)), will
EV 635996579 US 13
improve the accuracy in q, but at the expense of time resolution. The following discussions are based on a coarse-grained picture in which the parameter q is assumed constant during the time required to reduce the standard deviation below a certain threshold. The rationale behind this assumption is that an observer has no a priori knowledge of the true value. q(t) until an accurate measurement can be made. To determine the proper averaging time T, the Fisher information matrix is calculated and then inverted to find the covariance matrix for the parameters of interest. This gives us the
Cramer-Rao-Frechet bound for the variance of an estimator. The MLE, which approaches the Cramer-Rao-Frechet bound, is then constructed. Fisher Information
Suppose the variables of interest, q = {qi}, are being measured on independent channels. Typically, q are chosen such that they are relevant as an indicator
• of the molecular state on the single-molecule level, for example, the FRET efficiency or the distance between a fluorescent donor -and acceptor. Since the exact arrival times of the photons on these channels will be uncorrelated from one another, the Fisher information of these indepen- dent channels will be additive. The Fisher information can be computed for each of these channels individually. The observed intensity at a detector can be written as I(q). The intensity on the channel is generally measured relative to some constant reference intensity 1°, We can write I(q) as I°ζ( [), with the dimensionless scahng factor £(q) containing all of the q-dependence of the detected intensity. The probability density function for observing n photons at the detector for this channel in time T is Poisson: f (nn;; <qι,,TT)) =
= (2)
Based on this probability density, the Fisher information matrix elements for a single channel are
This form makes it cle ■ar that in«form-aStion is (£ acq)uir(e£d a)t a■ rate proportional to I°/ζ(q) through the course of the measurement. The information from independent channels may be combined. The total information
matrix is just the sum of the information matrices for each of the m channels:
Jυ ^α(q) ' ( ) Adding the effects of a detected. background intensity B, the total detected intensity is
J (q) = i°C (q) + £ (5) = ι' [(ι -
jr
1) C (q) + r
1] , (6) where the signal to background ratio (1° - B) /B has been. ritten as β and the maximum observed intensity in the presence of background is I
β — 1° J (1 — β
~l). Finally, then, the total m-channel information is
The inclusion of background photons on a detection channel thus degrades the information that can be collected.
Covariance Matrix
The Cramer-Rao-Frechet bound states that the covariance (σ#) between the estimated parameters g* and q
j is bounded by the inverse of the information matrix.
where the approximation is true when the bias of the estimator (q) approaches 0. The bias of an estimator depends on the probability density function of the parameter to be measured. The estimators proposed in this work can be verified to meet the consistency condition lim
r→00 F(q) → q (Schervish, 1997). Therefore, the subsequent derivations will assume the use of unbiased estimators and bias in the- short-time limit will be discussed case by case in the Appendix. Given p parameters to esthnate, the information matrix will be of order p. The form of the matrix given in Eq. 7 makes it clear that p channels are required to form an invertible matrix.
If multiple parameters are to be estimated simultaneously, the entire information matrix must be inverted to find the variances of the individual parameters and their covaxiances, so at least p independent sources of information are required to estimate p different parameters.
Concentrating on the estimation of one variable, qi, the variance of that measurement is simply (J^)-1. If the variable q is to be measured to a relative accuracjr of a = δq/q, the requirement is σ (q) < a. The best possible time resolution will be
Estimators
Again concentrating on the estimation of one variable, given expressions for i (?) through Cm (s) the total probability density for observing n-χ . . . nm photons on channels 1 . . . is r) = τ [ (g) rr e_Ife(g)r f (nι, , in (10)
The maximum likelihood estimator is the value of q for the observed T and nι, . . . , n
m that maximizes / ( i, . . . , n
m; q, T). This value of g is given by
The solution to this equation gives the maxi um likelihood estimator in terms of T and
From another pohit of view, each photon can be regarded as an instantaneous measurement of the state of the system under observation. Then a photon will be detected on channel fc with probability (q) Pk = (12)
The probability distribution for observing n-k photons on channel k, with N being the total number of photons, is f (nι, . .. ,nm-, q) = N\ f\ ^ L-. (13) k=ι nh\
The maximum likelihood estimator for q will thus be given by the solution to this equation:
The Poisson (Eq. 11) and the multinomial (Eq. 14) approaches are equivalent since the former can be derived as a hmiting case of the latter. The only differences are practical. First,
the multinomial approach cannot
' be used with a single channel measurement. Second, the multinomial approach generally yields simpler maximum likelihood estimators for multiple channel measurements. Having derived the basic information theoretical "x ressions for estimating parameters from single photon counting single-molecule measurements, it is of interest to apply these general formulae to some special cases.
FORSTER RESONANCE ENERGY TRANSFER
A variety of measures have been proposed and implemented to quantify the extent to which resonance energy transfer occurs from an energy donor to an acceptor (Berney k Danuser, 2003). Here, the energy transfer efficiency, E, is used because it has been widely adopted in single-molecule experiments (Ha, 2001). It is defined as the fraction of photon energy absorbed by the donor that is transferred to the acceptor. In cases where the acceptor is a non-fluorescent quencher, the transfer efficiency is measured by the donor fluorescence intensity alone (single-channel detection) and is expressed as
where 1% (x) is the detected background-free donor intensity and 1 is the detected background-free donor intensity in the absence of the quencher, ∑ can be measured in a separate control experiment. For simultaneous detection of donor and acceptor fluorescence,
' the transfer efficiency is given by: _ 1
E = l + pP
d (x) IP
a (x) -
(lβ)
Ia and Id are the detected background-free fluorescence intensities (number of photons per second) of the acceptor and donor channels, respectively, p ≡ φaV /ΦdVd, to be determined experimentally, is a scaling factor that corrects for differences in fluorescence quantum yields of the donor (φd) and acceptor (φa) probes, as well as those in detection (ηd for the donor channel and ηa for -the acceptor channel). Within the framework of orientation-randomized dipole-dipole coupling between the donor and acceptor probes (Fδrster, 1949), the energy transfer efficiency can be related
to the distance between the donor and acceptor probes:
E pa - = _ γλ 1 + (R/R0)6 1 -f -cβ ' [ J where x is the normalized donor-acceptor distance x =≡ R/R0, R is the center-to-center distance of the donor and acceptor probes, and Ra is the Forster radius — he distance at which energy transfer efficiency is 0.5. For a given donor-acceptor pair, the corresponding Fδrster radius can be calculated from the donor fluorescence and acceptor absorption spectra, and the orientation factor can be calculated from fluorescence anisotropy measurements (Yasuda et al, 2003). Alternatively, one may construct a series of poly-peptides of different length to calibrate the effective RD for tethered, gyrating fluorescent probes (Schuler et al., 2002). Here, it is assumed that both fluorescent probes gyrate around the tethered point on a time scale much shorter than the achievable experimental time resolution T, which can be verified experimentally. Information on slow orientation-dependent dynamics can be acquired by considering additional polarization-dependent channels.
Fisher Information and Maximum Likelihood Estimators A FRET measurement consists of observation of fluorescence from a donor fluorophore and/or from an acceptor chromophore. The E- and si-dependence of the intensities on these channels is ζd {E) = l - E, (18) ζ (E) = E, (19) X" (20) (21)
Note that with these definitions, 1° (x) = I°ζ
d (x), 1° (x) = P
a a (x), and p = IJ° I
d°. Using Eq. 7 to calculate the information, one has
Mx) . ι τ
3fal° , (
1 -ft")
2, (24)
For one channel measurements, the only possible MLE is that given by the Poisson distribution, Eq. 11:
For two channel measurements, the multinomial estimator given by Eq. 14 is used. The equations for the maximu likelihood estimators are
Iξn
d -
άn
aβ
a ■ μ9j
Cross-Talk and Cross-Excitation between Donor and Acceptor Channels
Due to spectral overlap and other experimental considerations, there is often cross-talk between the donor and acceptor chamiels. Also, if the absorbance spectrum of the acceptor overlaps with that of the donor, the acceptor may be excited directly. As will be shown below, cross-talk and cross-excitation simply change the effective signal-to-background ratio. As such, Eqs. 22-29 still apply. Cross-talk coefficients from the donor and acceptor channels are denoted X and χa, respectively. The cross-excitation coefficient is denoted χx. All three of these coefficients may be measured experimentally. For instance, χd can be measured by recording the acceptor
channel intensity at different excitation power levels for donor probes whereas χa can be measured by recording the donor channel intensity at different excitation power levels for a control system where donor and acceptor probes are in juxtaposition so that E → 1. χx may be measured by recording the acceptor channel intensity in the absence of the donor. With these notations, the distance-dependent photon intensities become
Id βx (x) = iS d (*) + X (x) + Bd, (30) IaβX (Z) - Ia°ζa 0=) + Xd$ζd (x) + X d° + Ba. (31)
Expanding the intensity terms, one has
J? (*) = Idβ [(1 ~ fi£) d (q) + βg] , (32) I0X (*) = IS [(1 - β£) C« (q) + β£] , (33)
^ = Ba + (χχ a+ χd) Ij- (35)
The crests and trouglxs in one channel correspond with the valleys and peaks in the other, so the only effect of the cross-talk is to decrease the apparent signal to background ratio. The expressions given for J (x) and x in the previous section still hold, using the new βdχ and βaχ in place of βd and βa, respectively.
Distance and Time Resolution
One Channel
When measuring FRET efficiency on only the donor channel, the total information is
This suggests that time resolution T is a function of donor-quencher distance. Fig 3 shows the theoretical minimum observation time period (T, in units of l/Id) required to achieve a relative measu^raent error δR/R0 less than apreset value = 0.1. Given a, the theoretically
achievable time resolution at various donor-quencher distances can be found under the ideal condition that there are no bad-ground photon counts (* 'in Fig. 3). The time resolution worsens sharply at both large and small x. This is not surprising; the energy transfer efficiency E does not vary much with x for donor-quencher distances that are significantly larger or smaller than Ra (cf . .overlaid FRET efficiency curve in Fig. 3) . Consequently, it will take a large number of photons to measure x at these distances to within this' error tolerance. At distances closer to RD the efficiency is very sensitive to changes in x, so fewer photons are required to obtain the desired tolerance. In practice, one cannot avoid recording background photons. In these cases, the curve remains U-shaped, but is shifted to longer observation times. This is because the information of x is degraded by a factor of x (1 —β~l) / (z6 - β~τ) < 1 in the presence of background photons. These equations can be used to understand the time and distance resolution limits of a single molecule experiment. For example, for a single molecule labeled with a donor-quencher pair that exhibits a Fδrster radius of 50 A and whose fluorescence can be measured with a signal-to-background ratio of 10, the highest time resolution achievable for measuring this donor-quencher distance is achieved at R ~ 43 A with Id T — 0.22/α-2 photons required to achieve the desired accuracy of . To measure R within a standard deviation of 5 A, then, one must collect 22 photons.
Two Channels
Time resolution in the two- channel detection scheme is also a function of normalized donor-acceptor distance. The total information for this scheme is
J (x)

A comparison of background-free single-channel and two-channel detection schemes is displayed in panel (A) of Fig. 4. Although both detection schemes behave similarly at short distances, the two-channel detection scheme clearly delivers better performance. This is expected since more information about x is gathered with two channels. Furthermore, better time resolution can be achieved for larger x in cases where p > 1, compared to the p = l case where the emission/detection efficiencies are the same for both donor and acceptor channels.
This is because in the p > 1 cases, the acceptor probe emits more photons than it would have if p = l, to give more information about x. Information degradation due to various degrees of background and cross-talk in the two-channel detection scenario is depicted in panels (B) -and (C) in Fig. 4. Panel (D) of Fig. 4 illustrates a nioie realistic situation in vvh L. Liu. 5 signal-to-background ratio is 5. The performance of the two-channel scheme is generally better than the single-channel scheme for large x. Using the same example as in the last section, for a single
' molecule labeled with a donor-quencher pair that exhibits a Fδrster radius of 50 A and whose fluorescence can be measured with a signal to background ratio of 10, the highest time resolution achievable for measuring this donor- quencher distance is achieved at R ^ 43 A with I
d βT = 0.15/α:
2 photons required to achieve the desired accuracy0 of a. To measure R within a standard deviation of 5 A, then, one must collect a total 15 photons.
ELECTRON TRANSFER
Recently, excited-state electron transfer has been used as a probe for investigation of conformational changes in individual molecules (Eggeling et al, 1998a, Jia et al, 1997,5 Sauer et al, 1998). In most cases, emission intensity or florescence hfetime of the probe is quenched via electron transfer to or from a nearby quencher. Due to the exponential distance dependence, ET can also be used as a spectroscopic ruler to measure distances on the A scale under such conditions that chromophore-quencher distance variation is the sole source for changes in ET rate. These conditions include, for instance, barrier less excited- state ET so that thermal fluctuation in the relative free-energy levels ΔΔC7 is negligible, rapidly randomized or fixed relative orientation of chromophore and quencher so that the 0 time scale of protein conformational motions is separated from that of probe rotation, and facile electron back transfer so that repetitive excitation of a single molecule is achievable. Therefore, ET allows investigation of minute changes of biomolecular conformation (Yang et al, 2003) and serves as a complementary method to FRET which, as discussed in earlier sections, is sensitive to distance changes on the 20-80 A scale. In the following discussion, we assume that ET is primarily dominated by chromophore-quencher distance. That is, the ,-,_■ quenching rate ft5 is kg = kee~β°R° = kee~x (38)
where ke is the ET rate when chromophore and quencher are in van der Waals contact, ■ βe is the distance parameter in ET and varies from 1.0 to 1.4 A-1 for proteins (Gra3^ fo Winkler, 1996, Moser et al, 1992), Re is the edge-to-edge distance between chromophore and quencher, and xe = βeRe is normalized chromophore-quencher distance.
Fisher Information and Maximum Likelihood Estimators
Let the radiative and non-radiative decay rates of a fluorescent probe in its excited state be k
r and k
nr, respectively. The total decay rate and emission intensity of the probe in the absence of quenchers is k
0 = k
τ + k
nr < &
e and I
0, respectively. In the presence of a quencher, the excited-state decay rate becomes k
x = k
a 4- k
q. If ET is the sole mechanism0 that increases the excited-state decay rate of a chromophore, the emission intensity of the chromophore is inversely proportional to its excited-state decay rate: I
x o (k
x)
~l. The total detected emission intensity becomes I
x = I
0 (1 + ξe
~Xz)
~ , where £ = k
e/k
0 > 1. The re-dependent part of the intensity is thus
ι c Using Eq. 7, the Fisher information is calculated to be
J(x) = , ^ - ff""1)2 ^"2* τβ T, ' (40)
The MLE can be determined using the Poissonian formula, Eq. 11: 'n - Tβ-1 x — ln -l- hι (41) T - n
20 Distance and Time Resolution
The expression for information in an electron transfer experiment, Eq. 40 makes it clear that time resolution T at a given detected photon fiux is a function of chromophore-quencher distance x and background level β. The information of x degrades by a factor of (1 — β
~1)
21 '(l+ξe
~x β
'1) < 1 in the presence of background photons. The condition for -measuring x
e to a relative error is
Note that while time resolution in general is related to ξ, the ratio of maximum ET rate (fc
e) to the excited-state decay rate of the chromophore (k
0), the best possible time resolution is independent of ξ. In fact, the best possible time resolution becomes I
aT°
pt = 27/4αι
2 at
x°≠ = log [ξ/2] under the ideal background-free condition when /3 → oo. In other words, under such ideal conditions, ~ 6.8/QI
2 photons are needed on average to measure x
e to a relative error . For example, if one is interested in measuring R
e to an absolute error of 0.5 A within a protein having β
e = 1.4 A
-1, which corresponds to a relative error α ■= Q.5/3
e = 0-7, at least <
^ 14 photons will be needed in the ideal background-free condition. In addition, the optimal chromophore-quencher distance becomes greater in the presence of background photons, but decreases asymptotically to log [ζ/2] = log [k
E/2k
D] as β -→ co. This suggests that one may choose probes of different fluorescent lifetime k
0 for different systems so the best time resolution will be achieved at an experimentally relevant distance. This is analogous to choosing donor and acceptor pairs by their spectral overlaps for optimal measurements in FRET applications.
APPLICATION The above analysis focuses on information-theoretic results about photon counting in general and discusses a number of special cases. Now the practical application of these results is considered. In this section, an algorithm is given to convert a list of measured photon arrival times to a distance trajectory. This algorithm is general for the experimental methods described and reaches the information theoretic limit. Since it applies equally to all approaches, it is discussed only in the context of -two-channel FRET. Similar considerations apply for the measurement of any other parameter that can be understood in the context of the previous theoretical discussion, including single-channel FRET and ET measurements.
Maximum Information. Data Analysis Algorithm
The data analysis algorithm that follows is predicated on accurate detection of the arrival times of individual photons, Experimentally, this is typically accomplished by a single- photon avalanche photo diode, SPAD (Li & Davis, 1993). It is also assumed that the fluorophores used as probes are excited by a fight source that provides constant iUumination
on the time scale of photon detection. The molecules under observation must be well enough separated that m electronic interactions between fluσrophores on different molecules may be neglected. Any experimental configuration that satisfies these criteria may be used to generate the single molecule trajectories whose analysis is described. An algorithm for obtaining distance measurements of predefined precision a is prescribed as follows (see Fig. 6). Each measurement consists of a chronological time t, a time uncertainty δt, a distance x, and a distance uncertainty σ. With the first measurement starting at time T = 0, find the minimum block length that will give σ(x) < a. Set the chronological time t for that data point to the middle of the time block, time resolution T(t) to its length, and calculate x and σ{&) according to the formulae given above. This algorithm achieves ι r, the limit of maximum information. Other termination conditions are possible as well, but only the constant σ (x) method is treated here, as it is the most practically applicable,
Simulation Details
This algorithm was validated using simulated single molecule trajectories for which the donor-acceptor coordinate is exactly known as a function of time. Motion on the x coordinate 15 is modeled according to a discretized Langevin equation in the limit of large, fast f iction with potential of mean force V (x) :
where Δt is
' the propagation time of the simulation, and is a frictional coefficient representing velocity-dependent dissipation, δf (t) is a Gaussian-distributed random force with {δf (t) δf (t
f)) = 2δ (t ~ t') Atk
B Θ in which δ(t - t') is the Dirac ^-function, k
B is Boltz- 20 niann's constant, and Θ is the absolute temperature. At each time step the emission and intersystem crossing rates are calculated and the system tested to see if either fluorophore has emitted a photon or entered the triplet state. Both photon emission and intersystem crossing are distance dependent according to the FRET Efiiciency relation Eq. 17. Relaxation from the triplet state is treated as exponential in time. If a photon is emitted on either channel, the x coordinate of the simulation is
9r recorded. Photon data is recorded as inter-photon timings on donor and acceptor channels and subsequently analyzed according to the prescribed algorithm.
Example Trajectories
Evolution of the x coordinate was simulated according to Eq, 43 on a parabolic potential. The performance of this algorithm for a sample x(t) trajectory is analyzed in Fig. 8, Eq. 37 gives a lower bound for the time resolution T{t). The maximum information algorithm achieves this lower bound and is thus the optimal data analysis algorithm for extraction of x trajectories from this land of data. Analysis with the maximum information algorithm yields the trajectories shown in Fig, 7, (A) and (B) were simulated with i = If = 3.4 kcps. The maximum time resolution is 8.8 ms (see Fig. 7). The maximum information algorithm can detect any conformational changes happening on this time scale or longer. Since the analysis is primarily based on the number of photons detected, this resolution scales exactly as the inverse of the average count rate. If one were to measure a. single molecule FRET trajectory at an (experimentally realizable) average count rate of 10 kcps the maximum time resolution would improve to 2,6 ms. Also, the value of the accuracy cutoff, , makes a significant difference. Analysis of the trajectory in Fig. 7(C) with = '0.1 improves the maximum time resolution to 1.2 ms. If donor-acceptor distance fluctuations in the experimental system are large and fast, measured distance trajectories may not represent the full conformational flexibility of the experimental system. In this case, spatial resolution may be sacrificed for time resolution, allowing the full conformational distribution to be observed. In this way the analysis can be tailored to the experimental system and conditions.
Bias in Distance Measurements and Effects of Fluorophore Intermittency
The maximum likelihood estimator is asymptotically unbiased when measuring a single distance in the limit of long measurement time. It is, however, slightly biased at small time intervals. In the cases we have studied, this bias is always much lower than the standard deviation given by the Fisher information (see Fig. 9 and numerical studies of bias in the Appendix). If the inherent bias becomes significant, many methods exist to generate estimators that correct for the bias while simultaneously approaching the Cramer-Rao-Frechet bound (Noinov & Νikulin, 1993), or the bias can be corrected empirically by numeric simulations
such as those presented in Appendix. Intensity bhnking due to triplet state trapping or other mechanisms has the potential to cause inaccuracies in distance measurements. The algorithm given above does not take these intensity intermittency effects into account. Here we use simulations to show that our algorithm is robust against such "blinking" behaviour up to few ms of non-fluorescent state lifetime. Without loss of generality, we use triplet-state blinking as an example and consider typical dye molecules that exhibit Si -→ ι intersystem crossing quantum yields on the order of φiSC = 5 x 10~4 and triplet lifetimes of 500 μs (Hiibner et al, 2001). Using these typical values, and assuming a collection efficiency of 5% (giving an effective φisc of 10~2), simulations were carried out at constant x values. In order to make a quantitative evaluation of the accuracy of the analysis under these conditions, we consider the error parameter ((δr)s) as a measure of the closeness of a particular analysis {xi} to the true data x (t):
«*
■)') - dt'x (t') (44)

where &i is maximmn-information estimate of x(t) at time and N the total number of estimates at given . Therefore, bias (mean error) is represented by s = 1 and mean square error by s — 2. Fig, 9 shows the results of this analysis. For experimentally relevant triplet lifetimes, there is no significant error. This is because — although the analysis is done on a photon by photon bais — enerally more than ten photons are included in each box. As long as the triplet lifetime and intersystem crossing quantum yield are such that the total time spent in the 'triplet state is not a large fraction of the width of the box, the analysis will not be adversely affected. In the case that only donor triplets are allowed, there is no significant increase in bias due to the triplet state even at very long triplet lifetimes. This is due to the multinomial nature of the analysis — he time taken to acquire photons is not important, only the channel they arrive on. Since donor triplets states prevent photon emission from both the donor and the acceptor, the only effect is that the time resolution will be decreased. This is not the case with acceptor triplet states. Bias due to acceptor triplet states is very
"distance-dependent. At large x, even with very long acceptor triplet lifetimes, there will be no effect on the accuracy of distance measurements: the acceptor simply will not enter the triplet state. As x decreases, the probability that the acceptor will enter the triplet state
increases, and the bias becomes more significant.
Experimental Considerations
The formulas presented here are read3^ far immediate use in many experimental setups that have already been reported in the literature. In addition, these results are applicable to other, time-independent measurements. For example, the Maximum Likelihood Estimators for determining distance from the numbers of photons measured on the donor and acceptor channels can be used in any situation where all of the calibration numbers Bd, Ba, Id, J" α) Xd, a*> χa) are known. It is important to note that, although formulas regarding FRET efficiency are also given in this report, the use of FRET efficiency as an indicator of molecular state may be misleading as it is not a linear function of donor-acceptor distance; small distance changes may be amplified as a result. The number of photons that may be collected from a single molecule is heavily dependent on experimental conditions and the particular fluorophores used. On the order of 10s photons may be collected from a single molecule of Rhodamine 6G under vacuum in a PMA film at low excitation intensities (Deschenes k, Vanden Bout, 2002). This means that on the order of 106 mdependent distance measurements may be made if the molecule is part of a FRET pair between OΛRQ -and I.QRQ. To study fast dynamics, one may wish to excite the single molecules at higher intensities, but the detected time trace will also be shorter. Currently, for example, using the Alexa-555 / Alexa-647 dyes (Molecular Probes), photon arrival rates on the order of 5 * 104 photons per second are experimentally feasible in water solution, giving time resolutions better than 1 ms (Watkins and Yang, unpublished data). Several detection methods exist that are compatible with the Maximum Information Method. Avalanche Photo diodes (APDs), Photomultipher Tubes (PMTs), and Multichannel Plates (MCPs) may all be used to detect and count single photons. The dark counts on a high quantum efficiency (Q.E. > 60%) Peltier-cooled APD range from 25-500 cps, while the dark counts on uncooled single photon counting PMTs and MCPs (Q.E. < 20%) are similar, ranging from 10-1500 cps. At high excitation intensities, the detection- device is not the primary source of background; dark counts from the detector are miniscule compared to other sources including Raman scattering and autofluorescence in cells, both are difficult to supress using spectral filters. These background contributions, however, do
play a role in determining the lowest possible excitation intensities and thus the longest possible trajectories. The Maximum Information Method will work at almost any signal to background ratio; but as this ratio approaches one, the required number of photons to make a particular distance measurement increases without bound. It should be stressed that, although the proposed method has in mind the use of detectors capable of single photon counting, the concepts and ideas that underlie the development of Maximum Information Method is general and should be applicable to any measurements that are information- limited. The preceding discussions suggest that, when choosing a FRET dye pair to measure a distance of <^> R, the most effective dyes will be those that exhibit a 0 ~ 1.1R-1.3R (or ι n R ~ 0.8-0.9i?o)) instead of the commonly used R ~ R0 condition. The options available, in dye and filter selection, are much broader. Since cross-talk and cross-excitation are now recognized as merely contributions to the background, the filter set may be chosen with this in mind. Bandpass filters may be made as wide as possible in order to collect as many photons as possible. Also, the excitation wavelength may be chosen to be the maximum absorbance of the donor, even if that would generate some direct acceptor excitation.
1 CONCLUSION
A detailed and quantitative study of reactive djmamics in biomolecular systems will require both an accurate way to measure the biomolecular conformation and a reliable estimate of the errors involved in the experiment. using the Maximum Likelihood Estimator, to analyze experiments using detectors capable of single photon detection. The Fisher information was used to demonstrate that our analysis 2 achieves the best possible resolution given the constraints of the experhnental system. The accuracy of a single photon counting experiment is determined by Poisson statistics. For example, if one is trying to measure the distance between two fiuorophores in an experimental system, the distance information that one seeks is carried by each detected photon and is acquired at a constant rate in time, as shown in Eqs. 7, 24, 37, and 40, and the actual rate will vary depending on the experimental configuration. Any measurement of these parameters will be limited in precision by the amount of information obtained, as specified by the Cramer-Rao-Frechet bound. By choosing the proper analysis, one that achieves these
limits of precision, one can assure that the maximum information is extracted from the data. We have presented a method that achieves this, yet allows great flexibility in determining the relative values of the temporal and spatial resolutions. This concept is (l) generally applicable to a variety of systems, (2) independent of kinetic models, (3) easy to implement in practical experiments, (4) efficient since it extracts information photon by photon, (5) quantitative, and (6) most importantly, applicable to reactive systems. Experimentally, it is helpful to remember that this algorithm is based on the detection of individual photons. The Maximum Information Method of analysis relies upon the Poisson noise inherent in photon counting applications. Arbitrary subtraction of background from the measured signal obscures these statistics. Also, to increase experimental time resolution, all that must be done is to increase the excitation power, and thus the average detected intensity. Conversely, one can choose the intensity based on the desired time resolution. This allows one to take into account other experimental limitations, such as fluorophore photobleaching and triplet blinking. For two-channel FRET measurements, if cross-talk between the donor and acceptor channels is ignored, one would naturally excite the donor at the maximum in its absorption spectrum to give the highest signal to background ratio in a single molecule experiment. But the acceptor in a FRET pair will frequently absorb at that wavelength, producing cross-talk between the two channels. Since we now recognize cross-talk as just another contribution to the background — its only effect is to decrease the ratio of signal to background — it is no longer necessary to ensure that the acceptor is perfectly transparent at the excitation wavelength. With om- information analysis, the excitation wavelength can be adjusted inteUigently and the optimal signal to background ratio can be achieved. All of these issues arise from the central idea of an information-based analysis. In any experiment the fundamentaUy limited parameter is information. Since the amount of information does not increase, it is to the advantage of the experimentalist to be as flexible as possible in choosing where to allocate that information. With our maxi n information analysis the experimentalist is given optimal control over the information.
Maximum Likelihood Estimators are not guaranteed to be unbiased. In this section we calculate the bias in the estimators we have given and consider the effects it might have on the results obtained by the maximum information algorithm discussed in the body of this paper. In general, the bias bn in an estimator Fn of some parameter x is bn = (Fn) - x. (45)
Here n is the number of observations in the data set and {■ •
■ ) indicates an average over aU possible n-point datasets, weighted by the probability density of the observation of that dataset. In the cases considered in this paper, the probability density to observe a particular data set is Poisson. In the context of dynamic measurement, however, we are concerned more with the time in photon acquisition rather than the number of photons in a certain observation interval as indicated by the Fisher information. The bias in our estimators wiU therefore by given by

where, as before, denotes the number of channels. The estimators of FRET efficiency given hi Eqs. 26 and 28 can be analytically shown to be unbiased. The sums for the estimators of distance in FRET and ET (given in Eqs. 27, 29, and 41) was evaluated numerically. A photon trajectory was generated at constant x, the photons were binned into time intervals T, and the appropriate estimator was applied. When negative, infinite, or imaginary distances were generated, the data point was discarded, just as in the maximum information algorithm in which T is increased until the set rmcertainty level is reached.
These calculations were performed at a variety of constant x values. The results, plotted as a function of the information per bin, can be seen in Fig. 10. Information per bin is the most natural coordinate for the bias plot in the context of the maximum information algorithm. As an example, if a trajectory is analyzed at a = 0.07 (as in panels (A) and (B) of Fig. 7), the information per bin is 204. To convert to units of bin thne, simply use the appropriate information equation: Eq. 36 for one channel FRET, Eq. 37 for two channel FRET, or Eq. 40 for ET. There are strong fluctuations in the bias at extreme values of x in single channel FRET and electron transfer. These arise as a consequence of the discrete nature of photons. Because photons are quantized, the possible values of the estimator at a given bin time are quantized. Since the estimator is highly nonlinear, the possible values of the estimator are also highly oscillatory as a function of the bin time. This oscillatory effect is most pronounced for the extreme values of the estimator. At extreme values of x, the average that is calculated to determine the bias is heavily influenced by the extreme value of the estimator. This produces the oscillations in bias. As the bin time increases, the number of possible values of the estimator also increases, and the oscillations damp out. For two channel FRET, the estimator is not a function of bin size, so there are no oscillations. For both one and two channel FRET, bias is smallest when x — 1.0. In all the curves the bias approaches zero as the information increases, confirming the asymptotic unbiasedness of these estimators, For ET at x — 6, the estimator is quite biased. At larger distances for ET, and at all distances for FRET measurements, the bias is at least an order of magnitude smaUer than the standard deviation (J
_1,/2).
Examples
5 Example 1. Polyproline maintains a stiff helical structure in solution, keeping a constant distance between its two ends. A series of six different model peptides with the sequence PnCGGGK(biotin) (n=6,8,12,15,18,24) was synthesized using the FMOC solid-phase synthesis technique. The reagent used for the C-terminal lysine residue was purchased ι o pre-functionalized with biotin on the side-chain amine group (Nova Biochem). The peptides were then labeled with Alex Fluor 555 maleimide (Molecular Probes) on the thiol group of the cystein and Alexa Fluor 647 carboxylic acid succinimidyl ester (Molecular Probes) on the N-terminal proline. These two fluorophores form a FEET pair with an Ro of 51 A. In order to immobilize the labeled peptides, quartz slides (Technical Glass Products) were first cleaned by 15 minutes each of sonication in IM OH in water,
15 sonication in absolute ethanol, sonication in IM KOH, and sonication in absolute ethanol. They were then dried and silanized with aminopropyltrimethoxysilane (APS) by soaking for two minutes in a 2\% solution of APS in acetone followed by 30 minutes at 110°C. The silanized slides were functionalized on one side with PEG- SPA and PEG-biotin by incubation fox 3 hours in a solution of 10% PEG-SPA and 0.1% PEG-biotin in 0.01M
20 NaHCθ3 at pH 7.2. Streptavidin (Jackson Pharmaceuticals) was then coated on the slides by covering the biotinylated side with a 0.2 mg/mL solution for 15 min. Finally, the fluorescently labeled peptide was incubated on the active side of the slide for five minutes at a concentration of -10 pM, sufficiently low that, molecules binding to the surface of the slide would be separated by at least the diffraction limit. 25
Single peptide molecules, visualized by their covalently attached fluorescent probes, were then observed on the slide, first by collecting a diffraction-limited image of the surface of the slide. Single, correctly labeled peptides appear as well-separated circular, Gaussian, diffraction-limited spots against a dim background. Once a suitable peptide was located, excitation light was focused on that peptide and a photon by photon fluorescence trajectory was collected on the donor and acceptor channels. The single-molecule nature of the trajectories thus acquired is confirmed by the stepwise bleaching of fluorescence on both chamiels. This trajectory was subsequently analyzed using the algorithm of the present invention to extract the distance between the two fluorophores attached to the peptide as a function of time. The distance was observed to be relatively constant, as predicted by molecular dynamics simulations, and the measured distances were consistent with these simulations, as seen in Table 1.
aυic i .
distance between fluorophores. The peptide for n=6 was measured, and the FRET efficiency was >99%, confirming our predictions but not within the range of efficiencies that can be accurately converted to distance.
Example 2,
Adenylate kinase is a ubiquitous protem that catalyzes the interconversion of AMP, ADP, and ATP within the cell. ADNA construct for the expression of wild-type adenylate
kinase was obtained from the Michael Glaser group at the University of Illinois Urbana- Champaign. The adenylate kinase gene was inserted into the pQE-32 vector (Qiagen), thus introducing a Hisg tag on the N-teπninus. Wild-type adenylate kinase contains one cystein, in the core region. An additional cystein was introduced at Alanine 192 via site- directed utagenesis. These two cysteins were labeled with Alexa 555 maleimide (Molecular Probes) and Alexa 647 maleimide (Molecular Probes). In order to immobilize the labeled protein, quartz slides (Technical Glass Products) were first cleaned by 15 minutes each of sonication in IMKOH in water, sonication in absolute ethanol, sonication in IM KOH, and sonication in absolute ethanol. They were then dried and silanized with glycidoxypropyltrimethoxysilane by soaking for three hours in a 2\% solution of glycidoxypropyltrimethoxysilane (Sigma- Aldrich) and 0.01% acetic acid in water at 90°C. The silanized slides were functionalized on one side with AB-NTA (Dojindo) by incubation for 16 hours in a lOOmg/mL solution of AB-NTA in 0.01M NaHC03 at pH 7.2 and 60°C. Finally, the fluorescently labeled protein was incubated on the AB-NTA functionalized side of the slide for five minutes at a concentration of -10 pM, sufficiently low that molecules binding to the surface of the slide would be separated by at least the diffraction limit. Single peptide molecules, visualized by their covalently attached fluorescent probes, were then observed on the slide, first by collecting a diffraction-limited image of the surface of the slide. Single labeled proteins appear as well-separated circular, Gaussian, diffraction-limited spots against a dim background.
Once a suitable molecule was located, excitation light was focused on that molecule and a photon by photon fluorescence trajectory was collected on the donor and acceptor channels. The single-molecule nature of the trajectories thus acquired is confirmed by the stepwise bleaching of fluorescence on both channels. This trajectory was subsequently
analyzed using our algorithm to extract the distance between the two fluorophores attached to the protein as a function of time. The distance was observed to fluctuate significantly, with distances characteristic of those measured via X-ray crystallography, and the observed fluctuations were measurably different upon addition of substrates and cofactors. Example 3.
The adenylate kinase protein discussed in Example 2 may be labeled by attaching each fluorophore to the protein with two covalent bonds, thus restricting their rotational freedom with respect to one another. The labeled protein may then be attached to the surface of a glass slide. Photon by photon trajectories thus acquired will provide information as to the distance and relative orientation of the two fluorophores, and the residues they are bound to on the protein. Example 4.
Adenylate kinase may be labeled with Alexa 555, and ATP, a substrate of adenylate kinase, may be labeled with Alexa 647. Then the distance between these two moieties may be measured as a function of time, resulting in a quantitative measurement of the interactions between the protein and its substrate.
-Adachi, K., Yasuda, R., Noji, H., Itoh, H., Harada, Y-, Yoshida, M. & Kinosita, K. (2000). Stepping rotation of F-1-ATPase visualized through angle- resolved smgle-fmorophore imaging. Proc. Natl. Acad. Sci. USA, 97, 7243-7247. Ag on, N. (2000a), Conformational cycle of a single worldng enzyme. J. Phys. Ghem. B, 104, ■7830-7834. Agmon, N. (2000b). Conformational cycle of a single working enzyme. J. Phys. Chem. B, 104, 7830-7834. Bartko, A.P., Xu, K. & Dickson, R.M. (2002). Three dimensional single molecule rotational diffu- sion in glassy state polymer films. Phys. Rev. Lett., 88, 026101. Becker, W., Hickl, H., Zander, C, Drexhage, K.H., Sauer, M., Siebert, S. & Wolfram, J. (1999). Time-resolved detection and identification of single analyte molecules in rnierocapillaries by time- correlated single-photon counting (TCSPC). Rev. Sci. Instrum., 70, 1835-1841. Berezhkovskii, A.M., Szabo, A. & Weiss, G.H. (1999). Theory of single-molecule fluorescence spectroscopy of two- state systems. J. Chem. Phys., 110, 9145-9150. Berezhkovskii, A.M., Szabo, A. & Weiss, G.H. (2000). Theory of the fluorescence of single molecules undergoing multistate conformational dynamics. J. Phys. Chem. B, 104, 3776-3780.
Berezhkovskii, A.M., Boguna, M. & Weiss, G.H. (2001). Evaluation of rate constants for conformational transitions using single-molecule fluorescence
Chem. Phys, Lett, 336, 321-324.
Berney, C. & Danuser, G. (2003). FRET or no FRET: A quantitative comparison. Biophys, J., 84, 3992-4010. Bohrner, M., Pampaloni, F., Wahl, M., Rahn, H.J., Erdmann, K. & Enderlein, J. (2001). Time- resolved confocal scanning device for ultrasensitive fluorescence detection. Rev. Sci Instrum., 72, 4145-4152.
BoMnsky, G., Rueda, D., Misra, V.K, Rliodes, M.M., Gordus, A., Babcoek, H., Walter, N.G. & ' Zhuang, X. (2003). Single-molecule transition-state analysis of RNA folding. Proc. Nail. Acad. Sci. USA, 100, 9302-9307.
Gao, U.S. (2000). Event-averaged measurements of single-molecule kinetics. Chem. Phys, Lett., 327, 38-44.
Chen, Y., Hu, D., Vorpagel, E.R. & Lu, H.P. (2003). Probing single-molecule t4 lysozyme conformational dynamics by intramolecular fluorescence energy transfer. J. Phys. Chem. B, 107, 7947-7956. Cover, T.M. & Thomas, J.A. (1991). Elements of Information Theory. John Wiley and Sons, Inc., New York. Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press, Princeton. Deschenes, L.A. & Vanden Bout, D.A. (2002). Single molecule photobleaching: increasing photon yield and survival time through supression of two-step photolysis. Chem. Phys, Lett, 365, 387-395. Edman, L., Mets, U. & Rigler, R. (1996). Conformation transitions monitored for single molecules in solution. Proc. Natl. Acad. Sci. USA, 93, 6710-6715. Edman, L., Foldes-Papp, Z., Wennmalm, S. & Rigler, R. (1999). The fluctuating enzyme: a single molecule approach. Chem. Phys., 247, 11-22. Eggeling, C, Fries, J.R., Brand, L., Gnter, R. & Seidel, C.A.M. (1998a). Monitoring conformational dynamics of a single molecule by selective fluorescence spectroscopy. Proc. Natl. Acad. Sci. USA, 95, 1556-1561.
Eggeling, C, Widengren, J., Rigler, R. & Seidel, C. (1998b). Photobleaching of fluorescent dyes under conditions used for siαgie-molecule detection: Evidence of two-step pholysis. Anal. Chem.,
70, 2651-2659.
Fisher, R.A. (1925), Theory of statistical estimation. Proc. Cam. Phil. Soc, 22, 700-725.
Fδrster, T. (1949). Experimentelle vuiά theoretische untersuchung des zwischenmolekularen bergangs von elektronenanregungsenergie. Z. Naturforsch., 4a, 321-327.
Frechet, M. (1930). Sur la convergence "en probabilite" . Metron, 8, 3.
Geva, E. & Skinner, J.L. (1997). Theory of single-molecule optical line-shape distributions in low- temperature glasses. J. Phys. Chem. B, 101, 8920-8932.
Gray, H. & Winkler, J.R. (1996). Electron transfer in proteins. Ann. Rev. Biochem., 65, 537-561.
Ha, T. (2001). Single-molecule fluorescence resonance energy transfer. Methods, 25, 78-86.
Ha, T., Enderle, T., Ogletree, D.F., Schemla, D.S., Selvin, P.R. k Weiss, S. (1996), Probing the interaction between two single molecules: Fluorescence resonance energy transfer between a single donor and a single acceptor. Proc. Natl. Acad. Sci. USA, 93, 6264-6268. Hύbner, G.G., Alois, R., Indrek, R. & P., W.U. (2001). Direct observation of the triplet hfetime quenching of single dye molecules by molecular oxygen. J. Chem. Phys., 115, 9619-9622.
Jia, Y.w., Sytnik, A., Li, L., Vladimirov, S., Cooperman, B.S. & Hochstrasser, R.M. (1997). Nonexponential lαnetics of a single trnaphe molecule under physiological conditions. Proc. Natl. Acad. Sci. USA, 94, 7932-7936. Jung, Y., Barkai, E. & Silbey, R. (2002). Current status of single-molecule spectroscopy: Tneoret- ical aspects. J. Chem. Phys., 117, 10980-10995. Kδllner, M. & Wolfrum, J. (1992). How many photons are necessary for fluorescence-lifetime measurement? Chem. Phys. Lett, 200, 199-204. Kou, S.C., Xie, X.S. . Liu, J.S. (2003). Bayesian analysis of single rαomlecule experimental data. J. Roy. Sta Soc. B, submitted.
Kuhne uth, R. & Seidel, C.A.M. (2001). Principles of single molecule multiparameter fluorescence spectroscopy. Single Mo , 2, 251-254.
Li, L.Q. k, Davis, L.M. (1993). Single photon avalanche diode for single molecule detecion. Rev. Sci. Instrum., 64, 1524-1529.
Lu, H.P., X i, L.Y. & Xie, X.S. (1998). Single-molecule enzymatic dynamics. Science, 282, 1877- 1882.
Moerner, W. k, Fromm, D. (2003). Methods of single-molecule fluorescence spectroscopoy and raicroscopy. Rev. Sci. Instrum., 74, 3597-3619.
Moerner, W.E. & Kador, L. (1989). Optical-detection and spectroscop3'- of single molecules in a solid. Phys. Rev. Lett., 62, 2535-2538.
Moser, C., J.M.Keske, Warncke, ., Farid, R. & Dutton, P. (1992). Nature of biological electron transfer. Nature, 355, 796-802.
Nie, S.M. & Zare, R.N. (1997). Optical detection of single molecules. Ann. Rev. Biophys. Biomol. Struct, 26, 567-596. Novilcov, E., Hofkens, J., Cotlet, M., Maus, M., De Schτyveτ, - F.G. & Boens, N. (2001). A new analysis method of single molecule fluorescence using series of photon arrival times: theory and experiment. Spectrochim. Ada, 57, 2109-2133.
Ober, R.J., Sripad, R. & Ward, S.E. (2004). Localization accuracy in single molecule microscopy. Biophys. J., 86, 1185-1200.
Onuchic, J.N., Wang, J. h Wolynes, P.G. (1999). Analyzing single molecule trajectories on complex energy landscapes using replica correlation functions. Chem. Phys., 247, 175-184.
Orrit, M. k, Bernard, J. (1990). Smgle pentacene molecules detected by fluorescence excitation in
a para-terphenyl crystal. Phys. Rev. Lett., 65, 2716-2719. Rao, C.R. (1949). Sufficient statistics and minimum variance estimates. Proc. Cambridge Phil. Sac, 45, 213-218. Sauer, M., Drexhage, K., Lieber irth, U., R., M., Nord, S. & Zander, C. (1998). Dynamics of the electron transfer reaction between an oxazine dye and DNA oligonucleotides monitored on the single-molecule level. Chem. Phys, Lett, 284, 153-163. Schenter, G.K., Lu, H.P. & Xie, X.S. (1999). Statistical analyses and theoretical models of single- molecule enzymatic dynamics. J. Phys, Chem. A, 103, 10477-10488. Schervish, M.J. (1997). Theory of Statistics. Springer, 2nd edn. Schroder, G.F. &ε Grubmiiller, H. (2003). Maximum likelihood trajectories from single molecule fluorescence resonance energy transfer experiments. J. Chem. Phys., 119, 7830-7834. Schuler, B., Lipman, E.A. &ε Eaton, W.A. (2002). Probing the free-energy surface for protein folding with single-molecule fluorescence spectroscopy. Nature, 419, 743-747. Stryer, L. (1978). Fluorescence energy transfer as a spectroscopic ruler. Ann. Rev. Biochem., 47, 819-846. Tan, E., Wilson, T.J., Nahas, M.K., Clegg, R.M., Lilley, D.M.J. & Ha, T. (2003). A four-way junction accelerates hairpin ribozyme folding via a discrete intermediate. Proc. Natl Acad. Sci. USA, 100, 9308-9313. Voinov, V.G. & Nikulin, M.S. (1993). -Unbiased Estimators and Their Applications, vol. I. Kluwer Academic Publishers, Boston. Wang, J. & Wolynes, P. (1995). Intermittency of single-molecule reaction dynamics in fluctuating environments. Phys. Rev. Lett., 74, 4317-4320. Weiss, S. (1999). Fluorescence spectroscopy of single biornolecules. Science, 283, 1676-1683. Xie, X.S. & Trautman, J.K. (1998). Optical studies of single molecules at room temperature. Ann. Rev. Phys. Chem., 49, 441-480. ■ Yang, H. & Xie, X.S. (2002a). Probing single molecule dynamics photon by photon. /. Chem. Phys., 117, 10965-10979. Yang, H. & Xie, X.S. (2002b). Statistical approaches for probing single molecule dynamics photon by photon. Chem. Phys., 284, 423-437. Yang, H., Luo, G., Karnchanaphanurach, P., Louie, T.M., Xun, L. Xie, X.S, (200S). Single- molecule protein dynamics on multiple time scales probed by electron transfer. Science, 302,
262-266.
Yang, S.L. & Cao, J.S. (2001). Two-event echos in single-molecule kinetics: A signature of conformational fluctuations. J. Phys. Chem. B, 105, 6536-6549.
Yasuda, R., Noji, H., Kinosita, K. & Yoshida, M. (1998). F-1-ATPase is a highly efficient molecular motor that rotates with discrete 120 degrees steps. Cell, 93, 1117-1124.
Yasuda, R., Masaike, T., Adachi, K., Noji, Ξ.-, Itoh, H. k, Kinosita Jr., K. (2003). The atp-waiting conformation of rotating Fi-ATPase revealed by single-pair fluorescence resonance energy transfer. Proc. Natl. Acad. βci. USA, 100, 9314-9318.
Zhuang, X.W., Bartley, L.B., Babcock, H.P., Russell, R., Ha, T.J., Herschlag, D. & Chu, S, (2000). A single-molecule study of RNA catalysis and folding. Science, 288, 2048-2051. Zhuang, X.W., Kim, H., Pereira, M.J.B., Babcock, H.P., Walter, N.G. & Chu, S. (2002). Correlating structural dynamics and function in single ribozyme molecules. Science, 296, 1473-1476.