1.. _stats-bls: 2 3*********************************** 4Box Least Squares (BLS) Periodogram 5*********************************** 6 7The "box least squares" (BLS) periodogram [1]_ is a statistical tool used for 8detecting transiting exoplanets and eclipsing binaries in time series 9photometric data. The main interface to this implementation is the 10`~astropy.timeseries.BoxLeastSquares` class. 11 12Mathematical Background 13======================= 14 15The BLS method finds transit candidates by modeling a transit as a periodic 16upside down top hat with four parameters: period, duration, depth, and a 17reference time. In this implementation, the reference time is chosen to be the 18mid-transit time of the first transit in the observational baseline. These 19parameters are shown in the following sketch: 20 21.. plot:: 22 23 import numpy as np 24 import matplotlib.pyplot as plt 25 26 period = 6 27 t0 = -3 28 duration = 2.5 29 depth = 0.1 30 x = np.linspace(-5, 5, 50000) 31 y = np.ones_like(x) 32 y[np.abs((x-t0+0.5*period)%period-0.5*period)<0.5*duration] = 1.0 - depth 33 plt.figure(figsize=(7, 4)) 34 plt.axvline(t0, color="k", ls="dashed", lw=0.75) 35 plt.axvline(t0+period, color="k", ls="dashed", lw=0.75) 36 plt.axhline(1.0-depth, color="k", ls="dashed", lw=0.75) 37 plt.plot(x, y) 38 39 kwargs = dict( 40 va="center", arrowprops=dict(arrowstyle="->", lw=0.5), 41 bbox={"fc": "w", "ec": "none"}, 42 ) 43 plt.annotate("period", xy=(t0+period, 1.01), xytext=(t0+0.5*period, 1.01), ha="center", **kwargs) 44 plt.annotate("period", xy=(t0, 1.01), xytext=(t0+0.5*period, 1.01), ha="center", **kwargs) 45 plt.annotate("duration", xy=(t0-0.5*duration, 1.0-0.5*depth), xytext=(t0, 1.0-0.5*depth), ha="center", **kwargs) 46 plt.annotate("duration", xy=(t0+0.5*duration, 1.0-0.5*depth), xytext=(t0, 1.0-0.5*depth), ha="center", **kwargs) 47 plt.annotate("reference time", xy=(t0, 1.0-depth-0.01), xytext=(t0+0.25*duration, 1.0-depth-0.01), ha="left", **kwargs) 48 plt.annotate("depth", xy=(0.0, 1.0), xytext=(0.0, 1.0-0.5*depth), ha="center", rotation=90, **kwargs) 49 plt.annotate("depth", xy=(0.0, 1.0-depth), xytext=(0.0, 1.0-0.5*depth), ha="center", rotation=90, **kwargs) 50 51 52 plt.ylim(1.0 - depth - 0.02, 1.02) 53 plt.xlim(-5, 5) 54 plt.gca().set_yticks([]) 55 plt.gca().set_xticks([]) 56 plt.ylabel("brightness") 57 plt.xlabel("time") 58 59 # **** 60 61Assuming that the uncertainties on the measured flux are known, independent, 62and Gaussian, the maximum likelihood in-transit flux can be computed as 63 64.. math:: 65 66 y_\mathrm{in} = \frac{\sum_\mathrm{in} y_n/{\sigma_n}^2}{\sum_\mathrm{in} 1/{\sigma_n}^2} 67 68where :math:`y_n` are the brightness measurements, :math:`\sigma_n` are the 69associated uncertainties, and both sums are computed over the in-transit data 70points. 71 72Similarly, the maximum likelihood out-of-transit flux is 73 74.. math:: 75 76 y_\mathrm{out} = \frac{\sum_\mathrm{out} y_n/{\sigma_n}^2}{\sum_\mathrm{out} 1/{\sigma_n}^2} 77 78where these sums are over the out-of-transit observations. Using these results, 79the log likelihood of a transit model (maximized over depth) at a given period 80:math:`P`, duration :math:`\tau`, and reference time :math:`t_0` is 81 82.. math:: 83 84 \log \mathcal{L}(P,\,\tau,\,t_0) = 85 -\frac{1}{2}\,\sum_\mathrm{in}\frac{(y_n-y_\mathrm{in})^2}{{\sigma_n}^2} 86 -\frac{1}{2}\,\sum_\mathrm{out}\frac{(y_n-y_\mathrm{out})^2}{{\sigma_n}^2} 87 + \mathrm{constant} 88 89This equation might be familiar because it is proportional to the "chi 90squared" :math:`\chi^2` for this model and this is a direct consequence of our 91assumption of Gaussian uncertainties. 92 93This :math:`\chi^2` is called the "signal residue" by [1]_, so maximizing the 94log likelihood over duration and reference time is equivalent to computing the 95box least squares spectrum from [1]_. 96 97In practice, this is achieved by finding the maximum likelihood model over a 98grid in duration and reference time as specified by the ``durations`` and 99``oversample`` parameters for the 100`~astropy.timeseries.BoxLeastSquares.power` method. 101 102Behind the scenes, this implementation minimizes the number of required 103calculations by pre-binning the observations onto a fine grid following [1]_ 104and [2]_. 105 106Basic Usage 107=========== 108 109The transit periodogram takes as input time series observations where the 110timestamps ``t`` and the observations ``y`` (usually brightness) are stored as 111``numpy`` arrays or :class:`~astropy.units.Quantity` objects. If known, error 112bars ``dy`` can also optionally be provided. 113 114Example 115------- 116 117.. EXAMPLE START: Evaluating BLS Periodograms 118 119To evaluate the periodogram for a simulated data set: 120 121>>> import numpy as np 122>>> import astropy.units as u 123>>> from astropy.timeseries import BoxLeastSquares 124>>> np.random.seed(42) 125>>> t = np.random.uniform(0, 20, 2000) 126>>> y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t)) 127>>> model = BoxLeastSquares(t * u.day, y, dy=0.01) 128>>> periodogram = model.autopower(0.2) 129 130The output of the `.astropy.timeseries.BoxLeastSquares.autopower` method 131is a `~astropy.timeseries.BoxLeastSquaresResults` object with several 132useful attributes, the most useful of which are generally the ``period`` and 133``power`` attributes. 134 135This result can be plotted using matplotlib: 136 137>>> import matplotlib.pyplot as plt # doctest: +SKIP 138>>> plt.plot(periodogram.period, periodogram.power) # doctest: +SKIP 139 140.. plot:: 141 142 import numpy as np 143 import astropy.units as u 144 import matplotlib.pyplot as plt 145 from astropy.timeseries import BoxLeastSquares 146 147 np.random.seed(42) 148 t = np.random.uniform(0, 20, 2000) 149 y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t)) 150 model = BoxLeastSquares(t * u.day, y, dy=0.01) 151 periodogram = model.autopower(0.2) 152 153 plt.figure(figsize=(8, 4)) 154 plt.plot(periodogram.period, periodogram.power, "k") 155 plt.xlabel("period [day]") 156 plt.ylabel("power") 157 158In this figure, you can see the peak at the correct period of three days. 159 160.. EXAMPLE END 161 162Objectives 163========== 164 165By default, the `~astropy.timeseries.BoxLeastSquares.power` method computes the 166log likelihood of the model fit and maximizes over reference time and duration. 167It is also possible to use the signal-to-noise ratio with which the transit 168depth is measured as an objective function. 169 170Example 171------- 172 173.. EXAMPLE START: Transit Search with BoxLeastSquares.power and Signal-to-Noise 174 175To compute the log likelihood of the model fit, call 176`~astropy.timeseries.BoxLeastSquares.power` or 177`~astropy.timeseries.BoxLeastSquares.autopower` with ``objective='snr'`` as 178follows: 179 180>>> model = BoxLeastSquares(t * u.day, y, dy=0.01) 181>>> periodogram = model.autopower(0.2, objective="snr") 182 183.. plot:: 184 185 import numpy as np 186 import astropy.units as u 187 import matplotlib.pyplot as plt 188 from astropy.timeseries import BoxLeastSquares 189 190 np.random.seed(42) 191 t = np.random.uniform(0, 20, 2000) 192 y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t)) 193 model = BoxLeastSquares(t * u.day, y, dy=0.01) 194 periodogram = model.autopower(0.2, objective="snr") 195 196 plt.figure(figsize=(8, 4)) 197 plt.plot(periodogram.period, periodogram.power, "k") 198 plt.xlabel("period [day]") 199 plt.ylabel("depth S/N") 200 201This objective will generally produce a periodogram that is qualitatively 202similar to the log likelihood spectrum, but it has been used to improve the 203reliability of transit search in the presence of correlated noise. 204 205.. EXAMPLE END 206 207Period Grid 208=========== 209 210The transit periodogram is always computed on a grid of periods and the results 211can be sensitive to the sampling. As discussed in [1]_, the performance of the 212transit periodogram method is more sensitive to the period grid than the 213`~astropy.timeseries.LombScargle` periodogram. 214 215This implementation of the transit periodogram includes a conservative heuristic 216for estimating the required period grid that is used by the 217`~astropy.timeseries.BoxLeastSquares.autoperiod` and 218`~astropy.timeseries.BoxLeastSquares.autopower` methods and the details of this 219method are given in the API documentation for 220`~astropy.timeseries.BoxLeastSquares.autoperiod`. 221 222Example 223------- 224 225.. EXAMPLE START: Computing Transit Periodograms on a Grid of Periods 226 227It is possible to provide a specific period grid as follows: 228 229>>> model = BoxLeastSquares(t * u.day, y, dy=0.01) 230>>> periods = np.linspace(2.5, 3.5, 1000) * u.day 231>>> periodogram = model.power(periods, 0.2) 232 233.. plot:: 234 235 import numpy as np 236 import astropy.units as u 237 import matplotlib.pyplot as plt 238 from astropy.timeseries import BoxLeastSquares 239 240 np.random.seed(42) 241 t = np.random.uniform(0, 20, 2000) 242 y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t)) 243 model = BoxLeastSquares(t * u.day, y, dy=0.01) 244 periods = np.linspace(2.5, 3.5, 1000) * u.day 245 periodogram = model.power(periods, 0.2) 246 247 plt.figure(figsize=(8, 4)) 248 plt.plot(periodogram.period, periodogram.power, "k") 249 plt.xlabel("period [day]") 250 plt.ylabel("power") 251 252However, if the period grid is too coarse, the correct period might be missed. 253 254>>> model = BoxLeastSquares(t * u.day, y, dy=0.01) 255>>> periods = np.linspace(0.5, 10.5, 15) * u.day 256>>> periodogram = model.power(periods, 0.2) 257 258.. plot:: 259 260 import numpy as np 261 import astropy.units as u 262 import matplotlib.pyplot as plt 263 from astropy.timeseries import BoxLeastSquares 264 265 np.random.seed(42) 266 t = np.random.uniform(0, 20, 2000) 267 y = np.ones_like(t) - 0.1*((t%3)<0.2) + 0.01*np.random.randn(len(t)) 268 model = BoxLeastSquares(t * u.day, y, dy=0.01) 269 periods = np.linspace(0.5, 10.5, 15) * u.day 270 periodogram = model.power(periods, 0.2) 271 272 plt.figure(figsize=(8, 4)) 273 plt.plot(periodogram.period, periodogram.power, "k") 274 plt.xlabel("period [day]") 275 plt.ylabel("power") 276 277.. EXAMPLE END 278 279Peak Statistics 280=============== 281 282To help in the transit vetting process and to debug problems with candidate 283peaks, the `~astropy.timeseries.BoxLeastSquares.compute_stats` method can be 284used to calculate several statistics of a candidate transit. 285 286Many of these statistics are based on the VARTOOLS package described in [2]_. 287This will often be used as follows to compute stats for the maximum point in 288the periodogram: 289 290>>> model = BoxLeastSquares(t * u.day, y, dy=0.01) 291>>> periodogram = model.autopower(0.2) 292>>> max_power = np.argmax(periodogram.power) 293>>> stats = model.compute_stats(periodogram.period[max_power], 294... periodogram.duration[max_power], 295... periodogram.transit_time[max_power]) 296 297This calculates a dictionary with statistics about this candidate. 298Each entry in this dictionary is described in the documentation for 299`~astropy.timeseries.BoxLeastSquares.compute_stats`. 300 301 302Literature References 303===================== 304 305.. [1] Kovacs, Zucker, & Mazeh (2002), A&A, 391, 369 (arXiv:astro-ph/0206099) 306.. [2] Hartman & Bakos (2016), Astronomy & Computing, 17, 1 (arXiv:1605.06811) 307