Background Modeling

When fitting a spectrum with a background, it is invalid to simply subtract off the background if the background is part of the data’s generative model van Dyk et al. (2001). Therefore, we are often left with the task of modeling the statistical process of the background along with our source.

In typical spectral modeling, we find a few common cases when background is involved. If we have total counts (\(S_i\)) in \(i^{\rm th}\) on \(N\) bins observed for an exposure of \(t_{\rm s}\) and also a measurement of \(B_i\) background counts from looking off source for \(t_{\rm b}\) seconds, we can then suppose a model for the source rate (\(m_i\)) and background rate (\(b_i\)).

Poisson source with Poisson background

This is described by a likelihood of the following form:

\[L = \prod^N_{i=1} \frac{(t_{\rm s}(m_i+b_i))^{S_i} e^{-t_{\rm s}(m_i+b_i)}}{S_i!} \times \frac{(t_{\rm b} b_i)^{B_i} e^{-t_{\rm b}b_i}}{B_i!}\]

which is a Poisson likelihood for the total model (\(m_i +b_i\)) conditional on the Poisson distributed background observation. This is the typical case for e.g. aperture x-ray instruments that observe a source region and then a background region. Both observations are Poisson distributed.

Poisson source with Gaussian background

This likelihood is similar, but the conditonal background distribution is described by Gaussian:

\[L = \prod^N_{i=1} \frac{(t_{\rm s}(m_i+b_i))^{S_i} e^{-t_{\rm s}(m_i+b_i)}}{S_i!} \times \frac{1}{\sigma_{b,i}\sqrt{2 \pi}} \exp \left[ \frac{({B_i} - t_{\rm b} b_i)^2} {2 \sigma_{b,i}^2} \right]\]

where the \(\sigma_{b,i}\) are the measured errors on \(B_i\). This situation occurs e.g. when the background counts are estimated from a fitted model such as time-domain instruments that estimate the background counts from temporal fits to the lightcurve.

In 3ML, we can fit a background model along with the the source model which allows for arbitrarily low background counts (in fact zero) in channels. The alternative is to use profile likelihoods where we first differentiate the likelihood with respect to the background model

\[\frac{ \partial L}{{\partial b_i}} = 0\]

and solve for the \(b_i\) that maximize the likelihood. Both the Poisson and Gaussian background profile likelihoods are described in the XSPEC statistics guide. This implicitly yields \(N\) parameters to the model thus requiring at least one background count per channel. These profile likelihoods are the default Poisson likelihoods in 3ML when a background model is not used with a SpectrumLike (and its children, DispersionSpectrumLike and OGIPLike) plugin.

Let’s examine how to handle both cases.

[1]:

import warnings

warnings.simplefilter("ignore")
import numpy as np

np.seterr(all="ignore")

[1]:

{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}

[2]:

%%capture
from threeML import *

[3]:

from jupyterthemes import jtplot

%matplotlib inline
jtplot.style(context="talk", fscale=1, ticks=True, grid=False)
set_threeML_style()
silence_warnings()
import astropy.units as u

First we will create an observation where we have a simulated broken power law source spectrum along with an observed background spectrum. The background is a powerl law continuum with a Gaussian line.

[4]:

# create the simulated observation

energies = np.logspace(1, 4, 151)

low_edge = energies[:-1]
high_edge = energies[1:]

# get a BPL source function
source_function = Broken_powerlaw(K=2, xb=300, piv=300, alpha=0.0, beta=-3.0)

# power law background function
background_function = Powerlaw(K=0.5, index=-1.5, piv=100.0) + Gaussian(
    F=50, mu=511, sigma=20
)

spectrum_generator = SpectrumLike.from_function(
    "fake",
    source_function=source_function,
    background_function=background_function,
    energy_min=low_edge,
    energy_max=high_edge,
)


spectrum_generator.view_count_spectrum()

23:26:10 INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: None                                                           SpectrumLike.py:488

23:26:11 INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: None                                                           SpectrumLike.py:488

         INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: poisson                                                        SpectrumLike.py:488

23:26:12 INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: poisson                                                        SpectrumLike.py:488

[4]:

../_images/notebooks_Background_modeling_5_12.png

../_images/notebooks_Background_modeling_5_13.png

Using a profile likelihood

We have very few counts counts in some channels (in fact sometimes zero), but let’s assume we do not know the model for the background. In this case, we will use the profile Poisson likelihood.

[5]:

# instance our source spectrum
bpl = Broken_powerlaw(piv=300, xb=500)

# instance a point source
ra, dec = 0, 0
ps_src = PointSource("source", ra, dec, spectral_shape=bpl)

# instance the likelihood model
src_model = Model(ps_src)

# pass everything to a joint likelihood object
jl_profile = JointLikelihood(src_model, DataList(spectrum_generator))


# fit the model
_ = jl_profile.fit()

# plot the fit in count space
_ = spectrum_generator.display_model(step=False)

         INFO      set the minimizer to minuit                                             joint_likelihood.py:1017

Best fit values:

	result	unit
parameter
source.spectrum.main.Broken_powerlaw.K	1.97 -0.13 +0.14	1 / (keV s cm2)
source.spectrum.main.Broken_powerlaw.xb	(3.21 -0.11 +0.12) x 10^2	keV
source.spectrum.main.Broken_powerlaw.alpha	(1 +/- 8) x 10^-2
source.spectrum.main.Broken_powerlaw.beta	-3.40 +/- 0.21

Correlation matrix:

1.00	-0.61	0.73	0.02
-0.61	1.00	-0.47	-0.58
0.73	-0.47	1.00	0.02
0.02	-0.58	0.02	1.00

Values of -log(likelihood) at the minimum:

	-log(likelihood)
fake	415.860253
total	415.860253

Values of statistical measures:

	statistical measures
AIC	839.996368
BIC	851.763047

Our fit recovers the simulated parameters. However, we should have binned the spectrum up such that there is at least one background count per spectral bin for the profile to be valid.

[6]:

spectrum_generator.rebin_on_background(1)

spectrum_generator.view_count_spectrum()

_ = jl_profile.fit()

_ = spectrum_generator.display_model(step=False)

23:26:17 INFO      Now using 76 bins                                                           SpectrumLike.py:1706

Best fit values:

	result	unit
parameter
source.spectrum.main.Broken_powerlaw.K	2.03 -0.14 +0.15	1 / (keV s cm2)
source.spectrum.main.Broken_powerlaw.xb	(3.13 +/- 0.13) x 10^2	keV
source.spectrum.main.Broken_powerlaw.alpha	(6 +/- 8) x 10^-2
source.spectrum.main.Broken_powerlaw.beta	-3.22 +/- 0.21

Correlation matrix:

1.00	-0.64	0.75	0.08
-0.64	1.00	-0.49	-0.64
0.75	-0.49	1.00	0.06
0.08	-0.64	0.06	1.00

Values of -log(likelihood) at the minimum:

	-log(likelihood)
fake	293.747608
total	293.747608

Values of statistical measures:

	statistical measures
AIC	595.771079
BIC	607.537758

Modeling the background

Now let’s try to model the background assuming we know that the background is a power law with a Gaussian line. We can extract a background plugin from the data by passing the original plugin to a classmethod of spectrum like.

[7]:

# extract the background from the spectrum plugin.
# This works for OGIPLike plugins as well, though we could easily also just read
# in a bakcground PHA
background_plugin = SpectrumLike.from_background("bkg", spectrum_generator)

23:26:18 INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: None                                                           SpectrumLike.py:488

This constructs a new plugin with only the observed background so that we can first model it.

[8]:

background_plugin.view_count_spectrum()

[8]:

../_images/notebooks_Background_modeling_13_0.png

../_images/notebooks_Background_modeling_13_1.png

We now construct our background model and fit it to the data. Let’s assume we know that the line occurs at 511 keV, but we are unsure of its strength an width. We do not need to bin the data up because we are using a simple Poisson likelihood which is valid even when we have zero counts Cash (1979).

[9]:

# instance the spectrum setting the line's location to 511
bkg_spectrum = Powerlaw(piv=100) + Gaussian(F=50, mu=511)

# setup model parameters
# fix the line's location
bkg_spectrum.mu_2.fix = True

# nice parameter bounds
bkg_spectrum.K_1.bounds = (1e-4, 10)
bkg_spectrum.F_2.bounds = (0.0, 1000)
bkg_spectrum.sigma_2.bounds = (2, 30)

ps_bkg = PointSource("bkg", 0, 0, spectral_shape=bkg_spectrum)

bkg_model = Model(ps_bkg)


jl_bkg = JointLikelihood(bkg_model, DataList(background_plugin))


_ = jl_bkg.fit()

_ = background_plugin.display_model(
    step=False, data_color="#1A68F0", model_color="#FF9700"
)

23:26:19 INFO      set the minimizer to minuit                                             joint_likelihood.py:1017

Best fit values:

	result	unit
parameter
bkg.spectrum.main.composite.K_1	(2.96 -0.22 +0.24) x 10^-1	1 / (keV s cm2)
bkg.spectrum.main.composite.index_1	-1.40 +/- 0.05
bkg.spectrum.main.composite.F_2	(2.5 +/- 0.5) x 10	1 / (s cm2)
bkg.spectrum.main.composite.sigma_2	(2.2 +/- 0.4) x 10	keV

Correlation matrix:

1.00	0.18	-0.06	-0.03
0.18	1.00	-0.06	-0.03
-0.06	-0.06	1.00	0.07
-0.03	-0.03	0.07	1.00

Values of -log(likelihood) at the minimum:

	-log(likelihood)
bkg	215.521667
total	215.521667

Values of statistical measures:

	statistical measures
AIC	439.319196
BIC	451.085875

../_images/notebooks_Background_modeling_15_9.png

We now have a model and estimate for the background which we can use when fitting with the source spectrum. We now create a new plugin with just the total observation and pass our background plugin as the background argument.

[10]:

modeled_background_plugin = SpectrumLike(
    "full",
    # here we use the original observation
    observation=spectrum_generator.observed_spectrum,
    # we pass the background plugin as the background!
    background=background_plugin,
)

         INFO      Background modeled from plugin: bkg                                          SpectrumLike.py:476

         INFO      Auto-probed noise models:                                                    SpectrumLike.py:486

         INFO      - observation: poisson                                                       SpectrumLike.py:487

         INFO      - background: poisson                                                        SpectrumLike.py:488

When we look at out count spectrum now, we will see the predicted background, rather than the measured one:

[11]:

modeled_background_plugin.view_count_spectrum()

[11]:

../_images/notebooks_Background_modeling_19_0.png

../_images/notebooks_Background_modeling_19_1.png

Now we simply fit the spectrum as we did in the profiled case. The background plugin’s parameters are stored in our new plugin as nuissance parameters:

[12]:

modeled_background_plugin.nuisance_parameters

[12]:

OrderedDict([('cons_full',
              Parameter cons_full = 1.0 []
              (min_value = 0.8, max_value = 1.2, delta = 0.05, free = False)),
             ('bkg_bkg_position_ra_full',
              Parameter ra = 0.0 [deg]
              (min_value = 0.0, max_value = 360.0, delta = 0.1, free = False)),
             ('bkg_bkg_position_dec_full',
              Parameter dec = 0.0 [deg]
              (min_value = -90.0, max_value = 90.0, delta = 0.1, free = False)),
             ('bkg_bkg_spectrum_main_composite_K_1_full',
              Parameter K_1 = 0.2963819973263162 [1 / (keV s cm2)]
              (min_value = 0.0001, max_value = 10.0, delta = 0.1, free = True)),
             ('bkg_bkg_spectrum_main_composite_piv_1_full',
              Parameter piv_1 = 100.0 [keV]
              (min_value = None, max_value = None, delta = 0.1, free = False)),
             ('bkg_bkg_spectrum_main_composite_index_1_full',
              Parameter index_1 = -1.4029177026108395 []
              (min_value = -10.0, max_value = 10.0, delta = 0.20099999999999998, free = True)),
             ('bkg_bkg_spectrum_main_composite_F_2_full',
              Parameter F_2 = 25.477375772919018 [1 / (s cm2)]
              (min_value = 0.0, max_value = 1000.0, delta = 0.1, free = True)),
             ('bkg_bkg_spectrum_main_composite_mu_2_full',
              Parameter mu_2 = 511.0 [keV]
              (min_value = None, max_value = None, delta = 0.1, free = False)),
             ('bkg_bkg_spectrum_main_composite_sigma_2_full',
              Parameter sigma_2 = 22.150930227963197 [keV]
              (min_value = 2.0, max_value = 30.0, delta = 0.1, free = True)),
             ('bkg_cons_bkg_full',
              Parameter cons_bkg = 1.0 []
              (min_value = 0.8, max_value = 1.2, delta = 0.05, free = False))])

and the fitting engine will use them in the fit. The parameters will still be connected to the background plugin and its model and thus we can free/fix them there as well as set priors on them.

[13]:

# instance the source model... the background plugin has it's model already specified
bpl = Broken_powerlaw(piv=300, xb=500)

bpl.K.bounds = (1e-5, 1e1)
bpl.xb.bounds = (1e1, 1e4)

ps_src = PointSource("source", 0, 0, bpl)

src_model = Model(ps_src)

jl_src = JointLikelihood(src_model, DataList(modeled_background_plugin))

_ = jl_src.fit()

23:26:20 INFO      set the minimizer to minuit                                             joint_likelihood.py:1017

Best fit values:

	result	unit
parameter
source.spectrum.main.Broken_powerlaw.K	1.95 -0.13 +0.14	1 / (keV s cm2)
source.spectrum.main.Broken_powerlaw.xb	(3.11 -0.12 +0.13) x 10^2	keV
source.spectrum.main.Broken_powerlaw.alpha	(3 +/- 8) x 10^-2
source.spectrum.main.Broken_powerlaw.beta	-3.11 +/- 0.20
K_1	(3.25 +/- 0.31) x 10^-1	1 / (keV s cm2)
index_1	-1.33 +/- 0.04
F_2	(2.3 +/- 0.4) x 10	1 / (s cm2)
sigma_2	(2.14 +/- 0.34) x 10	keV

Correlation matrix:

1.00	-0.59	0.73	0.05	0.10	-0.17	0.00	0.00
-0.59	1.00	-0.45	-0.64	0.02	0.26	-0.07	-0.03
0.73	-0.45	1.00	0.03	0.34	-0.25	-0.01	-0.00
0.05	-0.64	0.03	1.00	-0.23	-0.37	-0.01	0.01
0.10	0.02	0.34	-0.23	1.00	0.09	-0.04	-0.03
-0.17	0.26	-0.25	-0.37	0.09	1.00	-0.03	-0.03
0.00	-0.07	-0.01	-0.01	-0.04	-0.03	1.00	0.14
0.00	-0.03	-0.00	0.01	-0.03	-0.03	0.14	1.00

Values of -log(likelihood) at the minimum:

	-log(likelihood)
full	545.640339
total	545.640339

Values of statistical measures:

	statistical measures
AIC	1108.301954
BIC	1131.365759

[14]:

# over plot the joint background and source fits
fig = modeled_background_plugin.display_model(step=False)

_ = background_plugin.display_model(
    data_color="#1A68F0", model_color="#FF9700", model_subplot=fig.axes, step=False
)

../_images/notebooks_Background_modeling_24_0.png

[ ]: