In [1]:
import ROOT
Welcome to JupyROOT 6.12/06

Open MonteCarlo files

These files were generated using Madgraph5_aMC@NLO showered with Pythia8 and simulated with Delphes.

This sounds very complicated but with a nominal setup of mg5 from here, taking a moment to run install pythia8 and install Delphes, and running the following commands will produce similar events.

import model heft

generate p p > h, h > a a
output Hgg_RUN
launch Hgg_RUN
generate p p > a a / h
output HggBackground_RUN
launch HggBackground_RUN

To further simplify matters however the resulting root files are slimmed of most of their branches (Taus, FatJets etc) such that the few branches that remain don’t need any additional classes to read. Additional information can be found here

In [2]:
sig_file = ROOT.TFile.Open("HggSignal.root")
signal_tree = sig_file.Get("Tree")
bkg_file = ROOT.TFile.Open("HggBackground.root")
background_tree = bkg_file.Get("Tree")
In [3]:
lumi = 25000

Define Histograms

In [4]:
mgg = ROOT.TH1F("mgg","di photon mass",60,100,160)
mgg_background = ROOT.TH1F("mgg_background","background di photon mass",60,100,160)

Define an Event Selection

for each tree we assert that there must be at least two photons, no leptons and the scalar sum of objects in the region must be larger than 100 GeV

further more we assert that the highest \(p_\mathrm{T}\) photon has at least 40 GeV of transverse energy and the next highest 30 GeV.

The isolation variable shows how far the energy clusters of the photon candidate are distributed about the mean.

In [5]:
def apply_cuts(tree, histogram):
    for event in tree:
        if event.n_photons >1 and event.n_leptons < 1 and event.ht > 100.:
            lead = ROOT.TLorentzVector(event.gamma_lead_pt,event.gamma_lead_eta,event.gamma_lead_phi,0.)
            sublead = ROOT.TLorentzVector(event.gamma_sublead_pt,event.gamma_sublead_eta,event.gamma_sublead_phi,0.)
            di_photon = lead + sublead
            if lead.Pt()> 40 and sublead.Pt()>30 and event.lead_isolation < 0.04 and event.sublead_isolation < 0.04:
                histogram.Fill(abs(di_photon.M()),event.weight*lumi)
    return histogram
In [6]:
mgg = apply_cuts(signal_tree, mgg)
c1 = ROOT.TCanvas()
mgg.Draw()
c1.Draw()
_images/examples-hgg_10_0.png

As can be seen, the application of cuts convolutes the shape of the distribution and the number of events produced by the MonteCarlo can have a large effect on the overall uncertainty

In [7]:
mgg_background = apply_cuts(background_tree, mgg_background)
c1 = ROOT.TCanvas()
mgg_background.Draw()
fitResultPtr = mgg_background.Fit("expo","S")
c1.Draw()
 FCN=80.7299 FROM MIGRAD    STATUS=CONVERGED      54 CALLS          55 TOTAL
                     EDM=6.1278e-07    STRATEGY= 1      ERROR MATRIX ACCURATE
  EXT PARAMETER                                   STEP         FIRST
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE
   1  Constant     9.13245e+00   5.70895e-02   3.35087e-05   1.35559e-01
   2  Slope       -2.40656e-02   4.59901e-04   2.70027e-07   1.57861e+01
_images/examples-hgg_12_1.png

Validate the background shape

In the above stage we assert that the background distribution form a exponential distribution.

We now validate (by eye) that the binning and distribution are sufficient for our needs.

In this case we scale the signal histogram by 5 to make it more visible.

In [8]:
ROOT.gStyle.SetOptStat(0)
mgg.SetFillColor(2)
mgg.SetStats(0)
mgg.Scale(5)
mgg_background.SetFillColor(4)
mgg_background.SetStats(0)
leg = ROOT.TLegend(0.5,0.6,0.9,0.9)
leg.AddEntry(mgg,"{:.2f} Higgs Signal Events".format(mgg.Integral()))
leg.AddEntry(mgg_background,"{:.2f} Background Events".format(mgg_background.Integral()))
stack = ROOT.THStack()
stack.Add(mgg_background)
stack.Add(mgg)
stack.Draw("HIST")
mgg_background.Draw("same")
leg.Draw()
stack.GetXaxis().SetTitle("Mass (GeV)")
stack.GetYaxis().SetTitle("Events per GeV")
c1.Draw()
_images/examples-hgg_14_0.png

Prepare Workspace

First we convert the histograms into histogram pdf objects.

Our range is the Mass range plotted above.

In [9]:
x = ROOT.RooRealVar("x","x",100,160)
l = ROOT.RooArgList(x)

RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

In [10]:
signalhist = ROOT.RooDataHist("sighist", "sighist", l, mgg)
sigpdf = ROOT.RooHistPdf("sigpdf","sigpdf",ROOT.RooArgSet(x),signalhist,0)
bkghist = ROOT.RooDataHist("bkghist", "bkghist", l, mgg_background)
bkgpdf = ROOT.RooHistPdf("bkgpdf","bkgpdf",ROOT.RooArgSet(x),bkghist,0)
[#1] INFO:DataHandling -- RooDataHist::adjustBinning(sighist): fit range of variable x expanded to nearest bin boundaries: [100,160] --> [100,160]
[#1] INFO:DataHandling -- RooDataHist::adjustBinning(bkghist): fit range of variable x expanded to nearest bin boundaries: [100,160] --> [100,160]

Create a composite model signal+background (gauss+bkg)

In [11]:
mean = ROOT.RooRealVar("mean","Mean of Gaussian",120,100,160)
sigma = ROOT.RooRealVar("sigma","Width of Gaussian",5,0,10)
hgammagamma = ROOT.RooGaussian("Hgammagamma","Gaussian with mean",x,mean,sigma)
In [12]:
hgammagamma.fitTo(signalhist,ROOT.RooFit.Range(100,160),ROOT.RooFit.PrintLevel(-1))
Out[12]:
<ROOT.RooFitResult object at 0x(nil)>
[#0] WARNING:InputArguments -- RooAbsPdf::fitTo(Hgammagamma) WARNING: a likelihood fit is request of what appears to be weighted data.
       While the estimated values of the parameters will always be calculated taking the weights into account,
       there are multiple ways to estimate the errors on these parameter values. You are advised to make an
       explicit choice on the error calculation:
           - Either provide SumW2Error(kTRUE), to calculate a sum-of-weights corrected HESSE error matrix
             (error will be proportional to the number of events)
           - Or provide SumW2Error(kFALSE), to return errors from original HESSE error matrix
             (which will be proportional to the sum of the weights)
       If you want the errors to reflect the information contained in the provided dataset, choose kTRUE.
       If you want the errors to reflect the precision you would be able to obtain with an unweighted dataset
       with 'sum-of-weights' events, choose kFALSE.
[#1] INFO:Eval -- RooRealVar::setRange(x) new range named 'fit' created with bounds [100,160]
[#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_Hgammagamma_sighist) constructing test statistic for sub-range named fit
[#1] INFO:Eval -- RooRealVar::setRange(x) new range named 'NormalizationRangeForfit' created with bounds [100,160]
[#1] INFO:Eval -- RooRealVar::setRange(x) new range named 'fit_nll_Hgammagamma_sighist' created with bounds [100,160]
[#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_Hgammagamma_sighist) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
In [13]:
c = ROOT.TCanvas()
plot = x.frame(ROOT.RooFit.Title("Signal Mass"))
plot.SetTitle("")
plot.GetYaxis().SetTitleOffset(1.)
plot.GetYaxis().SetTitleSize(0.05)
plot.GetXaxis().SetTitleSize(0.05)
plot.GetXaxis().SetTitleOffset(.6)
plot.SetXTitle(r"$m_{\gamma\gamma}$ (GeV)")
signalhist.plotOn(plot,ROOT.RooFit.Name("data"))
hgammagamma.plotOn(plot,ROOT.RooFit.Name("hgg"))

l = ROOT.TLegend( 0.5, 0.6, 0.9, 0.9)
dataobj = plot.findObject("data")
hobj = plot.findObject("hgg")

l.AddEntry( dataobj , "MC Data", "pl" )
l.AddEntry( hobj , "{0:0.0f}GeV Higgs mass ".format(mean.getVal()), "l"  )
l.SetTextSizePixels(400)
plot.Draw()
l.Draw()
c.Draw()
[#1] INFO:InputArguments -- RooAbsData::plotOn(sighist) INFO: dataset has non-integer weights, auto-selecting SumW2 errors instead of Poisson errors
[#1] INFO:Plotting -- RooAbsPdf::plotOn(Hgammagamma) p.d.f was fitted in range and no explicit plot,norm range was specified, using fit range as default
[#1] INFO:Plotting -- RooAbsPdf::plotOn(Hgammagamma) only plotting range 'fit_nll_Hgammagamma_sighist'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(Hgammagamma) p.d.f. curve is normalized using explicit choice of ranges 'fit_nll_Hgammagamma_sighist'
_images/examples-hgg_21_1.png

and repeat for Background`

In [14]:
lamb = ROOT.RooRealVar("lambda","Decay rate of exponential",-1,-2,0)
nonhiggs = ROOT.RooExponential("ztautau","Exponential",x,lamb)
In [15]:
nonhiggs.fitTo(bkghist,ROOT.RooFit.Range(100,160),ROOT.RooFit.PrintLevel(-1))
Out[15]:
<ROOT.RooFitResult object at 0x(nil)>
[#0] WARNING:InputArguments -- RooAbsPdf::fitTo(ztautau) WARNING: a likelihood fit is request of what appears to be weighted data.
       While the estimated values of the parameters will always be calculated taking the weights into account,
       there are multiple ways to estimate the errors on these parameter values. You are advised to make an
       explicit choice on the error calculation:
           - Either provide SumW2Error(kTRUE), to calculate a sum-of-weights corrected HESSE error matrix
             (error will be proportional to the number of events)
           - Or provide SumW2Error(kFALSE), to return errors from original HESSE error matrix
             (which will be proportional to the sum of the weights)
       If you want the errors to reflect the information contained in the provided dataset, choose kTRUE.
       If you want the errors to reflect the precision you would be able to obtain with an unweighted dataset
       with 'sum-of-weights' events, choose kFALSE.
[#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_ztautau_bkghist) constructing test statistic for sub-range named fit
[#1] INFO:Eval -- RooRealVar::setRange(x) new range named 'fit_nll_ztautau_bkghist' created with bounds [100,160]
[#1] INFO:Fitting -- RooAbsOptTestStatistic::ctor(nll_ztautau_bkghist) fixing interpretation of coefficients of any RooAddPdf to full domain of observables
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
In [16]:
c = ROOT.TCanvas()
plot = x.frame(ROOT.RooFit.Title("Background Mass"))
plot.SetTitle("")
plot.GetYaxis().SetTitleOffset(1.)
plot.GetYaxis().SetTitleSize(0.05)
plot.GetXaxis().SetTitleSize(0.05)
plot.GetXaxis().SetTitleOffset(.6)
plot.SetXTitle(r"$m_{\gamma\gamma}$ (GeV)")
bkghist.plotOn(plot,ROOT.RooFit.Name("data"))
nonhiggs.plotOn(plot,ROOT.RooFit.Name("QCD"))

l = ROOT.TLegend( 0.5, 0.6, 0.9, 0.9)
dataobj = plot.findObject("data")
bobj = plot.findObject("QCD")

l.AddEntry( dataobj , "MC Data", "pl" )
l.AddEntry( bobj , "Background mass distribution", "l"  )
l.SetTextSizePixels(400)
plot.Draw()
l.Draw()
c.Draw()
[#1] INFO:InputArguments -- RooAbsData::plotOn(bkghist) INFO: dataset has non-integer weights, auto-selecting SumW2 errors instead of Poisson errors
[#1] INFO:Plotting -- RooAbsPdf::plotOn(ztautau) p.d.f was fitted in range and no explicit plot,norm range was specified, using fit range as default
[#1] INFO:Plotting -- RooAbsPdf::plotOn(ztautau) only plotting range 'fit_nll_ztautau_bkghist'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(ztautau) p.d.f. curve is normalized using explicit choice of ranges 'fit_nll_ztautau_bkghist'
_images/examples-hgg_25_1.png

and combine into a single model

In [17]:
fsig = ROOT.RooRealVar("fsig","signal fraction",0.02,0.,1.)

model = ROOT.RooAddPdf("model","model",ROOT.RooArgList(hgammagamma,nonhiggs),ROOT.RooArgList(fsig))
In [18]:
c1 = ROOT.TCanvas()
xframe = x.frame(ROOT.RooFit.Title("Composite Model"))
model.plotOn(xframe)
bkg_component = ROOT.RooArgSet(nonhiggs)
model.plotOn(xframe,ROOT.RooFit.Components(bkg_component),ROOT.RooFit.LineStyle(2),ROOT.RooFit.LineColor(2))
xframe.Draw()
c1.Draw()
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (ztautau)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: ()
_images/examples-hgg_28_1.png

and now import all the models into the workspace

In [19]:
w = ROOT.RooWorkspace("w")
getattr(w,'import')(sigpdf)
getattr(w,'import')(bkgpdf)
getattr(w,'import')(model)
Out[19]:
False
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing dataset sighist
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooHistPdf::sigpdf
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooRealVar::x
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing dataset bkghist
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooHistPdf::bkgpdf
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooAddPdf::model
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooGaussian::Hgammagamma
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooRealVar::mean
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooRealVar::sigma
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooRealVar::fsig
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooExponential::ztautau
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing RooRealVar::lambda

generating data from model

In [20]:
datamodel = ROOT.RooAddPdf("datamodel","datamodel",ROOT.RooArgList(sigpdf,bkgpdf),ROOT.RooArgList(fsig))
data = datamodel.generate(ROOT.RooArgSet(x),10000)
In [21]:
c1 = ROOT.TCanvas()
xframe = x.frame(ROOT.RooFit.Title("Composite Model"))
data.plotOn(xframe)
model.plotOn(xframe)
bkg_component = ROOT.RooArgSet(nonhiggs)
model.plotOn(xframe,ROOT.RooFit.Components(bkg_component),ROOT.RooFit.LineStyle(2),ROOT.RooFit.LineColor(2))
xframe.Draw()
c1.Draw()
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (ztautau)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: ()
_images/examples-hgg_33_1.png
In [22]:
getattr(w,'import')(data)
Out[22]:
False
[#1] INFO:ObjectHandling -- RooWorkspace::import(w) importing dataset wu

Set the Model Configuration and Save

In [23]:
mc = ROOT.RooStats.ModelConfig("ModelConfig",w)
mc.SetPdf(model)
mc.SetParametersOfInterest(ROOT.RooArgSet(w.var("fsig")))
mc.SetObservables(ROOT.RooArgSet(w.var("x")))
getattr(w,'import')(mc)
Out[23]:
False
In [24]:
w.writeToFile("HgammagammaWS.root",True)
Out[24]:
False