Development of machine learning based tau trigger algorithms and search for Higgs boson pair production in the bbtautau decay channel with the CMS detector at the LHC

Jona Motta

Résumé

This thesis presents the study of the Higgs boson pair (HH) production in the final state with a pair of b quarks and a pair of τ leptons (bbττ), exploiting proton-proton collisions data collected at 13 TeV centre-of-mass energy with the CMS detector at the CERN large hadron collider (LHC), corresponding to 138fb-1 accumulated during the Run2 data-taking period (2015-2018). The bbττ decay channel gives a good trade-off between a sizable branching fraction (7.3%) and the purity of the τ selection, ensuring the good rejection of the background contributions. The study of HH production gives access to the measurement of the Higgs boson self-coupling (λ3h). In the context of the standard model (SM), this coupling is the only parameter governing the shape of the Higgs potential and it is precisely predicted by the theory; therefore a measurement of λ3h is a test of the validity of the SM and allows us to shed light on the process of electroweak symmetry breaking. In the context of beyond the SM (BSM) theories - with a particular interest in effective field theories - λ3h can assume values larger than that predicted by the SM, greatly enhancing the HH production cross section; the measurement of deviations from the SM prediction would open the road to yet another new era of physics. Upper limits on the SM signal are set at 95% confidence level (CL) to be around 3 and 124 times the SM for σ(gg->HH) and σ(qq->HH), respectively. The results are also interpreted in the context of 20 different independent BSM scenarios for which 95% CL limits are set. The experimental context of this thesis is the restart of LHC operations in 2022 for its Run3, a new phase with collisions at an energy of 13.6 TeV and instantaneous luminosity of 2-2.2x10^34cm-2s-1. In Run3, the hardware capabilities of the CMS Level-1 trigger (L1T) are unchanged with respect to Run2. This requires the development of bolder and more sophisticated approaches to optimise available algorithms, to guarantee the success of the CMS physics program. Especially interesting is the optimisation of the L1T section that exploits calorimetric information. As part of this thesis a new machine learning method, based on a neural network, has been developed for the calibration applied in the L1T to calorimeter energy deposits; it exploits data for the calibration of single detector objects and its promising performance is evaluated against the offline reconstruction of electrons and hadronic jets. The calorimetric information is then optimally used by the algorithm for the reconstruction and identification of hadronically decaying τ leptons (τh), whose optimisation for the Run3 is performed in this thesis employing a new, simple, and more informative approach; the performance of this approach is evaluated using Z->ττ events collected during 2022. At the same time, the CMS collaboration is striving for its Phase2 upgrade program, which is intended to match the ambitious High-Luminosity LHC (HL-LHC) physics program, starting in 2029. The considerably increased volume of data collected by the HL-LHC will ensure the statistical power for the detailed study of λ3h and possibly its measurement; on the other hand, the larger instantaneous luminosity will require the full replacement of the L1T with hardware of increased capabilities based on state-of-the-art field programmable gate arrays (FPGAs) to efficiently collect data. To exploit the FPGA capabilities to the maximum, a new machine learning algorithm for the reconstruction, identification, and calibration of τh candidates in the L1T has been developed as part of this thesis. This algorithm exploits convolutional neural networks implemented in FPGA firmware and ensures largely enhanced performance compared to standard approaches. All the technical advancement developed within this thesis has one goal: improving the sensitivity of CMS analyses to the measurement of the Higgs boson self-coupling during the ongoing and future Runs of the LHC.

Cette thèse présente l'étude de la production de paire de bosons de Higgs (HH) dans l'état final avec une paire de quarks b et une paire de leptons τ (bbττ), en exploitant les données de collisions proton-proton collectées à 13TeV d’énergie de centre de masse avec le détecteur CMS au grand collisionneur de hadrons (LHC) du CERN, correspondant à 138fb-1 accumulée pendant la période de prise de données Run2 (2015-2018). Le canal de désintégration bbττ offre un compromis entre le rapport d’embranchement (7,3%) et la pureté de sélection des τ, garantissant un bon rejet du bruit de fond. L'étude de la production de HH permet d’étudier l’auto-couplage du boson de Higgs (λ3h) qui, dans le modèle standard (SM), est le seul paramètre prédit par la théorie qui régit la forme du potentiel du Higgs; par conséquent, une mesure de λ3h est un test de la validité du SM et nous permet d’étudier le processus de brisure de symétrie électrofaible. Dans le théories au-delà du SM (BSM) - avec un intérêt particulier pour les théories effectives - λ3h peut prendre des valeurs plus grandes que prédit par le SM, augmentant la section efficace de production de HH. La mesure des écarts par rapport à la prédiction du SM ouvrirait la voie à une nouvelle ère de la physique. Les limites supérieures sur le signal sont fixées à 95% de niveau de confiance (CL) correspondant à environ 3 et 124 fois le SM pour σ(gg->HH) et σ(qq->HH), respectivement. Les résultats sont également interprétés dans le contexte de 20 scénarios BSM pour lesquels des limites à 95% de CL sont fixées. Le contexte expérimental de cette thèse est la reprise des opérations du LHC en 2022 pour sa phase Run3, une nouvelle phase de collisions à 13.6TeV d’énergie et luminosité instantanée de 2-2,2x10^34cm-2s-1. Pendant le Run3, les capacités du déclencheur de niveau 1 (L1T) du CMS restent inchangées par rapport au Run2, nécessitant le développement d'approches plus complexes pour optimiser les algorithmes disponibles, garantissant le succès du programme de physique de CMS. L'optimisation de la section L1T qui exploite les informations calorimétriques est particulièrement intéressante. Dans cette thèse, une nouvelle méthode d'apprentissage automatique, basée sur un réseau de neurones, a été développée pour l’étalonnage des dépôts d'énergie du calorimètre dans le L1T; elle exploite les données pour l'étalonnage des objets détecteurs individuels et ses performances sont évaluées par rapport à la reconstruction hors ligne des électrons et jets. Les informations calorimétriques sont ensuite utilisées par l'algorithme pour la reconstruction et l'identification des leptons τ se désintégrant hadroniquement (τh), dont l'optimisation pour Run3 est réalisée dans cette thèse en utilisant une approche nouvelle; les performances de cette approche sont évaluées à l'aide des événements Z->ττ collectés en 2022. Parallèlement, la collaboration CMS s'efforce de réaliser son programme de mise à niveau Phase2, destiné à poursuivre au programme de physique du haute luminosité LHC (HL-LHC). Le volume accru de données collectées par le HL-LHC assurera la puissance statistique pour l'étude de λ3h et éventuellement sa mesure; en revanche, la luminosité instantanée accrue exigera le remplacement complet du L1T par un hardware basé sur des field programmable gate arrays (FPGAs) plus performant pour la collecte efficace des données. Pour exploiter au maximum les capacités des FPGA, un nouvel algorithme d'apprentissage automatique pour la reconstruction, l'identification et l'étalonnage des candidats τh dans le L1T a été développé dans cette thèse. Cet algorithme exploite des réseaux de neurones convolutifs implémentés dans un FPGA et assure des performances accrues par rapport aux approches standard. Tout le progrès technique développé dans cette thèse à pour but d’améliorer la sensibilité des analyses CMS à la mesure de l’auto-couplage du boson de Higgs au cours des opérations actuelles et futures du LHC.

Development of machine learning based tau trigger algorithms and search for Higgs boson pair production in the bbtautau decay channel with the CMS detector at the LHC

Développement d'algorithmes de déclenchement tau basés sur l'apprentissage automatique et recherche de la production de paires de bosons de Higgs dans le canal de désintégration bbtautau avec le détecteur CMS au LHC

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager