Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 2 - AMVA4NewPhysics (Advanced Multi-Variate Analysis for New Physics Searches at the LHC)

Teaser

In the course of the last 40 years the standard model (SM) has received increasing verifications. There are, however, compelling reasons to believe the SM is not a complete theory but only an “effective” low-energy one, breaking down at energies higher than those probed so...

Summary

In the course of the last 40 years the standard model (SM) has received increasing verifications. There are, however, compelling reasons to believe the SM is not a complete theory but only an “effective” low-energy one, breaking down at energies higher than those probed so far. The Higgs boson may constitute the door through which a whole class of new phenomena and a deeper understanding of Nature can be accessed. Experiments are also looking for new particles predicted in many SM extensions, but it is equally important to pursue a model-independent approach, and search for any rare new processes that may be hiding in the high-energy collisions. It is therefore necessary to broaden the ways of conducting searches for new physics.

The advent of machine learning (ML) techniques has brought dramatic changes to the potential of data analysis. This ITN aimed to tackle the two big challenges mentioned above with cutting-edge ML tools, optimizing them as well as developing new ones. This program gives us a chance to train a new generation of data scientists; Physics has always been a breeding ground for skilled individuals. Also, HEP has a history of developing tools that later become of exceptional importance for society as a whole (e.g. the internet, proton therapy). Hence we claim that research in HEP, employing ML techniques now available, may produce new important advancements for tomorrow\'s society as a whole.

The overall objectives of the project have been:
O1: Develop and improve advanced ML tools for data analysis in particle physics.
O2: Bring together academic and non-academic partners to create innovative training opportunities for talented students in statistical learning, computational tools, and data science.
O3: Deepen our knowledge of Nature by providing answers to fundamental physics questions with the LHC.

After the conclusion of operation of the ITN, we observe that we fully succeeded in achieving objective O1. Indeed, we substantially improved the performance of existing ML tools in use, and we delivered entirely new tools that promise ground-breaking advances in the quality of the data analysis and the overall extraction of physics knowledge from the available data. These claims are supported by the produced deliverables of work packages 1, 2, 3, and 4.
For O2, we produced new high-level training opportunities to our students and to others who attended our open events. In particular, the involved ESRs have been able to include in their training plan a number of excellent workshops, schools, and lectures offered both from academic and non-academic instructors. Another notable action is the quite successful interaction with industrial partners, allowing a perfect synergy with the work plan of the ESRs (YANDEX secondments were appreciated for the insight offered by personnel involved in applying the new software technologies to fundamental research) and in others provided training opportunities in real-life applications of ML.

Concerning objective O3, the investigation of fundamental physics proceeds by both incremental and disruptive advancements. The latter are unpredictable and exceedingly rare. While the LHC has been producing a large number of exquisite scientific results, in many cases benefitting from the very work of our ESRs, we cannot in earnest claim that we could answer fundamental physics questions in radical new ways. What we certainly have achieved is a strengthening and an improvement of the potential of the ATLAS and CMS experiments to pursue that long-term goal.

In conclusion, we feel that the ITN has contributed significantly to physics advancements, has formed in an optimal way a cohort of bright young researchers who are now starting a career in research and outside academia, and has created new ML tools which promise to strengthen our capability of extracting more information from data both in science and in industry.

Work performed

Below we summarize some of the most relevant results of our action in the reporting period, organized according to the objectives described above.
For O1:
• Developed a tool for matrix element calculations that greatly improves the task of comparing experimental data with theoretical predictions, MoMEMta. The tool is now used by LHC analysts in many searches and studies, and is public.
• Developed an innovative, systematic uncertainties-aware optimization framework for neural network (NN)-powered measurement of signal contamination of datasets, INFERNO. The algorithm is being tested to improve measurements at the LHC and considered for external applications. It has been described at conferences and in a gold open access journal; it is public on github.
• Adapted deep NNs to identify the originating parton in hadronic jets, strongly improving the reach of the CMS experiment in searches as well as measurements. The DeepJet and DeepFlavour algorithms are the default instrument for these tasks and are used by a large number of analyses in CMS.
• Exploited and adapted advanced multivariate tools (Matrix-element based and NN) to measure in more detail the properties of the Higgs boson decays to bottom quarks and tau leptons in the ATLAS and CMS experiments. The work of ESRs in this area has significantly contributed to the first observation of the Hbb decay mode of the Higgs boson, resulted in a number of highly cited publications.

ForO2:
• Worked in synergy with four industrial partners, creating the conditions for ESR-tailored secondments, where ESRs could acquire new skills in the area of project reporting, team work, private-owned data, software development, problem solving, as well as increase their knowledge of specific applications. In all cases the ESRs reported a high level of satisfaction from the experience.
• Cooperated with YANDEX to support a yearly school of Machine Learning for particle physics. The school is of extremely high quality and will out-run the ITN life span.

For O3:
• Contributed to the ATLAS observation of the decay of the Higgs boson to bottom-quark pairs in final states with top quark pairs and with vector bosons.
• Contributed to the ATLAS observation of the associated production of Higgs bosons and top quark pairs, and measured the coupling of the Higgs boson to top quarks.
• Contributed to the CMS determination of a constraint on the self-coupling of the Higgs boson.

Final results

The products delivered by the network do constitute progress beyond the state of the art in the considered field of research. The new algorithms have started to find use within the experimental collaborations, have been already published in refereed journals or as proceedings articles, and presented at international conferences. Their use in LHC analyses has already started and produced significant advances.

The potential impact on society of the ITN action is very high, as the discovery of new physics -eased by the added sensitivity provided by our statistical learning procedures- has the potential of completely revolutionize our understanding of fundamental physics as well as the universe. A discovery of corpuscular dark matter producible at accelerators, for example, or of a fifth force of nature, could impact society in a way that is beyond anybody\'s guess. As already stated, the concretization of this potential is unfortunately not in our hands, although we have certainly facilitated it with our action.

Website & more info

More info: https://amva4newphysics.wordpress.com.