What is the Top Quark Mass?

In this review I give an overview on the conceptual issues involved in the question how to interpret so-called `direct top quark mass measurements', which are based on the kinematic reconstruction of top quark decay products at the Large Hadron Collider (LHC). These measurements quote the top mass parameter $m_t^{\rm MC}$ of Monte-Carlo event generators with current uncertainties of around $0.5$ GeV. At present time the problem of finding a rigorous relation between $m_t^{\rm MC}$ and top mass renormalization schemes defined in field theory is unresolved and touches perturbative as well as nonperturbative aspects and the limitations of state-of-the-art Monte-Carlo event generators. I review the status of LHC top mass measurements, illustrate how conceptual limitations enter and explain a controversy that has permeated the community in the context of the interpretation problem. Recent advances in acquiring first principle insights are summarized, and it is outlined what else has to be understood to fully resolve the issue. For the time being, I give a recommendation how to deal with the interpretation problem when making top mass dependent theoretical predictions.

t of Monte-Carlo event generators with current uncertainties of around 0.5 GeV. At present time the problem of finding a rigorous relation between m MC t and top mass renormalization schemes defined in field theory is unresolved and touches perturbative as well as nonperturbative aspects and the limitations of state-of-the-art Monte-Carlo event generators. I review the status of LHC top mass measurements, illustrate how conceptual limitations enter and explain a controversy that has permeated the community in the context of the interpretation problem related to m MC t . Recent advances in acquiring first principle insights are summarized, and it is outlined what else has to be understood to fully resolve the issue. For the time being, I give a recommendation how to deal with the interpretation problem when making top mass dependent theoretical predictions.

INTRODUCTION
The top quark is the heaviest particle of the Standard Model of elementary particle physics (SM).
The currently most precise determinations of its mass come from so-called 'direct measurements'. These are based on the experimental kinematic reconstruction of the final-state top quark decay products (which are bottom quark jets, light quark jets from W boson decays and leptons) and the comparison of kinematic distributions one can construct from the 4-momenta of the decay products with descriptions of the same quantities obtained from multipurpose Monte Carlo event generators (MMC). These measurements determine the top mass parameter of the MMC and yield a world average of m MC t = 172.9 ± 0.4 GeV (1). This amounts to an impressive relative precision of 0.2% and makes the top quark mass the most precisely known parameter in the strong interaction sector of the SM, called quantum chromodynamics (QCD). For the high luminosity phase of the LHC (HL-LHC) it is projected that uncertainties as small as 200 MeV can be reached from direct top mass measurements (2).
The major portion of the top quark's mass is generated through the electroweak Higgs mechanism (3,4,5) which also gives all other elementary particles of the SM their masses. The precise knowledge of the elementary particle masses and their couplings is an important element in consistency tests of the SM and in indirect searches of physics beyond the SM. Because the hopes for discoveries of non-SM elementary particles at the LHC have up to now not been fulfilled, indirect new physics searches, which focus on finding deviations between experimental data and SM predictions, have become increasingly important. This requires a high level of precision and a thorough and systematic understanding of subtle experimental as well as theoretical issues. In this context the top quark plays a special role because its large mass makes it a highly sensitive probe of the structure of the SM Higgs sector and an important ingredient in models of physics beyond the SM. In this context it is the electroweak part of the top quark mass one seeks to know with the highest possible precision. It is frequently stated that, due to its small lifetime (1/τt = Γt = 1.42 +0. 19 −0.15 GeV (1)) the top quark, even though it has strong color charge, is protected from low-scale hadronization effects, approximately behaving as a free particle.
The results for m MC t obtained from the direct measurements were frequently identified with the so-called top quark pole mass m pole t , which is a popular renormalization scheme used for perturbative QCD computations at next-to-leading order (NLO) and beyond. The pole mass encodes, strictly within perturbation theory, the notion of the kinematic rest mass of the top quark as a real on-shell particle. The identification appeared natural because the top mass sensitivity of the kinematic distributions entering the direct measurement analyses is coming from resonance and endpoint structures that can be seen to be related to the kinematic properties of a top quark particle with a definite mass. I refer to distributions of this kind as 'observables with kinematic (top) mass sensitivity'. With the identification of m MC t and m pole t precise higher-order predictions for the SM electroweak potential (6,7,8,9,10) have been made. These, together with precise measurements of the Higgs boson mass (1), indicated that the SM is in a metastable state. 1 However, in recent years a discussion emerged whether, considering a precision of 0.5 GeV or better, the available NLO (and higher order) perturbative calculations and NLO-matched MMCs indeed control the QCD dynamics affecting the top quark mass and the way the direct measurement observables depend on it sufficiently well, to justify the identification of m MC t with the pole mass (12,13,14,7,15,1 The coupling of the top quark to the Higgs boson generates the large electroweak portion of the mass of the top quark. The top quark mass conversely causes large quantum corrections to the Higgs self-coupling which determines the Higgs mass and also the stability properties of the potential for the Higgs boson field and the SM vacuum. These quantum effects decrease the Higgs self coupling for a larger top mass with the possibility to destabilize the vacuum (11). 16,17,18,19,20,21,22,23). Here, the direct measurements, being the most precise and known to rely essentially entirely on the parton-shower and hadronization dynamics of the MMCs, were discussed most intensely. I call the associated set of physical issues the 'top mass interpretation problem'. The top mass interpretation problem is the question of the precise relation between m MC t and more fundamental and field theoretic mass definitions such as the pole mass, the MS mass or other mass schemes. The origin of the problem is that the simple picture of a free top quark, that directly governs the visible structures in distributions with kinematic top mass sensitivity, is too naive and that the effects of QCD and electroweak quantum fluctuations must be accounted for to high precision. These quantum effects are governed by low energy scales at the level of 1 GeV even though the top mass is extremely large, and they can directly affect the extracted top mass if not described theoretically in an adequate way. What makes matters subtle is that the lowenergy QCD dynamics is difficult to control theoretically because of large higher-order perturbative corrections and nonperturbative effects. The top mass interpretation problem emerges because the top mass sensitive kinematic distributions used for the direct measurements are so complicated that with the current technology their theoretical description can only be provided by MMCs. In the current generation of MMCs, however, the theoretical precision and quality of the low-energy parton-shower and hadronization dynamics cannot yet be systematically controlled at a level such that the identification of the top mass parameter m MC t with a field theoretic mass scheme such as the pole mass can be proven from first principles.
Probably the most confusing aspect of the emerging discussions has been that no consensus has been reached on how to estimate and even formulate the uncertainty associated to the top mass interpretation problem and how to deal with it in practical applications, see e.g. Ref. (2). Furthermore, the issue has not been discussed in a coherent fashion in the community and the advocated points of view were slightly shifting over time. I call this aspect of the interpretation problem 'the controversy' because it is related to different views on the relevance of the physical aspects of the interpretation problem, but does not contribute in any way to a resolution of the physical questions. Meanwhile a number of alternative top mass measurement methods were devised for the LHC, partly with the motivation of applying methods that are not or in a different way affected by the interpretation problem. These methods are still less precise than the direct measurements and have their own subtleties once their precision increases. The situation is reflected in an interesting way in the Review of Particle Physics (1), where the top is the only quark for which three different masses are quoted in the particle listings. I want to emphasize clearly, however, that in hadron-hadron collisions, where the underlying hard interactions that are the basis of the observables unavoidably involve partons in non-singlet color configurations, the conceptual issues that affect the direct measurements are eventually emerging for all top mass measurements methods once a precision of 0.5 GeV or better is reached.
The interpretation problem consists of a complex set of issues and requires significant theoretical progress on multiple fronts, predominantly beyond the realm of fixed-order perturbative calculations. The issue can be resolved in a straightforward and fully transparent way only once at least next-to-leading-logarithmic (NLL) precise parton showers and MMC event generators have become available (for the observables entering the top mass measurements). The latter should be capable of describing the top quark decay and the non-perturbative aspects of color neutralization in a systematic manner such that the field theory aspects of the MMC top mass parameter (and even the strong coupling parameter) are well-defined and can be determined from a simple computation. Such developments clearly require a dedicated and long-term effort that will also benefit all other aspects of collider physics. Even if this may be a bit too much to hope for, I believe that, by addressing these issues through dedicated studies, already much can be learned, so that the controversial situation can be lifted at least partially for some of the direct top mass measurement methods.
In this review, I explain the questions involved in the top mass interpretation problem from a physical and conceptual perspective. I hope that the review allows the reader to gain a better understanding of the physical and systematic aspects of the interpretation problem (and also the controversy) and an appreciation of the problems to be resolved. I am trying to be as non-technical as possible, but the problem is of subtle theoretical nature. For simplicity all formulas shown are either understood generic or truncated at O(αs) or NLO, even though higher order corrections are mostly known. All numerical results quoted have been computed including all available higher order corrections and I use α (n =5) s (MZ ) = 0.118 as the reference input for the strong coupling. I apologize for any missing references.
The review is organized as follows. In Sec. 2 the physical aspects of different top quark renormalization schemes are reviewed. This section serves as a basis of the following discussions. But I emphasize that a mere discussion on top mass schemes does not resolve the top mass interpretation problem. In Sec. 3 an overview is given on the current status of top quark mass measurements and the theoretical tools employed. Here I focus on the limitations of the theoretical tools, which are the origin of the top mass interpretation problem, and the role they play for the other top mass measurement methods. In Sec. 4 I then phrase the controversy in a set of formulae which can be discussed in a concrete way. In Sec. 5 recent work is reviewed which quantifies the inter-pretation problem numerically. In particular, I discuss the recent work of Ref. (24) where some first conceptual and concrete analytical insights into the perturbative (parton level) aspects of the interpretation problem were gained. Finally, in Sec. 6 I wrap up and conclude with a recommendation on how to practically deal with the top mass interpretation problem at this time. Appendix A contains a basic glossary.

The Principle of Top Mass Determinations
Since the top quark -just as all other quarks which carry strong interaction color charge -is not a physically observable particle, its mass is a quantity that needs to be defined from a theoretical prescription in quantum field theory, called a renormalization scheme or just a mass scheme. The choice of a scheme is in principle arbitrary. The usefulness and systematics of this concept arises from the facts that we can make precise predictions for a physical observable in a given scheme and that the predictions in different schemes can be related to each other by a theoretical calculation. Since contemporary high-energy physics for the most part considers observables where perturbative computations can be used, usually only quark mass schemes defined strictly within perturbation theory are considered. This is the view commonly accepted in high-energy collider physics, and I also adopt it in this review. So given two top mass schemes (called m A t and m B t ), the relation between the two can be described by a perturbative series of the form For simplicity I only indicate powers of the strong coupling αs and suppress contributions from the electromagnetic and weak couplings. The precision of the relation is limited by the ability to calculate and then to evaluate the truncated series in a meaningful way. I emphasize this because perturbative series in non-Abelian gauge theories such as QCD are in general asymptotic and not convergent. I will come back to this point below. Given a top mass sensitive observable σ, where I mostly refer to different kinds of cross sections, the corresponding perturbative (and also asymptotic) series, called 'parton level' cross section, can be written asσ(Q, m X t , αs(µ), µ; δm X ). The energy Q of the process, the top mass m X t in scheme X, the strong coupling αs(µ) and its renormalization scheme µ appear as explicit arguments. Furthermore, the separated argument δm X stands for the dependence of the series on the scheme choice X. By construction, the perturbative series in the two mass schemes are formally equivalent: 2 σ(Q, m A t , αs(µ), µ; δm A ) =σ(Q, m B t , αs(µ), µ; δm B ) .
However, in practice they differ due to the truncation of the series and our limited ability to calculate and sum the series. For the strong coupling αs the freedom of scheme exists as well. Within the commonly accepted paradigm of using dimensional regularization and the so-called MS prescription for αs (which are explained in the next subsection), this is signified by the dependence on the renormalization scale µ. A useful aspect of the coupling αs(µ) is, that one can interpret µ as the momentum scale above which all quantum corrections to the fundamental QCD gluon interactions are absorbed into αs(µ). So, frequently, for the choice µ ∼ Q (particularly when Q mt) the resulting perturbation series behave quite well. This way also an important set of logarithmic corrections is summed up to all orders in perturbation theory. The numerical differences between different considered reasonable scheme choices are then typically used to estimate the theoretical uncertainties of the parton level cross section.
For the extraction of the top mass (and any other QCD parameter) from an experimentally measured cross section σ exp , however, the parton level cross section does not provide the full answer and one has to also account for nonperturbative corrections: Here, ΛQCD stands for a nonperturbative scale with typical size of a few hundred MeV that governs the magnitude of the nonperturbative correction σ NP . The form of Eq. (3) is schematic and also accounts for the nonperturbative effects in the parton distribution functions needed to calculate cross sections at the LHC. For most cross sections one has Q ΛQCD and typicallyσ > σ NP . Since we consider mass schemes strictly defined within perturbation theory, the relations between mass schemes shown in Eq. (1) are free of nonperturbative corrections, so that switching between mass schemes does never modify the structure and the content of σ NP . The precision for the top mass extraction depends on the ability to calculate the perturbative cross sectionσ and to determine the nonperturbative correction σ NP . Likewise, the uncertainty in the extracted top mass arises from 2 I suppress the dependence on the masses of other quarks or leptons the combined uncertainties inσ and σ NP , where one has to keep in mind that σ NP is per se not responsible for the top mass dependence of the observable σ exp . The most preferred observables for top mass measurements (and in general), are those where σ NP vanishes as (ΛQCD/Q) n , with n ≥ 2 for the limit ΛQCD/Q → 0 because then the contributions of σ NP can be very small (because Q, mt ΛQCD). An example of such a "clean" observable is the total inclusive tt cross section in e + e − annihilation for which the observable-initiating hard reaction is the production of a colorsinglet tt pair via the process e + e − → γ, Z → tt. 3 Here, n = 4 and the nonperturbative corrections are negligible for most applications. Unfortunately, at the LHC such clean cross sections do not exist. This is because non-singlet color configurations are unavoidable for the observable-initiating hard reactions when partons (i.e. quarks and gluons that emerge from the colliding protons) appear in the scattering initial state and when jet formation is crucial for the construction of an observable. Therefore, color neutralization processes that are linearly sensitive to soft and nonperturbative momenta are unavoidable, and σ NP always depends linearly on ΛQCD. Thus we cannot get around dealing with σ NP at the LHC. Self-energy at NLO for the top quark with four-momentum p µ .

Top Mass Renormalization Schemes
In analogy to adopting an adequate choice for the renormalization scale µ of the strong coupling αs(µ), one also adopts an adequate top quark mass scheme. The central formal aspect of top quark mass renormalization is to absorb the UV divergence that arises in the NLO self-energy diagram, see the generic illustration in Fig. 1. Here displays the dominant contribution in the resonance limit p 2 → m 2 t within a calculation in d = 4−2 space-time dimensions. Using d dimensions is the standard way to regularize ultra-violet (UV) divergences in perturbative QCD computations and called 'dimensional regularization'. The ellipsis stands for higher order corrections proportional to higher powers of the strong coupling αs, which are known to O(α 4 s ) from Refs. (56,59,60,61,62,63). The term that is divergent in the limit → 0, which quantifies the UV divergence to be renormalized, and the finite term A fin (m 0 t /µ), are shown separately, and m 0 t stands for the bare unrenormalized mass. In this context the term A fin , even though it is finite, contains a contribution from self-energy quantum fluctuations arising from soft (i.e. small) momenta in a top resonance frame. 4 These soft quantum corrections, which I will refer to as 'ultracollinear' corrections in Sec. 5.2, affect m 0 t A fin (m 0 t /µ) linearly (36,37). For example, giving the gluon a small test mass λ, which cuts off these soft momenta, one obtains (64) This is important to remember for the following, because perturbation theory does not work well in this regime. The mass scheme that is closest to the concept of the strong coupling αs(µ) is the MS scheme mt(µ). Here only the pure UV 1/ term (including the conventional term ln(4πe −γ E )) is absorbed into the renormalized mass, The MS mass is µ-dependent like αs(µ) and satisfies the renormalization group equation 3 The inclusive tt cross section in e + e − annihilation at c.m. energies Q close to production threshold, Q ≈ 2mt, constitutes the most precise top mass measurement method at a future e + e − collider. Theoretical cross section predictions (25,26,27,28,29,30) have reached uncertainties at the level of several percent and allow for top mass determinations with uncertainties at the level of 50 MeV or better (31,32,33,34). Because the production of tt pairs in color-octet configurations is strongly suppressed, the effects of soft QCD radiation are strongly suppressed as well. This also applies to the ttγ final state analyzed in Ref. (35). 4 This is a reference frame where a top quark state within its finite-lifetime Breit-Wigner resonance region is having a non-relativistic average velocity. Such frames are frequently collectively called 'the top quark rest frame', but I will not adopt this jargon here, because it is not appropriate when discussing uncertainties in mass determinations much smaller than the top quark width Γt(t → bW ) = 1.4 GeV.  implying that mt(µ) depends logarithmically on µ. In analogy to αs(µ) we can interpret µ as the momentum scale (in a top resonance frame) above which all self-energy quantum corrections are absorbed into mt(µ), see Fig. 2 for a graphical illustration. The MS mass is therefore not affected by low-energy or nonperturbative quantum fluctuations and called a 'short-distance mass'. The term A fin in Eq. (4), which is not absorbed into the renormalized mass, still appears in the perturbative calculations of the process, and its 'bad' linearly sensitive small momentum contributions are known to cancel with other virtual (non-self-energy) corrections that are soft in a top resonance frame. This can be explicitly checked for any parton level cross section for a physical process involving top quarks considering the sum of all linear gluon mass terms as that displayed for A fin in Eq. (5) coming from radiation that is soft in a top resonance frame. Setting µ to the physical scale of the process governing the mass dependence of an observable, together with a proper scale setting for the strong coupling, frequently yields good behavior for the perturbation series ofσ. The interpretation of mt(µ) mentioned above, however, only applies for observables where µ ∼ > mt, which includes e.g. total inclusive cross sections at very high energies Q mt or when the top effects are virtual such as for the SM Higgs potential (6,7,8,9,10), the electroweak precision observables (1,38) or the properties of B mesons (39). Such observables can have a strong indirect top mass sensitivity, but not a kinematic mass sensitivity.
As already pointed out in Sec. 1, observables with kinematic mass sensitivity are related to distributions of variables that show sharp threshold patterns such as resonances, shoulders or endpoints. Even though these patterns are initiated by hard reactions involving the large scales mt or Q, the observable mass dependence is in addition modified by dynamical QCD and electroweak quantum effects. The momentum scales governing these quantum effects are, however, soft, i.e. much smaller than mt. I refer to this momentum scales generically as the scale 'R'. So for observables with kinematic mass sensitivity we typically have R mt. The prototypical example is the invariant mass of jets coming from the hadronic decay of a top quark. Here, the scale R governing the soft quantum effects I have been talking about is set by the width of the resonance visible in the invariant mass distribution. It is bounded from below by the top width Γt or the experimental resolution. The left panel of Fig. 7 shows the top mass dependence of the reconstructed top invariant mass m reco t from LHC simulations carried out in Ref. (124). From the distribution we can see that here we have R ∼ 30 GeV. The high mass sensitivity arises from the location of the resonance structure. While the basic location and existence of the resonance is tied to the top quark mass, which is large, the details of the resonance shape, its width and the exact location is governed in addition by low-energy QCD and electroweak effects. Observables of this kind are the basis of the direct top mass measurements.
To define an adequate scale dependent short-distance mass for observables where the mass sentivity is affected by QCD dynamics with momentum scales R < mt, one switches to an effective description in analogy to the well-known Foldy-Wouthuysen transformation (41). Here the virtual off-shell and hard top quark quantum effects are also absorbed into the mass (or 'integrated out'), but without absorbing the soft top quark dynamics. A mass scheme that realizes this concept is the MSR scheme m MSR (R) (42,43,40) defined for R < mt by the relation The MSR mass absorbs self-energy corrections coming from scales above R (see again Fig. 2 and compare to the MS mass). The definition above makes is possible that R can be chosen to be much smaller than mt consistent with the renormalization group. In practical applications R should be set to the momentum scale that governs the soft quantum fluctuations affecting the mass sensitivity of the observable (e.g. the width of the resonance in the m reco t distribution). In Eq. (8) the second peculiar looking term linear in R is essential since in the difference m 0 t A fin (m 0 t /R) − RA fin (1) all linear sensitivity to soft momenta (in the a top resonance frame) cancels, so that the 'bad' soft momentum contributions in the top self-energy of Eq. (4) are still left to cancel in calculations for processes as was the case for the MS mass mt(µ). The MSR mass is therefore also a short-distance mass. The price to pay is that the MSR mass has a renormalization group equation linear in R, which is a generic requirement for a short-distance mass scheme with a renormalization scale R < mt (44,45). The MSR mass is prototypical for the class of 'low-scale short-distance masses' devised in the last two decades for quantum field theory calculations for B mesons, heavy quarkonia and top resonance physics, such as the kinetic (46), the 1S (47,48,49), PS (50), RS (51) and jet mass (52). The MSR mass is, however, the only low-scale short-distance mass that is, like the MS mass, defined directly from the quark self-energy diagrams. For R = mt(mt) it differs from the MS mass mt(mt) only by corrections related to two-loop self-energy corrections from virtual top quark loops. Therefore it can be considered as the natural extension of the MS mass concept for renormalization scales below mt, as was advocated in Ref. (43,40). Interestingly, the MSR mass m MSR t (R) is also numerically close to the other low-scale short-distance masses at their respective intrinsic scales. See  is the mass of the top quark states that appear in parton level scattering amplitudes in the approximation where top quarks are treated as real external (or 'asymptotic') particles (54,55). Because the mass of the formally defined top quark asymptotic states is renormalization scale invariant, infrared finite and gauge-invariant at the level of perturbation theory (56,57), it appears to be unique and physical, at least at parton level. However, as already mentioned above, due to the top quark's color charge, this concept is actually unphysical considering a precision of 0.5 GeV or below. This can be seen from the fact that, due to the term A fin , the expression on the RHS of Eq. (10) depends linearly on the way infrared momenta are regularized. (Recall the example of a gluon mass regulator shown in Eq. (5).) This means that the pole of the top quark propagator (and the meaning of the mass of a top quark asymptotic state) depends linearly on the way infrared momenta (in a top resonance frame) are treated. However, what is commonly called 'the pole mass m pole t ' in the context of QCD is defined strictly within dimensional regularization, where e.g. the IR regulator gluon mass term λ shown in Eq. (5) does not arise.
The pole mass m pole t is obtained from the MSR mass m MSR t (R) taking the formal limit R → 0, so that the MSR mass can be seen as a scheme that kind of interpolates between the pole and the MS masses. For finite R the relation between the pole and MSR masses reads In comparison, the relation between the pole and MS masses has the form From the conceptional point of view, the MSR mass m MSR t (R) can also be seen as a scheme designed for observables where (virtual and real) QCD corrections below the scale R are unresolved so that the self-energy corrections below R, which are not absorbed into m MSR t (R), are left to cancel with 5 Current uncertainties in the strong coupling do not allow to reach this precision in the relation of all short-distance masses. From Eqs. (11) and (12)  other (real or virtual) quantum fluctuations from scales below R in a top resonance frame. In this context the pole mass is a scheme that is based on the unphysical view that virtual and real perturbative QCD corrections can be resolved down to arbitrarily small scales.
The unphysical character of the pole mass concept is reflected in the fact that the perturbative series for physical observables in the pole scheme carry the so-called 'pole mass renormalon ambiguity' (66,67,68). At this point let me briefly detour to explain some general aspects of renormalons, as they do not only arise for the pole mass. Renormalons arise in all parton level cross sectionsσ computed in dimensional regularization, even for observables involving only massless quarks or gluons. This is because the perturbation series for all QCD observables are asymptotic series as already mentioned below Eq. (1). This means that the terms in the perturbation series may decrease at low orders, but they eventually adopt divergent patterns, which I call 'turnover' in the following. These divergent patterns in QCD perturbation theory were already discovered very early in the history of QCD (69,70,71) and can be caused by sensitivities to physical infrared QCD dynamical effects that are directly associated to specific types of nonperturbative corrections contained in σ NP . So, nonperturbative contributions to σ NP having characteristic scaling behaviors in powers of (ΛQCD/Q) n are in one-to-one correspondence to specific types of asymptotic divergent patterns (74,75). This correspondence is understood very well mathematically and the associated divergent patterns of the perturbative coefficients can be quantified analytically to all orders. The generic rule applies that the lower the power of n is in a contribution to σ NP , the stronger is the associated asymptotic divergent pattern and the lower is the order of turnover. The formal mechanism that brings it all together is as follows: When making predictions for the observable σ exp , the correction term σ NP compensates, order-by-order in perturbation theory, the divergent patterns inσ and at the same time adds the corresponding physical nonperturbative corrections. This connection is a fundamental aspect of QCD predictions of the form in Eq. (3), where perturbative and nonperturbative contributions are separated and dimensional regularization is used to regulate infrared momenta (74,75). In practice it may not be easy to realize this mechanism at high perturbative order, but for most applications the available perturbative corrections appear to be below the order of turnover.
When making parton level predictions in the pole mass scheme, there is a renormalon that arises from a divergent pattern coming from the virtual non-self-energy corrections that are soft (in a top resonance frame) and left uncancelled. I have already emphasized this cancellation issue several times above, and this is why. The divergent pattern of this pole mass renormalon has the same mathematical structure as those patterns related to contributions to σ NP linear in ΛQCD, and this is in one-to-one correspondence to the linear infrared sensitivity of the term A fin illustrated in Eq. (5). As such, the pole mass renormalon is rather strong and its numerical impact and even the turnover point can be visible and relevant already at the low orders accessible to perturbative calculations. So the pole mass renormalon looks very much like an uncertainty due to some missing physical nonperturbative information to be remedied by a contribution in σ NP linear in ΛQCD (i.e. proportional to ΛQCD(d/dmt)σ). However, this view is incorrect, since the pole mass renormalon pattern is an artificial problem tied to the unphysical concept of a top quark particle pole and not related to a physical effect (that would be encoded in σ NP in Eq. (3) (66,67,68). Using instead a short-distance mass at an appropriate renormalization scale, this renormalon is just gone. Insisting on using the pole mass, however, one must truncate at the order where the correction is minimal, i.e. around the order of turnover. The inability to do that in a unique way in practice (and even in principle) results in an ambiguity in the determination of m pole  renormalon ambiguity". Interestingly, the ambiguity is unchanged even if the top quarks finite lifetime is accounted for, which shifts the top quark propagator pole to a complex p 2 value (73). This underlines the unphysical character of the pole mass renormalon.
Interestingly, the divergent pattern of the pole mass renormalon happens to grow so rapidly with order that for many quantities the explicitly calculated coefficients are already completely saturated by it (40,43,65,72). 6 The ambiguity is also reflected in the form of Eq. (11), where the limit R → 0 can apparently only be taken by crossing the Landau pole in αs(R). Since this would generate a nonperturbative contribution to the pole mass (which is unphysical), the limit R → 0 is taken keeping the scale µ in αs(µ) finite. This is another way to understand how the divergence pattern in the coefficients of perturbative series in the pole mass scheme arises. The pattern is illustrated in Fig. 4, where the pole mass is determined from the MSR mass m MSR t (R) at different values of R from Eq. (11) as a function of truncation order n. For the orders n > 4 the asymptotic estimates are used, see footnote 5. The observed pattern is representative of the behavior of pole mass determinations from physical observables (not affected linearly by other kinds of soft QCD effects) where the mass dependence is governed by QCD dynamics at the scale R. For R ∼ > mt (relevant for total inclusive cross sections, see blue points and error bars) the order of turnover is 7 or 8, and one needs to go for many orders to get to the final range of m pole t values (gray bands). For R ∼ < 10 GeV (relevant for some differential cross sections with kinematic top mass sensitivity, see red and green points and error bars) the order of turnover is 2 or 3. Here one can reach the final range of m pole t values already at orders accessible with available perturbative computations (see Sec. 3.1) and even the tree-level results are close to it. 7 The size of the ambiguity and final range for m pole t (around the order of turnover) can be formally proven to be independent of R, and the size of the ambiguity can be formaly shown to be of order ΛQCD (66,67,68). Recent analyses quantified it as 110 MeV (65) and 250 MeV (40) (width of the gray bands) using all available theoretical information 8 . Figure 4 also underlines that when numerically converting between pole mass and short-distance mass values, it is essential to truncate at the order of turnover (related to the vertical location of the gray bands), which may differ from the perturbative order used in the calculation. I adopt the approach from Ref. (40) when quoting numerical values for the difference between m pole t and short-distance masses. 6 The series for m pole t − m MSR t (R) and m pole t − mt(µ) in Eqs. (11) and (12), respectively, have been computed explicitly up to O(α 4 s ) (56,59,60,61,62,63). The large order renormalon asymptotics of the pole mass renormalon has been shown to saturate the O(α 3 s ) and O(α 4 s ) coefficients and the coefficients at O(α n s ) for n > 4 are therefore known with a precision of few percent from their asymptotic behavior (65,43). 7 This also illustrates that for perturbative computations for observables where R is large, it is more natural to use m MSR t (R) (for R < mt) or mt(R) (for R > mt) as mass schemes, while m pole t or m MSR t (R) are more natural for observables where R is small. 8 These estimates were obtained for finite charm and bottom quark masses. For mc = m b = 0 the ambiguity was estimated as 70 MeV in Ref. (65) and 180 MeV in Ref. (40). The dependence on the charm and bottom masses reflects the strong sensitivity of the pole mass on small momenta. It arises because the ambiguity is depending on the leading β-function coefficient of the strong coupling αs which increases when the number of massless quarks is decreased.
It should be stressed, however, that even when a short-distance mass scheme is used for a parton level top mass dependent cross sectionσ, one still has to deal with possible contributions in σ NP , and in particular with those depending linearly on ΛQCD which also affect mass determinations in a linear way. Such linear nonperturbative effects are unavoidable for LHC observables because the hard processes generating the top mass sensitivity always involve top states (single top, tt, . . . ) in a non-singlet color configuration or depend on jets. For these observables the description of the physically observable top mass dependence always involves nonperturbative color neutralization processes which enter σ NP linearly, as I already emphasized at the end of Sec. 2.1. For top mass determinations these nonperturbative color neutralization processes must be understood separately and disentangled from the top mass dependence to ever reach the principle theoretical uncertainty limit of 30 MeV for a short-distance mass determination mentioned above. 9 The limit can be approached for e + e − colliders (see footnote 3), but it is very difficult to do so for LHC observables. However, the size of the physical color neutralization corrections at the LHC is observable-dependent and can in some cases be controlled field theoretically using QCD factorization or be small, see Refs. (79) and (77), respectively, for related studies.

STATUS OF TOP MASS DETERMINATIONS AT THE LHC
In this section I discuss the status of state-of-the-art top mass determinations, focusing on the experimental methods and theoretical tools employed in the analyses. Since there are already a number of excellent reviews on the experimental aspects of the measurements in Refs. (1,14,80,81), on projections for the HL-LHC (see Ref. (2,82) and references therein) and on the status of the employed MC tools (see Refs. (1,83)), I refrain from a detailed technical presentation and rather concentrate on the conceptual aspects critical for LHC top mass measurements.

Fixed-Order Calculations
State-of-the-art fixed-order parton level computations, i.e. perturbation series forσ as expansions in powers of αs, have reached a high level of sophistication and are primarily based on numerical methods. For the production of on-shell top quark pairs, QCD corrections at next-to-next-to leading order (NNLO) (84) are available including the resummation of QCD next-to-next-to leading (NNLL) logarithms involving the ratio of the top quark transverse momentum pT and mt (85) and also accounting for NLO electroweak corrections (86). In the narrow-width approximation (NWA) for the top quark, NNLO QCD calculations for on-shell top quark production and top quark decay (87) have been combined to allow for fully differential predictions (88). These results neglect finitelifetime effects and do not account for the summation of large logarithms of the ratio Γt/mt related to the top quarks low-energy off-shell dynamics. Finite-lifetime effects have been included in fixedorder QCD NLO computations for the W + W − bb final state, including W decays in leptons (89) or jets (90). These calculations describe top production, top decay to W + W − bb final states and W + W − bb non-resonant production. The latter results are less precise concerning QCD corrections and lack the resummation of logarithmic terms. NLO and higher fixed-order calculations provide reliable approximations with controlled top mass scheme dependence toσ for observables where the typical momenta of the QCD dynamics governing the mass dependence are of the size mt or larger. Relevant for top mass measurements, this includes the total inclusive tt cross section, the tt+jet invariant mass M ttj for values much larger than 2mt and leptonic distributions away from kinematic threshold structures (kinks, shoulders, endpoints). Including the summation of logarithms of the ratio pT /mt further provides reliable parton level descriptions of the pT and the tt invariant mass M tt distributions in the boosted top region where pT mt or M tt 2mt. Interestingly, almost all fixed-order calculations are available only in the pole mass scheme. Making parton level predictions in short-distance top mass schemes requires a reexpansion of the perturbative series using Eqs. (11) or (12)

Multipurpose MC Event Generators
Multipurpose Monte-Carlo event generators (98,93,94) (MMCs) form the backbone of essentially all experimental analyses at the LHC. They are used to simulate all processes spanning from the colliding protons to the emergence of the observable hadrons. MMCs are used to design novel observables and measurements, for detector simulations, and to determine efficiencies and acceptances. As illustrated in Fig. 5, they combine the quark and gluon (parton) structure of the colliding  protons (big gray blobs), tree-level leading order (LO) matrix elements for the hard parton interactions (red), a parton shower (PS) that describes the branching of the hard partons into lower energy partons (dark blue) and a hadronization model. The latter turns the high-multiplicity partonic states that emerge after the PS terminates into the observable hadronic particles, accounting for the color flow in the large-Nc limit (small gray blobs for hadrons and green zigzag lines for color correlations). The hard matrix elements and the PS provide descriptions forσ in the collinear and soft limits, where fixed-order calculations are insufficient due to large logarithmic terms. These descriptions can be NLL precise for certain simple classes of observables such event-shapes, but are in general less precise even though they can still provide a description adequate for experimental simulations (95,96). State-of-the art PSs are either based on angular ordered coherent branching (CB) (97) (as used as the default in the Herwig (98) MMC family) or on transverse momentum ordered dipole showering (99) (as used in the Pythia (93) and Sherpa MMCs (94) and optionally also in Herwig (100)). Differences between the two PS types arise for example in the treatment of non-global observables, where CB has intrinsic restrictions, or in momentum recoil effects, where dipole showering is based on a local treatment for each parton branching leading to precision issues for global observables. The description of the proton structure in terms of parton distribution function and the hadronization models provide approximate descriptions for σ NP . The hadronization models are based on the concepts of decaying clusters (101) or the breaking of QCD strings (102). Their parameters are not fixed from first principles QCD but through the tuning procedure, i.e. by demanding that the MMCs reproduce a certain set of experimental differential reference cross sections. This allows the MC generators to provide adequate descriptions even when the PS description is less precise.
The precision of PSs in MMCs can be elevated by matching them with NLO matrix elements (referred to as NLO+PS). Matched generators such as MacGraph5 aMC@NLO (103,104) or the POWHEG (105) procedure and related methods available in Herwig (106) and Sherpa (107) improve the description of the first hard PS emission to NLO precision (typically with transverse momenta larger than 10 GeV) but leave the soft and collinear emissions and hadronization provided by the underlying MC generators unchanged. Figure 6 illustrates generically how the matching affects the distribution of a top mass sensitive kinematic distribution, see the caption for a more detailed explanation. MMCs share in an observable-dependent way the aspects of first-principle calculations as well as model-descriptions, where the primary goal is the good description for experimentally observable quantities. Obtaining reliable estimates of the theoretical uncertainties of the MMC descriptions is therefore a nontrivial task. There is an ongoing effort to improve the theoretical basis of MMCs and the methods to estimate their uncertainties for observable quantities, see e.g. Refs. (108,109,95,110).
For top quark physics, mostly the Pythia (93) and Herwig (98, 111) event generators are employed. It is an essential aspect of all experimental top quark measurements to properly estimate the theoretical or model uncertainties of the MMC descriptions. The common approach of the experimental collaborations is to analyse the variations obtained from a few different MMC implementations that are considered reasonable. Limitations in state-of-the-art MMCs, particularly relevant for LHC top mass measurements, concern subtle issues such as colorreconnection (112,113,114,115,116), multiparticle interactions (yellow in Fig. 5) (117, 118), the precise determination of parameters of the hadronization models (119) or finite lifetime effects (120,121). In addition, MMCs used for state-of-the-art LHC analyses only contain LO matrix elements for the top decay. A serious principle limitation is that all massive quark PSs are theoretically based on the quasi-collinear (i.e. boosted top) approximation, while the bulk of the top mass measurements rely on top events with relatively low top quark transverse momenta pT ∼ 100 GeV and velocities vt ∼ 0.5. How this restriction affects current top mass measurements is to the best of my knowledge unknown and also not quantified for top mass measurements. Furthermore, the parton level descriptions of top mass sensitive kinematic threshold structures provided by NLOmatched MMC generators are not elevated to subleading QCD precision. This is because sharp threshold structures are governed by soft and collinear radiation and hadronization effects. Examples of observables subject to this issue are all kinematic observables reconstructed from a single top quark such as its reconstructed invariant mass m reco t or kinematic endpoint regions for variables such as the lepton energy E or the lepton and b-jet invariant mass M b . Also the reconstructed tt invariant mass M tt in the threshold region where M tt ≈ 2mt is subject to this issue, due to soft radiation effects related to the tt pair produced with small relative velocity in a color-octet state, Coulomb binding corrections and coherence effects in the simultaneous weak decay of the tt pair. I reiterate that observables with such threshold structures are responsible for the high top mass sensitivity of the direct mass measurements. But it should in turn also be mentioned that NLO-matched MMCs can provide NLO reliable parton level approximations for observablesσ where the top mass sensitivity is generated exclusively by hard interactions (referred to as observables with indirect top mass sensitivity below Eq. (7)). Examples are the total cross section or the mass variables M tt and M ttj far above threshold. So if the NLO-matching procedure uses the identification m MC t = m pole t , a measurement of m MC t from such hard interaction dominated observables can indeed be considered as a pole mass measurement.
In this context, measurements of the top quark mass are more subtle than measurements of physical observables (hadron and lepton momenta, lifetimes, hadronic and jet cross sections, . . . ). This is because the top mass and its couplings are not physical observables, but theoretically defined Lagrangian parameters. For their measurement the MMC employed has to provide perturbative (σ) and nonperturbative descriptions (σ NP ) separately consistent with QCD, such that the mass and coupling parameters of the generator retain a definite and observable independent relation to the QCD Lagrangian parameters. This relation is diluted or even lost to the extent that the tuning compensates for conceptual deficiencies of the PS regardless of whether the MMC describes the data well. This is particularly subtle for the top quark mass parameter m MC t since the MMC has to reliably simulate all color neutralization (linear in ΛQCD) and finite-lifetime (linear in Γt) effects consistent with the SM. This issue is the core of the MMC top mass interpretation problem related to the direct top mass measurements. Only limited quantitative knowledge on this complex set issues exists today.

Experimental analyses
The direct measurements are the most precise top quark mass extractions carried out at the LHC. They are based on kinematic observables constructed from reconstructed top decay products (light quark and b quark jets, leptons) for the different accessible top decay (semi-leptonic or hadronic) and top production modes (tt and single top events). For the template method the b- Generic structure of a kinematic distribution with a top mass sensitive kinematic threshold structure in the soft-collinear region (on the left side) obtained by NLO-matched MMCs (NLO+PS, red) and unmatched MMCs (LO+PS, solid black). The distribution at NLO in QCD fixed-order perturbation theory (NLO FO, solid green) is singular and diverges in the soft-collinear limit. The parton shower evolution of the unmatched MMCs sums the leading logarithmic singular terms to all orders in fixed-order perturbation theory leading to a physically meaningful approximation with Sudakov suppression in the soft-collinear limit (LO+PS, solid black). The matching procedure adds the difference between unmatched MMC description expanded out to NLO (LO+PS NLO FO , dashed black) and the NLO QCD fixed-order (NLO FO, solid green) results, both of which are singular, to the tail of the unmatched MMC distribution in the region dominated by hard radiation events (gray area, on the right hand side). Since at the NLO fixed-order level the first hard emission arises from the first emission generated by the PS, this elevates the first hard PS emission of the NLO-matched MMCs (NLO+PS, red) to full NLO precision in QCD. Some distributions have a tail on both sides of the soft-collinear region.    So-called pole mass measurements are based on the inclusive and differential tt cross sections, for which theoretical parton level predictions expressed in the pole mass scheme from NNLO+NNLL calculations for the total cross section σ(tt + X) (128) or NLO-matched MC generators for the reconstructed tt+jet invariant mass M ttj (129), (di)leptonic variables (130) and tt invariant mass M tt are available. A summary of these measurements is shown in the right panel of Fig. 8, and the current world average quotes m pole t = 173.1 ± 0.9 GeV (1). The inclusive tt cross section and the invariant masses M tt and M ttj (away from the lower threshold at 2mt) are examples of observables where the top mass sensitivity is indirect, i.e. exclusively tied to hard interactions. For them, parton level predictions at NLO (or higher) and NLO-matched MC generators carry NLO information on the mass scheme. Furthermore, for these observables the resolution scale R for the QCD dynamics governing the mass sensitivity (see Fig. 4) is of order or larger than mt. One can therefore expect that the theoretical errors of the parton level prediction may be further reduced when even higher order fixed-order or resummed calculations become available or when the MS top mass scheme is employed. Inclusive cross section measurements yielded m pole  (133). 11 The relatively larger errors in comparison to the direct measurements result from the uncertainty in the normalization of the inclusive cross section (dominated by gluon luminosity uncertainties and renormalization scale 10 I believe that much could be learned from knowing the reasons for the discrepancy between the Tevatron and the LHC measurements. The impact a recalibration of the jet energy scale for the Tevatron D0 lepton+jet direct mass measurement (126) was analyzed in Ref. (127). 11 The analysis of Ref. (133) also studied the strong correlation between the extracted top mass, the value of the strong coupling αs(M Z ) and the employed set of parton distributions functions (134,135,136,137). The quoted lower value for m pole is based on a set of parton distribution functions (134)   CMS, 7+8 TeV comb. [10] total stat Ref.
) n-differential, NLO t (t σ +1j) differential, NLO t (t σ ) inclusive, NNLO+NNLL t (t σ ATLAS, 7+8 TeV [ 1.2) ± 0.8 ± 1.6 (0.9 ± 173.2 CMS, n=3, 13 TeV [9] 0.8 ± 170.9   (138), which is more precise since the distribution exhibits a mass sensitive broad hump. A measurement using leptonic distributions by the ATLAS collaboration obtained m pole t = 173.2 ± 1.6 GeV (139). It should be pointed out that for leptonic distributions the color neutralization effects I mentioned earlier indirectly affect the momentum of the decaying W boson and cannot be avoided. A CMS analysis including the total inclusive cross section, the M tt and the top pair rapidity y tt distributions and a simultaneous αs and gluon distribution fit obtained m pole t = 170.5 ± 0.8 GeV (140) and poses some tension with the pole mass world average mentioned above. For the latter analysis I would like to remark that the smaller error compared to the inclusive cross section measurements above partly results from the inclusion of the M tt distribution which is kinematically sensitive to the top mass in the threshold region M tt ≈ 2mt. This is an issue to be examined thoroughly for achieving reliable theoretical descriptions, because the theoretical fixed-order calculations employed for the analysis to determine the top mass are based on the approximation where M tt is defined from the 4-momenta of on-shell top quarks. On the other hand, NLO-matched MMC descriptions, used to relate the reconstructed observable M tt distribution to the theory calculation, do not have subleading QCD precision for M tt in the threshold region. Furthermore, a large fraction of the tt pairs is produced in color-octet configurations, for which the effects of soft QCD radiation are significant.
A number of alternative methods to measure mt have been proposed, which are based on differential cross sections with respect to alternative mass sensitive variables constructed from top decay products. The observables include the MT 2 variable and variants of it (141,142), the shape of b-jet and B meson energy distributions (143), the J/ψ and lepton invariant mass (144,145), secondary vertices from b quark fragmentation (145). They are also motivated having in mind the kinematics of a decaying top particle. These observables are affected by issues similar to the direct measurements albeit with differing systematics and leading to larger uncertainties. They can also be seen as m MC t measurements and are consistent with the direct measurements. Using the fact that the sensitivity to soft and nonperturbative dynamics can be reduced by jet grooming techniques (146,147,148,149), it was suggested to use the mass of a fat and groomed boosted top quark jet, for which factorized QCD predictions with field theoretical control of the top mass scheme and nonperturbative effects can be determined (79). In Ref. (150) the γγ invariant mass spectrum Mγγ was suggested as a top mass sensitive variable since it shows a glitch due to large QCD phases and Coulomb bound state effects when Mγγ ≈ 2mt. Predictions of the γγ mass observable in principle allow to control the top mass scheme systematically, but I remark again that LHC produces significant amounts of tt pairs in a color-octet state. Due to the effects of radiation that is soft in tt c.m. as well as the lab frame, precise and reliable predictions of Mγγ are therefore significantly more involved than for the analogoue tt threshold cross section in e + e − annihilation (29,30), and are still to be achieved. Furthermore, the γγ mass method requires HL-LHC to be competitive with the current pole mass measurements uncertainties.
Overall, current direct and pole mass measurements show good mutual agreement, but the discriminating power of the pole mass measurements is somewhat lower. One can expect that the theoretical uncertainties of pole mass measurements may be further reduced when the corresponding next higher order perturbative calculations or improved theoretical approaches become available. An additional reduction of theoretical uncertainties may be achieved when, instead of the pole mass scheme, appropriate scale-dependent short-distance mass schemes such as MS or MSR are employed. This should, however, also be accompanied with some substantially increased un-derstanding concerning a number of systematic effects influencing the size and shape of the related differential cross sections which currently affect these top mass measurements at the level of 1 GeV or larger. It requires dedicated work for the pole mass measurements to approach the numerical precision of the direct measurements quoted by the experimental collaboration. However, one has to keep in mind that the direct measurements suffer from an additional uncertainty related to the m MC t interpretation problem.

THE CONTROVERSY
The controversy described in Sec. 1 is about the question of whether or not the interpretation problem of m MC t is large compared to the experimental uncertainties quoted for the direct top mass measurements. The arguments for the two viewpoints can be paraphrased in a concrete form as follows.
There has been the view, advocated in Refs. (151,152), to write where it is assumed that the identification of the MMC top mass parameter with the pole mass is appropriate to very good approximation and the term ∆ MC m t is related to the approximate MMC theory description and modeling. The term ∆ MC m t is an uncertainty in addition to the uncertainties quoted by the experimental collaborations, but it is argued to be much smaller than these such that it is appropriate to identify m MC t = m pole t . This view is based on the following argumentation: First, the parton level components of MMCs (hard matrix elements, parton shower (PS)) are good approximations to perturbative computations made in the pole scheme. Second, the PS can be assumed to provide a good approximation to soft and collinear perturbative radiation at in principle all soft scales for the observables entering the direct measurements. The other view, advocated in Refs. (12,18), is to write where R0 is an appropriate scale. The argumentation is as follows: In state-of-the-art MMCs the PS evolution terminates at a scale Q0 around 1 GeV (called the 'shower cut') which keeps the strong coupling governing the PS in the perturbative regime and avoids that the number of partons becomes too large and computationally unmanageable. Since the PS in MMCs is an approximation to perturbative soft and collinear radiation for scales above Q0, all (real and virtual) radiation at scales below Q0 is treated as unresolved and thus left to combine and cancel. Therefore the self-energy corrections from scales below Q0 are not absorbed into m MC t . This implies that the generator mass depends on the shower cut Q0 (and in principle also the type of the PS), is close to the MSR mass m MSR t (R0) for R0 ∝ Q0 and thus a short-distance mass like the MSR mass. The relation can be computed if Q0 is treated as a factorization scale such that the PS is only used in the perturbative regime and not to model nonperturbative effects (as it should be for a first principles perturbative calculation). So ∆ MC m t (R0 ∝ Q0, Q0) is a finite perturbatively computable term scaling like αs(Q0) × Q0 ∼ 0.5 GeV. It is not captured by the uncertainties quoted by the experimental collaborations and may not be smaller than these. To determine it reliably, detailed additional insights into the perturbative precision and structure of PSs and the physical meaning of their shower cut Q0 are mandatory. This also implies a level of scrutiny on the theoretical precision of PSs and hadronization models in state-of-the-art MMCs beyond of what is presently imposed, to find out whether ∆ MC m t (R0, Q0) is observable independent or has a nonperturbative contribution. The size of the pole mass renormalon ambiguity plays no role in Eq. (14) which is a relation between two short-distance masses.
The second view is conceptually more involved than the first. The controversy thus boils down to different judgment on (a) whether the first view on the smallness of ∆ MC m t in Eq. (13) indeed applies or whether the formulation of Eq. (14) is required 12 and (b) whether the impact of the shower cut on the perturbative components of the MMCs is negligible so that Q0 is merely a parameter of the hadronization model or whether Q0 is an infrared factorization scale at the interface between the perturbative and nonpertubative components of the MMCs that can (and must) be quantified analytically. It should be also stressed that even though there is a controversy, given that Q0 is a relatively small scale of around 1 GeV, we are talking about differences and effects at the level of 0.5 to 1 GeV, but not more than that. In the context of QCD, worries that m MC t may be close to the MS mass mt(mt) (which would constitute differences at the level of 10 GeV) are unfounded. Furthermore, there is overall agreement in the demand that MMCs need to be improved to gain a higher level of precision concerning the quality of their PSs and hadronization models to reduce systematic uncertainties. For the second view this is a necessary condition to determine ∆ MC m t (R0, Q0) from first principles analyses, but clearly all methods to determine the top mass would benefit from such improvements.

Numerical Size of the Interpretation Problem
While initially only qualitative arguments for the interpretation problem for m MC t were available (12,18,16,151,152), recently some quantitative studies appeared, which have shed some light on the numerical aspect of the issue. In Ref. (153) a combined analysis using the direct method and a pole mass measurement using the inclusive cross section was carried out yielding that m pole (1 GeV) + (0.18 ± 0.23) GeV were obtained from fitting a NNLL+NLO calculation for the 2-jettiness distribution in the resonance region for boosted top production in e + e − annihilation (76) to Pythia 8.2 (93) pseudo-data samples. The result of this calibration study included a rigorous estimate of nonperturbative uncertainties of the analytic NNLL+NLO calculation, since hadronization corrections can be rigorously desribed by a factorized shape function (78). An analogous analysis for the LHC was performed in Ref. (79) using soft-drop groomed (149) jet mass distributions at NLL+LO, obtaining compatible but less precise results.
The results of the analyses in Refs. (153,154) are consistent with the view that the interpretation problem for m MC t is limited to the level of 0.5 or 1 GeV. In Ref. (154) also valuable quantitative results concerning the two views of Eqs. (13) and (14) were provided. In particular, the term ∆ MC m t in Eq. (13) has been shown to be about 0.5 GeV (i.e. of the same size as the uncertainties quoted for the LHC direct top mass measurements) for an e + e − process where Pythia (and all major MMCs) can be trusted to perform with a much higher precision. It is therefore conservative to conclude that the error in identifying m MC In Ref. (24) such a first principles study was initiated, still for boosted top quark production in e + e − annihilation. I review the results of this study in the following section.

First Conceptual Insights
To start the discussion, let us write down the relation between the MMC top mass parameter and the pole mass as It presents a generalized unbiased version of Eqs. (13) and (14) that makes the potential shower cut dependence of the MMC top mass parameter explicit and serves to visualize the issues that need to be understood. As written down, none of the three ∆ terms on the RHS is accessible via the error estimates carried out by the experimental collaborations. The three ∆ terms may even have different signs. Furthermore, all quantities except the pole mass m pole would be observable-independent. We could then simply calculate ∆ pert MC (Q0) from an analytic solution of the PS algorithm (with finite Q0) for a simple mass sensitive observable and a comparison with the corresponding partonic QCD calculation. In the analysis of Ref. (154) already mentioned above, the sum of the three ∆ terms was quantified as (0.57 ± 0.28) GeV for the Pythia 8.2 MMC and the e + e − 2-jettiness distribution, but no information on the size and interplay of ∆ pert MC (Q0), ∆ non−pert MC (Q0) and ∆ MC was acquired. Such a differentiated knowledge is, however, mandatory to allow for first principles conclusions on the field theoretic interpretation of the MC top mass m MC t . This is because sizeable contributions from ∆ non−pert MC (Q0) and ∆ MC can make the meaning of m MC,Q 0 t observable-dependent and non-universal. In Ref. (24) a first principles study of Eq. (15) was initiated by a dedicated analysis of the perturbative contribution ∆ pert MC (Q0). It was based on a combined analytical and numerical examination of the CB formalism for massive quarks (155) that is the theoretical basis of the angular-ordered PS used in the Herwig 7 MMC. The analysis was restricted in several ways: (i) The observable considered was the 2-jettiness event-shape distribution for boosted top pair production in e + e − annihilation in the resonance region, which is a global observable and equivalent to the distribution of the sum of the squared hemisphere masses with respect to the thrust axis. For this observable the available NNLL+NLO QCD computation (76,154) is based on a factorization of large-angle soft radiation (i.e. radiation that is soft in the tt c.m. frame) and ultracollinear radiation (i.e. radiation that is soft in the respective resonance frames of the boosted top quarks) (78). The results can therefore be immediately generalized to all e + e − massive quark event shape type observables for which the ultracollinear dynamics is universal, but not to those employed for LHC top mass measurements. (ii) The use of an e + e − event-shape variable such as 2-jettiness represents another physical restriction because the distribution is only sensitive to QCD radiation in the production stage of the top quarks while the effect of final-state radiation (off the top decay products) is power-suppressed. (iii) The NWA was employed, which does not account for finite-lifetime effects. (iv) The angular ordered CB shower formalism was considered, which differs from the transverse momentum ordered dipole shower formalism.
In this context the following statments were proven by first principles computations and analytic as well as numerical studies: 1. The consistent resummation of logarithms at NLL order in the singular resonance region, which carries the kinematic mass-sensitivity, is mandatory and sufficient to control the top mass scheme with NLO (O(αs)) precision. 2. The CB formalism for massive quarks (155), and thus also the angular ordered PS in Herwig 7, is NLL precise in the top mass sensitive singular resonance region and is fully equivalent with the NLL QCD factorization predictions of Ref. (76,154). 3. For vanishing infrared regularization (i.e. Q0 = 0) the quark mass parameter appearing in the CB formalism at NLL (defined in an expansion in powers of αs and logarithms) agrees with the pole mass m pole t to NLO, i.e. O(αs). This does not, however, apply to angular-ordered PSs because their evolution requires a finite shower cut Q0 > ΛQCD to avoid infinite parton multiplicies and the strong coupling Landau pole. 4. In angular ordered PSs the shower cut Q0 represents the minimal transverse momentum p ⊥ of radiated gluons or other partons that emerge from the showering and splittings. It can be also seen as a resolution scale or an infrared cutoff. An analysis of large-angle soft radiation as well as ultracollinear radiation with respect to effects linear in Q0 was carried out for the PS and the QCD calculation. This amounts to using a finite Q0 for the PS and imposing the Q0 cut in the QCD calculation in the pole scheme accounting also for the mass counterterm (which is absent in the CB algorithm 13 ). For the large-angle soft radiation the linear Q0 cutoff dependence is physical and represents a factorization scale at the interface to a nonperturbative effects (which is known as the linear power correction α0 (156) or Ω1 (157) in the tail of e + e − event shape distributions). A change in Q0 must therefore be compensated by a corresponding modification of non-perturbative contributions and does effectively not lead to a change at hadron level. For the ultracollinear radiation, terms linear in Q0 are generated as well, but in the full QCD calculation their cumulative effect in a smeared distribution or a moment cancels so that there is no physical net effect. However, the finite Q0 value entails that all virtual (self-energy and non-self-energy) ultracollinear radiation effects become unresolved and cancel 14 so that the pole of the top propagator is shifted away from m pole t (defined in the usual way without any infrared cut -recall the discussion after Eq. (10)) by a term linear in Q0. This shifted mass of the top propagator pole has been called coherent branching (CB) mass and reads The existence of a term linear in Q0 on the RHS of this equality has precisely the same origin as the linear gluon mass term already shown in Eq. (5). Since the CB algorithm does not generate any self-energy corrections, the generator mass for finite Q0 and at parton level is equal to m CB t (Q0) rather than m pole t , which implies ∆ pert CB,Herwig (Q0) = − 2 3 αs(Q0)Q0. For boosted top quarks the effects linear in Q0 on the large-angle soft and the ultracollinear radiation have an opposite sign, but also a different dependence with respect to the c.m. energy Q. Thus they can be analytically and numerically disentangled unambiguously at parton level. These linear contributions correctly exponentiate so that the mass change is consistently implemented in the resummed tower of logarithms. 5. The CB mass m CB t (Q0) is a short-distance mass, so its relation to other short-distance masses is not affected by the pole mass renormalon ambiguity. For example, the numerical relation between the CB and the MSR mass reads m MSR t (Q0) − m CB t (Q0) = 120 ± 70 MeV for the Herwig 7 shower cut Q0 = 1.25 GeV, where 70 MeV is an estimate of the missing two-loop correction. This allows to relate the CB mass to all other known short-distance masses with the same precision. It can also be seen that perturbation theory still works well at a scale of 1.25 GeV (which is close to the charm quark mass). Reducing this perturbative uncertainty would require the determination of the O(α 2 s ) term in Eq. (16) in the context of a NNLL order precise CB algorithm. The difference of the CB and the pole mass mass can be determined using the relation between the MSR and the pole mass shown in footnote 11. This gives m pole t − m CB t (Q0) = 480 ± 260 MeV which can be considered an all order relation that cannot ever be made more precise.
What can we learn from the results of the analysis? Let me start with some comments concerning its restrictions. The restriction to boosted top quarks goes hand in hand with the fact that both the CB formalism for massive quarks (155) (as well as dipole-type shower algorithms) and the QCD factorization approach of Ref. (76) only apply in the quasi-collinear regime. For slow top quarks a QCD factorization approach disentangling the individual top quarks from each other does not exist and the use of branching algorithms is an extrapolation (even though no serious problems seemingly appear in the description of top event provided by MMCs). Conceptual and analytic first principles studies of the top quark generator mass for the bulk top quarks are therefore strictly speaking impossible with the current set of theoretical tools, and one has to rely on extrapolation studies starting from the boosted regime. On the other hand, this makes precision studies and top mass measurements with boosted top quarks, which become available with high statistics at the HL-LHC, interesting, see Ref. (158,159,160) for recent CMS and ATLAS measurements. The restriction to a global e + e − dijet event-shape in the resonance region (where the top decay is treated fully inclusively) entails, as already mentioned, that the analysis is only sensitive to QCD radiation in the production stage of the top quarks. Corresponding global event-shape observables at the LHC are considerably more involved due to the effects of initial state radiation, underlying event contamination and long-distance color correlations. For boosted top quarks, however, the basic simplicity of QCD factorized predictions for e + e − collisions can be largely maintained also in hadron-hadron collisions if soft-drop groomed jet mass observables are considered (79,161), so that an LHC study in analogy to Ref. (24) is not unfeasible. Furthermore, e + e − dijet event-shapes differ conceptually from the observables employed in the direct measurements which use observables differential in the top decay. This restriction can be lifted by considering more differential observables, and technology to do so is available from the vast knowledge obtained in the theory of B meson decays (36,37) and contemporary progress in factorized calculations (162). The soft-drop groomed top jet mass analysis in Ref. (79) goes in that direction as well, but considers an observable not yet analyzed by the experimental collaborations. The restriction to the NWA has been applied since state-of-the-art PSs do not provide a systematic treatment of the top quark width. The Herwig 7 generator uses the NWA and Pythia is based on a NWA supplemented by throwing a random top mass value around the nominal top generator mass following a Breit-Wigner-type distribution. In the analysis (121) is was shown that, using different approaches to treat the top decay and finite lifetime effects, can affect a top mass determination at the 0.5 or even 1 GeV level, so this restriction is a very serious one too. For the 2-jettiness QCD calculation the treatment of the leading finite width effects is well-understood (78). Finally, the restriction to the CB formalism was motivated as it is designed to work well for global observables and allows for a straightforward analytic solution and comparison to the predictions of QCD factorization. This restriction can in principle be lifted by a dedicated study of the dipole shower formalism, which is more elaborate analytically. 15 It is clear that the restrictions just mentioned need to be lifted to resolve the interpretation problem for contemporary direct top quark mass measurements, but they reflect at the same time the principle limitations of the state-of-the-art MMCs which should be remedied. Furthermore, definite knowledge on the nonperturbative terms ∆ non−pert MC (Q0) and ∆ MC needs to be gained. One way to do so, is an analysis of the physical aspects of the hadronization models used in MMC from the perspective of observables for which definite statements on the first principles QCD structure of hadronization corrections are available. Each of the restrictions as well as of the nonperturbative terms, may have a numerical impact at the level of a few hundred MeV to 0.5 GeV. The results obtained in Ref. (24) are therefore only a first step. The numerical analysis of Eq. (16) demonstrates that the partonic contribution in Eq. (15) is already of the size of the uncertainties quoted for current LHC direct mass measurements and that detailed analyses of all the terms on the RHS of Eq. (15) is mandatory. To the extent that the infrared-behavior of PS algorithms for ultracollinear radiation is universal and NLL precise, results of the kind of Eq (16), which applies to Herwig 7, should be rather observable-independent and even apply to other MMCs, even though more studies are needed to substantiate this view.
The minimal aspect to be learned from the analysis (24) is that the identification of the direct mass measurements with the pole mass is field theoretically incorrect. There is clear evidence that the additional error associated with making the identification is at least of the same size as the quoted experimental direct measurement uncertainties. Furthermore, due to the different structure of the evolution variable of different PS algorithms, it appears natural that the physical meaning of the top mass parameters in different PSs should not be assumed to be universal. Overall, the analysis affirms that higher developed and more precise MMC (with respect to NLL accurate PSs and finite lifetime effects) are a necessary requirement to resolve the top mass interpretation problem.

SUMMARY AND RECOMMENDATION
In this review, I have presented an overview of the problems involved in the question how to interpret the direct top mass measurements, which quote the Monte-Carlo top mass parameter m MC t , from a physical and conceptual perspective. They touch perturbative (parton showers (PSs) and finite width effects) as well as nonperturbative aspects and limitations of multipurpose Monte-Carlo event generators (MMCs) and each may amount to effects at the level of several hundred MeV to half a GeV. Many of them go beyond the reach of the standard approaches used in high-energy collider physics today and require some novel avenues beyond the current paradigm of achieving higher theoretical precision by using MMCs matched to fixed-order perturbative computations. The top mass interpretation problem expresses the demand that, in order to measure theoretically defined QCD parameters at hadron colliders, MMCs themselves provide first principles QCD predictions which are accurate to subleading order in QCD in order to control the renormalization scheme of their QCD parameters. This is not the case for state-of-the-art MMCs.
For the observables used in the direct measurement, for which the top mass sensitivity is tied to kinematic threshold structures, this means that the PS algorithms should have NLL precision and that hadronization models are employed that implement nonperturbative effects consistent with QCD and the electroweak theory. For the top quark mass the radiation that is soft in a top quark resonance frame plays the most important role. It is probably unrealistic to ask for this level of precision for all observables. But for a number of key observables leading to high-precision top mass measurements with control over the mass scheme at NLO, I believe, such an achievement is realistic in the near future. The relation of the MMC top mass parameter to any mass scheme and the question of universality and observable independence could then be obtained from computations rather than speculations. Developments in the direction of NLL precise PSs are already under way, e.g. concerning a more precise description of the parton splitting (165,166,167), the restriction of dipole-type showers for global observables (168,169), finite life-time effects (170) or full color coherence (171,172), but there is still a long way to go.
How should one deal with the top mass interpretation problem today? It is well understood that the m MC Clearly a deeper understanding is crucial to obtain a reliable and systematic high-precision top mass measurement at the HL-LHC. Much work is still needed to analyze how much the dynamical effects of the hadronization models affect the meaning of m MC t and to carry out similar analyses for observables closer to those used in the LHC measurements.
At this time there is no general consensus how to quantify the interpretation problem of the direct top mass measurments for making mass dependent theoretical predictions. It is left to the decision of the individual how to deal with the issue. I hope that this review provides the reader a deeper insight for her or his choice. Most often the identification m MC t = m pole t is made, sometimes supplemented by adding another uncertainty of the order of the quoted experimental uncertainty. If this approach is adopted, I recommend, as a practical (neither very conservative nor very optimistic) attitude for the time being, that the uncertainty to be added is 0.5 GeV accounting for the interpretation problem plus 250 MeV for the pole mass renormalon ambiguity. As an additional option, which accounts for the existing evidence that the MMC generator top masses are short-distance masses and reflects a somewhat less conservative attitude, I recommend to use the identification m MC t = m MSR t (1.3 GeV) adding an uncertainty of 0.5 GeV quantifying the interpretation problem. In this approach the pole mass renormalon ambiguity is coming back in the conversion to m pole t for predictions made in the pole mass scheme (the outcome just differs by a 350 MeV shift in the central value w.r. to the first approach). But the pole mass renormalon ambiguity is avoided completely when considering only predictions made in short-distance mass schemes. In this context one should employ the method explained in Sec. 2.2 to convert between pole and short-distance masses.