The top quark discovery
The top quark, the heaviest elementary particle known to date, was discovered by the CDF and D0 collaborations in 1995, after almost two decades of searches.
The Standard Model of particle physics describes the interactions between the elementary building blocks of matters, quarks and leptons. Quarks and leptons of left chirality are grouped by pairs, with the weak interaction transforming one element into the other. The two lighter quarks, u and d, which are the building blocks for protons and neutrons, are for instance in one such doublet. Quarks and leptons are then organised by generations (i.e replicas of the doublets), with increasing masses. All the matter currently in the universe is made of elements from the first generation: the u and d quarks and the electron and electron-neutrino leptons. A second heavier generation is made of c and s quarks and the muon and muon-neutrino. The last element of this second generation (the c quark) was discovered in 1974 through the observation of the J/psi state (made of c and anti-c quarks). An obvious question was then if a third, heavier generation, was also present. The existence of a third generation would also give rise, within the Standard Model, to a mechanism for an asymmetry between matter and anti-matter, which had been observed experimentally for the first time in the 1960's. Thanks to higher energy accelerators, a heavier quark, which was called the b quark, was discovered rapidly in 1977 at Fermilab. The mass of the b quark is about 5 GeV/c2, five times the mass of the proton. The charged lepton of the third generation, the tau lepton, was also discovered in the mid-1970s. The search of the second quark of this third generation, known as the top quark, became then an important goal of experiments at the high energy frontier.
Constraints on the mass of the top quark
The first expectation was that the mass of the top quark would be of similar order of magnitude as the mass of the b quark, maybe a few 10's of GeV. Several colliders able to probe this mass range became available in the late 1970's and in the 1980's. Searches for the top quark were performed but with negative results. The lower bound on the top quark mass was pushed higher and higher, and was set to 69 GeV/c2 in 1989 by the UA2 experiment at the CERN proton-antiproton collider. At the same time, several indirect constraints on the top quark mass started to emerge.
First, precision measurements of the forward-backward asymmetry of b quarks production in e+e- collisions clearly indicated that the b quark had to be in a doublet for the weak interaction, like the quarks of the other generations, so theories with only a b quark as a third generation quark were ruled out.
The top quark can play an important role in several observables through virtual effects related to quantum corrections, affecting observable at an energy scale significantly smaller than the top quark mass. One of the first indirect indication of the heavy mass of the top quark came from the measurement of the fast oscillation frequency between neutral mesons made with b and d (anti) quarks. In the Standard Model, these oscillations are related to diagrams involving virtual top quark exchange, and the frequency of the oscillations increases with the mass of the top quark. Experimental measurements were favouring a heavy top quark.
However, the strongest indirect constraint on the mass of the top quark emerged in the late 80's and early 90's when the first results of the LEP e+e- collider at CERN on precise measurements of the Z boson properties became available. The Z boson is, with the charged W boson, one of the carriers of the electroweak interaction. The Standard Model predicts a relation between the masses of the Z boson and of the W boson. This relation can be computed very accurately in perturbation theory and is sensitive to the exchange of virtual particles affecting the propagator and thus the masses of W and Z bosons. The exchange of virtual top quark plays a significant role and thus the relation between the W and Z masses depends on the assumed value of the top quark mass. By 1991, precision measurements of the Z boson mass at LEP, together with W boson mass measurements performed at the CERN proton-antiproton collider and at the Fermilab proton-antiproton collider, were favoring strongly a top quark mass in the range 110-200 GeV/c2, much higher than originally thought. These constraints were further refined in the following years, thanks to precision measurement of several observables related to the electroweak interactions at LEP.
With a mass in this range, a direct observation and discovery of the top quark was only possible at the Fermilab Tevatron proton-antiproton collider.
The Tevatron Collider
The Tevatron was a proton-antiproton collider at the Fermi National Accelerator Laboratory (Fermilab), near Chicago. It was a ring of 6.28 km circumference. Super-conducting magnets, generating a magnetic field of 4 Teslas, were used to bend proton and anti-protons which circulated in the same beam pipe in opposite directions. Proton and anti-protons were then brought to collisions at two places around which the CDF and D0 detectors were located. The main advantage of accelerating protons compared to electrons is that, thanks to the high proton mass, the synchrotron radiation, produced when charged particles are bent, is much smaller, so it is possible to reach a much higher energy within a given radius of the collider. Colliding protons against anti-protons is possible with only one beam pipe and the same magnetic field, as protons and anti-protons are keep in the same trajectory if they travel in opposite directions, thanks to their opposite electric charge. The main challenge of a proton-antiproton collider is to be able to accumulate enough anti-protons to produce a high enough collision rate. The Tevatron started operations in the late 1980's and was in operation until 2011 with many improvements done along the way. The description below corresponds to the "run 1" period from 1992 to 1995, during which the top quark was discovered.
Anti-protons were produced by sending protons of 120 GeV energy (accelerated in the Main Ring accelerator which shares the same tunnel as the Tevatron) to a nickel target. Anti-protons produced in the collisions between the protons and the nuclei of the target were focused using a lithium lens and collected in the debuncher ring. For one million protons hitting the target, only 20 anti-protons were collected. To produce enough anti-protons, about 3.1012 protons per pulse were sent to the target and pulses were repeated every 2.4 s. The collected anti-protons had a large energy spread. To use them in a collider, the phase space dispersion (in position and momentum) of the anti-protons had to be strongly reduced. This was achieved with the stochastic cooling technique. This technique was pioneered in the early 80's at the CERN proton-antiproton collider. Anti-protons were then accumulated in a dedicated ring at an energy of 8.2 GeV. Once enough anti-protons were accumulated (about 1012, which took typically one day), anti-protons were extracted, sent to the Main Ring and then to the Tevatron where they circulated in the opposite direction to the protons. The beams were then accelerated to an energy of 900 GeV per beam, leading to a centre of mass energy for the collisions of 1800 GeV = 1.8 TeV. Six bunches of anti-protons and six bunches of protons were circulating at the same time. Given the numbers of proton per bunches (about 1-2 1011) , the number of anti-proton per bunches (3-5 1010), the dispersion in phase space of the bunches and the focusing applied when bringing the bunches into collisions, the luminosity produced was in the range (5-25 ).1030 cm-2s-1. The luminosity is a key parameter for an accelerator as the rate of events observed for a given process will be the product of the intrinsic cross-section of this process times the luminosity. The integrated luminosity recorded by the CDF and D0 experiments during run 1 was 110 to 120 pb-1 (1 pb-1 corresponds to 1036 cm-2).
Figure 1 shows a schematic view of the accelerator complex during run 1.
The CDF and D0 Experiments
The CDF  and D0  experiments were general purpose experiments built to study proton-antiproton collisions. Starting from the interaction point, there were first light weight detectors to measure the trajectory of charged particles produced in the collisions. The energy of particles was measured using calorimetry systems and then the outer layers of the detectors were instrumented to identify muons, which are the only charged particles expected to traverse the full calorimeter. A schematic view of the CDF and D0 detectors can be found in Figure 2.
During run 1, the main differences between CDF and D0 were CDF had a magnetic field in the inner region before the calorimeter, unlike D0 which only detected charged particles without measuring their momenta, except in the case of muons. This axial magnetic field (with a 1.4 T field strength) was generated by a superconducting magnet located between the inner detector and the calorimeter. This enabled the measurement of the momentum of charged particles produced in the collision, as well as of their electric charge. This measurement was performed mostly by a large cylindrical wire chamber. The innermost part of the CDF detector was a detector made of silicon strip sensors. This detector consisted in four layers of sensors ranging from 3 to 8 cm from the beam pipe. The pitch of the strip was 50 to 60 μm, allowing a very precise measurement of the trajectory of the charged particle in the plane orthogonal to the beam axis. This detector was the first silicon vertex detector to operate in a hadron collider experiment. As explained later, it provided important informations to identify the decay products of the top quark and played a major role in the CDF result. The D0 calorimeter was a sampling calorimeter made of liquid argon and depleted uranium, providing better energy measurements for hadrons. The outer layer of D0 was instrumented by a toroid magnet made of iron and muon detection chambers, while CDF had muon detection chambers after the return yoke of the solenoid.
Production of top quarks in proton-antiproton collisions
Protons and anti-protons are not elementary particles but are made of quarks and gluons. Like them, the top quark is sensitive to the strong interaction, so the main production of top quarks is pair-production via strong interaction processes, either from a quark-antiquark initial state or from a gluon-gluon initial state. Example lowest-order Feynman diagrams are shown in Fig 3. At the Tevatron, the quark-antiquark annihilation process dominates, from valence (anti)-quark contribution of the (anti)-proton. In the early 90's, next-to-leading order corrections to the production cross-section were already computed, allowing prediction of the cross-section to an accuracy of about 15%. Other production modes involving weak interactions are possible but give a smaller cross-section and did not contribute to the top quark discovery.
The production cross-section decreases significantly as a function of the assumed top quark mass. For a mass of 175 GeV/c2, the predicted cross-section is about 4.8 pb . This means that in a 20 pb-1 sample, only about 100 top-antitop pairs are produced. This can be compared to the total number of inelastic proton-proton collisions which is more than 1000 billions in the same dataset, so huge background rejection was required, but at the same time, a good efficiency to select signal events had to be maintained.
Strategies for top quark searches
The top quark is unstable and decays promptly via weak interaction to a b quark (its partner in the same generation) and a W boson. This is possible because the top quark mass is larger than the sum of the b quark (5 GeV/c2) and W boson (80 GeV/c2) masses. Given the large mass of the top quark, there is a large phase space for this decay and the decay width is predicted to be about 1.8 GeV/c2, for a top quark mass of 175 GeV/c2. This means that the lifetime of the top quark is extremely small, below 10-24 s. Therefore only the decay products of the top quark can be experimentally observed and thus the search strategy for the top quark is driven by the decay modes of the W boson. In an event with a top-antiquark top pair, two W bosons are produced (with opposite charges). The W boson is also a very short-lived particle. Because of the universality of the weak interactions, it can decay either to lepton-neutrino pairs or quark-antiquark pairs, with relative decay rate of about 10% for decay to electron+neutrino, 10% to muon+neutrino, 10% to tau+neutrino and 70% to quark-antiquarks. Quarks and antiquarks are abundantly produced by strong interaction in proton-antiproton collisions so it is advantageous to categorise the events depending on the number of produced leptons, considering together electrons or muons (light leptons) , which are "easy" to detect, identify and precisely measure, and separating tau leptons, which are short-lived and decay before interacting with the detector. Events with two light leptons give in principle the cleanest final state, but only 5% of the produced top-antiquark top pairs lead to this final state, so searches in this channel are limited by the small signal efficiency. Events with one light lepton from a W boson decay, with the other W boson decaying to quarks, are more abundant, corresponding to about 30% of the produced events. Events with only quark in the W boson decays and/or with tau leptons can be a bit more abundant but suffer from a much worse signal over background ratio and did not contribute to the discovery of the top quark. One drawback of selecting events with W decays to leptons is the presence of neutrinos in the final state. Neutrinos are weakly interacting particles so escape the detector without leaving any signal and their presence can only be inferred indirectly via an apparent imbalance of the momentum of all detected particles. In practice a large number of particles from the fragments of the interacting protons, which are not elementary particles, are produced at small angle from the beam direction. Therefore, only the imbalance of momentum in the plane orthogonal to the beam direction can be exploited, reducing further the knowledge of the kinematic of the full event, especially in the two lepton final state, where two neutrinos are produced but only the sum of their momenta is constrained.
Events with one or two leptons in the final state are relatively easy to trigger, i.e to identify in real time. The trigger system reduces the 300 kHz proton-proton bunch crossing rate to about 20-30 Hz of data for which all the detector information is recorded and available for further analysis. This reduction is done in several steps using a combination of dedicated hardware systems and software-based fast processing of the full detector information. Selected events with one or two leptons in the final state are mostly made of decay products of W and Z bosons produced in proton-proton collisions. The cross-section for these processes is much smaller than the total proton-proton inelastic cross-section but is still much larger than the top signal production cross-section, by a factor close to 1000. Further background rejection is thus needed after the initial selection of the recorded events, especially for the final states with only one lepton, where the dominant background is the production of a W boson (decaying to a lepton) in association to several jets (collimated set of hadrons initiated by high energy produced quarks or gluons). Several techniques can be used, based for instance on detailed studies of the kinematic of the events and exploiting the differences predicted between the production mechanisms of top quark pairs and W boson production. In the following paragraph, a different technique exploiting the silicon vertex detector of the CDF experiment is described in more details.
The presence of b and anti-b quarks in the top decay products can indeed also be exploited. These quarks will produce high energy jets of hadrons, among which one weakly decaying hadron made of b (or anti-b) quark and a light anti-quark (or light quarks). Jets can be detected and their energy measured with the calorimeter system. Hadrons with b (or anti-b) quark (called "B-hadrons") have a lifetime of about 1.5 ps, which is large enough to produce a visible displacement between the proton-antiproton interaction point where they are produced and the point where they decay. For B-hadrons produced in top-quark decays, this displacement is in average 0.3 mm. This displacement can be measured if the tracks from charged particles produced in the B-hadron decays are precisely reconstructed, either by measuring a significant impact parameter of the track trajectory with respect to the primary vertex or by reconstructing the decay vertex from the intersection of the trajectories of several tracks. A sketch of a secondary vertex in a jet initiated by a b-quark jet is shown in Figure 4. The silicon vertex detector installed in CDF has a high enough resolution to enable this discrimination. The main algorithm used by the CDF experiment for b-jet identification during run 1 was based on the reconstruction of a secondary vertex. The efficiency to select a jet initiated by a b-quark was about 25% with a rejection by a factor of several 100's for jets not containing long-lived hadrons made with heavy quarks. In rare cases b-quarks can be produced in association with W bosons in the W+jets production background process. Taking this into account, the overall background rejection was around 50 with an efficiency per top-antitop event of about 40%.
First results from a partial dataset
With the run 1a dataset (19 pb-1), CDF performed analyses based on the two-lepton final state and on the one lepton + jets final state, using either secondary vertex tagging as described above or exploiting low energy charged lepton production in B-hadron decays as another algorithm to identify jets originating from b-quarks . These analyses took some time to develop as techniques to perform the b-quark identification had to be optimised and the various methods to compute the background had to be scrutinised. An excess of events was observed, and the kinematic of the reconstructed events pointed towards a mass close to 175 GeV/c2. However the number of events observed was somewhat higher than expected for this mass. Several features of the events were consistent with top quark pair production but some of them less, although these studies were severely limited by the small amount of data available. So CDF decided to release in 1994 a long paper describing the studies and to quote the significance of the excess over the background-only expectations only based on a simple count of the total number of candidates across all analyses, without taking into account additional information from the kinematic of the events. In total, 12 candidates were observed, with an expected number of background events of 5.7 +-0.5. The p-value of the background-only hypothesis (i.e the probability to get a yield as high as observed in the data if only background is present), based on the number of "tags", i.e giving more weights to events with two candidate b-quarks, was 0.26%, corresponding to a bit less than a 3 standard deviations excess.
The D0 analysis with the run 1a dataset  used events with two lepton candidates (electron-muon or electron-electron pairs) and lepton+jets candidates with four jets. In the lepton+jets events, a topology selection based on the aplanarity (how much the topology deviate from a plane) of the observed objects was applied to reduce the background from W+jets events. The observed data yields (1,1, 1 and zero candidates in the electron-muon, electron-electron, electron+jets and muon+jets categories) were consistent with background expectations (1.1+-0.3, 0.5+-0.2, 2.7+-1.3 and 1.6+-0.9, respectively). This analysis was optimised for a top mass in the range 100-160 GeV/c2. As no excess was found, a lower limit on the top quark mass was set at 131 GeV/c2 .
The discovery in 1995
After the first results from run 1a, experiments prepared to get quickly new results during run 1b, which was expected to deliver higher luminosity. Analysis strategies were optimised for a mass close to 175 GeV prior to the data taking, and background estimation techniques were refined. In addition the silicon vertex detector of CDF was changed to a new detector more resistant to the radiations created by the proton-antiproton collisions. The secondary vertex reconstruction algorithm was also improved. So it did not take long after the start of run 1b for CDF and D0 to reach enough sensitivity to claim observation of the top quark. Using an extra 50 pb-1 run 1b sample, both experiments published the discovery of the top quark production in March 1995.
The CDF observation  was based on the count of candidates, similarly to what was used in the 1994 results. The different categories were two lepton events, lepton + jet events with secondary vertex b-tagging and lepton + jet events with a low energy lepton consistent with semileptonic B-hadron decay. The counts in these categories were 6 events, 27 tags and 23 tags respectively with background expectations of 1.3+-0.3, 6.7+-2.1 and 15.4+-2.0 respectively. The most sensitive category was the second one, with the secondary vertex reconstruction. Figure 5 (left) illustrates the distribution of the number of events as a function of the number of jets in candidate W+jet events, before and after requiring a secondary vertex tag. The top quark signal was expected to be in the 3 and 4 jet or more bins, which formed the signal region. The background was estimated using a mixture of data-driven estimates for the backgrounds which do not involve genuine b (or c) quark production, while backgrounds from W+b bbar, W + c cbar and W+c production was estimated from simulation and validated in control regions. In the 1 and 2 jet regions where little top signal is expected, the background computations matched the data very well, while a significant excess was observed in the signal enriched regions. Figure 5 (left) shows also the reconstructed proper time distribution from the secondary vertices in the signal region, showing a nice consistency with the known B-hadron lifetime. Overall the significance of the excess of data yield above the background expectation, when summing all categories, was almost five standard deviations, with a p-value of about 1x10-6. A kinematic reconstruction of the top quark mass based on the decay chain (t-> b W - > b lepton neutrino) and (tbar -> bbar W -> bbar q q'bar) (or charge conjugate) using events with one identified lepton, missing transverse momentum, at least four jets out of which one is consistent with originating from a b-quark, was performed to estimate the mass of the top quark. Figure 5 (right) illustrates the event-by-event estimated top mass value and from that distribution a measurement of the top quark mass of 176 +-8 stat +-10 syst GeV/c2 was derived, quite close to the now very well known top quark mass of 172.8 +-0.3 GeV/c2.
From the event yield, the top quark pair production cross-section was measured to be 6.8 +3.6 -2.4 pb, within one standard deviation of the theory prediction. Figure 6 shows an illustration of one of the observed candidate event, which is very unlikely to originate from background. An independent analysis to observe top quark production in the lepton+jets final state using kinematical variables as discriminant was also performed and observed at the same time also a significant excess, which was then published shortly after 
The D0 observation  was based on an analysis re-optimized for a heavy top quark hypothesis and new event categories were added. Seven categories were considered in total: electron-muon events, di-electron events, di-muon events, one electron plus four jets, one muon plus four jets, one electron plus at least three jets and a soft muon tag, one muon plus at least three jets and a soft muon tag. A soft muon tag (low momentum reconstructed muon candidate) was used to enrich the sample in candidates from semi-muonic B-hadron decays. Selections in the lepton+jet categories were based on the aplanarity of the event and also on the sum of the total transverse energies of lepton and jets (Ht) , where transverse energy is defined as the energy of the object multiplied by the sine of the polar angle with respect to the beam axis. Ht is a powerful kinematical variable to separate the top quark pair signal from the background from W+jets productions. Backgrounds were estimated mostly using simulation for processes like W+jets, Z+jets and using data for backgrounds involving wrongly identified electrons or muons.
Figure 7 shows a comparison of data and background prediction in the electrons + >=2 or >=3 jet regions, validating the background modelling in these background dominated regions. In total 17 events were observed across all categories with a background expectation of 3.8 +- 0.6 events. The p-value of the background-only hypothesis using only the total event yield was 2x10-6 corresponding to a 4.6 standard deviation significance. To measure the top quark mass, a fit exploiting kinematic constraints under the top-antitop production hypotheses was performed using measurements of reconstructed objects for the candidates in the electron or muon +>=4 jet categories. Fig 8(left) shows the reconstructed top mass distribution together with the background contribution and the best fitted top mass. Fig 8 (right) illustrates the reconstructed top mass distribution when a looser is applied. The distribution with this looser selection is well described by the sum of the background and signal contributions. The measured top quark mass reported by D0 was 199+-20 +-22 GeV/c2, slightly higher than the CDF value but well consistent within uncertainties. The top quark pair production cross-section was measured to be 6.2 +-2.2 pb at this mass value.
The discovery of the top quark by the CDF and D0 experiments was the starting point of a long program of measurements. With the Tevatron run 2 data (collected between 2001 and 2011) and upgraded detector, the CDF and D0 collaborations were able to significantly improve the measurement of the top quark mass. Several other properties of the top quark were also studied. The production of a single top quark via weak interaction was also observed. Since 2010, the CERN Large Hadron Collider (a proton-proton collider at a much higher energy, starting at 7 TeV in 2010 and reaching 13 TeV for 2015-2018) has become a top quark factory, allowing more precise probes of many aspects of top quark physics. So far all measured properties are consistent with the Standard Model expectations. With its mass much larger than that of all other quarks and leptons, the top quark remains a bit mysterious and could still be a window to physics beyond the Standard Model.
1. F.Abe et al., Nucl. Instrum. Methods Phys. Res., Sect. A 271, 387 (1988) and D. Amidei et al., Nucl. Instrum. Methods Phys. Res., Sect. A 350, 73 (1994)
2. S. Abachi et al., Nucl. Instrum. Methods Phys. Res., Sect. A 338, 185 (1994)
3. S.Catani, M.L.Mangano, P.Nason and L.Trendatue, Phys.Lett. B 378, 239 (1996)
4. F. Abe et al., Phys. Rev. D 50, 2966 (1994)
5. S. Abachi et al., Phys. Rev. Lett. 72, 2138 (1994)
6. F.Abe et al., Phys.Rev.Lett. 74, 2626 (1995)
7. F. Abe et al. Phys. Rev. D 52, R2605 (1995)
8. S.Abachi et al., Phys. Rev. Lett. 74, 2632 (1995)
Bill Carithers and Paul Grannis, “Discovery of the Top Quark”, SLAC Beamline, Vol. 25, No. 3, p. 4-16, Fall 1995, https://www.slac.stanford.edu/pubs/beamline/25/3/25-3-carithers.pdf