Experimental verification of an indefinite causal order

See allHide authors and affiliations

Science Advances  24 Mar 2017:
Vol. 3, no. 3, e1602589
DOI: 10.1126/sciadv.1602589


Investigating the role of causal order in quantum mechanics has recently revealed that the causal relations of events may not be a priori well defined in quantum theory. Although this has triggered a growing interest on the theoretical side, creating processes without a causal order is an experimental task. We report the first decisive demonstration of a process with an indefinite causal order. To do this, we quantify how incompatible our setup is with a definite causal order by measuring a “causal witness.” This mathematical object incorporates a series of measurements that are designed to yield a certain outcome only if the process under examination is not consistent with any well-defined causal order. In our experiment, we perform a measurement in a superposition of causal orders—without destroying the coherence—to acquire information both inside and outside of a “causally nonordered process.” Using this information, we experimentally determine a causal witness, demonstrating by almost 7 SDs that the experimentally implemented process does not have a definite causal order.

  • Quantum Information
  • Quantum Optics
  • Quantum Foundations


The notion of causality is an innate concept, which defines the link between physical phenomena that temporally follow one another, with one phenomenon manifestly being the cause of the other. Nevertheless, in quantum mechanics, the concept of causality is not as straightforward. For example, when the superposition principle is applied to causal relations, situations without a definite causal order can arise (1, 2). Although this can lead to disconcerting consequences, forcing one to question concepts that are commonly viewed as the main ingredients of our physical description of the world (3), these effects can be exploited to achieve improvements in computational complexity (46) and quantum communications (79). Recently, this computational advantage was experimentally demonstrated in the study of Procopio et al. (10). However, the absence of a causal order was inferred from the success of an algorithm rather than being directly measured. Here, we explicitly demonstrate the realization of a causally nonordered process by measuring a so-called “causal witness” (11).

To make our results stronger (that is, make the causal witness more robust to noise), we performed a superposition of the orders of a unitary gate and a measurement operation. In other words, we made a measurement inside a quantum process with an indefinite order of operations [the quantum SWITCH (1)]. Performing a standard measurement inside the quantum SWITCH would destroy its coherence, because it would reveal the time at which the measurement is performed and would thus also reveal whether it is performed before or after other operations. In other words, such a measurement would reveal the causal order between the operations. However, in our scheme, the measurement outcomes are read out only “at the end” of the process, thus preserving its coherence. Because applications of indefinite causal orders will most likely require the superposition of orders of complex quantum operations, we believe that, in addition to the first direct demonstration of an indefinite causal order, our measurement in a quantum SWITCH can also be considered a technological step toward these applications (49).

In our usual understanding of causal relations, if we consider two events A and B, which are connected by a time-like curve, then we will have one of two cases: Either “A is in the past of B,” or “B is in the past of A.” However, the application of the superposition principle to these causal relations calls into question the interpretation of causal orders as a preexisting property. The causal order can become genuinely indefinite. To see this, consider a two-qubit quantum state |ϕ〉 lying in the composite Hilbert space Embedded Image, with Embedded Image and Embedded Image each being two-dimensional Hilbert spaces. It is possible to condition the order in which operations are applied to a target state Embedded Image on the value of a control state Embedded Image. For example, if the state of the control qubit is |0〉C, then the two operators will be applied in the order A and then B on the state of the target qubit |ψ〉T, and vice versa if the state of the control qubit is |1〉C. Therefore, if the control qubit is in a superposition state Embedded Image, then a controlled quantum superposition of the situations “A is in the past of B” and “B is in the past of A” is established (Fig. 1). In the above situation, the causal order is not merely in a superposition. It is entangled with the state of the control qubit.

Fig. 1 The quantum SWITCH.

Consider a situation wherein the order in which two parties Alice and Bob act on a target qubit |ψ〉T depends on the state of a control qubit in a basis {|0〉, |1〉}C. If the control qubit is in the state |0〉C, then the target qubit is sent first to Alice and then to Bob (A), whereas if the control qubit is in the state |1〉C, then it is sent first to Bob and then to Alice (B). Both of these situations have a definite causal order and are described by the process matrices WAB and WBA (Eq. 6). If the control qubit is prepared in a superposition state Embedded Image , then the entire network is placed into a controlled superposition of being used in the order Alice→Bob and in the order Bob→Alice (C). This situation has an indefinite causal order.

From this simple example, we can see that the causal order between events is not always definite in quantum mechanics. One could, in the spirit of hidden-variable theories, insist that there might nonetheless be a well-defined causal order. However, such a claim requires, in general, a theory to be nonlocal and contextual because of the Bell and Kochen-Specker theorems (12, 13).

The case described above, called the quantum SWITCH, is the first explicit example wherein it was shown that quantum mechanics does not allow for a well-defined causal order (1). The SWITCH was recently experimentally implemented (10) by superposing the order in which two unitary operations acted. That experiment confirmed that a causally nonordered quantum circuit can solve a specific computational problem more efficiently than an ordered quantum circuit. However, only an indirect evidence of indefinite causal order was observed through the demonstration of this computational advantage. Therefore, the primary goal of our current experiment is to provide direct experimental proof of the causal nonseparability of the quantum SWITCH. For this purpose, we used a recently developed theoretical tool: the causal witness (11).



A causal witness is a carefully designed set of measurements, whose outcome will tell us if a given process is causally ordered or not. An intuitive way to introduce causal witnesses is through the well-known concept of an entanglement witness (14). First, recall that a composite quantum system ρ lying in a Hilbert space Embedded Image is separable or entangled depending on whether it can be written in the form Embedded Image (with Embedded Image and Embedded Image states of the subsystems A and B and 0 ≤ pi ≤ 1, ∑ipi = 1) or not. Then, it can be shown that for all entangled states ρent, there exists a Hermitian operator S, called an “entanglement witness,” such that Tr(Sρent) < 0, but Tr(Sρsep) ≥ 0 for all separable states ρsep. Hence, it follows that if one measures an entanglement witness on a state and finds a negative value, then the state must be entangled.

A similar quantity was recently introduced to determine whether a process matrix W is causally separable or not (2). A process matrix (the counterpart of the density matrix in the entanglement witness example) describes causal relations between local laboratories (15). Consider two observers Alice and Bob who perform local operations MA and MB (MA and MB can be arbitrary quantum operations, from simple unitary operations to more complex measurement channels). By local operations, we mean that the only connection that Alice and Bob have with the external world is given by the quantum state that they receive from it and the state that they return to it. The process matrix W then details how this quantum state moves between the two local laboratories (Fig. 2). Hence, it is independent of the individual operations that Alice and Bob perform. In the case of the quantum SWITCH, the process matrix first routes the input state to Alice and Bob in superposition, then connects Alice’s output to Bob’s input and vice versa, and finally coherently recombines their outputs.

Fig. 2 A process matrix representation of Fig. 1.

The process matrix W describes the “links” between Alice and Bob. For example, it could simply route the input state ρ(in) to Alice MA and then to Bob MB (solid line) or vice versa (dashed line). In the case of the quantum SWITCH, it creates a superposition of these two paths, conditioned on the state of a control qubit. The input state ρ(in), the two local operations MA and MB, and the final measurement D(out) must all be controllable and known a priori. The unknown process is represented by the process matrix (shaded gray area labeled W). A causal witness quantifies the causal nonseparability of W.

Because a causal witness characterizes a process rather than a state (unlike an entanglement witness), it requires a procedure akin to “process tomography” (that is, “causal tomography,” see Materials and Methods). Namely, we must probe the process with several different input states ρ(in). Then, for each input state, Alice and Bob implement several different known operations, and then, we perform a final measurement D(out) (Fig. 2). In general, MA and MB can include measurement operations; thus, each could have additional measurement outcomes associated with it. We denote the outcomes of Alice and Bob’s local operations by a and b, and their choice of operation by x and y, respectively. We label the specific choice of an input state with z and the output of a detection operation with d. With this in mind, the probability of obtaining the outcomes Embedded Image, Embedded Image, and Embedded Image, with the input state Embedded Image can be written, using the Choi-Jamiołkowski isomorphism (16) (see the Supplementary Materials), asEmbedded Image(1)with ∑a,b,dp(a, b, d| x, y, z) = 1 for all the possible settings x, y, z and where W is the process matrix (11).

To calculate these probabilities for the quantum SWITCH, we must construct its process matrix, which we will call WSWITCH. To do this, we will again use the Choi-Jamiołkowski isomorphism, which is a way of representing a linear operator that maps Embedded Image to Embedded Image as a state in the composite Hilbert space Embedded Image. As a first step, consider the identity channel from the output space Embedded Image of a party P1 to the input space Embedded Image of a second party P2. To describe this as a process matrix, we can write it as a projector onto a process vector in the “double-ket notation” (17, 18)Embedded Image(2)where j labels a basis over the spaces. We can now use this process matrix to describe an input state passing first to Alice (Embedded Image), then to Bob (Embedded Image), and finally to the output space (Embedded Image). This process is described byEmbedded Image(3)

Alice and Bob are free to perform measurements Embedded Image and Embedded Image , respectively, but they are not part of the above process vector. Note that swapping the order of Alice and Bob is as simple as swapping the labels A and B. The vectors |wAB〉 (describing “Alice acts before Bob”) and |wBA〉 (describing “Bob acts before Alice”) both have a well-defined causal order (Fig. 1, A and B).

We are now in the position to construct the process matrix of the quantum SWITCH. Recall that for the quantum SWITCH, the control qubit’s state sets the relative amplitudes of Alice → Bob and Bob → Alice. Thus, the process vector of the quantum SWITCH [with the control qubit initially in the state Embedded Image)] is simplyEmbedded Image(4)

For the causal witness we will consider here, we will only measure the state of the control qubit after the SWITCH. Thus, we need to construct the process matrix taking an input state and returning the state of the control qubit. This is done by tracing over the SWITCH output (that is, the target qubit) and fixing the state of the control qubit to be Embedded Image. Thus, the process matrix to compute the final state of the control qubit is represented by the process matrixEmbedded Image(5)where Embedded Image is the partial trace over the output system qubit.

Using the same formalism, one can also concisely describe all causally separable processes. Consider two general process matrices linking the two local laboratories A and B, WAB and WBA. Here, contrary to Eq. 3, the link between the laboratories is in general no longer the identity channel. Then, by simply taking an incoherent mixture of the two, one can describe all possible causally separable processes (11)Embedded Image(6)where 0 ≤ p ≤ 1. Physically, this can be understood as each run of the process having a well-defined order, with Alice acting first with probability p and Bob acting first with probability 1 − p. From this definition, it is apparent that every convex combination of causally separable process matrices is still a causally separable process matrix; thus, the set of causally separable process matrices is convex.

Causal witnesses are designed to distinguish between causally separable (Eq. 6) and causally nonseparable process matrices (such as Eq. 5). For all causally nonseparable process matrices Wn−sep, there exists a Hermitian operator S, called a causal witness, such thatEmbedded Image(7)but Tr(SWsep) ≥ 0 for all causally separable process matrices Wsep (11), just as in the entanglement witness example. As we show in Materials and Methods, such an operator is always guaranteed to exist. This is because the convexity of the causally separable process matrices set ensures that there is always a hyperplane, which separates the set from a given causally nonseparable process Wn−sep (19).

To implement a causal witness experimentally, we need to decompose it in terms of operations that we can realize in the laboratory: preparation of states, applying quantum channels, and doing measurements. This can always be done, because the tensor product of these operations spans the whole Hilbert space of Hermitian operations, which includes the Hilbert space of process matrices. Using the notation defined in Eq. 1, a causal witness can be expanded asEmbedded Image(8)where the coefficients αa,b,d,x,y,z are real numbers that define (together with the input states, operations, and measurements) a particular witness. From the definition in Eq. 1, it follows thatEmbedded Image(9)and, therefore, the evaluation of the quantity Tr(SW) for a given process W translates into a determination of probabilities p(a, b, d|x, y, z) for several input states and measurement choices.

In the case where there are no restrictions on which operations we can implement, we choose the coefficients αa,b,d,x,y,z by maximizing the quantity −Tr(SW) over the set of all possible causal witnesses, as described in Materials and Methods. This quantity, for such an optimal witness, corresponds to the maximum “amount of worst-case noise” that the process under examination can tolerate while remaining causally nonseparable (11). More precisely, it is the minimal λ ≥ 0 for which the process matrixEmbedded Image(10)becomes causally separable, where Ω is any other process that could have been prepared instead of the desired Wn−sep. We will refer to this quantity as the “causal nonseparability” (CNS) of a process WEmbedded Image(11)

When the −Tr(SW) < 0, we define the CNS(W) to be zero.

However, in practice, we may not be able to maximize −Tr(SW) over the whole set of causal witnesses, because there can be restrictions on which operations Alice and Bob have access to. To fully assess the CNS, Alice and Bob must be able to implement a complete basis of operators, which gives them access to the maximal amount of information about the process. Therefore, we define the experimentally certifiable CNS [hereafter referred to as CNSexp(W) = −Tr(SexpW)] as the maximum of −Tr(SW) over the restricted set of operators. In this case, CNSexp(W) is no longer the amount of noise that the process can tolerate before becoming nonseparable but the maximal amount of noise for which this restricted class of witnesses can still detect its causal nonseparability.

If Alice and Bob could only implement unitaries, for example, then this would drastically diminish the attainable CNSexp(W)—this path was chosen by Procopio et al. (10). Because a unitary operation cannot extract any explicit information from the manipulated state (and, consequently, from the process), neither Alice nor Bob can gain any knowledge about their received state when applying only these gates, and consequently, the estimated CNSexp(W) is less efficient. However, if the unitary operations are replaced with projective measurements, then, roughly speaking, information about the process at different points throughout the SWITCH can be extracted. If both Alice and Bob have access to measure and reprepare operations, then one can achieve CNSexp(W) = CNS(W).

Because of the experimental challenges of coherently adding measure-and-reprepare operations, Alice performs a measure-and-reprepare operation and Bob implements a unitary channel in our experiment. It turns out that giving one party a measure-and-reprepare operation and the other a unitary operation still increases CNSexp(W) substantially. Thus, the causal witness we will measure depends both on Alice’s outcome (performed inside the SWITCH) and on our final measurement outcome.


To experimentally implement the quantum SWITCH, we need a control and a target qubit. In our experiment, we encode a control qubit in a path degree of freedom of a photon and a target qubit in the same photon’s polarization. The technique of using multiple degrees of freedom has enabled many previous quantum technologies (2022). For our present experiment, this is convenient because Bob’s unitary gate can be implemented easily with three wave plates, whereas Alice can perform a projective measurement with wave plates and a polarizing beam splitter. Note that there are other proposals to coherently control the causal orders of events (11, 23, 24). In these proposals (as in ours), the target and control system are encoded in the same particle. In principle, it is also possible to use different particles. With photons, this could be done using a so-called controlled path gate (25) or potentially by using a spin qubit to control the causal order acting on a photon (26).

In our experiment, the realization of the unitary channel is straightforward, but a short remark is necessary concerning Alice’s measurement. It is clear that a polarizing beam splitter enables one to distinguish the polarization of an incoming photon. However, a polarizing beam splitter gives rise to additional spatial modes (that is, there are two output paths after the polarizing beam splitter). These two spatial modes can be considered as a new spatial qubit. Then, the action of the polarizing beam splitter is to couple the polarization qubit to this additional qubit. This is formally equivalent to a von Neumann system-probe coupling, which can model the interaction necessary for any projective measurement (27) and has been used between path and polarization degrees of freedom in the experiment reported by Rozema et al. (28). In our experiment, the polarization qubit is the system, and it is coupled (via the polarizing beam splitter) to an additional spatial qubit, which is the probe. We can read out information about the system by measuring the probe (with a photon detector) at a later time. This solves the nontrivial problem of realizing a measurement operation inside a quantum SWITCH. Most approaches to acquire information inside the SWITCH would lead to distinguishing information about the order in which the operations were applied, destroying the quantum superposition. However, in our solution, because the probe qubit is not measured until the information about the order of application of the operations is erased, the entire process can remain coherent. This solution also works deterministically; that is, both of Alice’s outcomes are retained. It also allows Alice to implement a measurement-dependent repreparation by placing different wave plates in each of the two outcome modes.

Our implementation of the quantum SWITCH draws inspiration from the study of Procopio et al. (10), in which only orders of unitary operations were superimposed. Therefore, as in the study of Procopio et al. (10), our experimental skeleton consists of a Mach-Zehnder interferometer (MZI) with a loop in each arm. However, because Alice’s measure-and-reprepare channel adds an additional path degree of freedom, we need an extra interferometric loop.

A scheme of our experimental apparatus is presented in Fig. 3. The first step is to set the state of the system qubit (encoded in the polarization) with a polarizer and a half–wave plate. Then, the photon impinges on a 50/50 beam splitter; this sets the state of the control qubit (encoded in a path degree of freedom) in |+〉. Depending on the state of the path qubit, the photon is sent to either Alice (who performs MA) and then Bob (who performs UB) or vice versa. As described above, MA is a projective measurement (a sequence of two wave plates and a polarizing beam splitter) and a corresponding repreparation (a sequence of two wave plates in only one of the polarizing beam splitter outputs), and UB is a unitary gate (a sequence of three wave plates). Because the polarizing beam splitter adds a second path qubit, this results in four path modes, encoding both the state of the control qubit and the outcome of the measure-and-reprepare channel. Referring to Fig. 3, the external (yellow) interferometer arises from the outcome H—also referred to as a logical 0—and the internal (purple) one arises from the outcome V—a logical 1. We finalize the SWITCH by erasing the information about the order of the gates. This can be done by applying a Hadamard gate to the control qubit. Because the control qubit is a path qubit, a Hadamard gate can be implemented with a 50/50 beam splitter. However, in our experiment there are two path qubits (the control qubit and Alice’s ancilla measurement qubit). Thus, we must use two 50/50 beam splitters: one beam splitter to interfere the control qubit when Alice’s ancilla qubit is in the state |0〉, and one beam splitter when it is in the state |1〉. Finally, each of the four outputs is coupled into single-mode fibers, which are each connected to single-photon detectors (SPDs). Then, detecting a photon in one of the four modes yields the result of both the measurement of the control qubit in the superposition basis and Alice’s measurement (see the detector labels in Fig. 3).

Fig. 3 Experimental setup.

A sketch of our experiment to verify the causal nonseparability of the quantum SWITCH. We produce pairs of single photons using a type II SPDC source (not shown here). One of the photons is used as a trigger, and one is sent to the experiment. The experiment body consists of two MZIs, with loops in their arms. The qubit control, encoded in a path degree of freedom, dictates the order in which the operations MA and MB are applied to the target qubit (encoded in the same photon’s polarization). Alice implements a measurement and repreparation (MA), and Bob implements a unitary operation (MB). The state of the control qubit is measured after the photon exits the interferometers; that is, we check if the photon exits port 0 or port 1. Note that there are two interferometers, each corresponding to a different outcome for Alice: The yellow path means Alice measured the photon to be horizontally polarized (a logical 0), and the purple path means Alice found the photon to be vertically polarized (a logical 1). The first digit written on the detector labels this outcome. The second digit refers to the final measurement outcome, which, physically, corresponds to the photon exiting from either port 0 or port 1. In this diagram, port 0 (1) means the photon exits through a horizontally (vertically) drawn port. A half–wave plate at 0° was used in the reflected arm of the first beam splitter to compensate for the acquired additional phase. QWP, quarter–wave plate; HWP, half–wave plate; BS, beam splitter; PBS, polarizing beam splitter.

We wish to evaluate the CNS of our quantum SWITCH by experimentally estimating the expectation value of a causal witness S (Eq. 8). In other words, we want to assess Tr(SexpWSWITCH), where WSWITCH here refers to the process matrix describing our experiment. Because the trace is linear, this can be done by implementing one term in the sum of S (Eq. 8) at a time. To estimate a single term, we injected an input state into the SWITCH, Alice and Bob each perform an operation inside, and then we measured the outputs of the overall process. Because the control qubit measurement and Alice’s measurement are both single-qubit projective measurements, there are a total of four possible outcomes. For each measurement setting, different input states are sent into the SWITCH, and the probabilities of each outcome are experimentally estimated by sending multiple copies of the same input state. To compute the final value of the CNSexp(WSWITCH), the results of these measurements are weighted by the corresponding αa,b,d,x,y,z and summed.

The number of terms in the sum of Eq. 8 is determined by the specific witness we wish to evaluate. In general, Alice and Bob must each implement a set of operators forming a basis over their channels. For Bob’s unitary channel, this requires 10 elements, and for Alice’s measure-and-reprepare channel, this requires 16 (11). In our case, we formed Alice’s basis with four (noncommutative) projection operators and three unitary repreparation operators when the outcome was H and one operator (the identity operator) when the outcome was V. This corresponds to 12 measure-and-reprepare channels when the outcome of Alice’s measurement is H and 4 when it is V, for a total of 16 measure-and-reprepare operators. For Bob, we implement all 10 unitaries.

Varying the input state can make CNSexp(WSWITCH) more robust to noise. Hence, for our experiment, we used three different input states: |H〉, |V〉, and |+〉. Finally, we implemented two different measurement operators D(out) on the control qubit (corresponding to the two outcomes of the projection onto basis Embedded Image). Thus, for our experiment, the calculation of CNSexp(WSWITCH) translates intoEmbedded Image(12)

Here, we do not need the sum over b, because Bob’s unitaries do not have an outcome. The probability in Eq. 12 is defined asEmbedded Image(13)

We must experimentally estimate all of these probabilities to evaluate CNSexp(WSWITCH). There are 1440 terms in this sum. However, four outcomes (two from Alice’s measurement and two from the final detection) are collected simultaneously (experimentally, this means the counts of four SPDs are collected in one setting). Therefore, we need 360 different experimental settings. However, for our witness of the 360 prefactors αa,d,x,y,z, 101 are equal to zero; thus, there are actually only 259 relevant experimental settings.

With this in place, we can experimentally measure the CNSexp(WSWITCH) (for information relating to experimental visibility, stability, and data taking procedure, see Materials and Methods). Figure 4 shows some of the probabilities p(a, d|x, y, z) (Eq. 13) for the four outcomes; that is, for Alice, a = 0, 1, and our final measurement, d = 0, 1 (the remainder are shown in the Supplementary Materials). In Fig. 4, the experimentally obtained values are denoted by blue dots, and the theoretical predictions are represented by bars.

Fig. 4 Experimentally estimated probabilities.

Each data point represents a probability p(a, d|x, y, z) in Eq. 12 for a = 0, 1 and d = 0, 1. The blue dots represent the experimental result, and the bars represent the theoretical prediction. The yellow (blue) bars refer to the external (internal) interferometer. The x axis is the measurement number, which labels a specific choice of its input state, measurement channel for Alice and Bob, and final measurement outcome. For our witness, it runs from 0 to 259, but we only show the first 44 here for brevity. Alice and Bob’s specific choice of operator is given in Table 1 and discussed in Materials and Methods. Additional information is in figs. S1 to S3.

Our main source of error is phase fluctuations in the two interferometers. Therefore, we performed a separate measurement (presented in Materials and Methods) to characterize this error. The error bars in Fig. 4 represent both these phase errors and Poissonian errors due to finite counts. These errors do not take into account systematic errors, such as wave plate miscalibration, because these systematic errors represent a deviation of our experimental SWITCH from the ideal SWITCH.

We can now obtain a value for the CNS of our process by weighting the data presented in Fig. 4 (and figs. S1 to S3) by αa,d,x,y,z and then summing them. The result isEmbedded Image(14)

The error bar on CNSexp(WSWITCH) was calculated by Gaussian error propagation from the errors of the individual probabilities. The theoretical maximum value for CNSexp(WSWITCH) is 0.2842. The disagreement between this and our measured result is caused primarily by two effects. First, given the reduced visibility of the interferometers (which we will discuss in detail shortly), the maximal value for CNSexp(WSWITCH) is 0.2523, when the visibility is 0.9539. The remaining discrepancy comes from systematic errors, such as wave plate miscalibration, which effectively mean that the unitaries Alice and Bob implement differ slightly from their targets. For example, we estimate, using a simple Monte Carlo simulation, that a wave plate calibration error of 3° would explain this discrepancy, leading to a drop in the CNS of approximately 0.043. Still, given our measured result, we can conclude that our process is causally nonseparable by a margin of approximately 7 SDs. This large margin demonstrates the effectiveness of performing a measurement operation inside the quantum SWITCH.

As mentioned above, the causal nonseparability (as measured using a causal witness) can be considered as a measurement of how much noise can be added to the process before it becomes causally separable. The CNSexp we have discussed so far refers to a worst-case noise model (11), wherein the desired process is replaced with the process that can do the most damage to its causal nonseparability with a probabilityEmbedded Image(15)

Because the replacement is done with the worst-case process, this is a lower bound on the “probability of noise” that can be tolerated (see Materials and Methods). For our process pworst−case = 0.168 ± 0.001.

We studied the effect of the noise most relevant to our experiment, namely, dephasing the control qubit but not the system qubit. This noise is the strongest in our setup because the control qubit is encoded in a path degree of freedom, which must remain interferometrically stable [see the study of Branciard (29) for the formal definition of this noise model]. We realized this noise by unbalancing the path length of the interferometers by more than the photons’ coherence length. The experimental signature of this imbalance is a reduced visibility of the interferometer. We measured the CNS for several visibilities between 0.95 and 0.06. Figure 5 shows a decrease in the expectation value of Embedded Image as the noise increases. There is an offset between the experimental data and the theoretical prediction due to systematic errors. However, both theory and experiment follow the same trend. By extrapolating our fit of the experimental data to Embedded Image (where the process becomes causally separable), we observe a “noise tolerance” of 0.342 for this type of noise. As expected, this is larger than our experimentally measured pworst−case, indicating that it is a lower bound.

Fig. 5 Expectation value of the causal witness [Embedded Image] in the presence of noise.

Because the control qubit (initially in |+〉) is decohered, the superposition of causal orders becomes an incoherent mixture of causal orders. Hence, the causal nonseparability of the SWITCH is gradually lost. The plot shows the causal nonseparability of our experimentally implemented SWITCH because the visibility of the two interferometers is decreased (from right to left). The experimental data linearly decreases with visibility just as theory (dashed line) predicts. The gap between theory and experiment is attributed to systematic errors. The visibility (x axis) is a measure of the dephasing strength on the control qubit.


Our experiment demonstrates how to perform a measurement inside a quantum SWITCH without destroying the superposition of causal orders. The task was only assumed to be possible in the study of Araújo et al. (11), but no method to accomplish it was proposed. The difficulty is that performing a standard measurement reveals the time at which it is performed and, thus, whether it is performed before or after the partner’s operation. Consequently, the superposition of causal orders becomes incoherent. Our way around this is to break the measurement into two steps: First, the system coherently interacts with an ancilla through a unitary operation (namely, the additional path modes introduced by the local operation in our experiment). Second, after finalizing the quantum SWITCH (interfering these modes), the ancilla is measured. This allows us to make a “coherent measurement at different times” and then erase the ordering information.

We demonstrated the causal nonseparability of our experimental apparatus by measuring a causal witness. With the ability to perform a measurement inside the SWITCH, we could increase the robustness of the causal witness to noise. Previous experimental work only indirectly accessed the causal nonseparability of the SWITCH and, moreover, only used unitary gates in the SWITCH (10). Although some other experiments (3033) have also studied the topic of causal relations in quantum mechanics, they focused on a different aspect. For example, in previous studies (30, 31, 33), instead of creating a genuinely indefinite causal order, as in our work, the authors distinguished between different causal structures. The incoherent mixture (30) and a quantum superposition (31) of different causal relations reported previously are both compatible with one party in the past and the other in the future. Thus, in our language, they correspond to causally ordered processes.

Our work represents the first experimental realization of a quantum superposition of orders of nonunitary channels and the first measurement of a causal witness. We believe that this will be an important step toward the realization of quantum superpositions of the order of more elaborate processes. Because it has been theoretically demonstrated that causally nonordered processes can give rise to a reduction in the query complexity of certain tasks (46) and lead to more efficient communication channels (7, 8), it is important to study new techniques to create more complex, causally nonordered processes. We already see an advantage in our current work. Making a measurement inside the quantum SWITCH made our experiment more robust to noise and allowed us to demonstrate, by approximately 7 SDs, that our setup cannot be described by a causally ordered process.


Single-photon source

We generated heralded single photons using a type II spontaneous parametric down-conversion (SPDC) process in a Sagnac loop (34). The Sagnac loop was realized using a dual-wavelength polarizing beam splitter and two mirrors. The SPDC crystal was a 20-mm-long periodically poled crystal potassium titanyl phosphate crystal. The crystal was pumped by a 23.7-mW diode laser centered at 395 nm. The polarization of the laser was set to be horizontal. With this, we generated degenerate pairs of single photons centered at 790 nm, in a separable polarization state |H〉|V〉. Polarizers in the signal and idler modes were used to ensure that the polarization was in a well-defined state. The down-converted photons were coupled into single-mode fibers. One photon was sent directly to an SPD and used to herald the other photon’s presence for the experiment, whereas the other was sent to our experiment. After passing through the experiment, we observed a coincidence rate between the herald detector and the four final-measurement detectors of 3750 pairs per second.

Implementing Alice and Bob’s channels

As discussed in the main text, to experimentally measure a causal witness, Alice and Bob need to implement a series of quantum channels on a polarization qubit inside the quantum SWITCH. Alice must perform a measure-and-reprepare channel, whereas Bob must implement a unitary channel. Alice measures in four different bases. We define her different bases by a unitary operator preceding a projective measurement in the basis {|0〉, |1}. Alice’s premeasurement operators are listed in the first column of Table 1. When her outcome is |0〉 (in a given basis), Alice implements one of three different repreparation operators (second column of Table 1). On the other hand, when her outcome is |1〉, she performs the identity channel. Thus, she has 16 different measure-and-reprepare maps. Bob simply implements 10 different unitary operators (third column of Table 1).

Table 1 List of operators performed by the two parties.

The table shows Alice’s four measurement operators and her three repreparation operators, which Alice applies when her outcome is |0〉; when Alice’s outcome is |1〉, Alice performs the identity. Bob’s 10 unitary operators are shown in the third column.

View this table:

We experimentally implemented both Alice’s measurement operators and repreparation operators through a sequence of two wave plates (quarter–wave plate and then half–wave plate) and Alice’s projective measurement in a polarizing beam splitter measuring in {H〉, |V}. Bob’s operators were implemented via three wave plates (quarter–wave plate, half–wave plate, and then quarter–wave plate). In Table 2, we show the specific wave plate angles we used for each operator.

Table 2 Set of wave plate angles.

A list of all of the wave plate angles used to perform the operators listed in Table 1. In our experiment, all combinations of these settings were used, which, together with our three input states, results in 360 measurement settings.

View this table:

Experimentally estimating probabilities

Because Alice makes a two-outcome measurement, and our final measurement has two outcomes, for each setting of Alice and Bob, there are four different outcomes. Experimentally, each outcome corresponds to a different SPD. For each setting, we collected approximately 7500 counts in total after 2 s of data acquisition. From these counts, we estimated the four corresponding output probabilities through the formulaEmbedded Image(16)where Cmn is the number of counts collected at one of the detectors, and the η factors are different relative detector efficiencies, described below. Here, m labels Alice’s outcome [experimentally, this labels in the internal (purple) or external (yellow) interferometer] and n labels the outcome of the final measurement (experimentally, port 0 or port 1 of either interferometer). The total number of (efficiency corrected) counts, appearing in Eq. 16, isEmbedded Image(17)

The efficiency factors in the above equations are defined as follows. The single-subscript factor ηm refers to relative efficiencies between the internal (m = 1) and external (m = 0) interferometer (Fig. 3). The other factors Embedded Image refer to the relative efficiencies between the two ports n = 0 and n = 1, of interferometer m. Then the absolute efficiency of a given detector is Embedded Image. Roughly speaking, to estimate the relative efficiencies, we must send the same number of photons between the detectors and compare the measured count rates.

To estimate Embedded Image within each interferometer, we sent the photons between the two ports by scanning the phase (when all of the internal wave plates are set to 0) by means of a piezo-electrically driven translation stage. Plots of representative interference fringes (already efficiency corrected) for each interferometer are shown in Fig. 6. By requiring the total counts out of each port to be constant, we can obtain a relative efficiency between the two ports in each interferometer. In practice, we obtain the efficiency by plotting the counts out of one port versus the counts out of the other port. If the two efficiencies are equal, the slope of this line will be 1. However, because of different coupling and detector efficiencies, this is enforced by requiringEmbedded Image(18)where K0 and K1 are constants. We set one efficiency of each pair to 1, because we were interested in the relative efficiency. Setting (arbitrarily) Embedded Image means that the slope of Cm1 versus Cm0 will be Embedded Image. These plots, for both interferometers, are shown in Fig. 7.

Fig. 6 Efficiency-corrected interferometer fringes out of the two interferometers.

A plot of the coincidences between the herald and the two detectors at the output of each interferometer as the interferometer phase is varied.

Fig. 7 Determination of detection efficiency.

Triggered coincidences detected in port 1 plotted against those detected in port 0 for both interferometers. Because the total number of photons exiting the interferometer should be constant, the relative collection/detection efficiency can be determined from the slope of this line.

If we next estimate ηm, the relative efficiency between two interferometers, then we can estimate the required probabilities (Eq. 16). To do this, we used the state preparation wave plate (Fig. 3) to send the photons all to one interferometer or the other. In each case, we scanned the phase. Then, using the previously discussed efficiencies Embedded Image we have K0 and K1 (Eq. 18). As before, we can set one of the relative efficiencies to 1, we chose η0 = 1. Then, we can calculate the final efficiency asEmbedded Image(19)

This works because by using the wave plates and the polarizing beam splitter, we can send nearly all of the incident photons one way or the other.

Using this procedure, we now have relative efficiencies between all of the detectors. Note that Embedded Image; however, this does not matter because even if we had the absolute efficiency of each detector, it would cancel out in the calculation of the probability (Eq. 16), because we must normalize by Ctot. After evaluating p00, p01, p10, and p11 for each of Alice and Bob’s settings, we weighted each by the corresponding αa,d,x,y,z (Eq. 12) and summed them all up. This gave us our experimental value of the causal nonseparability.

Stability and visibility of the interferometers

Central to our experiment were two interferometers whose overall size was approximately 80 cm × 120 cm. The visibility of the two interferometers was 95%; this is apparent in the interferograms shown in Fig. 7. This error can be interpreted as dephasing noise on the control qubit (see the discussion in the main text). In addition to the reduced visibility, the phase of the interferometer fluctuated. If the phase fluctuates on the time scale of the acquisition time, then this would further decrease the visibility. However, we found that the phase drifts rather slowly, by approximately 0.01 rad/min. To measure the causal witness, we needed to set 259 different wave plate settings. Moving the wave plates from one setting to the next took approximately 30 s. Combined with the measurement time of 2 s, this means that it took approximately 30 s per measurement setting. Therefore, after 30 measurements, the phase drifted enough to cause a noticeable error. To combat this, we automatically reset the phase to 0 rad every 20 measurement settings by setting the wave plates to 0°, scanning the piezo-electrically driven translation stage, and moving to the maximum of the fringe. Despite this action, there was still residual phase drift. We performed a separate measurement, mimicking our experimental procedure, to characterize this remaining phase drift. We set the wave plates to 0° so that we could directly observe the drift phase drift. As above, we counted for 2 s, and reset the phase to 0 rad every 20 measurements. However, the wave plates remained set to 0° the entire time. Therefore, in the absence of phase drift, the fringe would have remained at a maximum. By measuring the deviation from the ideal values, we estimated that, over the course of our entire data run, we had a residual phase fluctuation of approximately 0.04 rad. Then, we propagated this error to estimate an error on each probability that we measured. These are the error bars drawn in Fig. 4 and figs. S1 to S3.

Causal witness derivation for our setup

Here, we define what a causal witness is and sketch the algorithm that was used to compute the witness suitable for our experimental setup. See the study of Araújo et al. (11) for an exhaustive introduction to the subject. Throughout this section, we will use the Choi-Jamiołkowski isomorphism, which we introduce briefly in the Supplementary Materials.

A process matrix Embedded Image (where the Hilbert spaces refer both to the input and the output of the laboratories) is “causally separable” if it can be written as a convex combination of processes compatible with the causal order AB and BA, that is, as WseppWAB + (1 − p)WBA. A causal witness Embedded Image is a Hermitian operator such that for all “causally nonseparable” process matrices Wn−sep, Tr(SWn−sep) < 0, but for any processWsep, Tr(SWsep) ≥ 0. The existence of this Hermitian operator S is justified by the separating hyperplane theorem (19). As a consequence of this theorem, and because the set of causally separable processes is convex, for every causally nonseparable process Wn−sep, there exists a causal witness S such that Tr(SWn−sep) < 0. This is illustrated graphically in Fig. 8.

Fig. 8 Schematic representation of a causal witness.

In this two-dimensional representation, the causal witness is represented by the line (actually, a hyperplane) S. It separates the convex set of process matrices Embedded Image from a given causally nonseparable process matrix Wn−sep. Because the set of causally separable processes (Eq. 6) is convex, the separating hyperplane theorem (19) implies that one can always draw a hyperplane to separate it from any point outside the set (which corresponds to a causally nonseparable process). This hyperplane is the causal witness.

The optimal causal witness Sopt for a given process W can be computed efficiently using a “semidefinite program” (SDP) (11)Embedded Image(20)where Embedded Image and Embedded Image are, respectively, the set of causal witnesses and the set of Hermitian operators that have nonnegative trace with process matrices, as defined in the study of Araújo et al. (11), and Embedded Image is the identity operator on Embedded Image divided by the dimension of the output spaces Embedded Image out for normalization.

The causal nonseparability CNS(Wn−sep) = −Tr(SoptWn−sep) is the minimal λ ≥ 0 such that the process matrixEmbedded Image(21)is causally separable, after being optimized over all valid process matrices Ω. This means that it is the minimum amount of worst-case noise necessary to make Wn−sep causally separable or, equivalently, the maximum (or rather the supremum) amount of worst-case noise that Wn−sepcan tolerate before becoming causally separable. Noting that Embedded Image , we see that Embedded Image can be interpreted as the probability that the worst-case process is prepared instead of the desired process Wn−sep and therefore that Embedded Image is the maximal probability that still allows us to see causal nonseparability.

Any witness S (particularly Sopt) can be decomposed with respect to a basis for the space Embedded Image. Such a basis consists of the Choi-Jamiołkowski representations of general state preparations on Embedded Image, general measurement and repreparation operations on Embedded Image and Embedded Image, and general measurements on Embedded Image. Having access to such a basis of operations means being able to perform full causal tomography.

However, in our experimental setup, Alice could implement general measure-and-reprepare operations Embedded Image, but Bob could implement only unitary operations Embedded Image, and measurements were carried out only in the superposition basis. Thus Sopt will not necessarily be experimentally achievable, and in our case, it was not. To compute the best witness that we could experimentally implement, we added a restriction on the decomposition of the witness as an additional constraint in the SDP, which then outputs the optimal experimentally accessible witness SexpEmbedded Image(22)where Embedded Image are the 24 Choi-Jamiołkowski representations of measurement-repreparation maps, among which 16 were linearly independent, Embedded Image are the 10 linearly independent Choi-Jamiołkowski representations unitaries, which are listed under the heading of Implementing Alice and Bob’s Channels, and Embedded Image are the two projectors onto the superposition basis.

The algorithm 21 returns the coefficients αa,d,x,y,z, which were used to weight the experimental probabilities p(a, d|x, y, z) corresponding to Embedded Image to compute the experimental value for Tr(SexpWSWITCH).

Analogously to the ideal case, the “experimentally accessible causal nonseparability” [that is, CNSexp(WSWITCH) = −Tr(SexpWSWITCH)] is the maximal amount of worst-case noise that can be admixed to WSWITCH before our experimental setup becomes incapable of certifying that WSWITCH is causally nonseparable, and Embedded Image is the maximal probability of preparing the worst-case noise process instead of the ideal WSWITCH.


Supplementary material for this article is available at

section A. Choi-Jamiołkowski isomorphism

fig. S1. Experimentally estimated probabilities.

fig. S2. Experimentally estimated probabilities.

fig. S3. Experimentally estimated probabilities.

table S1. List of all the experimental measurement settings and the corresponding coefficients.

Reference (35)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.


Acknowledgments: We thank I. Alonso Calafell for assisting with the electronics and C. Branciard, F. Costa, F. Massa, and M. Zych for useful discussions. Funding: G.R. acknowledges support from the uni:docs fellowship programme. L.A.R. acknowledges support from the Templeton World Charity Foundation (fellowship no. TWCF0194). Č.B. acknowledges support from the John Templeton Foundation, and Individual Project (no. 24621). Č.B. and P.W. acknowledge support from the Doctoral Programme CoQuS (no. W1210-3). P.W. also acknowledges support from the European Commission, Emulators of Quantum Frustrated Magnetism (EQuaM) (no. 323714), Photonic Integrated Compound Quantum Encoding (PICQUE) (no. 608062), Graphene-Based Single-Photon Nonlinear Optical Devices (GRASP) (no.613024), Quantum Simulation on a Photonic Chip (QUCHIP) (no.641039), the Austrian Science Fund (FWF) through the START Program (Y585-N20), and the U.S. Air Force Office of Scientific Research (FA9550-16-1-0004). L.M.P. acknowledges partial support from Consejo Nacional de Ciencia y Tecnología–Mexico, from 1 November 2015 to 31 October 2016; the corresponding project is 10010-2015-02. Author contributions: G.R., L.A.R., M.A., A.F., and L.M.P. designed the experiment. G.R. and L.A.R. built the setup and carried out data collection. G.R., L.A.R., A.F., and M.A. performed data analysis. J.M.Z. designed and built the automated components. G.R. and M.A. created the figures. P.W. and C.B. supervised the project. All authors contributed to writing the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Stay Connected to Science Advances

Navigate This Article