Understanding Quantum Computing from the Math Up
We start with the simplest ingredients — complex numbers and vectors — and end with variational quantum algorithms running on today’s noisy hardware. Every section depends on the one before it, so I recommend reading in order.
1. Complex Numbers: The Language of Quantum Amplitudes
A complex number has the form where . Quantum mechanics is built on complex numbers because they naturally encode two pieces of information at once: a magnitude and a phase.
Modulus and modulus squared
The modulus (absolute value) of is:
The modulus squared is:
where is the complex conjugate. The modulus squared is the single most important operation in quantum mechanics — it turns amplitudes into probabilities.
Polar form and phase
Any complex number can be written in polar form:
Here is the modulus and is the phase angle. The bridge is Euler’s formula:
The factor lives on the unit circle in the complex plane — its modulus is always 1. It is called a phase factor: it rotates a complex number without changing its size.
This duality — magnitude and angle — is exactly what quantum amplitudes need. A probability (a real non-negative number) can only tell you “how likely.” A complex amplitude tells you “how likely” and “at what angle” — and that angle is what enables interference.
2. Vectors, Inner Products, and Dirac Notation
Kets and bras
In Dirac notation a quantum state is written as a ket — a column vector of complex amplitudes:
The corresponding bra is the conjugate transpose (row vector):
Inner product
The inner product (bra times ket) yields a scalar measuring the overlap between two states:
When this equals 1, the state is normalized — a physical requirement, as we will see next.
3. The Qubit
A classical bit is either 0 or 1. A qubit is a two-dimensional complex vector that can be in a superposition of both:
where the computational basis states are:
The coefficients and are called amplitudes.
The Born rule (normalization)
The modulus squared of each amplitude gives the probability of measuring that outcome:
- Probability of measuring 0:
- Probability of measuring 1:
Since probabilities must sum to 1:
This is why a quantum state must be a unit-length vector. The familiar factor of in the state is simply the normalization constant: .
Two kinds of phase
Write the most general single-qubit state as .
- Global phase (): multiplies the entire state. It cancels out in every measurement probability (), so it has no physical meaning. We are free to drop it.
- Relative phase (): the phase difference between and . This is physically observable — it determines the outcomes of measurements in bases other than , and it is exactly what decoherence destroys.
After removing the global phase, any single-qubit state can be parametrized by just two real numbers:
Two angles — exactly what you need to specify a point on a sphere.
4. The Bloch Sphere
Since a single-qubit pure state is described by two angles , it maps to a point on the surface of a unit sphere — the Bloch sphere.
- (polar angle, 0 to ): controls the balance between and
- (azimuthal angle, 0 to ): the relative phase
Notable points
| Location | Angles | State |
|---|---|---|
| North pole | ||
| South pole | ||
| Equator | ||
| Equator | ||
| Equator | Superpositions with phase |
The states and sit at opposite points on the equator. They share the same (so -basis measurements give 50/50 for both), but they differ by — which means an -basis measurement can tell them apart perfectly. This is the geometric picture of relative phase.
Two key correspondences
Pure states live on the surface; mixed states live inside. A perfectly coherent qubit is a point on the sphere (Bloch vector length = 1). As decoherence degrades the state, the point shrinks toward the center. At the center sits the maximally mixed state — 50/50 and with no phase information left. The distance from the center encodes “how much coherence remains.”
Quantum gates are rotations. The gate rotates 180° around the -axis (). The gate rotates 180° around the -axis (, i.e., it flips the relative phase). and are partial rotations around their respective axes.
Try it yourself
Use the interactive Bloch sphere tool to build intuition. Start at (north pole), apply an gate to move to the equator, then try , , to see how phase rotations work.
5. Measurement and Expectation Values
Measurement
Quantum measurement forces a qubit to collapse into one of the basis states, probabilistically. Measuring in the -basis:
- With probability : result is “0,” state collapses to
- With probability : result is “1,” state collapses to
A single measurement gives a single random bit — it reveals nothing about or individually. To learn about the state, you must prepare it many times and collect statistics. This leads to the concept of the expectation value.
Expectation value
The expectation value is the weighted average of measurement outcomes. Assign eigenvalue to outcome and to (these are the eigenvalues of ):
For any observable (Hermitian operator) , the expectation value is given by the unified formula:
Read right to left: apply to to get a new vector, then take the inner product with . The result is always a real number.
Worked example
Take .
Computing :
This makes sense: measured in the -basis gives 50/50 outcomes, so the average of is 0.
Computing :
This also makes sense: is the eigenstate of , so -basis measurement yields with certainty.
Together with , these three values give the Bloch sphere coordinates — pointing along the axis, exactly where lives.
Why expectation values matter
- They are the experimentally accessible quantity. Single shots are random; repeated measurements converge to . This is also why VQE is measurement-hungry — suppressing statistical noise requires many repetitions.
- They translate quantum states into physical numbers. If is a Hamiltonian, is the average energy. VQE’s cost function is an expectation value.
- They connect directly to probabilities. For an observable with eigenvalues : .
- They are the Bloch sphere coordinates. locates the state on the sphere, and measures purity (1 for pure, smaller for mixed).
6. Phase Information: What It Means to Have It (or Lose It)
Now that we have the mathematical tools, we can understand precisely what “phase” means and why losing it is catastrophic.
Where is the phase?
A general superposition:
The relative phase distinguishes () from (). In the -basis both measure as 50/50, but in the -basis they give opposite deterministic outcomes. The difference is entirely in .
Seeing decoherence through the density matrix
A pure superposition state has the density matrix:
- The diagonal elements (0.5, 0.5) are the populations — the probabilities of finding or .
- The off-diagonal elements are the coherences — they carry the phase information.
When the environment randomizes the phase (dephasing), we average over , and the off-diagonals vanish:
This is a classical probability mixture — 50% , 50% , with zero interference capability.
Precisely: “Losing phase information” = the off-diagonal terms of the density matrix decay to zero while the diagonal (populations) stay unchanged. This is exactly the process. If the diagonals also change ( population decaying to ), that is the process.
Coherent vs. incoherent: the experimental distinction
| | System A: pure superposition | System B: classical mixture 50% + 50% | |—|—|—| | -basis measurement | 50/50 | 50/50 (indistinguishable!) | | Hadamard then measure | 100% gives 0 | 50/50 | | Math object | Amplitudes (complex) | Probabilities (real) | | How they combine | Amplitudes add, then square | Probabilities add directly | | Interference? | Yes (constructive / destructive) | No | | Density matrix | Non-zero off-diagonals | Off-diagonals are zero | | Bloch sphere | On the surface | Inside (center at worst) |
System A, passed through a Hadamard gate, undergoes destructive interference — the amplitude cancels perfectly. System B has no phase relationship, so probabilities just average. The fundamental distinction is amplitude addition vs. probability addition.
Why this matters for quantum computing: Every quantum speedup (Shor, Grover, phase estimation) relies on controlled interference — amplifying the correct answer and canceling the wrong ones. Once phase information is lost, a quantum computer degrades into an expensive random number generator.
7. Tensor Products: Building Multi-Qubit Systems
The tensor product (, or Kronecker product for matrices) combines subsystems into a joint system. Dimensions multiply, they do not add.
Vectors
Each component of the first vector multiplies the entire second vector:
For example, (written ):
The four components correspond to the amplitudes of . For qubits the state vector has dimensions — this exponential growth is the source of quantum computing’s potential power.
Matrices
Each element of the first matrix is replaced by that element times the entire second matrix:
For example, the Pauli string :
The mixed-product property
This is the most useful identity in multi-qubit calculations:
Each factor acts only on its own subsystem. For instance, — no need to construct the matrix.
Another frequently used rule:
Practical notes
- Not commutative: . The ordering corresponds to qubit labeling and cannot be swapped.
- Associative: , so grouping does not matter.
8. Pauli Strings: The Building Blocks of Quantum Observables
The four Pauli matrices
- : identity, does nothing.
- : bit flip (), 180° rotation around the Bloch sphere -axis.
- : phase flip (), 180° rotation around the -axis.
- : 180° rotation around the -axis.
These four matrices form a basis for all Hermitian matrices. Any single-qubit observable can be written as a linear combination of .
Pauli strings
An -qubit Pauli string assigns one Pauli matrix to each qubit and connects them with tensor products:
Examples (subscripts denote qubit index, omitted positions default to ):
- =
- =
- =
A Pauli string is formally a matrix, but we almost always work with the compact notation — which is the whole point of using them.
Why Pauli strings are central
They are the building blocks of Hamiltonians. Any -qubit Hermitian operator can be uniquely decomposed as a weighted sum of Pauli strings: . The Pauli strings form a complete orthogonal basis for the space of Hermitian matrices.
Expectation values are easy to measure. Measuring just requires checking parity. Measuring or requires a basis-change gate before measuring as . So is computed by measuring each Pauli term separately and taking the weighted sum.
Exponentials decompose into standard gate circuits. for any Pauli string has a fixed circuit template (detailed in Section 10).
Useful properties
- Every Pauli string is both Hermitian () and unitary (), so its eigenvalues are and only.
- Any two Pauli strings either commute or anti-commute — this is the foundation of the stabilizer formalism and quantum error correction.
- Weight: the number of non- factors. Higher weight means more expensive measurement and deeper circuits. (The notorious -tails of Jordan-Wigner are high-weight strings.)
9. Mapping Problems to Quantum Circuits
The goal is to recast a problem as finding the ground state of some Hamiltonian . The mapping has three layers:
- Problem layer — define what to optimize or solve.
- Hamiltonian layer — write (a weighted sum of Pauli strings) such that the ground state of encodes the solution.
- Circuit layer — translate state preparation and time evolution into concrete quantum gates.
Example 1: QAOA for Max-Cut
Encoding: Each graph vertex gets one qubit. = group A, = group B.
Hamiltonian: For each edge , the “reward for being cut” is:
This equals 1 if qubits and are in different groups (one is , the other ) and 0 if they are in the same group. The full cost Hamiltonian is:
This is diagonal — every computational basis state is an eigenstate, and its eigenvalue equals the number of edges cut by that assignment. The ground state (maximum eigenvalue after sign flip, or minimum of ) corresponds to the optimal cut.
Circuit translation:
- Initial state: apply Hadamard to all qubits to create .
- Problem layer : implement as CNOT → on → CNOT.
- Mixing layer with : apply to each qubit.
- Repeat for layers.
Example 2: VQE for molecular simulation
This requires an extra step — a fermion-to-qubit mapping — because electrons are fermions with anti-commutation relations that bare qubit operators do not satisfy.
Second-quantized Hamiltonian: , with coefficients computed classically.
Fermion-to-qubit transform (the key step):
- Jordan-Wigner: most straightforward, but introduces long -tails — each creation/annihilation operator maps to a Pauli string whose weight scales as .
- Bravyi-Kitaev: each operator involves only qubits, producing shorter circuits.
After this step, becomes a sum of Pauli strings .
Ansatz construction: UCCSD (chemistry-inspired, accurate but deep circuits) or hardware-efficient ansatz (shallow but prone to barren plateaus).
Measuring : each Pauli term is measured independently (with appropriate basis rotations for and terms), then results are combined with their coefficients.
General technique: decomposing into gates
Regardless of the problem, the evolution operator must be broken into elementary gates. The core recipe is Trotter decomposition + standard Pauli-string exponentiation:
- When terms do not commute, use the Trotter approximation: .
- For a single Pauli string exponential , there is a fixed template:
- Single-qubit gates to rotate any non- Paulis to .
- A cascade of CNOTs to collect parity onto one qubit.
- on that qubit.
- Reverse the CNOTs and basis rotations.
10. The NISQ Era and Variational Algorithms
What is “near-term quantum”?
The term NISQ (Noisy Intermediate-Scale Quantum) was coined by John Preskill in 2018 to describe the current generation of quantum hardware:
- Noisy: qubits have no error correction. Decoherence and gate errors (0.1%–1% per gate) accumulate directly in the output.
- Intermediate-Scale: roughly 50 to a few thousand physical qubits. Beyond ~50 qubits, classical computers struggle to simulate the system exactly — but the qubit count is still far too small to support full error correction.
| NISQ (Near-term) | FTQC (Fault-tolerant) | |
|---|---|---|
| Error correction | None | Full quantum error-correcting codes |
| Qubit count | 50 – few thousand (physical) | Millions physical → thousands logical |
| Circuit depth | Strictly limited by decoherence | Arbitrarily long in principle |
| Algorithms | Variational, sampling tasks | Shor’s factoring, exact chemistry simulation |
The dividing line is error correction. NISQ occupies the awkward middle ground — too many qubits for classical simulation, too few for the overhead of error correction.
The variational hybrid algorithm framework
All variational algorithms share a common skeleton — a feedback loop between a quantum processor and a classical optimizer:
- Prepare a parameterized quantum circuit acting on an initial state, producing a trial state (called the ansatz).
- Measure the expectation value of the target Hamiltonian: — this is the “cost.”
- A classical optimizer adjusts to reduce the cost.
- Repeat until convergence.
The design motivation: keep the quantum circuit shallow (avoiding decoherence) and offload the heavy iteration work to the noise-resilient classical computer. The theoretical justification is the variational principle: for any trial state, (the ground state energy). So “minimize ” is equivalent to “approximate the ground state.”
VQE and QAOA
| VQE (Variational Quantum Eigensolver) | QAOA (Quantum Approximate Optimization) | |
|---|---|---|
| Problem | Ground-state energy of a Hamiltonian | Approximate solution to combinatorial optimization |
| Applications | Quantum chemistry, materials science | Max-Cut, scheduling, routing |
| Ansatz structure | Flexible (UCCSD, hardware-efficient, etc.) | Fixed: alternating and layers |
| Parameter count | Many (grows with ansatz complexity) | Few (only parameters for layers) |
QAOA has a physically motivated circuit structure — alternating applications of two unitaries:
Here encodes the optimization objective (its ground state = the optimal solution) and provides “mixing” — exploring the solution space. The integer is the number of alternating layers. QAOA can be viewed as VQE with a structured, problem-specific ansatz.
Shared practical challenges:
- Noise accumulation limits circuit depth.
- Barren plateaus: gradients can vanish exponentially in the number of qubits, making optimization intractable.
- Measurement overhead is significant — each expectation value requires thousands of shots.
- Whether NISQ algorithms offer genuine quantum advantage at practical scale remains an open question. The field is gradually shifting toward “early fault-tolerant” approaches.
11. Superconducting Qubits and Decoherence
With the mathematical framework in place, we can now understand the physical platform where much of today’s quantum computing happens — and why maintaining coherence is so hard.
Why superconducting qubits stay coherent (for a while)
Superconducting qubits are not especially long-lived — their coherence times (typically 100 s to 1 ms) are short compared to trapped ions or neutral atoms. But they maintain coherence long enough to be useful, for specific physical reasons:
The superconducting energy gap. At cryogenic temperatures, electrons pair into Cooper pairs and condense into a superconducting state. Breaking a Cooper pair requires crossing a finite energy gap (). Below this gap, there are simply no available electronic excitation states for energy to dissipate into. In normal metals, the continuous excitation spectrum means oscillating currents decay almost immediately. In a superconductor at low temperature, the qubit’s operating frequency range is effectively “silent.”
Supporting factors:
- Macroscopic quantum coherence: a superconducting qubit is a collective state of billions of Cooper pairs sharing a single wave function. The Josephson junction provides the nonlinearity needed to isolate two energy levels, while the rest of the circuit behaves like a clean LC oscillator.
- Millikelvin operating temperatures: at 10–20 mK, thermal photons at the qubit frequency (~5 GHz) are essentially absent ().
- Circuit design against known noise: the transmon design deliberately operates in a regime exponentially insensitive to charge noise.
Coherence times: , , and dephasing
Coherence time measures how long a qubit preserves its “quantumness.” There are two distinct timescales:
(energy relaxation time): the characteristic time for the qubit to decay from to . Energy is genuinely lost to the environment (dielectric loss, quasiparticles, stray modes).
(phase coherence time): the characteristic time for the relative phase of a superposition to remain well-defined. Even without energy loss, environmental fluctuations can randomly shift the phase, causing the superposition to “blur out.” This pure phase randomization is called dephasing, with characteristic time .
The three are related by:
Energy relaxation necessarily causes some dephasing (the term), but additional pure dephasing mechanisms — typically low-frequency noise ( noise, flux fluctuations, charge fluctuations) causing the qubit frequency to drift — contribute .
Key detail: always. When , dephasing has been suppressed to its limit, and the remaining decoherence comes almost entirely from energy decay — a sign of excellent fabrication.
Practical implications: The number of gate operations per coherence time single-gate time. With single gates at 20–50 ns and s, a circuit can in principle run a few thousand gates — but each gate also has finite error, so fault-tolerant quantum computing requires error correction to break through this barrier.
What still causes decoherence
- Two-level systems (TLS) in amorphous oxide layers at interfaces
- Quasiparticles generated by stray infrared photons or cosmic rays
- Dielectric loss in the substrate
- Stray mode coupling in packaging
Over the past two decades, these have been identified and systematically suppressed one by one. Coherence times have improved by roughly since the first charge qubit in 1999.
Energy is the gift; phase is the struggle
The superconducting gap specifically suppresses high-frequency dissipation — the environment has no states to absorb energy at the qubit frequency, making relatively long. But dephasing does not require energy exchange. Slow environmental fluctuations can scramble the relative phase without crossing the gap, so is harder to protect.
Platform comparison ( typical values):
| Platform | Typical |
|---|---|
| Superconducting (transmon) | 50–500 s |
| Semiconductor spin qubits | Tens of s to milliseconds |
| Trapped ions | Seconds to tens of seconds |
| Neutral atoms | Seconds |
| Nuclear spins | Seconds to minutes |
Superconducting qubits couple strongly and operate fast — but that same strong coupling means they “hear” more noise.
is the gift from physics; is what engineers fight for.
12. Tying It All Together
Here is how every piece connects into one coherent framework:
- Complex numbers carry both a modulus and a phase → quantum amplitudes are complex, so quantum states inherently carry phase information.
- Modulus squared = probability, and the total must be 1 → this is the normalization condition and the origin of the factor.
- Relative phase (not global phase) is physically observable → it is precisely what dephasing destroys.
- Two real angles fully describe a single-qubit state → the Bloch sphere, where phase is the azimuthal angle.
- Measurement collapses the state randomly; a single shot carries minimal information → statistics over many runs are essential.
- Expectation values are the stable, experimentally accessible quantities — they are the Bloch sphere coordinates, the VQE energy to optimize, and the bridge between “quantum state” and “experimental number.”
- Pauli strings decompose any observable or Hamiltonian; tensor products combine single-qubit spaces into multi-qubit spaces → these are the complete language for writing “a problem” as “a measurable, optimizable .”
Pauli strings, tensor products, phase, decoherence — each is a link in the same chain, a different facet of the same framework.
Directions to explore next
- Bell states hands-on: Start from the tensor product, work through a specific two-qubit state all the way to its expectation values, to see the multi-qubit formalism in action.
- Quantum gates as Bloch sphere rotations: Connect each common gate’s matrix representation to its geometric meaning on the sphere.
- Stabilizer formalism and error correction: How Pauli string commutation relations lead to quantum error-correcting codes.
These notes were compiled from a series of AI-assisted learning conversations, tracing a complete thread from superconducting decoherence to the mathematical foundations of quantum algorithms.