A Conversation on Teaching Optics and the History of Linear Algebra

This document captures a discussion about pedagogical approaches to linear optics and the historical development of key concepts in linear algebra that underpin the field.

1. Rethinking the Teaching Order: Eigenfunctions Before Harmonics

User: I am teaching students about linear optics. I am considering the order of presentation. Normally, I would start with the idea of a linear system. Then I would go to harmonics as eigenfunctions of shift invariant linear systems.

But I am reconsidering. I am thinking about starting with linear systems, emphasizing that for discrete calculations a linear system is represented by a matrix. Then explaining that the eigenfunctions of the matrix are an orthonormal basis for the inputs to the system. And then saying that when the system is shift-invariant, and optics is generally shift-invariant (isoplanatic) over small regions, the harmonics are almost eigenfunctions. If we represent them using complex exponentials, they are eigenfunctions.

What do you think about the 2nd approach, where eigenfunctions come before harmonics?

I think your second approach is excellent and likely more insightful for students.

It frames the Fourier basis not as a convenient mathematical trick, but as the fundamental, natural basis for describing the behavior of shift-invariant systems. This “why” is often missing from the traditional presentation.

Evaluation of Your Proposed Approach

Strengths 👍

  1. Strong Motivation: It directly answers the question, “Why do we use sine waves and complex exponentials (harmonics) to analyze optical systems?” The answer is because they are the eigenfunctions—the special functions that don’t change their form when passing through the system. This provides a much deeper motivation than simply stating that they are a convenient orthogonal basis.
  2. Concrete to Abstract: Starting with matrices is a powerful pedagogical tool. Students in physics and engineering typically have a more concrete understanding of linear algebra (vectors and matrices) than functional analysis. Showing that eigenvectors form a special basis for a matrix operator provides a solid foundation before generalizing from discrete vectors (v) to continuous functions (f(x)).
  3. Unifies Concepts: This approach beautifully connects two major topics: linear algebra and Fourier analysis. Students can see that the Fourier transform is not an isolated tool but a specific application of eigen-decomposition for the special case of shift-invariant operators.

A Suggested Refinement for Maximum Impact

To make the connection between matrices and shift-invariance even more explicit, you can introduce the concept of a circulant matrix. This provides the perfect bridge from the discrete to the continuous.

Proposed Teaching Flow:

  1. Linear Systems and Matrices: Start as you planned. A linear system is an operator L such that L[x]=y. In the discrete, finite case, this is a matrix multiplication: Ax=y.
  2. Eigenvectors as a Natural Basis: Explain that for any matrix A, the eigenvectors ei​ are the “special” inputs. When an eigenvector is the input, the output is just a scaled version of the input: Aei​=λi​ei​.
  3. Introduce Shift-Invariance (Discrete): Now, consider a specific type of linear system: a discrete, shift-invariant system (or more accurately, a system with periodic boundary conditions). The operation is a circular convolution. Show students the matrix C that represents this operation. They will see that each row is a shifted version of the row above it. This is a circulant matrix.
  4. The “Aha!” Moment 💡: State the powerful theorem: The eigenvectors of any circulant matrix are the basis vectors of the Discrete Fourier Transform (DFT). These are the discrete complex exponentials. This is not a coincidence; it’s a fundamental property.
  5. Generalize to Continuous Optics: Now, make the leap from discrete to continuous. A discrete input vector becomes a continuous function, a circulant matrix becomes a convolution integral, and the DFT basis vectors become the continuous complex exponentials, which are the eigenfunctions of any continuous, shift-invariant system.

2. Minor Question: Circulant vs. Toeplitz Matrices

User: Minor question: Is circulant matrix widely used, or is Toeplitz more common?

Excellent question. While they are related, they serve different roles.

Toeplitz matrices are the more general and fundamental concept, but circulant matrices are more widely used in computational algorithms.

Toeplitz Matrix: The General Form

A Toeplitz matrix has constant values along each of its diagonals. It is the exact matrix representation of a linear convolution operation, which is fundamental to linear, time-invariant (LTI) systems.

Circulant Matrix: The Computational Powerhouse

A circulant matrix is a special type of Toeplitz matrix where each row is a cyclic shift of the row above it. It represents circular convolution. Its “killer feature” is that it is diagonalized by the Discrete Fourier Transform (DFT) matrix, which allows for incredibly fast computations (O(NlogN)) using the Fast Fourier Transform (FFT).

The Bottom Line:

  1. Generality: Toeplitz is the more general structure.
  2. Computation: Circulant matrices are the key to fast computation. In fact, the standard way to perform a fast linear convolution (a Toeplitz operation) is to pad the input signals and use circular convolution (a circulant operation) via the FFT.

3. Historical Context: Who Linked Matrices to Linear Systems?

User: Separately, historically who would you associate with the notion that a discrete linear system can be represented by a matrix? I have Grassmann for the initial framing in terms of linear algebra of multidimensional spaces. But I don’t know who really got the relationship between matrices and linear systems firmly grounded.

That’s an excellent historical question. You’ve correctly identified Hermann Grassmann as a pivotal, yet often overlooked, figure. However, the person most credited with firmly grounding the relationship between matrices as an algebraic object and systems of linear transformations is the English mathematician Arthur Cayley.

The Key 19th-Century Figures

  1. Hermann Grassmann (1844): The Visionary Abstract Thinker
    In his work, Die Lineale Ausdehnungslehre, he developed the first axiomatic theory of what we now call a vector space and conceived of linear transformations in this general, abstract sense. However, his work was highly philosophical and largely ignored for decades.
  2. James Joseph Sylvester (1850): The Namer
    A close collaborator of Cayley, J.J. Sylvester was the first to use the term “matrix” (from the Latin for “womb”), seeing it as a generator of determinants.
  3. Arthur Cayley (1855, 1858): The Father of Matrix Algebra
    Cayley is the figure you’re looking for. In his Memoir on the Theory of Matrices (1858), he:
    • Defined Matrix Operations: Systematically defined matrix addition, scalar multiplication, and matrix multiplication.
    • Connected Matrices to Transformations: He explicitly showed that the composition of two linear transformations corresponds to the product of their representative matrices.
    • Developed an Algebra of Matrices: He introduced the identity matrix, the inverse matrix, and explored their properties, including the famous Cayley-Hamilton theorem.

In Summary: While Grassmann laid the abstract foundation, it was Arthur Cayley who firmly grounded the idea that a discrete linear system (a linear transformation) is represented by a matrix, and that the algebra of these matrices governs the system’s behavior.

User: Was Cayley well aware of Grassmann’s work and building on it?

The historical consensus is that no, Cayley was almost certainly not aware of Grassmann’s work and did not build upon it. They arrived at similar conceptual territories via two very different, parallel paths.

  • Grassmann’s Path (Axiomatic and Philosophical): Grassmann was attempting to build a completely new, axiomatic foundation for geometry in n-dimensions. His work was abstract, dense, and published far from the main mathematical centers, so it remained obscure for decades.
  • Cayley’s Path (Algebraic and Concrete): Cayley was motivated by the theory of invariants and the geometry of linear transformations. He needed an efficient notation and algebra for describing how geometric objects change under transformations. His approach was to take the concrete array of numbers—the matrix—and build an operational algebra around it.

The two streams of thought were eventually unified by a later generation of mathematicians like William Kingdon Clifford and Giuseppe Peano, who recognized the power of combining Grassmann’s abstract structures with Cayley’s computational engine.