### Higher Dimensions

I want to extend the treatment to we just went through to higher dimensions. In my view, this provides a very nice way of thinking about why the generalisation of the derivative to higher dimensions is what it is.  It has certainly taken me a while to appreciate the definition of the derivative of a a function of many variables and I am attempting to share some of my thoughts.

In the last post we thought about graphs of functions and tangent-lines. Now, the graph of a function is intially something 0ne thinks of a picture of that function. In my opinion, there is a sense in which one ought to continue to do so. It’s just that now, we’ll be thinking about $\mathbb{R}^m$-valued functions f defined on  an open subset U of $\mathbb{R}^n$. Recall that the graph of f is the set of points $(x, y)$ in $U \times \mathbb{R}^m$ such that f(x) = y.

What might we mean by the derivative of f ?

Allow me to reiterate the ideas introduced in the previous post: For a real-valued function of a single variable, finding the derivative of f at the point x is the same problem as finding the tangent-line to the graph of f at the point (x,f(x)). For an $\mathbb{R}^m$-valued function defined on an open subset of $\mathbb{R}^n$, this tangential object is no longer a line but an n-dimensional affine subspace in $\mathbb{R}^{n+m}$ (this is a purely geometric observation and an affine subspace is an n-dimensional subspace, only translated – it needn’t contain the origin). Either way, the tangent-object is itself the graph of some function and so finding the tangent is the same problem as determining this function. Crucially, the functions of which these tangents are the graphs are affine functions (i.e. of the form $t \mapsto At + c$, a linear transformation followed by a translation) . What this means, in particular, is that they can be completeley parameterized by a realtively small amount of information. So, the main point here is that determining the derivative of f at the point x boils down to describing a particular affine function.

### Affinity

Let’s write $L_x(t) = At + c$ for the particular affine function we are trying to describe. Our job is made even easier because we are given one piece of information for free. Remember that we know that $L_x(x) = f(x)$. So, using the fact that $Ax + c = f(x)$, we can re-write $L_x$ as $L_x(t) = A(t-x) + f(x)$. To labour this a bit more: What’s happened here is that knowing one point, namely (x,f(x)), on the tangential affine subspace is the same as knowing the value of c in the definition of the function which has that affine subspace as its graph.

All that is left to do is to determine is the linear transformation A. The main point to note here is that we can now see that finding the derivative of f at x boils down to determining a particular linear transformation. This is the point of view which I think it is worth thinking hard to arrive at, because it is exactly right: The derivative of a function at a point is indeed a linear transformation.

### Remarks

My question now is: Can this point of view be taught? – In an effort to make the higher dimensional derivative seem less mysterious when compared with the derivative of a function of a single variable? When one learns to differentiate a function on $\mathbb{R}$, one can’t tell that the number which you get for the derivative of f at x is actually a linear transformation. In my opinion, this can lead to confusion when you are then told that the derivative of a function of many variables at a point is a linear transformation. It can seem strange, as though things get a bit weird in the special case of higher dimensions when they are OK in the original case of one dimension. It is then years after first learning about differentiation of a single-variable function that one can go back and form a new point of view which marries the two ‘cases’.

Perhaps the idea would have to be to somehow instill the idea of finding tangent-lines to curves as a purely geometric game. This needs to be motivated somehow though, named as ‘differentiation’ and labelled as the goal. Then one could follow the route I have tried to outline: i.e. The line is the graph of an affine function. We can work out the translation part of the function but what is the linear transformation? Is this reasonable? Presumably there is someone somewhere who learnt differentiation roughly like this and then when moving on to higher dimensions simply followed the same steps and understood more rapidly why you are now dealing with a matrix rather than a number.