From Average Rate of Change to Derivatives

Today the course moved past linear algebra and into average rate of change, instantaneous rate of change, and derivatives.

The first thing I revisited was the meaning of a function. A function describes a relationship where an input determines an output. If we have $y = 3x + 6$ , then $y$ is determined by $x$ , so we can write it as $f(x) = 3x + 6$ .

The input does not have to be a single value. A function like $f(x, y) = x + 2y$ takes multiple inputs, and the same basic idea still holds. Inputs go in, and an output is determined.

Slope is a rate of change. In the line $y = 2x + 1$ , a slope of 2 means that when $x$ increases by 1, $y$ increases by 2. If I choose two points, the slope can be calculated as the change in $y$ divided by the change in $x$ .

\text{slope} = \frac{\Delta y}{\Delta x} = \frac{y_2 - y_1}{x_2 - x_1}

One small question I had was about order. Do we have to put $x_2 - x_1$ in the denominator? The answer is no, as long as the direction is consistent. If the denominator is $x_1 - x_2$ , the numerator should also be $y_1 - y_2$ .

\frac{y_1 - y_2}{x_1 - x_2} = \frac{-(y_2 - y_1)}{-(x_2 - x_1)} = \frac{y_2 - y_1}{x_2 - x_1}

The important part is consistency. If only one side changes direction, the sign changes. If both sides use the same direction, the slope stays the same.

A straight line has the same slope everywhere, but a curve does not. For a curve like $y = x^2 - 2x + 1$ , the steepness changes depending on where I look. So instead of one slope for the whole curve, I need the slope at a specific point, which is the slope of the tangent line.

The problem is that slope usually requires two points. To get the slope at one point, we first take two points on the curve, calculate the average rate of change, and then move the two points closer together.

\text{average rate of change} = \frac{f(b) - f(a)}{b - a}

This looked more complicated at first, but it is the same idea as $\Delta y / \Delta x$ . Since $y$ is written as $f(x)$ , $f(b)$ replaces $y_2$ , and $f(a)$ replaces $y_1$ .

The course then rewrote $b$ as $a + h$ .

\frac{f(a + h) - f(a)}{h}

That substitution exists so the distance between the two points can be handled directly. If $h$ represents the horizontal distance between the points, bringing the two points together becomes the same as making $h$ approach zero. As $h$ gets closer to zero, the average rate of change gets closer to the instantaneous rate of change at that point.

\lim_{h \to 0} \frac{f(a + h) - f(a)}{h}

That is the basic idea behind differentiation. The slope at one point is not pulled out of nowhere. It is approached by shrinking the distance between two points until the average rate of change becomes the rate of change at a single point.

For $f(x) = x^2 - 2x + 1$ , calculating the instantaneous rate of change at $x = 2$ eventually gives a slope of 2. If I keep $x$ as a variable instead of plugging in one value, the result is $2x - 2$ . That is the derivative function, a function derived from the original function.

f(x) = x^2 - 2x + 1

f'(x) = 2x - 2

Once the derivative exists, I do not need to repeat the limit calculation every time. If I want the slope at $x = 2$ , I plug 2 into $2x - 2$ . If I want the slope at $x = 5$ , I plug in 5. The original function tells me values by position, and the derivative tells me rates of change by position.

The notation also became clearer. At first, $\frac{df(x)}{dx}$ or $\frac{d f(x)}{dx}$ looked like a symbol that appeared out of nowhere, but it is connected to the rate-of-change notation from earlier. $\Delta x$ and $\Delta y$ describe finite changes between two separated points. When the distance between those points keeps shrinking toward zero, the notation changes from $\Delta$ to $d$ , so $\frac{dy}{dx}$ represents an extremely small change in $y$ compared with an extremely small change in $x$ .

So $\frac{dy}{dx}$ is not just an ordinary fraction. It is notation for asking how much $y$ changes when $x$ changes by a tiny amount. $\frac{d f(x)}{dx}$ follows the same idea. If the value of $f(x)$ is treated as $y$ , then differentiating $f(x)$ with respect to $x$ means looking at the tiny change in $f(x)$ over the tiny change in $x$ .

The shortcut rule was familiar: bring the power down and reduce the power by one.

x^2 \to 2x

-2x \to -2

1 \to 0

The constant term disappears because its rate of change is zero. The formula explains it through $x^0$ , but the intuition is simpler. Differentiation measures change, and a constant does not change.

It was more interesting than I expected to slow down and clean up something I had passed over vaguely before.