Vector mathematics

Home

Vector mathematics

In vector mathematics, we take the {(x,y)} coordinate pair and enhance it to provide these additional features:

Extension into higher dimensions, for example, {(x,y,z)} in 3-dimensional space.
Displacement, which is the idea of using {(x,y)} to describe movement, instead of a fixed location.
Operators that apply to {(x,y)} as a whole, such as addition, subtraction, and various types of multiplication and “products”.

This introduction is for students who can:

use the Pythagorean theorem to find the length of a line segment,
use the equation {y = mx + b} to describe a line on the XY plane, and
understand basic trigonometry using {\sin} and {\cos} on a right triangle.

Index

Vectors and displacement

This diagram shows movement on the XY plane from {(1,2)} to {(4,6)} along a straight line:

To measure this movement, we can split it into two directions: {3} units to the right and {4} units up. We can then combine the movements into a single pair of numbers, {(3,4)}, which describes the overall diagonal movement.

We call {(3,4)} a vector. Vectors are used to describe displacement.

“Displacement” is movement in a specific direction, for a specific distance. The movement can be either real or abstract. In physics, “displacement” often implies that a physical object is moving. In mathematics, it often just describes a change in position.

When performing a displacement, the horizontal and vertical movements occur simultaneously, and we assume that the resulting movement occurs in a straight line.

You can view a vector in two ways:

The geometric view:

Since a vector describes movement, it’s natural to draw it as an arrow. The movement starts at the tail of the arrow, and ends at the tip of the arrow. (The tip end is also called the head end.)

The arrow is drawn as a straight line segment that contains an arrowhead at the tip end. The arrow points in a specific direction, and has a specific length.

The algebra view:

In algebra, we write a vector as a list of numbers surrounded by parentheses, for example: {(3,4).} The individual numbers {3} and {4} are called the components of the vector.

The number of components in a vector is called its dimension. For example, the vector {(8,-5,2)} is a {3}-dimensional vector.

Mathematical operators can be applied to a vector as a whole, so for example, we can write: {(1,2) + (3,4),} which is the algebra description of the movement shown in the above diagram.

Vector algebra

Here are some examples of vectors at work in algebra:

{(9,\, -2,\ 0,\ 6.25,\ {1 \over 3})}	a vector is an ordered list of real numbers
{(2x,\ y+3)}	it can contain expressions that evaluate to real numbers
{( \, )}	this is not a vector; a vector must contain at least one number
{(3,4) = (3,4)}	vectors can be compared for equality
{(3,4) \ne (4,3)}	order matters
{(3,4) \ne (3,4,0)}	equal vectors must have the same dimension
{(x,y) = (3,4)}	this defines {x=3} and {y=4} using vector notation
{\mathbf{a} = (3,4)}	a variable can represent a vector (usually in boldface)
{\mathbf{b} = (-1,7)}	a variable can represent a vector
{\mathbf{a} + \mathbf{b} = (2,11)}	vector addition is component-wise addition
{\mathbf{a} - \mathbf{b} = (4,-3)}	vector subtraction is component-wise subtraction
{\mathbf{a} _1 = 3, \ \ \mathbf{a} _2 = 4}	the individual components of a vector can be extracted
{2 \mathbf{a} = (6,8)}	scalar multiplication: distributes over all the components
{\mathbf{a}/2 = (1.5,2)}	scalar division: distributes over all the components
{- \mathbf{a} = (-3,-4)}	negation: distributes over all the components
{1/\mathbf{a}}	is undefined
{\mathbf{a} \mathbf{b}}	is undefined
{\mathbf{a} / \mathbf{b}}	is undefined
{\mathbf{a} ^ {\large 2}}	is undefined
{\mathbf{a} < \mathbf{b}}	is undefined, vectors themselves are not ordered

Length

The length of a vector is the length of its arrow line.

The length of the vector {(3,4)} is {5,} which we can calculate using the Pythagorean:

{\sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5}

We use vertical bars around a vector to indicate its length. For example, if {\mathbf{a} = (3,4),} we can say:

{ |\mathbf{a}| = \sqrt{3^2 + 4^2} = 5}

In general, the length of a {2}-dimensional vector is:

{|\mathbf{a}| = \sqrt{\mathbf{a} _1^2 + \mathbf{a} _2^2}}

The Pythagorean can be extended to calculate vector length in all dimensions:

{|\mathbf{a}| = \sqrt{\mathbf{a} _1^2 + \mathbf{a} _2^2 + \mathbf{a} _3^2 + \mathbf{a} _4^2 + \text{···} + \mathbf{a} _n^2} \quad} (for an {n}-dimensional vector}

Length is like absolute value — it’s never negative:

{|\mathbf{a}| \ge 0 \quad} for all {\mathbf{a}}

The “magnitude” of a vector is the same thing as its length. The two words can be used interchangeably.

Scaling

In vector terminology, a “scalar” is a real number that’s multiplied by a vector.

If you take a vector and perform scalar multiplication with the scalar {k}, then the length of the resulting vector will also be multiplied by {k.} This is called “scaling” a vector.

For example, if we’re given the vector {\mathbf{a} = (3,4)} where {|\mathbf{a}| = 5,} we can scale it by {2,} which doubles the length to {10}:

{\mathbf{a} = (3, 4)}

{|\mathbf{a}| = |(3, 4)| = \sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5}

{2 \mathbf{a} = 2(3, 4) = (6, 8)}

{|2 \mathbf{a}| = |(6, 8)| = \sqrt{6^2 + 8^2} = \sqrt{36 + 64} = \sqrt{100} = 10}

The vector {2 \mathbf{a}} points in the same direction as {\mathbf{a}.}

When you scale a vector, the tip of the resulting vector is moved forward or backward along the line of the arrow, leaving its direction unchanged.

Direction

To measure the direction of a vector’s arrow, we use a special type of vector called a unit vector that points in the same direction, but has a length of {1.}

All vectors of length {1} are unit vectors, and vice versa.

Unit vectors are used to standardize the way we measure direction. Every direction has exactly one unique unit vector that describes it.

For example, the vectors {(3,4),} {(9,12),} and {(15,20)} all point in the same direction, which is expressed as the unit vector {(0.6, 0.8).}

You can verify that all four of those vectors are parallel by calculating their slopes, which are all {m = {\large 4 \over \large 3}.} However, slope is in imperfect way to measure direction, because it’s incapable of distinguishing between directions that are {180°} apart, and it can’t represent vertical directions. Also, the way we calculate slope on the {2}-dimensional plane cannot be directly extended to {3}-dimensional space. To solve these deficiencies, we measure direction with unit vectors instead.

As a special case, the direction of the vector {(0,0)} is undefined, That’s because you can’t determine which direction you’re moving if you’re not moving. The vector {(0,0)} is drawn as a point with no arrowhead.

Creating a unit vector

We can put the symbol {{\LARGE{\hat{\ }}}} over a vector’s name to describe the unit vector that points in the same direction. For example, if {\mathbf{a} = (3,4),} we can say:

{{\LARGE{\hat{\normalsize \mathbf{a}}}} = (0.6, 0.8)}

The symbol {{\LARGE{\hat{\ }}}} is called the hat symbol, and the name {{\LARGE{\hat{\normalsize \mathbf{a}}}}} is pronounced “{\mathbf{a}} hat”.

In general, we define the unit vector {{\LARGE{\hat{\normalsize \mathbf{a}}}}} as:

[[{\LARGE{\hat{\normalsize \mathbf{a}}}} = {\mathbf{a} \over |\mathbf{a}|} \quad (|\mathbf{a}| \ne 0)]]

So if {\mathbf{a} = (3,4),} we can calculate {{\LARGE{\hat{\normalsize \mathbf{a}}}}} as:

[[{\LARGE{\hat{\normalsize \mathbf{a}}}} \ = \ {\mathbf{a} \over |\mathbf{a}|} \ = \ {{(3,4)} \over {\sqrt{3^2 + 4^2}}} \ = \ {{(3,4)} \over {5} } \ = \ \left( {3 \over 5}, {4 \over 5} \right) \ = \ (0.6, 0.8) ]]

To verify that {(0.6, 0.8)} is a unit vector, we can show that its length is {1}:

[[ |(0.6, 0.8)| \ = \ \sqrt{0.6^2 + 0.8^2} \ = \ \sqrt{0.36 + 0.64} \ = \ \sqrt{1} \ = \ 1 ]]

The components of a unit vector are always between {-1} and {1} (inclusive).

Opposite direction

If you take a vector and perform scalar multiplication with a negative number, the resulting vector will point in the opposite direction.

For example, if we’re given the vector {\mathbf{a} = (3,4),} we can multiply it by {-1} to define another vector of the same length that points in the opposite direction:

{- \mathbf{a} = -(3, 4) = (-3, -4)}

The vectors {-\mathbf{a}} and {\mathbf{a}} are parallel, but their arrowheads are on opposite ends.

Position vectors

It’s interesting to observe that the notation {\text{“}(3,4)\text{”}} is used for two different purposes:

as a displacement, describing the movement from one point to another
as a coordinates, describing a fixed point that does not move

At first glance, these two purposes seem contradictory. Isn’t it inconsistent to use the same notation for both movement and something that never moves?

We can resolve this by defining a special type of vector for representing fixed points. It’s called a position vector:

A position vector is a vector with its tail located at the origin, and a tip that describes a fixed point.

For example, let’s say that we’re given a fixed point with coordinates {(3,4).} If we also draw a position vector that starts at the origin and moves a displacement of {(3,4),} it will arrive at that same fixed point. Here’s a diagram showing both the point and the vector:

This shows that a fixed point is exactly the same thing as the tip of a position vector.

When we use a position vector, we’re free to ignore its displacement arrow, and to focus just on its tip, whenever it’s convenient to do so.

Sometimes, however, the displacement arrow is useful. We’ll see an example of this in the next section.

Notice that the arrowhead of a position vector always points away from the origin (except for {(0,0),} which doesn’t have an arrowhead).

Viewing all points as position vectors

In vector mathematics, some things are easier to visualize if you imagine that every point on the XY plane is associated with a straight line that connects it to the origin.

For example, it turns out that ({3,4)} and {(8,-6)} are perpendicular to each other. That’s hard to see if you think of ({3,4)} and {(8,-6)} as just two individual points on the plane. But if you view them as position vectors, then it’s easy to see that their arrows intersect at a 90° angle:

There’s a natural one-to-one correspondence between every point on the plane and every vector arrow that could possibly emanate from the origin. It’s helpful to be able to seamlessly switch your perspective back and forth between the “point view” and the “arrow view”, depending on the situation.

The tail of a vector

The tail of a vector is always assumed to be at the origin, unless otherwise specified. Therefore, all vectors are assumed to be position vectors unless you’re also given the position of its tail.

For example, if someone describes a vector as {\text{“}(3,4)} starting from {(1,2)\text{”},} then it’s clear that {(3,4)} is not being used as a position vector, but instead is being used to show displacement from the tip of {(1,2).}

However, if they describe a vector only as {\text{“}(3,4)\text{”}} then you must assume that its tail is at the origin, which makes it a position vector.

If a vector is being used to show displacement from any point other than the origin, then we can call it a displacement vector to clarify that it’s not a position vector.

Vector addition geometry

Given these two vectors:

{\mathbf{a} = (3,4)}
{\mathbf{b} = (5,-1)}

you can add them to obtain:

{\mathbf{a} + \mathbf{b} = (8,3)}

Vector addition describes the act of “chaining” displacements together, end to end, effectively combining them together into a single displacement.

Here’s the geometric view of {\mathbf{a} + \mathbf{b}}:

To understand this diagram:

Start at the origin.
Perform the displacement {\mathbf{a},} arriving at the tip of {\mathbf{a}.}
From there, perform another displacement {\mathbf{b}} (shown in gray)
You have now arrived at the tip of {\mathbf{a} + \mathbf{b}.}
You could also have arrived at that same point by starting at the origin and simply performing a single displacement of {\mathbf{a} + \mathbf{b}} (shown as the longest vector).

This diagram shows two different kinds of vectors:

The black arrows are position vectors. Their tails are located at the origin.
The gray arrow is a translated vector. Its tail is not located at the origin.

To “translate” a vector means to move its arrow elsewhere without changing either the length or the direction of the arrow. This is accomplished by applying the same displacement to both its tail and tip.

The diagram shows {\mathbf{b}} being translated up to the tip of {\mathbf{a}.} During the translation, the color of {\mathbf{b}} turns from black to gray. After the translation, the tip of the gray vector defines the tip of {\mathbf{a} + \mathbf{b}.}

We can call the gray vector “{\mathbf{b}} translated to the tip of {\mathbf{a}}”, or “{\mathbf{b}} starting from {\mathbf{a}}”.

Notice that the tail of the vector {\mathbf{a} + \mathbf{b}} is at the origin. That’s because a vector’s tail is always at the origin unless otherwise specified.

It would also be possible to arrive at the tip of {\mathbf{a} + \mathbf{b}} by translating the other vector — that is, by translating {\mathbf{a}} onto the tip of {\mathbf{b}.} The final location is the same regardless of which vector is translated.

Vector subtraction geometry

Given these two vectors:

{\mathbf{a} = (3,4)}
{\mathbf{b} = (8,3)}

you can subtract the first from the second to obtain:

{\mathbf{b} - \mathbf{a} = (5,-1)}

Vector subtraction is used to find the displacement from one vector to another.

Specifically, if you subtract {\mathbf{a}} from {\mathbf{b}}, the result is a vector that shows the displacement from the tip of {\mathbf{a}} to the tip of {\mathbf{b}.}

Notice that the subtraction is specified as {\mathbf{b} - \mathbf{a}}, so the resulting displacement is in the reverse order: from {\mathbf{a}} to {\mathbf{b}.}

Here’s the geometric view of {\mathbf{b} - \mathbf{a}}:

To understand this diagram:

Start at the tip of {\mathbf{b}.}
Move toward the tip of {\mathbf{a},} drawing a straight line as you go (shown in gray).
As you move toward {\mathbf{a}}, view it as “backward” movement, because it’s subtraction.
Once you arrive at {\mathbf{a},} the gray line shows the “forward” displacement back to {\mathbf{b}.}
That “forward” displacement is {\mathbf{b} - \mathbf{a},} shown in gray as “{\mathbf{b} - \mathbf{a}} starting from {\mathbf{a}}”.
Starting at {\mathbf{a}} and following the gray arrow forward, we find that {\mathbf{a} + (\mathbf{b} - \mathbf{a}) = \mathbf{b}.}

If you follow these steps, you will draw {\mathbf{b} - \mathbf{a}} as a translated vector. Drawing it as a translated vector usually makes the diagram more understandable. Notice that the diagram also shows {\mathbf{b} - \mathbf{a}} as a position vector (in black), but the geometric meaning of its position is not as obvious.

To determine the displacement from a start point to an end point, we always do the subtraction in this order:

end {-} start {=} displacement

This makes sense, because if you add start to both sides of the equation, you get:

end {=} start {+} displacement

which describes exactly how displacement works.

Notice that either {|\mathbf{b} - \mathbf{a}|} or {|\mathbf{a} - \mathbf{b}|} gives you the distance between the tips of {\mathbf{a}} and {\mathbf{b}.}

Vector subtraction only works if the start and end vectors have the same tail position.

Vector exercise

Let’s solve the following problem:

An object is currently located at {(7,1).} It will be moved to a new location that’s twice as far away as it currently is from the location {(5,0).}
The movement will occur entirely along the straight line that connects {(5,0)} and {(7,1).}
Where will the new location of the object be?

Let’s call {(5,0)} the “base”.

First, we can find the displacement from the base to the object’s original position:

{(7,1) - (5,0) \ = \ (2,1)}

Then multiply the displacement by {2} to obtain the new displacement from the base:

{2(2,1) \ = \ (4,2)}

Then apply the new displacement to the base to get the answer:

{(5,0) + (4,2) \ = \ (9,2)}

Optionally, we can combine all of these operators together into a single expression:

{(5,0) + 2((7,1) - (5,0)) \ = \ (9,2) }

Let’s take this solution and generalize it with variables:

{\mathbf{a} + k(\mathbf{b} - \mathbf{a})}

where:

{\mathbf{a}} is the “base”
{\mathbf{b}} is the current location of the object
the object is moved {k} times as far from the base as it currently is

Notice the following:

if {k > 1,} the object moves farther away from the base
if {0 < k < 1,} the object moves closer to the base
if {k < 0,} the object moves to the “other side” of the base
if {k = 1} or {-1,} the distance between the object and the base remains the same
if {k = 0,} the object moves to the base

Now that the problem is solved using variables, it will work for any two points in a space of any dimension.

Vector summary and review

Here’s a brief summary and review of the basic concepts in vector mathematics.

In vector mathematics, we take the {(x, y)} coordinate pair and enhance it with operators that act on {(x, y)} as a whole, for example:

vector addition: { (3, 4) + (-1, 7) = (2, 11) }
vector subtraction: { (3, 4) - (-1, 7) = (4, -3) }
scalar multiplication: { 2(3, 4) = (6, 8) }
scalar division: { (3, 4)/2 = (1.5, 2) }
negation: { -(3, 4) = (-3, -4) }

To graph a vector, draw a straight-line arrow that connects its tail point to its tip point. The arrow shows the displacement (movement) from the tail to the tip.
The displacement is broken into separate horizontal and vertical movements, {x} and {y}, which are then combined to create the vector {(x, y).} Both movements occur simultaneously, in a straight line.
If {\mathbf{a} = (x, y),} we can extract its individual components using {\mathbf{a} _1 = x} and {\mathbf{a} _2 = y.}
The number of components in a vector is called its dimension.
If {\mathbf{a} = (x, y),} then {|\mathbf{a}|} is the length of the vector {\mathbf{a}}, and is calculated as {\sqrt{x^2 + y^2}.}
If {\mathbf{a} = (x, y),} then {{\LARGE{\hat{\normalsize \mathbf{a}}}}} {(\text{“} \mathbf{a}} hat{\text{”})} is the direction of the vector {\mathbf{a}}. The vector {{\LARGE{\hat{\normalsize \mathbf{a}}}}} is a unit vector of length {1.} It points in the same direction as {\mathbf{a},} and it’s defined as:

[[ {\LARGE{\hat{\normalsize \mathbf{a}}}} = {{\mathbf{a}} \over {|\mathbf{a}|}} ]]
A position vector is a vector whose tail is at the origin, and whose tip describes a fixed point. Every point on the plane has a corresponding position vector that points to it from the origin. All vectors are assumed to be position vectors unless otherwise specified.
If a vector’s tail is not at the origin, then you need to describe it using a phrse like {\text{“}\mathbf{a}} starting from {\mathbf{b}\text{”}}, or {\text{“}\mathbf{a}} translated to the tip of {\mathbf{b}\text{”}}, to indicate that vector {\mathbf{a}}’s tail is at the tip of vector {\mathbf{b}.}
Vector addition is used to “chain” displacements together, end to end, combining them together into a single displacement. For example, the vector {\text{“}\mathbf{a}} starting from {\mathbf{b}\text{”}} has its tip at {\mathbf{b} + \mathbf{a}.}
Vector subtraction is used to find the displacement from the tip of one vector to the tip of another vector. For example, the displacement from {\mathbf{a}} to {\mathbf{b}} is {\mathbf{b} - \mathbf{a}.}
The scalar multiplication {k \mathbf{a}} results in a vector that’s {k} times longer than {\mathbf{a},} and points in the same direction as {\mathbf{a},} or points in the opposite direction if {k < 0.}

Straight line: introduction

It’s common to describe a straight line on the XY plane this way:

{y = mx + b}

However, there’s an alternative way to describe a straight line, using the vector form:

{\mathbf{a} + t \mathbf{d}}

where:

{\mathbf{a}} is the position vector of any point on the line
{\mathbf{d}} is a vector that points in the direction of the line (either direction, any length {\ne 0})
{t} varies across all the real numbers (just like {x} does when drawing {y = mx + b})

For example, {y = 2x + 3} can be expressed in vector form as:

{(0,3) + t(1,2)}

More generally, {y = mx + b} can be expressed in vector form as:

{(0, b) + t(1, m)}

Notice that in both of these cases:

the left side of the {\text{“}\mathord{+}\text{”}} is the position vector of the y-intercept, and
the right side of the {\text{“}\mathord{+}\text{”}} is a vector whose slope defines the line’s slope.

Straight line: vertical

The equation {y = mx + b} is often used to describe lines on the XY plane. It has a limitation, though, because it can’t be used to describe a vertical line.

Fortunately, it’s easy to describe a vertical line in vector form. For example, here’s how to describe the vertical line at {x = 4}:

{(4, 0) + t(0, 1)}

This can be algebraically simplified to:

{(4, t)}

(Remember that {t} varies across all the real numbers as the line is being drawn.)

If you like, you can be more explicit about the fact that the vertical line is being drawn onto the XY plane by writing it like this:

{(x, y) = (4, t)}

Now, it’s clearer that the {x} coordinate of the line is fixed at {x = 4,} and its {y} coordinate varies across all the real numbers, since {y = t} while {t} is varying.

Generalizing to all lines on the plane, you can write this:

{(x, y) = \mathbf{a} + t \mathbf{d}}

The inclusion of that extra {\text{“}(x, y)\ \mathord{=}\text{”}} notation is optional, and it does not change the vector mathematics at all. It simply shows that we’re interpreting the right side of the equation as a position on the XY plane.

Straight line: a detailed analysis

Let’s do a detailed analysis of a straight line in vector form:

{(x, y) = \mathbf{a} + t \mathbf{d}}

where:

{\mathbf{a} = (3,4)}
{\mathbf{d} = (5,2)}

Here’s the geometric view of this line (shown in gray):

First, let’s analyze this from the vector perspective:

{\mathbf{a}} is the position vector of any point on the line; here we chose {(3,4)}
{\mathbf{d}} is a vector that points in the direction of the line (either direction, any length {\ne 0})
{t} varies across all the real numbers (just like {x} does when drawing {y = mx + b})
each {t \mathbf{d}} is a new vector with a different length than {\mathbf{d}}, but the same slope
as {t} changes, the tip of {t \mathbf{d}} moves, generating a straight line as it goes
adding {\mathbf{a}} to {t \mathbf{d}} effectively translates every {t \mathbf{d}} vector by a displacement of {\mathbf{a}}
this causes all the {t \mathbf{d}} vectors to start from {\mathbf{a}} (that is, their tails are all moved to {\mathbf{a}})
which causes the whole {t \mathbf{d}} line to translate so that it ends up passing through {\mathbf{a}}
the whole translated line is shown in gray, and can be called “{t \mathbf{d}} starting from {\mathbf{a}}”
the gray line is parallel to position vector {\mathbf{d},} verifying that {\mathbf{d}} defines the direction
{|\mathbf{d}|} is the distance moved when {t} advances to {t+1} during drawing
so {|\mathbf{d}|} essentially serves as the “scale” of the line by “magnifying the change in {t}”
{\mathbf{a}} is essentially the “zero” of the line, because {\mathbf{a}} is the point plotted when {t = 0}

And now let’s analyze this from the function and graphing perspectives:

the 2D linear family of functions can be defined in vector form as: {(x, y) = \mathbf{a} + t \mathbf{d}}
to define one specific function, you must first assign specific values to {\mathbf{a}} and {\mathbf{d}}
so you set {\mathbf{a}} and {\mathbf{d}} to constant values (just like you do for {m} and {b} in {y = mx + b})
{t} is a real number that you plug into the function {\mathbf{a} + t \mathbf{d}}
so, {t} plays the role of the independent variable (or “input variable”) of the function
the function takes {t} as input and produces {\mathbf{a} + t \mathbf{d}} as its output, which is a vector
{x} and {y} are defined to represent the two components of that output vector
so, {x} and {y} are the dependent variables (or “output variables”) of the function
{x} and {y} are combined to make the vector {(x,y),} to show a position on the XY plane
you can draw this by plotting a dot at the tip of the position vector {(x,y)}
you allow {t} to vary across all the real numbers, plotting dots as you go
together, all the plotted dots create the graph of the straight line {(x, y) = \mathbf{a} + t \mathbf{d}}

Recall that vectors can have more than {2} dimensions. So if you define {\mathbf{a}} and {\mathbf{d}} to be 3D vectors, it only requires a trivial change to describe a line in 3D space:

{(x, y, z) = \mathbf{a} + t \mathbf{d} \quad} (for 3D vectors {\mathbf{a}} and {\mathbf{d}})

We can extended this to even higher dimensions, which allows us to describe any straight line in a space of any dimension.

As you can see, using {\mathbf{a} + t \mathbf{d}} to describe a line can be very concise, and it’s far more powerful than using {y = mx + b} to describe lines.

Straight line: converting back to slope-intercept form

If we’re given a line on the XY plane that’s described in vector form:

{(x, y) = \mathbf{a} + t \mathbf{d}}

it’s instructive to see how to convert it back to the slope-intercept form:

{y = mx + b}

Doing this conversion is a good algebra exercise, because it requires you to juggle seven different variables at the same time, along with some vector handling.

Here are the steps to do this conversion:

Split the vector into its two components:

{x = {\mathbf{a} _1} + t \mathbf{d} _1 }
{y = {\mathbf{a} _2} + t \mathbf{d} _2 }
We already know that {\mathbf{d}} determines the slope of the line {m}:

[[m = {{\mathbf{d} _2} \over {\mathbf{d} _1}} \quad ({\mathbf{d} _1} \ne 0) ]]

The y-intercept is {b,} which you can find by setting {x = 0} and solving for {y.} But first, you’ll need to know which value of {t} produces {x = 0}:

{x = {\mathbf{a} _1} + t \mathbf{d} _1 } (repeated from step 1 above)
{0 = {\mathbf{a} _1} + t \mathbf{d} _1 } (set {x = 0} for the y-intercept)
{-{\mathbf{a} _1} = t \mathbf{d} _1 } (subtract {\mathbf{a} _1} from both sides)
[[ {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} = t ]] (divide both sides by {\mathbf{d} _1} to find {t})

Now that you know which value of {t} produces {x = 0,} you can find the corresponding value for the y-intercept {b}:

{y = mx + b} (given)
{y = b } (set {x = 0} for the y-intercept and simplify)
{y = {\mathbf{a} _2} + t \mathbf{d} _2 } (repeated from step 1 above)
[[y = {\mathbf{a} _2} + {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} \mathbf{d} _2 ]] (substitute [[t = {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} ]] from step 3 above)
{y = {\mathbf{a} _2} - m {\mathbf{a} _1} } (simplify [[{\mathbf{d} _2} \over {\mathbf{d} _1}]] to {m,} from step 2 above)
{b = {\mathbf{a} _2} - m {\mathbf{a} _1} } (substitute {y = b} from above)

So the result of the conversion is:

{y = mx + b}

where:

[[m = {{\mathbf{d} _2} \over {\mathbf{d} _1}} \quad ({\mathbf{d} _1} \ne 0) ]]
[[b = {\mathbf{a} _2} - m {\mathbf{a} _1} ]]

For more practice, try converting back and forth between the vector form and the standard form of the linear equation, which is:

{Ax + By = C \quad} (for constants {A,} {B,} and {C})

Parametric functions

We now have two different ways to describe a line:

{y = mx + b}

{(x, y) = \mathbf{a} + t \mathbf{d}}

In the first function, {y} is dependent on {x.} This means that we must first determine a value for {x} before we can determine the corresponding value of {y.}

In the second function, both {x} and {y} are dependent on {t.} This means that we must first determine a value for {t,} then, once we do, we can determine the corresponding values of both {x} and {y.}

The second approach is the way we define functions in vector mathematics. When using vectors, our goal is to always keep {x} and {y} packaged together into the vector {(x,y),} and to define both at the same time, as a pair. We don’t consider either {x} or {y} to be dependent on the other — instead, we make both {x} and {y} dependent on a new variable named {t.}

The name {\text{“}t\text{”}} was chosen because it’s often helpful to think of it as time. So, for example, an increase in {t} could be seen as moving forward in time.

When a function is defined using this approach, it’s called a parametric function, and {t} is called the parameter of the function. If it’s expressed as an equation (like above), it can also be called a parametric equation.

A function like {y = mx + b} is called a real function. This type of function takes one real number as “input” (called {x} here) and produces one real number as “output” (called {y} here).

In contrast, a parametric function lifts those restrictions. It can produce different types of outputs, such as vectors, matrices, or complex numbers, and it can even allow multiple “inputs” as well.

Circle

Let’s see how we can describe a circle on the XY plane using vectors.

To start, observe that if {r} represents the distance between the origin and any point {(x,y)} on the plane, then those three variables satisfy the Pythagorean theorem:

{x^2 + y^2 = r^2}

To describe a circle, we can set {r} to a constant value, and let {x} and {y} vary. The resulting collection of all the {(x,y)} points that satisfy the equation will form a circle of radius {r} that’s centered at the origin.

It’s common to solve this equation for {y} to obtain two real functions that produce a circle when graphed together on the XY plane:

{y = \phantom{-} \sqrt{r^2 - x^2}} (the top half of the circle only)

{y = - \sqrt{r^2 - x^2}} (the bottom half of the circle only)

However, in vector mathematics, we prefer to use parametric functions, so that we can keep {x} and {y} together as a vector. This can be done by introducing a new variable {t} and rearranging things so that the expressions appear inside of vectors, like this:

{(x,y) = (t, \phantom{-} \sqrt{r^2 - t^2})} (the top half of the circle only)

{(x,y) = (t, - \sqrt{r^2 - t^2})} (the bottom half of the circle only)

(Remember that {t} varies across all the real numbers as the circle is being drawn.)

This works, but it’s not the most concise way to describe a circle with vectors. Fortunately, there’s a single parametric function that elegantly describes the entire circle:

{(x,y) = r(\cos t, \sin t)}

In the next section, we’ll see why this function describes a circle. But first, let’s translate this circle to another location on the plane by adding it to a position vector {\mathbf{c},} which defines the center of the circle:

{(x,y) = \mathbf{c} + r(\cos t, \sin t)}

Now, all the vectors produced by {r(\cos t, \sin t)} will be translated to the tip of {\mathbf{c},} allowing us to describe circles located anywhere on the XY plane, simply by changing the center position {\mathbf{c}.}

Parametric function for the circle

Here are the steps to derive the parametric function for the circle using trigonometry:

First, let’s define a circle of radius {r} that’s centered on the origin. If {(x,y)} is any point on that circle, then those three variables satisfy the Pythagorean theorem:

{x^2 + y^2 = r^2}
Second, notice that the following equation is true for all values of {t}:

{\cos^2 t + \sin^2 t = 1}

You can see why this equation is always true by applying the Pythagorean theorem to the following right triangle:

Notice that the above two equations are very similar in structure — they’re both the sum of squares, and they both have a constant on the right side. We can use this similarity to define a parametric function that describes a circle, as follows:

{\cos^2 t + \sin^2 t = 1} (repeated from step 2 above)
{r^2 \cos^2 t + r^2 \sin^2 t = r^2} (multiply by {r^2})
{(r \cos t)^2 + (r \sin t)^2 = r^2} (combine the squares)
{x^2 + y^2 = r^2} (repeated from step 1 above)
{x = r \cos t} (equate like terms in the previous two equations)
{y = r \sin t}
{(x,y) = (r \cos t, r \sin t)} (express the previous two equations as vectors)
{(x,y) = r(\cos t, \sin t)} (factor out {r})
{(x,y) = \mathbf{c} + r(\cos t, \sin t)} (translate the circle to the desired center {\mathbf{c}})

This produces the parametric function for the circle:

{(x,y) = \mathbf{c} + r(\cos t, \sin t)}

As the above diagram shows, the parameter {t} represents an angle. This allows {t} to easily identify a position on the circle. Here are some examples:

if {\ t = 0°\ } then {\ r(\cos 0°, \sin 0°)} {\; = (r, 0)}
if {\ t = 90°\ } then {\ r(\cos 90°, \sin 90°)} {\; = (0, r)}
if {\ t = 180°\ } then {\ r(\cos 180°, \sin 180°)} {\; = (-r, 0)}
if {\ t = 270°\ } then {\ r(\cos 270°, \sin 270°)} {\; = (0, -r)}

Dot product

Given these two vectors:

{\mathbf{a} = (3,4)}
{\mathbf{b} = (5,2)}

you can calculate their dot product as follows:

Multiply their first components together: {3 \mathord{\times} 5 = 15.}
Multiply their second components together: {4 \mathord{\times} 2 = 8.}
Add those two results together: {15 \mathord{+} 8 = 23.}

The resulting sum, {23,} is the dot product of {\mathbf{a}} and {\mathbf{b},} and you can write it like this:

{\mathbf{a} \cdot \mathbf{b} = 23}

or, in more detail:

{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{b} _1, \mathbf{b} _2) \ = \ (3,4) \cdot (5,2) \ = \ 3 \mathord{\times} 5 + 4 \mathord{\times} 2 \ = \ 23}

The left side of the equation is pronounced “{\mathbf{a}} dot {\mathbf{b}}”.

The dot product can be described as “the sum of the component-wise products”.

In general, the dot product is defined for vectors of all dimensions {n \ge 1}:

{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2, \mathbf{a} _3, \text{...}, \mathbf{a} _n) \cdot (\mathbf{b} _1, \mathbf{b} _2, \mathbf{b} _3, \text{...}, \mathbf{b} _n) }

{\phantom{\mathbf{a} \cdot \mathbf{b}} \ = \ \mathbf{a} _1 \mathbf{b} _1 + \mathbf{a} _2 \mathbf{b} _2 + \mathbf{a} _3 \mathbf{b} _3 + \text{...} + \mathbf{a} _n \mathbf{b} _n }

The two vectors {\mathbf{a}} and {\mathbf{b}} must have the same dimension, otherwise their dot product is undefined.

Dot product: the sum of proportionately-weighted values

The dot product can be used to calculate the sum of proportionately-weighted values.

Here are two examples of this:

Example #1:

Let’s say that you have:

{4} ten-dollar bills,

{3} five-dollar bills, and

{6} one-dollar bills.

How much money do you have?

To determine the answer, you first need to convert the number of bills to their dollar value, and then you can find the sum of those dollar values.

To do this, first define two vectors to represent the data:

{\mathbf{a} = (4, 3, 6)}

{\mathbf{b} = (10, 5, 1)}

and then take the dot product of the two vectors:

{\mathbf{a} \cdot \mathbf{b} \ = \ (4, 3, 6) \cdot (10, 5, 1) \ = \ 4 \mathord{\times} 10 + 3 \mathord{\times} 5 + 6 \mathord{\times} 1 \ = \ 61}

to obtain the final answer: {61} dollars.

Example #2:

Let’s construct the number {7402} digit-by-digit.

To do this, first define a vector that contains the individual digits:

{\mathbf{a} = (7, 4, 0, 2)}

and then define another vector that contains the “weights” of each digit:

{\mathbf{b} = (10^{\large 3}, 10^{\large 2}, 10^{\large 1}, 10^{\large 0})}

and then use the dot product to find the “weighted sum” of the digits:

{\mathbf{a} \cdot \mathbf{b} \ = \ (7, 4, 0, 2) \cdot (10^{\large 3}, 10^{\large 2}, 10^{\large 1}, 10^{\large 0}) \ = \ 7000 + 400 + 0 + 2 \ = \ 7402}

This example shows that the dot product is both ubiquitous and indispensable in everyday mathematics. Every person must perform this dot product mentally in order to understand the numerical value of a multi-digit number.

Dot product: the self dot product

What happens if you take the dot product of a vector with itself? Here, we do it in two dimensions:

{\mathbf{a} \cdot \mathbf{a} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{a} _1, \mathbf{a} _2) \ = \ \mathbf{a} _1^2 + \mathbf{a} _2^2 \ = \ |\mathbf{a}|^2 }

This collapses to:

{\mathbf{a} \cdot \mathbf{a} \ = \ |\mathbf{a}|^2 }

If you take the square root of both sides of this equation, you get a concise way of defining vector length:

{\sqrt{\mathbf{a} \cdot \mathbf{a}} \ = \ |\mathbf{a}| }

This result holds for vectors of all dimensions {n \ge 1}.

It’s interesting to observe that the Pythagorean theorem is a special case of the dot product.

Repeating the first equation from above, you can see that the Pythagorean theorem appears on the right side:

{\mathbf{a} \cdot \mathbf{a} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{a} _1, \mathbf{a} _2) \ = \ \mathbf{a} _1^2 + \mathbf{a} _2^2 \ = \ |\mathbf{a}|^2 }

Here’s a vector diagram showing the right triangle that corresponds to the Pythagorean theorem {\mathbf{a} _1^2 + \mathbf{a} _2^2 = |\mathbf{a}|^2}:

Dot product: geometric form

So far, we have used this definition of the dot product:

{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2, \mathbf{a} _3, \text{...}, \mathbf{a} _n) \cdot (\mathbf{b} _1, \mathbf{b} _2, \mathbf{b} _3, \text{...}, \mathbf{b} _n) }

{\phantom{\mathbf{a} \cdot \mathbf{b}} \ = \ \mathbf{a} _1 \mathbf{b} _1 + \mathbf{a} _2 \mathbf{b} _2 + \mathbf{a} _3 \mathbf{b} _3 + \text{...} + \mathbf{a} _n \mathbf{b} _n }

It turns out that there’s another definition of the dot product that defines it exclusively in terms of lengths and angles — it’s called the geometric version of the dot product:

{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta }

where {\theta} is the smaller of the two angles between the vectors {\mathbf{a}} and {\mathbf{b}.}

This result holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.

Dot product: perpendicular vectors

The dot product has the following property:

If {\mathbf{a}} and {\mathbf{b}} are perpendicular, then {\mathbf{a} \cdot \mathbf{b} = 0.}
If {\mathbf{a}} and {\mathbf{b}} are not perpendicular, then {\mathbf{a} \cdot \mathbf{b} \ne 0.}

For example, the vectors {\mathbf{a} = (3,4)} and {\mathbf{b} = (8,-6)} are perpendicular to each other, so let’s take their dot product to confirm that it’s {0}:

{\mathbf{a} \cdot \mathbf{b} \ = \ (3,4) \cdot (8,-6) \ = \ 3 \mathord{\times} 8 + 4 \mathord{\times-}6 \ = \ 24 - 24 \ = \ 0}

Perpendicular vectors intersect at a {90°} angle, so let’s plug {\theta = 90°} into the geometric version of the dot product, and see what happens:

{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta \ = \ |\mathbf{a}| |\mathbf{b}| \cos 90° \ = \ 0}

This proves that the dot product of all perpendicular vectors is {0}.

This result holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.

In general, for 2D vectors, we can say:

The vector {(x,y)} is perpendicular to both {(-y, x)} and {(y, -x).}

Rotating {(x,y)} counterclockwise to the perpendicular produces {(-y,x).}

Rotating {(x,y)} clockwise to the perpendicular produces {(y,-x).}

We can prove they’re perpendicular by showing that their dot products are {0}:

{(x,y) \cdot (-y,x) \ = \ -xy + xy \ = \ 0}

{(x,y) \cdot (y,-x) \ = \ xy - xy \ = \ 0}

And, consequently, perpendicular lines have slopes that are negative reciprocals of each other:

The slope of {(x,y)} is [[ \ m = {{y \over x}} ]]

The slope of {(-y,x)} is [[ \ m^\prime = {x \over -y} \, = \, -{1 \over m}]]

Dot product: exercises (part 1)

Recall that the dot product in geometric form is:

{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta }

Here are some vector exercises:

Calculate {(2, 0, -5) \cdot (-4, 1, -3)}
Are the vectors {(3, 0, 2)} and {(5, -1, -8)} perpendicular?
What vector is produced when you rotate {(-2, -4)} counterclockwise to the perpendicular, keeping its tail at the origin, and its length the same?
Two vectors have a dot product of {10,} and their tails meet at a {60°} angle. If one of the vectors has length {4,} what is the length of the other vector?

Dot product: finding an angle

It’s useful to solve the geometric version of the dot product for the angle {\theta}:

[[ \mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta ]]

[[ {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} = \cos \theta ]]

[[ \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) = \theta ]]

We now have an easy formula for finding the angle {\theta} between any two vectors. For example, to find the angle between {(3,4)} and {(5,12),} you can do the following:

Find their dot product:

{\mathbf{a} \cdot \mathbf{b} \ = \ (3,4) \cdot (5,12) \ = \ 3 \mathord{\times} 5 + 4 \mathord{\times} 12 \ = \ 15 + 48 \ = \ 63 }
Find both of their lengths:

{|\mathbf{a}| \ = \ \sqrt{\mathbf{a} \cdot \mathbf{a}} \ = \ \sqrt{(3,4) \cdot (3,4)} \ = \ \sqrt{3 \mathord{\times} 3 + 4 \mathord{\times} 4} \ = \ 5}

{|\mathbf{b}| \ = \ \sqrt{\mathbf{b} \cdot \mathbf{b}} \ = \ \sqrt{(5,12) \cdot (5,12)} \ = \ \sqrt{5 \mathord{\times} 5 + 12 \mathord{\times} 12} \ = \ 13}
Plug them into the formula:

[[ \theta \ = \, \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) = \, \arccos \left( {63 \over {5 \mathord{\times} 13} } \right) = \, 14.25° ]]

The result {\theta} is always the smaller of the two angles between {\mathbf{a}} and {\mathbf{b},} and is never negative. The other angle between {\mathbf{a}} and {\mathbf{b}} is {360° - \theta.}

This formula holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.

Dot product: 2D direction angle

Slope is an imperfect way to measure direction on the XY plane. It’s incapable of distinguishing between directions that are {180°} apart, and it can’t represent vertical directions.

To solve these deficiencies, we can measure direction on the XY plane using angles instead.

The standard way of measuring a vector’s angle is to calculate the angle that it forms with the positive X axis.

Specifically, the angle of vector {\mathbf{a}} is the angle from {(1,0)} to {\mathbf{a},} where:

the angle is positive if it’s measured counterclockwise from {(1,0)} to {\mathbf{a},} and
the angle is negative if it’s measured clockwise from {(1,0)} to {\mathbf{a}.}

Here are the steps to derive the angle of {\mathbf{a}}:

Start with the general angle formula, which finds the smaller of the two angles between {\mathbf{a}} and {\mathbf{b}}:

[[ \theta = \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) ]]
Set {\mathbf{b} = (1,0)} and simplify the result:

[[ \theta = \arccos \left( {{\mathbf{a} \cdot (1,0)} \over {|\mathbf{a}| |(1,0)|}} \right) ]]

[[ \theta = \arccos \left( {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) ]]
The angles {\theta} and {-\theta} both have the same cosine, so the above equation must be enhanced to distinguish between those two angles. This gives us a formula for vector angle that works in all cases:

[[ \theta = \begin{cases} \phantom{-} \arccos \left( \displaystyle {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) \space\space \text{ if } \mathbf{a} _2 \ge 0 \cr - \arccos \left( \displaystyle {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) \space\space \text{ if } \mathbf{a} _2 < 0 \cr \end{cases} ]]

This formula produces an angle {\theta} that’s in the range {-180° < \theta \le 180°, } and {\theta} will always have the same sign as the vector’s {y} component. This means that all vectors having a downward vertical displacement also have a negative angle. This convention was chosen because it’s natural to associate downward with negative.

Notice that if you start at the vector {(1,0)} and begin rotating clockwise, the resulting vector will begin pointing downward, and will therefore have a negative angle. This explains why clockwise rotation was chosen to be the negative direction of rotation.

If you like, you can eliminate negative angles by adding {360°} to any negative angle. You will then obtain an angle {\theta} in the range {0° \le \theta < 360°.}

As a special case, the angle of the vector {(0,0)} is undefined.

This formula does not require {\mathbf{a}} to be a position vector. The tail may be located anywhere on the XY plane.

This formula applies to {2}-dimensional vectors only. In higher dimensions, angular direction requires the use of multiple angles.

Dot product: rotation

The dot product can be used to rotate a vector by any angle {\theta.}

Here’s how it works in {2} dimensions:

If you start at vector {\mathbf{a}} and rotate by angle {\theta,} you will land on a new vector that we will call {\mathbf{b}.} Both {\mathbf{a}} and {\mathbf{b}} are position vectors. During the rotation, the vector’s tail remains stationary at the origin, its tip moves in a circular arc, and its length remains constant.

You can find vector {\mathbf{b}} with the following formula:

{\mathbf{b} = (\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})}

where the vectors {\mathbf{R1}} and {\mathbf{R2}} are defined as:

{\mathbf{R1} = (\cos \theta, \; -\sin \theta)}

{\mathbf{R2} = (\sin \theta, \; \cos \theta)}

We can expand this formula out and write it as:

{\mathbf{b} = (\mathbf{a} \cdot (\cos \theta, \; -\sin \theta), \ \mathbf{a} \cdot (\sin \theta, \; \cos \theta))}

The direction of rotation is:

counterclockwise if {\theta > 0}

clockwise if {\theta < 0}

Dot product: rotation example #1

Let’s perform a {30°} rotation starting from the vector {(3,4),} going counterclockwise.

Set {\mathbf{a} = (3,4)} and {\theta = 30°,} and define the vectors that will accomplish the rotation:

{\mathbf{R1} = (\cos 30°, \; -\sin 30°)} {\; = (0.866, -0.5)}

{\mathbf{R2} = (\sin 30°, \; \cos 30°)} {\; = (0.5, 0.866)}

We can now solve the problem by plugging these values into the formula:

{\mathbf{b} = (\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})}

{\mathbf{b} = ((3,4) \cdot (0.866, -0.5), \ (3,4) \cdot (0.5, 0.866))}

{\mathbf{b} = (0.598, 4.964)}

Dot product: rotation example #2

Let’s solve this problem:

A planet is revolving around its sun in a circular orbit.
If the planet is at location {\mathbf{p}} and the sun is at location {\mathbf{s},} then where will the planet be after it advances an angle of {\theta} in its orbit?
(For example, if the planet advances an angle of {\theta = 90°,} it would advance one-fourth of a revolution around the sun.)
The planet’s orbit is contained entirely within one plane, so all positions can be described with {2}-dimensional vectors.

Here’s the solution:

We know that if you start at vector {\mathbf{a}} and rotate by angle {\theta} around the origin, the new vector is:

{(\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})}

where:

{\mathbf{R1} = (\cos \theta, \; -\sin \theta)}

{\mathbf{R2} = (\sin \theta, \; \cos \theta)}
Observe that {\mathbf{p} - \mathbf{s}} is the displacement vector from the sun to the planet.
Rotate {\mathbf{p} - \mathbf{s}} by {\theta} to find the planet’s new displacement after advancing in its orbit. We can do this rotation by setting {\mathbf{a} = \mathbf{p} - \mathbf{s}} and substituting:

{(\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2}) \ = \ ((\mathbf{p} - \mathbf{s}) \cdot \mathbf{R1}, \ (\mathbf{p} - \mathbf{s}) \cdot \mathbf{R2})}
Translate this new displacement to the sun to find the new position of the planet:

{\mathbf{s} + ((\mathbf{p} - \mathbf{s}) \cdot \mathbf{R1}, \ (\mathbf{p} - \mathbf{s}) \cdot \mathbf{R2})}
Optionally, you can expand it out to eliminate {\mathbf{R1}} and {\mathbf{R2}}:

{\mathbf{s} + ((\mathbf{p} - \mathbf{s}) \cdot (\cos \theta, \; -\sin \theta), \ (\mathbf{p} - \mathbf{s}) \cdot (\sin \theta, \; \cos \theta))}

Dot product: projection introduction

Here’s a diagram that shows the projection of vector {\mathbf{a}} onto vector {\mathbf{b},} creating a new vector {\mathbf{c}}:

The vector {\mathbf{c}} is called the “projection of {\mathbf{a}} onto {\mathbf{b}}”.

Here are some ways to interpret this diagram:

When you move from the tail of {\mathbf{a}} to the tip of {\mathbf{a},} it’s clear that you’re moving, at least to some extent, in the direction of {\mathbf{b}.} The vector {\mathbf{c}} shows how much of that movement occurs in the direction of {\mathbf{b}.}
Imagine that {\mathbf{a}}’s arrow is casting a shadow onto {\mathbf{b}}’s line, and that {\mathbf{c}}’s arrow is the shadow itself. The source of the light is located far above, and it’s shining down rays of light that run perpendicular to {\mathbf{b}.} The gray line is the leftmost ray of light that’s not blocked by {\mathbf{a}}’s arrow.
If you look at all the points that lie along {\mathbf{b}}’s arrow line, you’ll see that exactly one of those points is the closest to the tip of {\mathbf{a}.} The tip of {\mathbf{c}} shows where that closest point is located. This means that the length of the gray line segment {|\mathbf{a} - \mathbf{c}|} is the shortest possible distance between {\mathbf{b}}’s arrow line and the tip of {\mathbf{a}.}

Observe that the gray line connecting {\mathbf{a}} and {\mathbf{c}} is perpendicular to {\mathbf{b}}’s arrow. In general, the displacement from a point to its projection is always perpendicular to the line onto which the point is projected.

Dot product: projection definition

Let’s continue with the projection example that we used in the previous section:

This shows the projection of the vector {\mathbf{a}} onto vector {\mathbf{b},} creating a new vector {\mathbf{c}.}

Projection is defined in two ways:

Scalar projection

In the above diagram, {\mathbf{c}} shows how much of {\mathbf{a}}’s movement occurs in the direction of {\mathbf{b}.} Let’s use the variable {s} to represent the amount of that movement. The variable {s} is called the {\text{“}}scalar projection of {\mathbf{a}} onto {\mathbf{b}\text{”},} and is a real number.

The definition of {s} is:

{ s = \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}} }

The value of {s} is either {|\mathbf{c}|} or {-|\mathbf{c}|,} determined as follows:

If {\theta} is an acute angle then {s = |\mathbf{c}| \quad} {(|\theta| < 90°)}

If {\theta} is an obtuse angle then {s = -|\mathbf{c}| \quad} {(90° < |\theta| \le 180°)}

If {\theta} is a right angle then {s = |\mathbf{c}| = 0 \quad} {(|\theta| = 90°)}

In the above diagram, notice that if {\theta} becomes large enough to be an obtuse angle, then {\mathbf{a}} will be pointing, at least to some extent, in the opposite direction of {\mathbf{b}.} When that happens, the scalar projection {s} becomes negative.

If you happen to know the value of {\theta} instead of {\mathbf{b},} then you can use a different (but equivalent) definition of the scalar projection:

{ s = |\mathbf{a}| \cos \theta }

Scalar projection is defined for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.

Vector projection

In the above diagram, the vector {\mathbf{c}} is called the {\text{“}}vector projection of {\mathbf{a}} onto {\mathbf{b}\text{”},} or simply the {\text{“}}projection of {\mathbf{a}} onto {\mathbf{b}\text{”}.}

The definition of {\mathbf{c}} is:

{ \mathbf{c} = { s {\LARGE{\hat{\normalsize \mathbf{b}}}} } }

The vector {\mathbf{c}} has length {|s|,} and it points in the direction of {\mathbf{b},} except that it points in the opposite direction of {\mathbf{b}} if {s < 0.}

We can use the notation {\text{“proj}_ {\large{\mathbf{b}}} \ \mathbf{a}\text{”}} to represent vector projection in general:

{ {\text{proj}_ {\large{\mathbf{b}}} \ \mathbf{a}} = { s {\LARGE{\hat{\normalsize \mathbf{b}}}} } \quad } {(}where { s = \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}}) }

The left side of the equation is pronounced {\text{“}}the projection of {\mathbf{a}} onto {\mathbf{b}\text{”.}}

Vector projection is defined for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both be position vectors.

Dot product: projection example

Let’s solve this vector projection problem:

Given the point {\mathbf{a} = (3,4)} and the line {y = {\Large {x \over 2}},} how far away is {\mathbf{a}} from the closest point on the line?

Here’s a diagram of the problem:

The vector {\mathbf{b}} lies on the line {y = {\Large {x \over 2}}.} The point {\mathbf{c}} is the one point on the line that’s closest to {\mathbf{a}.} We want to find the distance between {\mathbf{a}} and {\mathbf{c}.}

Notice that the {\mathbf{b}} line is perpendicular to the line that connects {\mathbf{a}} and {\mathbf{c}.} Earlier, we observed that vector projection also has this same perpendicular relationship, so we can model this problem using vector projection.

Specifically, we want to find the projection of {\mathbf{a}} onto {\mathbf{b}}’s line, which will give us {\mathbf{c}.}

To better recognize this as a vector projection problem, we can explicitly draw the vector arrows for {\mathbf{a}} and {\mathbf{c}}:

Here are the steps to solve the problem:

First, let’s determine exactly where {\mathbf{b}}’s tip is. Any point on the line will do, so let’s set {x = 2} to get {y = 1,} and choose that point for the tip of {\mathbf{b}}:

{\mathbf{b} = (2,1)}
The projection formulas use the unit vector in {\mathbf{b}}’s direction, so let’s find {{\LARGE{\hat{\normalsize \mathbf{b}}}}}:

[[ {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ {{\mathbf{b}} \over {|\mathbf{b}|}} \ = \ {{(2,1)} \over {\sqrt{2^2+1^2}}} \ = \ {{(2,1)} \over {\sqrt{5}}} \ = \ \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ (0.894, 0.447)]]
Now we can find the scalar projection {s}, which is the amount of {\mathbf{a}}’s movement in the direction of {\mathbf{b}}:

[[ s \ = \ \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ (3,4) \cdot \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ { 6 \over {\sqrt{5}}} + { 4 \over {\sqrt{5}}} \ = \ { 10 \over {\sqrt{5}}} \ = \ 4.472 ]]

(Recall that {s} is the length of vector {\mathbf{c}.})
Now we can find the vector projection {\mathbf{c}}:

[[ \mathbf{c} \ = \ s {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ { 10 \over {\sqrt{5}}} \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ \left( {20 \over 5}, {10 \over 5} \right) \ = \ (4,2) ]]
And finally, we can find the distance between {\mathbf{a}} and {\mathbf{c}}:

[[ |\mathbf{a} - \mathbf{c}| \ = \ |(3,4) - (4,2)| \ = \ |(-1, 2)| \ = \ \sqrt{1 + 4} \ = \ \sqrt{5} \ = \ 2.236 ]]

And we obtain the answer: {\sqrt{5}.}

As a shortcut, we can combine steps 2, 3, and 4 together, and calculate {\mathbf{c}} directly from {\mathbf{a}} and {\mathbf{b}.} This is done by combining the three separate formulas together into a single formula for the vector projection:

[[ \mathbf{c} \ = \; \left( {\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{b}|^2} \right) \mathbf{b} \ = \; \left( {(3,4) \cdot (2,1)} \over {|(2,1)|^2} \right) (2,1) \ = \ \left( {{10} \over {5}} \right) (2,1) \ = \ (4,2) ]]

We can also create a shortcut formula for scalar projection, and calculate {s} directly from {\mathbf{a}} and {\mathbf{b}}:

[[ s \ = \; {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{b}|}} \ = \; {{(3,4) \cdot (2,1)} \over {\sqrt{2^2 + 1^2}}} \ = \ {{10} \over {\sqrt{5}}} \ = \ 4.472 ]]

Graph test

vector addition:	{ (3, 4) + (-1, 7) = (2, 11) }
vector subtraction:	{ (3, 4) - (-1, 7) = (4, -3) }
scalar multiplication:	{ 2(3, 4) = (6, 8) }
scalar division:	{ (3, 4)/2 = (1.5, 2) }
negation:	{ -(3, 4) = (-3, -4) }

{x = {\mathbf{a} _1} + t \mathbf{d} _1 }	(repeated from step 1 above)
{0 = {\mathbf{a} _1} + t \mathbf{d} _1 }	(set {x = 0} for the y-intercept)
{-{\mathbf{a} _1} = t \mathbf{d} _1 }	(subtract {\mathbf{a} _1} from both sides)
[[ {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} = t ]]	(divide both sides by {\mathbf{d} _1} to find {t})

{y = mx + b}	(given)
{y = b }	(set {x = 0} for the y-intercept and simplify)
{y = {\mathbf{a} _2} + t \mathbf{d} _2 }	(repeated from step 1 above)
[[y = {\mathbf{a} _2} + {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} \mathbf{d} _2 ]]	(substitute [[t = {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} ]] from step 3 above)
{y = {\mathbf{a} _2} - m {\mathbf{a} _1} }	(simplify [[{\mathbf{d} _2} \over {\mathbf{d} _1}]] to {m,} from step 2 above)
{b = {\mathbf{a} _2} - m {\mathbf{a} _1} }	(substitute {y = b} from above)

{\cos^2 t + \sin^2 t = 1}	(repeated from step 2 above)
{r^2 \cos^2 t + r^2 \sin^2 t = r^2}	(multiply by {r^2})
{(r \cos t)^2 + (r \sin t)^2 = r^2}	(combine the squares)
{x^2 + y^2 = r^2}	(repeated from step 1 above)
{x = r \cos t}	(equate like terms in the previous two equations)
{y = r \sin t}
{(x,y) = (r \cos t, r \sin t)}	(express the previous two equations as vectors)
{(x,y) = r(\cos t, \sin t)}	(factor out {r})
{(x,y) = \mathbf{c} + r(\cos t, \sin t)}	(translate the circle to the desired center {\mathbf{c}})

if {\ t = 0°\ }	then {\ r(\cos 0°, \sin 0°)}	{\; = (r, 0)}
if {\ t = 90°\ }	then {\ r(\cos 90°, \sin 90°)}	{\; = (0, r)}
if {\ t = 180°\ }	then {\ r(\cos 180°, \sin 180°)}	{\; = (-r, 0)}
if {\ t = 270°\ }	then {\ r(\cos 270°, \sin 270°)}	{\; = (0, -r)}

{\mathbf{R1} = (\cos 30°, \; -\sin 30°)}	{\; = (0.866, -0.5)}

{\mathbf{R2} = (\sin 30°, \; \cos 30°)}	{\; = (0.5, 0.866)}

If {\theta} is an acute angle then {s = \|\mathbf{c}\| \quad}	{(\|\theta\| < 90°)}

If {\theta} is an obtuse angle then {s = -\|\mathbf{c}\| \quad}	{(90° < \|\theta\| \le 180°)}

If {\theta} is a right angle then {s = \|\mathbf{c}\| = 0 \quad}	{(\|\theta\| = 90°)}