In vector mathematics, we take the {(x,y)} coordinate pair and enhance it to provide these additional features:
Extension into higher dimensions, for example, {(x,y,z)} in 3-dimensional space.
Displacement, which is the idea of using {(x,y)} to describe movement, instead of a fixed location.
Operators that apply to {(x,y)} as a whole, such as addition, subtraction, and various types of multiplication and “products”.
This introduction is for students who can:
This diagram shows movement on the XY plane from {(1,2)} to {(4,6)} along a straight line:
To measure this movement, we can split it into two directions: {3} units to the right and {4} units up. We can then combine the movements into a single pair of numbers, {(3,4)}, which describes the overall diagonal movement.
We call {(3,4)} a vector. Vectors are used to describe displacement.
“Displacement” is movement in a specific direction, for a specific distance. The movement can be either real or abstract. In physics, “displacement” often implies that a physical object is moving. In mathematics, it often just describes a change in position.
When performing a displacement, the horizontal and vertical movements occur simultaneously, and we assume that the resulting movement occurs in a straight line.
You can view a vector in two ways:
Since a vector describes movement, it’s natural to draw it as an arrow. The movement starts at the tail of the arrow, and ends at the tip of the arrow. (The tip end is also called the head end.)
The arrow is drawn as a straight line segment that contains an arrowhead at the tip end. The arrow points in a specific direction, and has a specific length.
In algebra, we write a vector as a list of numbers surrounded by parentheses, for example: {(3,4).} The individual numbers {3} and {4} are called the components of the vector.
The number of components in a vector is called its dimension. For example, the vector {(8,-5,2)} is a
{3}-dimensional vector.
Mathematical operators can be applied to a vector as a whole, so for example, we can write: {(1,2) + (3,4),} which is the algebra description of the movement shown in the above diagram.
Here are some examples of vectors at work in algebra:
{(9,\, -2,\ 0,\ 6.25,\ {1 \over 3})} | a vector is an ordered list of real numbers |
{(2x,\ y+3)} | it can contain expressions that evaluate to real numbers |
{( \, )} | this is not a vector; a vector must contain at least one number |
{(3,4) = (3,4)} | vectors can be compared for equality |
{(3,4) \ne (4,3)} | order matters |
{(3,4) \ne (3,4,0)} | equal vectors must have the same dimension |
{(x,y) = (3,4)} | this defines {x=3} and {y=4} using vector notation |
{\mathbf{a} = (3,4)} | a variable can represent a vector (usually in boldface) |
{\mathbf{b} = (-1,7)} | a variable can represent a vector |
{\mathbf{a} + \mathbf{b} = (2,11)} | vector addition is component-wise addition |
{\mathbf{a} - \mathbf{b} = (4,-3)} | vector subtraction is component-wise subtraction |
{\mathbf{a} _1 = 3, \ \ \mathbf{a} _2 = 4} | the individual components of a vector can be extracted |
{2 \mathbf{a} = (6,8)} | scalar multiplication: distributes over all the components |
{\mathbf{a}/2 = (1.5,2)} | scalar division: distributes over all the components |
{- \mathbf{a} = (-3,-4)} | negation: distributes over all the components |
{1/\mathbf{a}} | is undefined |
{\mathbf{a} \mathbf{b}} | is undefined |
{\mathbf{a} / \mathbf{b}} | is undefined |
{\mathbf{a} ^ {\large 2}} | is undefined |
{\mathbf{a} < \mathbf{b}} | is undefined, vectors themselves are not ordered |
The length of a vector is the length of its arrow line.
The length of the vector {(3,4)} is {5,} which we can calculate using the Pythagorean:
{\sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5}
We use vertical bars around a vector to indicate its length. For example, if {\mathbf{a} = (3,4),} we can say:
{ |\mathbf{a}| = \sqrt{3^2 + 4^2} = 5}
In general, the length of a {2}-dimensional vector is:
{|\mathbf{a}| = \sqrt{\mathbf{a} _1^2 + \mathbf{a} _2^2}}
The Pythagorean can be extended to calculate vector length in all dimensions:
{|\mathbf{a}| = \sqrt{\mathbf{a} _1^2 + \mathbf{a} _2^2 + \mathbf{a} _3^2 + \mathbf{a} _4^2 + \text{···} + \mathbf{a} _n^2} \quad} (for an {n}-dimensional vector}
Length is like absolute value — it’s never negative:
{|\mathbf{a}| \ge 0 \quad} for all {\mathbf{a}}
The “magnitude” of a vector is the same thing as its length. The two words can be used interchangeably.
In vector terminology, a “scalar” is a real number that’s multiplied by a vector.
If you take a vector and perform scalar multiplication with the scalar {k}, then the length of the resulting vector will also be multiplied by {k.} This is called “scaling” a vector.
For example, if we’re given the vector {\mathbf{a} = (3,4)} where {|\mathbf{a}| = 5,}
we can scale it by {2,} which doubles the length to
{\mathbf{a} = (3, 4)}
{|\mathbf{a}| = |(3, 4)| = \sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5}
{2 \mathbf{a} = 2(3, 4) = (6, 8)}
{|2 \mathbf{a}| = |(6, 8)| = \sqrt{6^2 + 8^2} = \sqrt{36 + 64} = \sqrt{100} = 10}
The vector {2 \mathbf{a}} points in the same direction as {\mathbf{a}.}
When you scale a vector, the tip of the resulting vector is moved forward or backward along the line of the arrow, leaving its direction unchanged.
To measure the direction of a vector’s arrow, we use
a special type of vector called a
All vectors of length {1} are unit vectors, and vice versa.
Unit vectors are used to standardize the way we measure direction. Every direction has exactly one unique unit vector that describes it.
For example, the vectors {(3,4),} {(9,12),} and {(15,20)} all point in the same direction, which is expressed as the unit vector {(0.6, 0.8).}
You can verify that all four of those vectors are parallel by calculating
their slopes, which are all
{m = {\large 4 \over \large 3}.} However, slope is in imperfect way to
measure direction, because it’s incapable of distinguishing between
directions that are {180°} apart, and it can’t represent vertical
directions. Also, the way we calculate slope on the
As a special case, the direction of the vector {(0,0)} is undefined, That’s because you can’t determine which direction you’re moving if you’re not moving. The vector {(0,0)} is drawn as a point with no arrowhead.
We can put the symbol {{\LARGE{\hat{\ }}}} over a vector’s name to describe the unit vector that points in the same direction. For example, if {\mathbf{a} = (3,4),} we can say:
{{\LARGE{\hat{\normalsize \mathbf{a}}}} = (0.6, 0.8)}
The symbol {{\LARGE{\hat{\ }}}} is called the hat symbol, and the name {{\LARGE{\hat{\normalsize \mathbf{a}}}}} is pronounced “{\mathbf{a}} hat”.
In general, we define the unit vector {{\LARGE{\hat{\normalsize \mathbf{a}}}}} as:
[[{\LARGE{\hat{\normalsize \mathbf{a}}}} = {\mathbf{a} \over |\mathbf{a}|} \quad (|\mathbf{a}| \ne 0)]]
So if {\mathbf{a} = (3,4),} we can calculate {{\LARGE{\hat{\normalsize \mathbf{a}}}}} as:
[[{\LARGE{\hat{\normalsize \mathbf{a}}}} \ = \ {\mathbf{a} \over |\mathbf{a}|} \ = \ {{(3,4)} \over {\sqrt{3^2 + 4^2}}} \ = \ {{(3,4)} \over {5} } \ = \ \left( {3 \over 5}, {4 \over 5} \right) \ = \ (0.6, 0.8) ]]
To verify that {(0.6, 0.8)} is a unit vector, we can show that its length is
[[ |(0.6, 0.8)| \ = \ \sqrt{0.6^2 + 0.8^2} \ = \ \sqrt{0.36 + 0.64} \ = \ \sqrt{1} \ = \ 1 ]]
The components of a unit vector are always between {-1} and {1} (inclusive).
If you take a vector and perform scalar multiplication with a negative number, the resulting vector will point in the opposite direction.
For example, if we’re given the vector {\mathbf{a} = (3,4),} we can multiply it by {-1} to define another vector of the same length that points in the opposite direction:
{- \mathbf{a} = -(3, 4) = (-3, -4)}
The vectors {-\mathbf{a}} and {\mathbf{a}} are parallel, but their arrowheads are on opposite ends.
It’s interesting to observe that the notation {\text{“}(3,4)\text{”}} is used for two different purposes:
as a displacement, describing the movement from one point to another
as a coordinates, describing a fixed point that
At first glance, these two purposes seem contradictory. Isn’t it inconsistent to use the same notation for both movement and something that never moves?
We can resolve this by defining a special type of vector
for representing
A position vector is a vector with its tail located at the origin, and a tip that describes a fixed point.
For example, let’s say that we’re given a fixed point with coordinates {(3,4).} If we also draw a position vector that starts at the origin and moves a displacement of {(3,4),} it will arrive at that same fixed point. Here’s a diagram showing both the point and the vector:
This shows that a fixed point is exactly the same thing as the tip
of a
When we use a position vector, we’re free to ignore its displacement arrow, and to focus just on its tip, whenever it’s convenient to do so.
Sometimes, however, the displacement arrow is useful. We’ll see an example of this in the next section.
Notice that the arrowhead of a position vector always points away from the origin (except for {(0,0),} which doesn’t have an arrowhead).
In vector mathematics, some things are easier to visualize if you imagine that every point on the XY plane is associated with a straight line that connects it to the origin.
For example, it turns out that ({3,4)} and {(8,-6)} are perpendicular to each other. That’s hard to see if you think of ({3,4)} and {(8,-6)} as just two individual points on the plane. But if you view them as position vectors, then it’s easy to see that their arrows intersect at a 90° angle:
There’s a natural one-to-one correspondence between every point on the plane and every vector arrow that could possibly emanate from the origin. It’s helpful to be able to seamlessly switch your perspective back and forth between the “point view” and the “arrow view”, depending on the situation.
The tail of a vector is always assumed to be at the origin, unless otherwise specified. Therefore, all vectors are assumed to be position vectors unless you’re also given the position of its tail.
For example, if someone describes a vector as
However, if they describe a vector only as
If a vector is being used to show displacement from any point other than the origin, then we can call it a displacement vector to clarify that it’s not a position vector.
Given these two vectors:
{\mathbf{a} = (3,4)}
{\mathbf{b} = (5,-1)}
you can add them to obtain:
{\mathbf{a} + \mathbf{b} = (8,3)}
Vector addition describes the act of “chaining” displacements together, end to end, effectively combining them together into a single displacement.
Here’s the geometric view of {\mathbf{a} + \mathbf{b}}:
To understand this diagram:
This diagram shows two different kinds of vectors:
To “translate” a vector means to move its arrow elsewhere without changing either the length or the direction of the arrow. This is accomplished by applying the same displacement to both its tail and tip.
The diagram shows {\mathbf{b}} being translated up to the tip of {\mathbf{a}.} During the translation, the color of {\mathbf{b}} turns from black to gray. After the translation, the tip of the gray vector defines the tip of {\mathbf{a} + \mathbf{b}.}
We can call the gray vector
Notice that the tail of the vector {\mathbf{a} + \mathbf{b}} is at the origin. That’s because a vector’s tail is always at the origin unless otherwise specified.
It would also be possible to arrive at the tip of {\mathbf{a} + \mathbf{b}} by translating
the other
Given these two vectors:
{\mathbf{a} = (3,4)}
{\mathbf{b} = (8,3)}
you can subtract the first from the second to obtain:
{\mathbf{b} - \mathbf{a} = (5,-1)}
Vector subtraction is used to find the displacement from one vector to another.
Specifically, if you subtract {\mathbf{a}} from {\mathbf{b}}, the result is a vector that shows the displacement from the tip of {\mathbf{a}} to the tip of {\mathbf{b}.}
Notice that the subtraction is specified as {\mathbf{b} - \mathbf{a}}, so the resulting displacement is in the reverse order: from {\mathbf{a}} to {\mathbf{b}.}
Here’s the geometric view of {\mathbf{b} - \mathbf{a}}:
To understand this diagram:
If you follow these steps, you will draw {\mathbf{b} - \mathbf{a}} as a translated vector. Drawing it as a translated vector usually makes the diagram more understandable. Notice that the diagram also shows {\mathbf{b} - \mathbf{a}} as a position vector (in black), but the geometric meaning of its position is not as obvious.
To determine the displacement from a start point to an end point, we always do the subtraction in this order:
end {-} start {=} displacement
This makes sense, because if you add start to both sides of the equation, you get:
end {=} start {+} displacement
which describes exactly how displacement works.
Notice that either {|\mathbf{b} - \mathbf{a}|} or {|\mathbf{a} - \mathbf{b}|} gives you the distance between the tips of {\mathbf{a}} and {\mathbf{b}.}
Vector subtraction only works if the start and end vectors have the same tail position.
Let’s solve the following problem:
An object is currently located at {(7,1).} It will be moved to a new location that’s twice as far away as it currently is from the location {(5,0).} The movement will occur entirely along the straight line that connects {(5,0)} and {(7,1).} Where will the new location of the object be?
Let’s call {(5,0)} the “base”.
First, we can find the displacement from the base to the object’s original position:
{(7,1) - (5,0) \ = \ (2,1)}
Then multiply the displacement by {2} to obtain the new displacement from the base:
{2(2,1) \ = \ (4,2)}
Then apply the new displacement to the base to get the answer:
{(5,0) + (4,2) \ = \ (9,2)}
Optionally, we can combine all of these operators together into a single expression:
{(5,0) + 2((7,1) - (5,0)) \ = \ (9,2) }
Let’s take this solution and generalize it with variables:
{\mathbf{a} + k(\mathbf{b} - \mathbf{a})}
where:
Notice the following:
Now that the problem is solved using variables, it will work for any two points in a space of any dimension.
Here’s a brief summary and review of the basic concepts in vector mathematics.
In vector mathematics, we take the {(x, y)} coordinate pair and enhance it with operators that act on {(x, y)} as a whole, for example:
vector addition: { (3, 4) + (-1, 7) = (2, 11) } vector subtraction: { (3, 4) - (-1, 7) = (4, -3) } scalar multiplication: { 2(3, 4) = (6, 8) } scalar division: { (3, 4)/2 = (1.5, 2) } negation: { -(3, 4) = (-3, -4) }
To graph a vector, draw a straight-line arrow that connects its tail point to its tip point. The arrow shows the displacement (movement) from the tail to the tip.
The displacement is broken into separate horizontal and vertical
movements,
If {\mathbf{a} = (x, y),} we can extract its individual components using {\mathbf{a} _1 = x} and {\mathbf{a} _2 = y.}
The number of components in a vector is called its dimension.
If {\mathbf{a} = (x, y),} then {|\mathbf{a}|} is the length of the vector {\mathbf{a}}, and is calculated as {\sqrt{x^2 + y^2}.}
If {\mathbf{a} = (x, y),} then {{\LARGE{\hat{\normalsize \mathbf{a}}}}}
A position vector is a vector whose tail is at the origin, and whose tip describes a fixed point. Every point on the plane has a corresponding position vector that points to it from the origin. All vectors are assumed to be position vectors unless otherwise specified.
If a vector’s tail is not at the origin, then you need to describe
it using a phrse like
Vector addition is used to “chain” displacements together, end to end,
combining them together into a single displacement.
For example, the vector
Vector subtraction is used to find the displacement from the tip of one vector to the tip of another vector. For example, the displacement from {\mathbf{a}} to {\mathbf{b}} is {\mathbf{b} - \mathbf{a}.}
The scalar multiplication {k \mathbf{a}} results in a vector that’s {k} times longer than {\mathbf{a},} and points in the same direction as {\mathbf{a},} or points in the opposite direction if {k < 0.}
It’s common to describe a straight line on the XY plane this way:
{y = mx + b}
However, there’s an alternative way to describe a straight line, using the vector form:
{\mathbf{a} + t \mathbf{d}}
where:
For example, {y = 2x + 3} can be expressed in vector form as:
{(0,3) + t(1,2)}
More generally, {y = mx + b} can be expressed in vector form as:
{(0, b) + t(1, m)}
Notice that in both of these cases:
The equation {y = mx + b} is often used to describe lines on the XY plane. It has a limitation, though, because it can’t be used to describe a vertical line.
Fortunately, it’s easy to describe a vertical line in vector form.
For example, here’s how to describe the vertical line at
{(4, 0) + t(0, 1)}
This can be algebraically simplified to:
{(4, t)}
(Remember that {t} varies across all the real numbers as the line is being drawn.)
If you like, you can be more explicit about the fact that the vertical line is being drawn onto the XY plane by writing it like this:
{(x, y) = (4, t)}
Now, it’s clearer that the {x} coordinate of the line is fixed at {x = 4,} and its {y} coordinate varies across all the real numbers, since {y = t} while {t} is varying.
Generalizing to all lines on the plane, you can write this:
{(x, y) = \mathbf{a} + t \mathbf{d}}
The inclusion of that extra {\text{“}(x, y)\ \mathord{=}\text{”}} notation is optional, and it does not change the vector mathematics at all. It simply shows that we’re interpreting the right side of the equation as a position on the XY plane.
Let’s do a detailed analysis of a straight line in vector form:
{(x, y) = \mathbf{a} + t \mathbf{d}}
where:
{\mathbf{a} = (3,4)}
{\mathbf{d} = (5,2)}
Here’s the geometric view of this line (shown in gray):
First, let’s analyze this from the vector perspective:
And now let’s analyze this from the function and graphing perspectives:
Recall that vectors can have more than {2} dimensions. So if you define {\mathbf{a}} and {\mathbf{d}} to be 3D vectors, it only requires a trivial change to describe a line in 3D space:
{(x, y, z) = \mathbf{a} + t \mathbf{d} \quad} (for 3D vectors {\mathbf{a}} and {\mathbf{d}})
We can extended this to even higher dimensions, which allows us to describe any straight line in a space of any dimension.
As you can see, using {\mathbf{a} + t \mathbf{d}} to describe a line can be very concise, and it’s far more powerful than using {y = mx + b} to describe lines.
If we’re given a line on the XY plane that’s described in vector form:
{(x, y) = \mathbf{a} + t \mathbf{d}}
it’s instructive to see how to convert it back to the slope-intercept form:
{y = mx + b}
Doing this conversion is a good algebra exercise, because it requires you to juggle seven different variables at the same time, along with some vector handling.
Here are the steps to do this conversion:
Split the vector into its two components:
{x = {\mathbf{a} _1} + t \mathbf{d} _1 }
{y = {\mathbf{a} _2} + t \mathbf{d} _2 }
We already know that {\mathbf{d}} determines the slope of the line
[[m = {{\mathbf{d} _2} \over {\mathbf{d} _1}} \quad ({\mathbf{d} _1} \ne 0) ]]
The y-intercept is {b,} which you can find by setting {x = 0} and
solving for {y.} But first, you’ll need to know which value of
{t} produces
{x = {\mathbf{a} _1} + t \mathbf{d} _1 } (repeated from step 1 above) {0 = {\mathbf{a} _1} + t \mathbf{d} _1 } (set {x = 0} for the y-intercept) {-{\mathbf{a} _1} = t \mathbf{d} _1 } (subtract {\mathbf{a} _1} from both sides) [[ {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} = t ]] (divide both sides by {\mathbf{d} _1} to find {t})
Now that you know which value of {t} produces {x = 0,} you can find
the corresponding value for the y-intercept
{y = mx + b} (given) {y = b } (set {x = 0} for the y-intercept and simplify) {y = {\mathbf{a} _2} + t \mathbf{d} _2 } (repeated from step 1 above) [[y = {\mathbf{a} _2} + {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} \mathbf{d} _2 ]] (substitute [[t = {{-{\mathbf{a} _1}} \over {\mathbf{d} _1}} ]] from step 3 above) {y = {\mathbf{a} _2} - m {\mathbf{a} _1} } (simplify [[{\mathbf{d} _2} \over {\mathbf{d} _1}]] to {m,} from step 2 above) {b = {\mathbf{a} _2} - m {\mathbf{a} _1} } (substitute {y = b} from above)
So the result of the conversion is:
{y = mx + b}
where:
[[m = {{\mathbf{d} _2} \over {\mathbf{d} _1}} \quad ({\mathbf{d} _1} \ne 0) ]] [[b = {\mathbf{a} _2} - m {\mathbf{a} _1} ]]
For more practice, try converting back and forth between the vector form
and the
{Ax + By = C \quad} (for constants {A,} {B,} and {C})
We now have two different ways to describe a line:
{y = mx + b}
{(x, y) = \mathbf{a} + t \mathbf{d}}
In the first function, {y} is dependent on {x.} This means that we must first determine a value for {x} before we can determine the corresponding value of {y.}
In the second function, both {x} and {y} are dependent on {t.} This means that we must first determine a value for {t,} then, once we do, we can determine the corresponding values of both {x} and {y.}
The second approach is the way we define functions in vector mathematics.
When using vectors, our goal is to always keep {x} and {y} packaged
together into the vector {(x,y),} and to define both at the same time,
as a pair. We don’t consider either {x} or {y} to be dependent
on the
The name {\text{“}t\text{”}} was chosen because it’s often helpful to think of it as time. So, for example, an increase in {t} could be seen as moving forward in time.
When a function is defined using this approach, it’s called a parametric function, and {t} is called the parameter of the function. If it’s expressed as an equation (like above), it can also be called a parametric equation.
A function like {y = mx + b} is called a real function.
This type of function takes one
In contrast, a parametric function lifts those restrictions. It can produce different types of outputs, such as vectors, matrices, or complex numbers, and it can even allow multiple “inputs” as well.
Let’s see how we can describe a circle on the XY plane using vectors.
To start, observe that if {r} represents the distance between the origin and any point {(x,y)} on the plane, then those three variables satisfy the Pythagorean theorem:
{x^2 + y^2 = r^2}
To describe a circle, we can set {r} to a constant value, and let {x} and {y} vary. The resulting collection of all the {(x,y)} points that satisfy the equation will form a circle of radius {r} that’s centered at the origin.
It’s common to solve this equation for {y} to obtain two real functions that produce a circle when graphed together on the XY plane:
{y = \phantom{-} \sqrt{r^2 - x^2}} (the top half of the circle only){y = - \sqrt{r^2 - x^2}} (the bottom half of the circle only)
However, in vector mathematics, we prefer to use parametric functions, so that we can keep {x} and {y} together as a vector. This can be done by introducing a new variable {t} and rearranging things so that the expressions appear inside of vectors, like this:
{(x,y) = (t, \phantom{-} \sqrt{r^2 - t^2})} (the top half of the circle only){(x,y) = (t, - \sqrt{r^2 - t^2})} (the bottom half of the circle only)
(Remember that {t} varies across all the real numbers as the circle is being drawn.)
This works, but it’s not the most concise way to describe a circle with vectors. Fortunately, there’s a single parametric function that elegantly describes the entire circle:
{(x,y) = r(\cos t, \sin t)}
In the next section, we’ll see why this function describes a circle. But first, let’s translate this circle to another location on the plane by adding it to a position vector {\mathbf{c},} which defines the center of the circle:
{(x,y) = \mathbf{c} + r(\cos t, \sin t)}
Now, all the vectors produced by {r(\cos t, \sin t)} will be translated to the tip of {\mathbf{c},} allowing us to describe circles located anywhere on the XY plane, simply by changing the center position {\mathbf{c}.}
Here are the steps to derive the parametric function for the circle using trigonometry:
First, let’s define a circle of radius {r} that’s centered on the origin. If {(x,y)} is any point on that circle, then those three variables satisfy the Pythagorean theorem:
{x^2 + y^2 = r^2}
Second, notice that the following equation is true for all
values of
{\cos^2 t + \sin^2 t = 1}
You can see why this equation is always true by applying the Pythagorean theorem to the following right triangle:
Notice that the above two equations are very similar in
{\cos^2 t + \sin^2 t = 1} (repeated from step 2 above) {r^2 \cos^2 t + r^2 \sin^2 t = r^2} (multiply by {r^2}) {(r \cos t)^2 + (r \sin t)^2 = r^2} (combine the squares) {x^2 + y^2 = r^2} (repeated from step 1 above) {x = r \cos t} (equate like terms in the previous two equations) {y = r \sin t} {(x,y) = (r \cos t, r \sin t)} (express the previous two equations as vectors) {(x,y) = r(\cos t, \sin t)} (factor out {r}) {(x,y) = \mathbf{c} + r(\cos t, \sin t)} (translate the circle to the desired center {\mathbf{c}})
This produces the parametric function for the circle:
{(x,y) = \mathbf{c} + r(\cos t, \sin t)}
As the above diagram shows, the parameter {t} represents an angle. This allows {t} to easily identify a position on the circle. Here are some examples:
if {\ t = 0°\ } then {\ r(\cos 0°, \sin 0°)} {\; = (r, 0)} if {\ t = 90°\ } then {\ r(\cos 90°, \sin 90°)} {\; = (0, r)} if {\ t = 180°\ } then {\ r(\cos 180°, \sin 180°)} {\; = (-r, 0)} if {\ t = 270°\ } then {\ r(\cos 270°, \sin 270°)} {\; = (0, -r)}
Given these two vectors:
{\mathbf{a} = (3,4)}
{\mathbf{b} = (5,2)}
you can calculate their dot product as follows:
The resulting sum, {23,} is the dot product of {\mathbf{a}} and {\mathbf{b},} and you can write it like this:
{\mathbf{a} \cdot \mathbf{b} = 23}
or, in more detail:
{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{b} _1, \mathbf{b} _2) \ = \ (3,4) \cdot (5,2) \ = \ 3 \mathord{\times} 5 + 4 \mathord{\times} 2 \ = \ 23}
The left side of the equation is pronounced “{\mathbf{a}} dot {\mathbf{b}}”.
The dot product can be described as “the sum of the component-wise products”.
In general, the dot product is defined for vectors of all dimensions
{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2, \mathbf{a} _3, \text{...}, \mathbf{a} _n) \cdot (\mathbf{b} _1, \mathbf{b} _2, \mathbf{b} _3, \text{...}, \mathbf{b} _n) }
{\phantom{\mathbf{a} \cdot \mathbf{b}} \ = \ \mathbf{a} _1 \mathbf{b} _1 + \mathbf{a} _2 \mathbf{b} _2 + \mathbf{a} _3 \mathbf{b} _3 + \text{...} + \mathbf{a} _n \mathbf{b} _n }
The two vectors {\mathbf{a}} and {\mathbf{b}} must have the same dimension, otherwise their dot product is undefined.
The dot product can be used to calculate the sum of proportionately-weighted values.
Here are two examples of this:
Example #1:
Let’s say that you have:
{4}ten-dollar bills, {3}five-dollar bills, and {6}one-dollar bills.How much money do you have?
To determine the answer, you first need to convert the number of bills to their dollar value, and then you can find the sum of those dollar values.
To do this, first define two vectors to represent the data:
{\mathbf{a} = (4, 3, 6)}{\mathbf{b} = (10, 5, 1)}and then take the dot product of the two vectors:
{\mathbf{a} \cdot \mathbf{b} \ = \ (4, 3, 6) \cdot (10, 5, 1) \ = \ 4 \mathord{\times} 10 + 3 \mathord{\times} 5 + 6 \mathord{\times} 1 \ = \ 61}
to obtain the final answer: {61} dollars.
Example #2:
Let’s construct the number {7402} digit-by-digit.
To do this, first define a vector that contains the individual digits:
{\mathbf{a} = (7, 4, 0, 2)}
and then define another vector that contains the “weights” of each digit:
{\mathbf{b} = (10^{\large 3}, 10^{\large 2}, 10^{\large 1}, 10^{\large 0})}
and then use the dot product to find the “weighted sum” of the digits:
{\mathbf{a} \cdot \mathbf{b} \ = \ (7, 4, 0, 2) \cdot (10^{\large 3}, 10^{\large 2}, 10^{\large 1}, 10^{\large 0}) \ = \ 7000 + 400 + 0 + 2 \ = \ 7402}
This example shows that the dot product is both ubiquitous and indispensable in everyday mathematics. Every person must perform this dot product mentally in order to understand the numerical value of a multi-digit number.
What happens if you take the dot product of a vector with itself? Here, we do it in two dimensions:
{\mathbf{a} \cdot \mathbf{a} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{a} _1, \mathbf{a} _2) \ = \ \mathbf{a} _1^2 + \mathbf{a} _2^2 \ = \ |\mathbf{a}|^2 }
This collapses to:
{\mathbf{a} \cdot \mathbf{a} \ = \ |\mathbf{a}|^2 }
If you take the square root of both sides of this equation, you get a concise way of defining vector length:
{\sqrt{\mathbf{a} \cdot \mathbf{a}} \ = \ |\mathbf{a}| }
This result holds for vectors of all dimensions {n \ge 1}.
It’s interesting to observe that the Pythagorean theorem is a special case
of the
Repeating the first equation from above, you can see that the Pythagorean theorem appears on the right side:
{\mathbf{a} \cdot \mathbf{a} \ = \ (\mathbf{a} _1, \mathbf{a} _2) \cdot (\mathbf{a} _1, \mathbf{a} _2) \ = \ \mathbf{a} _1^2 + \mathbf{a} _2^2 \ = \ |\mathbf{a}|^2 }
Here’s a vector diagram showing the right triangle that corresponds to
the Pythagorean theorem
So far, we have used this definition of the dot product:
{\mathbf{a} \cdot \mathbf{b} \ = \ (\mathbf{a} _1, \mathbf{a} _2, \mathbf{a} _3, \text{...}, \mathbf{a} _n) \cdot (\mathbf{b} _1, \mathbf{b} _2, \mathbf{b} _3, \text{...}, \mathbf{b} _n) }
{\phantom{\mathbf{a} \cdot \mathbf{b}} \ = \ \mathbf{a} _1 \mathbf{b} _1 + \mathbf{a} _2 \mathbf{b} _2 + \mathbf{a} _3 \mathbf{b} _3 + \text{...} + \mathbf{a} _n \mathbf{b} _n }
It turns out that there’s another definition of the dot product
that defines it exclusively in terms of lengths and
{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta }
where {\theta} is the smaller of the two angles between the vectors {\mathbf{a}} and {\mathbf{b}.}
This result holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.
The dot product has the following property:
For example, the vectors {\mathbf{a} = (3,4)} and {\mathbf{b} = (8,-6)} are perpendicular
to each other, so let’s take their dot product to confirm that it’s
{\mathbf{a} \cdot \mathbf{b} \ = \ (3,4) \cdot (8,-6) \ = \ 3 \mathord{\times} 8 + 4 \mathord{\times-}6 \ = \ 24 - 24 \ = \ 0}
Perpendicular vectors intersect at a {90°} angle, so let’s plug {\theta = 90°} into the geometric version of the dot product, and see what happens:
{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta \ = \ |\mathbf{a}| |\mathbf{b}| \cos 90° \ = \ 0}
This proves that the dot product of all perpendicular vectors is {0}.
This result holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.
In general, for 2D vectors, we can say:
The vector {(x,y)} is perpendicular to both {(-y, x)} and {(y, -x).} Rotating {(x,y)} counterclockwise to the perpendicular produces {(-y,x).} Rotating {(x,y)} clockwise to the perpendicular produces {(y,-x).}
We can prove they’re perpendicular by showing that their
dot products are
{(x,y) \cdot (-y,x) \ = \ -xy + xy \ = \ 0}{(x,y) \cdot (y,-x) \ = \ xy - xy \ = \ 0}
And, consequently, perpendicular lines have slopes that are negative reciprocals of each other:
The slope of {(x,y)} is [[ \ m = {{y \over x}} ]]
The slope of {(-y,x)} is [[ \ m^\prime = {x \over -y} \, = \, -{1 \over m}]]
Recall that the dot product in geometric form is:
{\mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta }
Here are some vector exercises:
Calculate {(2, 0, -5) \cdot (-4, 1, -3)}
Are the vectors {(3, 0, 2)} and {(5, -1, -8)} perpendicular?
What vector is produced when you rotate {(-2, -4)} counterclockwise to the perpendicular, keeping its tail at the origin, and its length the same?
Two vectors have a dot product of {10,} and their tails meet at a {60°} angle. If one of the vectors has length {4,} what is the length of the other vector?
It’s useful to solve the geometric version of the dot product
for the angle
[[ \mathbf{a} \cdot \mathbf{b} \ = \ |\mathbf{a}| |\mathbf{b}| \cos \theta ]] [[ {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} = \cos \theta ]] [[ \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) = \theta ]]
We now have an easy formula for finding the angle {\theta} between any two vectors. For example, to find the angle between {(3,4)} and {(5,12),} you can do the following:
Find their dot product:
{\mathbf{a} \cdot \mathbf{b} \ = \ (3,4) \cdot (5,12) \ = \ 3 \mathord{\times} 5 + 4 \mathord{\times} 12 \ = \ 15 + 48 \ = \ 63 }
Find both of their lengths:
{|\mathbf{a}| \ = \ \sqrt{\mathbf{a} \cdot \mathbf{a}} \ = \ \sqrt{(3,4) \cdot (3,4)} \ = \ \sqrt{3 \mathord{\times} 3 + 4 \mathord{\times} 4} \ = \ 5}
{|\mathbf{b}| \ = \ \sqrt{\mathbf{b} \cdot \mathbf{b}} \ = \ \sqrt{(5,12) \cdot (5,12)} \ = \ \sqrt{5 \mathord{\times} 5 + 12 \mathord{\times} 12} \ = \ 13}
Plug them into the formula:
[[ \theta \ = \, \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) = \, \arccos \left( {63 \over {5 \mathord{\times} 13} } \right) = \, 14.25° ]]
The result {\theta} is always the smaller of the two angles between {\mathbf{a}} and {\mathbf{b},} and is never negative. The other angle between {\mathbf{a}} and {\mathbf{b}} is {360° - \theta.}
This formula holds for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.
Slope is an imperfect way to measure direction on the XY plane. It’s incapable of distinguishing between directions that are {180°} apart, and it can’t represent vertical directions.
To solve these deficiencies, we can measure direction on the XY plane using angles instead.
The standard way of measuring a vector’s angle is to calculate the angle that it forms with the positive X axis.
Specifically, the angle of vector {\mathbf{a}} is the angle from {(1,0)} to {\mathbf{a},} where:
Here are the steps to derive the angle of
Start with the general angle formula, which finds the smaller
of the two angles between {\mathbf{a}} and
[[ \theta = \arccos \left( {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{a}| |\mathbf{b}|}} \right) ]]
Set {\mathbf{b} = (1,0)} and simplify the result:
[[ \theta = \arccos \left( {{\mathbf{a} \cdot (1,0)} \over {|\mathbf{a}| |(1,0)|}} \right) ]] [[ \theta = \arccos \left( {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) ]]
The angles {\theta} and {-\theta} both have the same cosine, so the above equation must be enhanced to distinguish between those two angles. This gives us a formula for vector angle that works in all cases:
[[ \theta = \begin{cases} \phantom{-} \arccos \left( \displaystyle {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) \space\space \text{ if } \mathbf{a} _2 \ge 0 \cr - \arccos \left( \displaystyle {{\mathbf{a} _1} \over {|\mathbf{a}|}} \right) \space\space \text{ if } \mathbf{a} _2 < 0 \cr \end{cases} ]]
This formula produces an angle {\theta} that’s in the range {-180° < \theta \le 180°, } and {\theta} will always have the same sign as the vector’s {y} component. This means that all vectors having a downward vertical displacement also have a negative angle. This convention was chosen because it’s natural to associate downward with negative.
Notice that if you start at the vector {(1,0)} and begin rotating clockwise, the resulting vector will begin pointing downward, and will therefore have a negative angle. This explains why clockwise rotation was chosen to be the negative direction of rotation.
If you like, you can eliminate negative angles by adding {360°} to any negative angle. You will then obtain an angle {\theta} in the range {0° \le \theta < 360°.}
As a special case, the angle of the vector {(0,0)} is undefined.
This formula does not require {\mathbf{a}} to be a position vector. The tail may be located anywhere on the XY plane.
This formula applies to
The dot product can be used to rotate a vector by any angle {\theta.}
Here’s how it works in {2} dimensions:
If you start at vector {\mathbf{a}} and rotate by angle {\theta,} you will land on a new vector that we will call {\mathbf{b}.} Both {\mathbf{a}} and {\mathbf{b}} are position vectors. During the rotation, the vector’s tail remains stationary at the origin, its tip moves in a circular arc, and its length remains constant.
You can find vector {\mathbf{b}} with the following formula:
{\mathbf{b} = (\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})}
where the vectors {\mathbf{R1}} and {\mathbf{R2}} are defined as:
{\mathbf{R1} = (\cos \theta, \; -\sin \theta)} {\mathbf{R2} = (\sin \theta, \; \cos \theta)}
We can expand this formula out and write it as:
{\mathbf{b} = (\mathbf{a} \cdot (\cos \theta, \; -\sin \theta), \ \mathbf{a} \cdot (\sin \theta, \; \cos \theta))}
The direction of rotation is:
counterclockwise if {\theta > 0}clockwise if {\theta < 0}
Let’s perform a {30°} rotation starting from the vector {(3,4),} going counterclockwise.
Set {\mathbf{a} = (3,4)} and {\theta = 30°,} and define the vectors that will accomplish the rotation:
{\mathbf{R1} = (\cos 30°, \; -\sin 30°)} {\; = (0.866, -0.5)} {\mathbf{R2} = (\sin 30°, \; \cos 30°)} {\; = (0.5, 0.866)}
We can now solve the problem by plugging these values into the formula:
{\mathbf{b} = (\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})} {\mathbf{b} = ((3,4) \cdot (0.866, -0.5), \ (3,4) \cdot (0.5, 0.866))} {\mathbf{b} = (0.598, 4.964)}
Let’s solve this problem:
A planet is revolving around its sun in a circular orbit. If the planet is at location {\mathbf{p}} and the sun is at location {\mathbf{s},} then where will the planet be after it advances an angle of {\theta} in its orbit? (For example, if the planet advances an angle of {\theta = 90°,} it would advanceone-fourth of a revolution around the sun.) The planet’s orbit is contained entirely within one plane, so all positions can be described with{2}-dimensional vectors.
Here’s the solution:
We know that if you start at vector {\mathbf{a}} and rotate by angle {\theta} around the origin, the new vector is:
{(\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2})}
where:
{\mathbf{R1} = (\cos \theta, \; -\sin \theta)} {\mathbf{R2} = (\sin \theta, \; \cos \theta)}
Observe that {\mathbf{p} - \mathbf{s}} is the displacement vector from the sun to the planet.
Rotate {\mathbf{p} - \mathbf{s}} by {\theta} to find the planet’s new displacement after advancing in its orbit. We can do this rotation by setting {\mathbf{a} = \mathbf{p} - \mathbf{s}} and substituting:
{(\mathbf{a} \cdot \mathbf{R1}, \ \mathbf{a} \cdot \mathbf{R2}) \ = \ ((\mathbf{p} - \mathbf{s}) \cdot \mathbf{R1}, \ (\mathbf{p} - \mathbf{s}) \cdot \mathbf{R2})}
Translate this new displacement to the sun to find the new position of the planet:
{\mathbf{s} + ((\mathbf{p} - \mathbf{s}) \cdot \mathbf{R1}, \ (\mathbf{p} - \mathbf{s}) \cdot \mathbf{R2})}
Optionally, you can expand it out to eliminate {\mathbf{R1}} and
{\mathbf{s} + ((\mathbf{p} - \mathbf{s}) \cdot (\cos \theta, \; -\sin \theta), \ (\mathbf{p} - \mathbf{s}) \cdot (\sin \theta, \; \cos \theta))}
Here’s a diagram that shows the projection of vector {\mathbf{a}} onto
vector {\mathbf{b},} creating a new vector
The vector {\mathbf{c}} is called the “projection of {\mathbf{a}} onto
Here are some ways to interpret this diagram:
When you move from the tail of {\mathbf{a}} to the tip of {\mathbf{a},} it’s clear that you’re moving, at least to some extent, in the direction of {\mathbf{b}.} The vector {\mathbf{c}} shows how much of that movement occurs in the direction of {\mathbf{b}.}
Imagine that
If you look at all the points that lie along
Observe that the gray line connecting {\mathbf{a}} and {\mathbf{c}}
is perpendicular to
Let’s continue with the projection example that we used in the previous section:
This shows the projection of the vector {\mathbf{a}} onto vector {\mathbf{b},} creating a new vector {\mathbf{c}.}
Projection is defined in two ways:
Scalar projection
In the above diagram, {\mathbf{c}} shows how much of
The definition of {s} is:
{ s = \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}} }
The value of {s} is either {|\mathbf{c}|} or {-|\mathbf{c}|,} determined as follows:
If {\theta} is an acute angle then {s = |\mathbf{c}| \quad} {(|\theta| < 90°)} If {\theta} is an obtuse angle then {s = -|\mathbf{c}| \quad} {(90° < |\theta| \le 180°)} If {\theta} is a right angle then {s = |\mathbf{c}| = 0 \quad} {(|\theta| = 90°)}
In the above diagram, notice that if {\theta} becomes large enough to be an obtuse angle, then {\mathbf{a}} will be pointing, at least to some extent, in the opposite direction of {\mathbf{b}.} When that happens, the scalar projection {s} becomes negative.
If you happen to know the value of {\theta} instead of {\mathbf{b},} then you can use a different (but equivalent) definition of the scalar projection:
{ s = |\mathbf{a}| \cos \theta }
Scalar projection is defined for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both lie on the same plane.
Vector projection
In the above diagram, the vector {\mathbf{c}} is called the {\text{“}}vector projection of {\mathbf{a}} onto {\mathbf{b}\text{”},} or simply the {\text{“}}projection of {\mathbf{a}} onto {\mathbf{b}\text{”}.}
The definition of {\mathbf{c}} is:
{ \mathbf{c} = { s {\LARGE{\hat{\normalsize \mathbf{b}}}} } }
The vector {\mathbf{c}} has length {|s|,} and it points in the direction of {\mathbf{b},} except that it points in the opposite direction of {\mathbf{b}} if {s < 0.}
We can use the notation {\text{“proj}_ {\large{\mathbf{b}}} \ \mathbf{a}\text{”}} to represent vector projection in general:
{ {\text{proj}_ {\large{\mathbf{b}}} \ \mathbf{a}} = { s {\LARGE{\hat{\normalsize \mathbf{b}}}} } \quad }
{(}where { s = \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}}) }
The left side of the equation is pronounced {\text{“}}the projection of {\mathbf{a}} onto {\mathbf{b}\text{”.}}
Vector projection is defined for vectors of all dimensions {n \ge 2,} however, {\mathbf{a}} and {\mathbf{b}} must both be position vectors.
Let’s solve this vector projection problem:
Given the point {\mathbf{a} = (3,4)} and the line {y = {\Large {x \over 2}},} how far away is {\mathbf{a}} from the closest point on the line?
Here’s a diagram of the problem:
The vector {\mathbf{b}} lies on the line {y = {\Large {x \over 2}}.} The point {\mathbf{c}} is the one point on the line that’s closest to {\mathbf{a}.} We want to find the distance between {\mathbf{a}} and {\mathbf{c}.}
Notice that the {\mathbf{b}} line is perpendicular to the line that connects {\mathbf{a}} and {\mathbf{c}.} Earlier, we observed that vector projection also has this same perpendicular relationship, so we can model this problem using vector projection.
Specifically, we want to find the projection of {\mathbf{a}} onto
To better recognize this as a vector projection problem, we can
explicitly draw the vector arrows for {\mathbf{a}} and
Here are the steps to solve the problem:
First, let’s determine exactly where
{\mathbf{b} = (2,1)}
The projection formulas use the unit vector in
[[ {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ {{\mathbf{b}} \over {|\mathbf{b}|}} \ = \ {{(2,1)} \over {\sqrt{2^2+1^2}}} \ = \ {{(2,1)} \over {\sqrt{5}}} \ = \ \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ (0.894, 0.447)]]
Now we can find the scalar projection {s}, which is the amount of
[[ s \ = \ \mathbf{a} \cdot {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ (3,4) \cdot \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ { 6 \over {\sqrt{5}}} + { 4 \over {\sqrt{5}}} \ = \ { 10 \over {\sqrt{5}}} \ = \ 4.472 ]]
(Recall that {s} is the length of vector
Now we can find the vector projection
[[ \mathbf{c} \ = \ s {\LARGE{\hat{\normalsize \mathbf{b}}}} \ = \ { 10 \over {\sqrt{5}}} \left( {2 \over {\sqrt{5}}}, {1 \over {\sqrt{5}}} \right) \ = \ \left( {20 \over 5}, {10 \over 5} \right) \ = \ (4,2) ]]
And finally, we can find the distance between {\mathbf{a}} and
[[ |\mathbf{a} - \mathbf{c}| \ = \ |(3,4) - (4,2)| \ = \ |(-1, 2)| \ = \ \sqrt{1 + 4} \ = \ \sqrt{5} \ = \ 2.236 ]]
And we obtain the answer: {\sqrt{5}.}
As a shortcut, we can combine steps 2, 3, and 4 together, and calculate {\mathbf{c}} directly from {\mathbf{a}} and {\mathbf{b}.} This is done by combining the three separate formulas together into a single formula for the vector projection:
[[ \mathbf{c} \ = \; \left( {\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{b}|^2} \right) \mathbf{b} \ = \; \left( {(3,4) \cdot (2,1)} \over {|(2,1)|^2} \right) (2,1) \ = \ \left( {{10} \over {5}} \right) (2,1) \ = \ (4,2) ]]
We can also create a shortcut formula for scalar projection,
and calculate {s} directly from {\mathbf{a}} and
[[ s \ = \; {{\mathbf{a} \cdot \mathbf{b}} \over {|\mathbf{b}|}} \ = \; {{(3,4) \cdot (2,1)} \over {\sqrt{2^2 + 1^2}}} \ = \ {{10} \over {\sqrt{5}}} \ = \ 4.472 ]]