There exist several points of view on what a system of linear equations, or in short a linear system is. The first, naïve one is, that it is simply a collection of linear equations with unknowns ,
To solve the system is to find all -tuples of numbers which satisfy all equations simultaneously.
If we denote , , and
then the above linear system can be written in the matrix form (as a matrix-vector equation)
To solve the above equation is to find all vectors satisfying .
And finally, recalling the “column by coordinate” rule of the matrix-vector multiplication, we can write the system as a vector equation
where is the th column of the matrix , , .
Note, these three examples are essentially just different representations of the same mathematical object.
Before explaining how to solve a linear system, let us notice that it does not matter what we call the unknowns, , or something else. So, all the information necessary to solve the system is contained in the matrix , which is called the coefficient matrix of the system and in the vector (right side) . Hence, all the information we need is contained in the following matrix
which is obtained by attaching the column to the matrix . This matrix is called the augmented matrix of the system. We will usually put the vertical line separating and to distinguish between the augmented matrix and the coefficient matrix.
Linear systems are solved by the Gauss–Jordan elimination (which is sometimes called row reduction). By performing operations on rows of the augmented matrix of the system (i.e. on the equations), we reduce it to a simple form, the so-called echelon form. When the system is in the echelon form, one can easily write the solution.
There are three types of row operations we use:
Row exchange: interchange two rows of the matrix;
Scaling: multiply a row by a non-zero scalar ;
Row replacement: replace a row # by its sum with a constant multiple of a row # ; all other rows remain intact;
It is clear that the operations 1 and 2 do not change the set of solutions of the system; they essentially do not change the system.
As for the operation 3, one can easily see that it does not lose solutions. Namely, let a “new” system be obtained from an “old” one by a row operation of type 3. Then any solution of the “old” system is a solution of the “new” one.
To see that we do not gain anything extra, i.e. that any solution of the “new” system is also a solution of the “old” one, we just notice that row operation of type 3 are reversible, i.e. the “old’ system also can be obtained from the “new” one by applying a row operation of type 3 (can you say which one?)
There is another, more “advanced” explanation why the above row operations are legal. Namely, every row operation is equivalent to the multiplication of the matrix from the left by one of the special elementary matrices.
Namely, the multiplication by the matrix
(all non-marked off-diagonal entries are ) just interchanges the rows number and number . Multiplication by the matrix
(again all non-marked off-diagonal entries are ) multiplies the row number by . Finally, multiplication by the matrix
adds to the row # row # multiplied by , and leaves all other rows intact.
To see that the multiplication by these matrices works as advertised, one can just see how the multiplications act on vectors (columns).
A way to describe (or to remember) these elementary matrices: each of them is obtained by applying the corresponding row operation to the identity matrix .
Note that all these matrices are invertible (compare with reversibility of row operations). The inverse of the first matrix is the matrix itself. To get the inverse of the second one, one just replaces by . And finally, the inverse of the third matrix is obtained by replacing by . To see that the inverses are indeed obtained this way, one again can simply check how they act on columns.
So, performing a row operation on the augmented matrix of the system is equivalent to the multiplication of the system (from the left) by a special invertible matrix . Left multiplying the equality by we get that any solution of the equation
is also a solution of
Multiplying this equation (from the left) by we get that any of its solutions is a solution of the equation
which is the original equation . So, a row operation does not change the solution set of a system.
The main step of row reduction consists of three sub-steps:
Find the leftmost non-zero column of the matrix;
Make sure, by applying row operations of type 1 (row exchange), if necessary, that the first (the upper) entry of this column is non-zero. This entry will be called the pivot entry or simply the pivot;
“Kill” (i.e. make them 0) all non-zero entries below the pivot by adding (subtracting) an appropriate multiple of the first row from the rows number .
We apply the main step to a matrix, then we leave the first row alone and apply the main step to rows , then to rows , etc.
The point to remember is that after we subtract a multiple of a row from all rows below it (step 3), we leave it alone and do not change it in any way, not even interchange it with another row.
After applying the main step finitely many times (at most ), we get what is called the echelon form of the matrix.
Let us consider the following linear system:
The augmented matrix of the system is
Subtracting Row#1 from the second row, and subtracting Row#1 from the third one we get:
Multiplying the second row by we get
Adding Row#2 to the third row we obtain
Now we can use the so called back substitution to solve the system. Namely, from the last row (equation) we get . Then from the second equation we get
and finally, from the first row (equation)
So, the solution is
or in vector form
or . We can check the solution by multiplying , where is the coefficient matrix.
Instead of using back substitution, we can do row reduction from bottom to top, killing all the entries above the main diagonal of the coefficient matrix: we start by multiplying the last row by , and the rest is pretty self-explanatory:
and we just read the solution off the augmented matrix.
We leave it as an exercise to the reader to formulate the algorithm for the backward phase of the row reduction.
A matrix is in echelon form if it satisfies the following two conditions:
All zero rows (i.e. the rows with all entries equal ), if any, are below all non-zero entries.
For a non-zero row, let us call the leftmost non-zero entry the leading entry. Then the second property of the echelon form can be formulated as follows:
For any non-zero row its leading entry is strictly to the right of the leading entry in the previous row.
The leading entry in each row in echelon form is also called pivot entry, or simply pivot, because these entries are exactly the pivots we used in the row reduction.
A particular case of the echelon form is the so-called triangular form. We got this form in our example above. In this form the coefficient matrix is square (), all its entries on the main diagonal are non-zero, and all the entries below the main diagonal are zero. The right side, i.e. the rightmost column of the augmented matrix can be arbitrary.
After the backward phase of the row reduction, we get what the so-called reduced echelon form of the matrix: coefficient matrix equal , as in the above example, is a particular case of the reduced echelon form.
The general definition is as follows: we say that a matrix is in the reduced echelon form, if it is in the echelon form and
All pivot entries are equal ;
All entries above the pivots are . Note, that all entries below the pivots are also because of the echelon form.
To get reduced echelon form from echelon form, we work from the bottom to the top and from the right to the left, using row replacement to kill all entries above the pivots.
An example of the reduced echelon form is the system with the coefficient matrix equal . In this case, one just reads the solution from the reduced echelon form. In the general case, one can also easily read the solution from the reduced echelon form. For example, let the reduced echelon form of the system (augmented matrix) be
here we boxed the pivots. The idea is to move the variables, corresponding to the columns without pivot (the so-called free variables) to the right side. Then we can just write the solution.
or in the vector form
One can also find the solution from the echelon form by using back substitution: the idea is to work from bottom to top, moving all free variables to the right side.
Write the systems of equations below in matrix form and as vector equations:
Solve the systems. Write the answers in the vector form.
Find all solutions of the vector equation
where , and . What conclusion can you make about linear independence (dependence) of the system of vectors ?
All questions about existence of a solution and it uniqueness can be answered by analyzing pivots in the echelon (reduced echelon) form of the augmented matrix of the system. First of all, let us investigate the question of when is the equation inconsistent, i.e. when it does not have a solution. The answer follows immediately, if one just thinks about it:
a system is inconsistent (does not have a solution) if and only if there is a pivot in the last column of an echelon form of the augmented matrix, i.e. iff an echelon form of the augmented matrix has a row , in it.
Indeed, such a row correspond to the equation that does not have a solution. If we don’t have such a row, we just make the reduced echelon form and then read the solution off it.
Now, three more statements. Note, they all deal with the coefficient matrix, and not with the augmented matrix of the system.
A solution (if it exists) is unique iff there are no free variables, that is if and only if the echelon form of the coefficient matrix has a pivot in every column;
Equation is consistent for all right sides if and only if the echelon form of the coefficient matrix has a pivot in every row.
Equation has a unique solution for any right side if and only if echelon form of the coefficient matrix has a pivot in every column and every row.
The first statement is trivial, because free variables are responsible for all non-uniqueness. I should only emphasize that this statement does not say anything about the existence.
The second statement is a tiny bit more complicated. If we have a pivot in every row of the coefficient matrix, we cannot have the pivot in the last column of the augmented matrix, so the system is always consistent, no matter what the right side is.
Let us show that if we have a zero row in the echelon form of the coefficient matrix , then we can pick a right side such that the system is not consistent. Let echelon form of the coefficient matrix . Then
where is the product of elementary matrices, corresponding to the row operations, . If has a zero row, then the last row is also zero. Therefore, if we put (all entries are , except the last one), then the equation
does not have a solution. Multiplying this equation by from the left, and recalling that , we get that the equation
does not have a solution.
Finally, statement 3 immediately follows from statements 1 and 2. ∎
From the above analysis of pivots we get several very important corollaries. The main observation we use is
In echelon form, any row and any column have no more than 1 pivot in it (it can have 0 pivots)
Questions as to when a system of vectors in is a basis, a linearly independent or a spanning system, can be easily answered by the row reduction.
Let us have a system of vectors , and let be an matrix with columns . Then
The system is linearly independent iff echelon form of has a pivot in every column;
The system is complete in (spanning, generating) iff echelon form of has a pivot in every row;
The system is a basis in iff echelon form of has a pivot in every column and in every row.
The system is linearly independent if and only if the equation
has the unique (trivial) solution , or equivalently, the equation has unique solution . By statement 1 above, it happens if and only if there is a pivot in every column of the matrix.
Similarly, the system is complete in if and only if the equation
has a solution for any right side . By statement 2 above, it happens if and only if there is a pivot in every row in echelon form of the matrix.
And finally, the system is a basis in if and only if the equation
has unique solution for any right side . By statement 3 this happens if and only if there is a pivot in every column and in every row of echelon form of . ∎
Any linearly independent system of vectors in cannot have more than vectors in it.
Let a system be linearly independent, and let be the matrix with columns . By Proposition 2.3.1 echelon form of must have a pivot in every column, which is impossible if (number of pivots cannot be more than number of rows). ∎
Any two bases in a vector space have the same number of vectors in them.
Let and be two different bases in . Without loss of generality we can assume that . Consider an isomorphism defined by
where is the standard basis in .
The statement below is a particular case of the above proposition.
Any basis in must have exactly vectors in it.
This fact follows immediately from the previous proposition, but there is also a direct proof. Let be a basis in and let be the matrix with columns . The fact that the system is a basis, means that the equation
has a unique solution for any (all possible) right side . The existence means that there is a pivot in every row (of a reduced echelon form of the matrix), hence the number of pivots is exactly . The uniqueness mean that there is pivot in every column of the coefficient matrix (its echelon form), so
∎
Any spanning (generating) set in must have at least vectors.
Let be a complete system in , and let be matrix with columns . Statement 2 of Proposition 2.3.1 implies that echelon form of has a pivot in every row. Since number of pivots cannot exceed the number of columns, . ∎
A matrix is invertible if and only if its echelon form has pivot in every column and every row.
As it was discussed in the beginning of the section, the equation has a unique solution for any right side if and only if the echelon form of has pivot in every row and every column. But, we know, see Theorem 1.6.8 in Chapter 1, that the matrix (linear transformation) is invertible if and only if the equation has a unique solution for any possible right side .
The above proposition immediately implies the following
An invertible matrix must be square ().
If a square () matrix is left invertible, or if it is right invertible, then it is invertible. In other words, to check the invertibility of a square matrix it is sufficient to check only one of the conditions , .
Note, that this proposition applies only to square matrices!
We know that matrix is invertible if and only if the equation has a unique solution for any right side . This happens if and only if the echelon form of the matrix has pivots in every row and in every column.
If a matrix is left invertible, the equation has unique solution . Indeed, if is a left inverse of (i.e. ), and satisfies
then multiplying this identity by from the left we get , so the solution is unique. Therefore, the echelon form of has pivots in every column (no free variables). If the matrix is square (), the echelon form also has pivots in every row ( pivots, and a row can have no more than 1 pivot), so the matrix is invertible.
If a matrix is right invertible, and is its right inverse (), then for ,
Therefore, for any right side the equation has a solution . Thus, echelon form of has a pivot in every row. If is square, it also has a pivot in every column, so is invertible. ∎
For what value of the system
has a solution. Find the general solution of the system for this value of .
Determine, if the vectors
are linearly independent or not.
Do these four vectors span ? (In other words, is it a generating system?) What about ?
Determine, which of the following systems of vectors are bases in :
, , ;
, , ;
, , .
Which of the systems are bases in ?
Do the polynomials , , generate (span) ? Justify your answer.
Can 5 vectors in be linearly independent? Justify your answer.
Prove or disprove: If the columns of a square () matrix are linearly independent, so are the columns of .
Prove or disprove: If the columns of a square () matrix are linearly independent, so are the rows of .
Show that if the equation has unique solution (i.e. if echelon form of has pivot in every column), then is left invertible. Hint: elementary matrices may help.
Note: It was shown in the text that if is left invertible, then the equation has unique solution. But here you are asked to prove the converse of this statement, which was not proved.
Remark: This can be a very hard problem, for it requires deep understanding of the subject. However, when you understand what to do, the problem becomes almost trivial.
Is the reduced echelon form of a matrix unique? Justify your conclusion.
Namely, suppose that by performing some row operations (not necessarily following any algorithm) we end up with a reduced echelon matrix. Do we always end up with the same matrix, or can we get different ones? Note that we are only allowed to perform row operations, the “column operations”’ are forbidden.
Hint: What happens if you start with an invertible matrix? Also, are the pivots always in the same columns, or can it be different depending on what row operations you perform? If you can tell what the pivot columns are without reverting to row operations, then the positions of pivot columns do not depend on them.
As it was discussed above, an invertible matrix must be square, and its echelon form must have pivots in every row and every column. Therefore reduced echelon form of an invertible matrix is the identity matrix . Therefore,
Any invertible matrix is row equivalent (i.e. can be reduced by row operations) to the identity matrix.
Now let us state a simple algorithm of finding the inverse of an matrix:
Form an augmented matrix by writing the identity matrix right of ;
Performing row operations on the augmented matrix transform to the identity matrix ;
The matrix that we added will be automatically transformed to ;
If it is impossible to transform to the identity by row operations, is not invertible.
There are several possible explanations of the above algorithm. The first, a naïve one, is as follows: we know that (for an invertible ) the vector is the solution of the equation . So to find the column number of we need to find the solution of , where is the standard basis in . The above algorithm just solves the equations
simultaneously!
Let us also present another, more “advanced” explanation. As we discussed above, every row operation can be realized as a left multiplication by an elementary matrix. Let be the elementary matrices corresponding to the row operation we performed, and let be their product.111Although it does not matter here, but please notice, that if the row operation was performed first, must be the rightmost term in the product We know that the row operations transform to identity, i.e. , so . But the same row operations transform the augmented matrix to . ∎
This “advanced” explanation using elementary matrices implies an important proposition that will be often used later.
Any invertible matrix can be represented as a product of elementary matrices.
As we discussed in the previous paragraph, , so
∎
Suppose we want to find the inverse of the matrix
Augmenting the identity matrix to it and performing row reduction we get
Here in the last row operation we multiplied the first row by to avoid fractions in the backward phase of row reduction. Continuing with the row reduction we get
Dividing the first and the last row by 3 we get the inverse matrix
Find the inverse of the matrices
Show all steps
The dimension of a vector space is the number of vectors in a basis.
For a vector space consisting only of zero vector we put . If does not have a (finite) basis, we put .
If is finite, we call the space finite-dimensional; otherwise we call it infinite-dimensional.
Proposition 2.3.3 asserts that the dimension is well defined, i.e. that it does not depend on the choice of a basis.
Proposition 1.2.8 from Chapter 1 states that any finite spanning system in a vector space contains a basis. This immediately implies the following
A vector space is finite-dimensional if and only if it has a finite spanning system.
Suppose, that we have a system of vectors in a finite-dimensional vector space, and we want to check if it is a basis (or if it is linearly independent, or if it is complete)? Probably the simplest way is to use an isomorphism , to move the problem to , where all such questions can be answered by row reduction (studying pivots).
Note, that if , then there always exists an isomorphism . Indeed, if then there exists a basis , and one can define an isomorphism by
Any linearly independent system in a finite-dimensional vector space cannot have more than vectors in it.
Let be a linearly independent system, and let be an isomorphism. Then is a linearly independent system in , and by Proposition 2.3.2 . ∎
Any generating system in a finite-dimensional vector space must have at least vectors in it.
Let be a complete system in , and let be an isomorphism. Then is a complete system in , and by Proposition 2.3.5 . ∎
The following statement will play an important role later.
A linearly independent system of vectors in a finite-dimensional space can be completed to a basis, i.e. if are linearly independent vectors in a finite-dimensional vector space then one can find vectors such that the system of vectors is a basis in .
Let . Assume that the system is not generating, since otherwise it is already a basis. Take any vector (let us call it ) not belonging to (one can always do that because the system is not generating). By Exercise 1.2.5 from Chapter 1 the system is linearly independent (notice that in this case by Proposition 2.5.2). Repeat the procedure with the new system to get vector , and so on.
We will stop the process when we get a generating system. Note, that the process cannot continue infinitely, because a linearly independent system of vectors in cannot have more than vectors. ∎
Let be a subspace of a vector space , . Then is finite dimensional and .
Moreover, if then (we are still assuming that is a subspace of here).
This theorem looks like a complete banality, like an easy corollary of Proposition 2.5.2. But we can apply Proposition 2.5.2 only if we already have a basis in . And we only have a basis in , and we cannot say how many vectors in this basis belong to ; in fact, it is easy to construct an example where none of the vectors in the basis of belongs to .
If then the theorem is trivial, so let us assume otherwise.
We want to find a basis in . Take a non-zero vector . If , we got our basis (consisting of the one vector ).
If not, we continue by induction. Suppose we constructed linearly independent vectors . If , then we have found a basis in . If not, there exists a vector , . By Exercise 1.2.5 from Chapter 1 the system is linearly independent.
∎
True or false:
Every vector space that is generated by a finite set has a basis;
Every vector space has a (finite) basis;
A vector space cannot have more than one basis;
If a vector space has a finite basis, then the number of vectors in every basis is the same.
The dimension of is ;
The dimension on is ;
If vectors generate (span) the vector space , then every vector in can be written as a linear combination of vector in only one way;
Every subspace of a finite-dimensional space is finite-dimensional;
If is a vector space having dimension , then has exactly one subspace of dimension and exactly one subspace of dimension .
Prove that if is a vector space having dimension , then a system of vectors in is linearly independent if and only if it spans .
Prove that a linearly independent system of vectors in a vector space is a basis if and only if .
(An old exercise revisited: now this problem should be easy) Is it possible that vectors are linearly dependent, but the vectors , and are linearly independent? Hint: What dimension can the subspace have?
Let vectors be a basis in . Show that is also a basis in .
Consider in the space vectors , , .
Prove that these vectors are linearly independent.
Complete this system of vectors to a basis.
If you do part b) first you can do everything without any computations.
In this short section we discuss the structure of the general solution (i.e. of the solution set) of a linear system.
We call a system homogeneous if the right side , i.e. a homogeneous system is a system of form .
With each system
we can associate a homogeneous system just by putting .
Let a vector satisfy the equation , and let be the set of all solutions of the associated homogeneous system
Then the set
is the set of all solutions of the equation .
In other words, this theorem can be stated as
Fix a vector satisfying . Let a vector satisfy . Then for we have
so any of form
is a solution of .
Now let satisfy . Then for we get
so . Therefore any solution of can be represented as with some . ∎
The power of this theorem is in its generality. It applies to all linear equations, we do not have to assume here that vector spaces are finite-dimensional. You will meet this theorem in differential equations, integral equations, partial differential equations, etc. Besides showing the structure of the solution set, this theorem allows one to separate investigation of uniqueness from the study of existence. Namely, to study uniqueness, we only need to analyze uniqueness of the homogeneous equation , which always has a solution.
There is an immediate application in this course: this theorem allows us to check a solution of a system . For example, consider a system
Performing row reduction one can find the solution of this system
| (2.6.1) |
The parameters , can be denoted here by any other letters, and , for example; we are keeping notation and here only to remind us that the parameters came from the corresponding free variables.
Now, let us suppose, that we are just given this solution, and we want to check whether or not it is correct. Of course, we can repeat the row operations, but this is too time consuming. Moreover, if the solution was obtained by some non-standard method, it can look differently from what we get from the row reduction. For example, the formula
| (2.6.2) |
gives the same set as (2.6.1) (can you say why?); here we just replaced the last vector in (2.6.1) by its sum with the second one. So, this formula is different from the solution we got from the row reduction, but it is nevertheless correct.
The simplest way to check that (2.6.1) and (2.6.2) give us correct solutions, is to check that the first vector satisfies the equation , and that the other two (the ones with the parameters and or and in front of them) should satisfy the associated homogeneous equation .
If this checks out, we will be assured that any vector defined by (2.6.1) or (2.6.2) is indeed a solution.
Note, that this method of checking the solution does not guarantee that (2.6.1) (or (2.6.2)) gives us all the solutions. For example, if we just somehow miss out the term with , the above method of checking will still work fine.
So, how can we guarantee, that we did not miss any free variable, and there should not be extra term in (2.6.1)?
What comes to mind, is to count the pivots again. In this example, if one does row operations, the number of pivots is 3. So indeed, there should be 2 free variables, and it looks like we did not miss anything in (2.6.1).
To be able to prove this, we will need new notions of fundamental subspaces and of rank of a matrix. I should also mention that in this particular example, one does not have to perform all row operations to check that there are only 2 free variables, and that formulas (2.6.1) and (2.6.2) both give correct general solutions.
True or false
Any system of linear equations has at least one solution;
Any system of linear equations has at most one solution;
Any homogeneous system of linear equations has at least one solution;
Any system of linear equations in unknowns has at least one solution;
Any system of linear equations in unknowns has at most one solution;
If the homogeneous system corresponding to a given system of a linear equations has a solution, then the given system has a solution;
If the coefficient matrix of a homogeneous system of linear equations in unknowns is invertible, then the system has no non-zero solution;
The solution set of any system of equations in unknowns is a subspace in ;
The solution set of any homogeneous system of equations in unknowns is a subspace in .
Find a system (2 equations with 3 unknowns) such that its general solution has a form
As we discussed above in Section 1.7 of Chapter 1, with any linear transformation we can associate two subspaces, namely, its kernel, or null space
and its range
In other words, the kernel is the solution set of the homogeneous equation , and the range is exactly the set of all right sides for which the equation has a solution.
If is an matrix, i.e. a mapping from to , then it follows from the “column by coordinate” rule of the matrix multiplication that any vector can be represented as a linear combination of columns of . This explains the name column space (notation ), which is often used instead of .
If is a matrix, then in addition to and one can also consider the range and kernel for the transposed matrix . Often the term row space is used for and the term left null space is used for (but usually no special notation is introduced).
The four subspaces , , , are called the fundamental subspaces of the matrix . In this section we will study important relations between the dimensions of the four fundamental subspaces.
We will need the following definition, which is one of the fundamental notions of Linear Algebra
Given a linear transformation (matrix) its rank, , is the dimension of the range of
To compute the fundamental subspaces and rank of a matrix, one needs to do echelon reduction. Namely, let be the matrix, and be its echelon form
The pivot columns of the original matrix (i.e. the columns where after row operations we will have pivots in the echelon form) give us a basis (one of many possible) in .
The pivot rows of the echelon form give us a basis in the row space. Of course, it is possible just to transpose the matrix, and then do row operations. But if we already have the echelon form of , say by computing , then we get for free.
To find a basis in the null space one needs to solve the homogeneous equation : the details will be seen from the example below.
Consider a matrix
Performing row operations we get the echelon form
(the pivots are boxed here). So, the columns 1 and 3 of the original matrix, i.e. the columns
give us a basis in . We also get a basis for the row space for free: the first and second row of the echelon form of , i.e. the vectors
(we put the vectors vertically here. The question of whether to put vectors here vertically as columns, or horizontally as rows is really a matter of convention. Our reason for putting them vertically is that although we call the row space we define it as a column space of )
To compute the basis in the null space we need to solve the equation . Compute the reduced echelon form of , which in this example is
Note, that when solving the homogeneous equation , it is not necessary to write the whole augmented matrix, it is sufficient to work with the coefficient matrix. Indeed, in this case the last column of the augmented matrix is the column of zeroes, which does not change under row operations. So, we can just keep this column in mind, without actually writing it. Keeping this last zero column in mind, we can read the solution off the reduced echelon form above:
or, in the vector form
| (2.7.1) |
The vectors at each free variable, i.e. in our case the vectors
form a basis in .
Unfortunately, there is no shortcut for finding a basis in , one must solve the equation . The knowledge of the echelon form of does not help here.
So, why do the above methods indeed give us bases in the fundamental subspaces?
The case of the null space is probably the simplest one: since we solved the equation , i.e. found all the solutions, then any vector in is a linear combination of the vectors we obtained. Thus, the vectors we obtained form a spanning system in . To see that the system is linearly independent, let us multiply each vector by the corresponding free variable and add everything, see (2.7.1). Then for each free variable , the entry number of the resulting vector is exactly , see (2.7.1) again, so the only way this vector (the linear combination) can be is when all free variables are .
Let us now explain why the method for finding a basis in the column space works. First of all, notice that the pivot columns of the reduced echelon form of form a basis in (not in the column space of the original matrix, but of its reduced echelon form!). Since row operations are just left multiplications by invertible matrices, they do not change linear independence. Therefore, the pivot columns of the original matrix are linearly independent.
Let us now show that the pivot columns of span the column space of . Let be the pivot columns of , and let be an arbitrary column of . We want to show that can be represented as a linear combination of the pivot columns ,
The reduced echelon form is obtained from by the left multiplication
where is a product of elementary matrices, so is an invertible matrix. The vectors are the pivot columns of , and the column of is transformed to the column of . Since the pivot columns of form a basis in , vector can be represented as a linear combination
Multiplying this equality by from the left we get the representation
so indeed the pivot columns of span .
It is easy to see that the pivot rows of the echelon form of are linearly independent. Indeed, let be the transposed (since we agreed always to put vectors vertically) pivot rows of . Suppose
Consider the first non-zero entry of . Since for all other vectors the corresponding entries equal (by the definition of echelon form), we can conclude that . So we can just ignore the first term in the sum.
Consider now the first non-zero entry of . The corresponding entries of the vectors are , so . Repeating this procedure, we get that .
To see that vectors span the row space, one can notice that row operations do not change the row space. This can be obtained directly from analyzing row operations, but we present here a more formal way to demonstrate this fact.
For a transformation and a set let us denote by the set of all elements which can represented as , ,
If is an matrix, and is its echelon form, is obtained from be left multiplication
where is an invertible matrix (the product of the corresponding elementary matrices). Then
so indeed .
There are many applications in which one needs to find a basis in column space or in the null space of a matrix. For example, as it was shown above, solving a homogeneous equation amounts to finding a basis in the null space . Finding a basis in the column space means simply extracting a basis from a spanning set, by removing unnecessary vectors (columns).
However, the most important application of the above methods of computing bases of fundamental subspaces is the relations between their dimensions.
For a matrix
This theorem is often stated as follows:
The column rank of a matrix coincides with its row rank.
The proof of this theorem is trivial, since dimensions of both and are equal to the number of pivots in the echelon form of .
The following theorem is gives us important relations between dimensions of the fundamental spaces. It is often also called the Rank Theorem
Let be an matrix, i.e. a linear transformation from to . Then
(dimension of the domain of );
(dimension of the target space of ).
The proof, modulo the above algorithms of finding bases in the fundamental subspaces, is almost trivial. The first statement is simply the fact that the number of free variables ( plus the number of basic variables (i.e. the number of pivots, i.e. ) adds up to the number of columns (i.e. to ).
The second statement, if one takes into account that is simply the first statement applied to . ∎
As an application of the above theorem, let us recall the example from Section 2.6. There we considered a system
and we claimed that its general solution given by
or by
We checked in Section 2.6 that a vector given by either formula is indeed a solution of the equation. But, how can we guarantee that any of the formulas describe all solutions?
First of all, we know that in either formula, the last 2 vectors (the ones multiplied by the parameters) belong to . It is easy to see that in either case both vectors are linearly independent (two vectors are linearly dependent if and only if one is a multiple of the other).
Now, let us count dimensions: interchanging the first and the second rows and performing first round of row operations
we see that there are three pivots already, so . (Actually, we already can see that the rank is 3, but it is enough just to have the estimate here). By Theorem 2.7.2, , hence , and therefore there cannot be more than 2 linearly independent vectors in . Therefore, last 2 vectors in either formula form a basis in , so either formula give all solutions of the equation.
An important corollary of the rank theorem, is the following theorem connecting existence and uniqueness for linear equations.
Let be an matrix. Then the equation
has a solution for every if and only if the dual equation
has a unique (only the trivial) solution. (Note, that in the second equation we have , not ).
The proof follows immediately from Theorem 2.7.2 by counting the dimensions. We leave the details as an exercise to the reader. ∎
There is a very nice geometric interpretation of the second rank theorem (Theorem 2.7.2). Namely, statement 1 of the theorem says, that if a transformation has trivial kernel (), then the dimensions of the domain and of the range coincide. If the kernel is non-trivial, then the transformation “kills” dimensions, so .
As Proposition 2.5.4 from Section 2.5 above asserts, any linearly independent system can be completed to a basis, i.e. given linearly independent vectors in a finite-dimensional vector space , one can find vectors such that the system of vectors is a basis in .
Theoretically, the proof of this proposition give us an algorithm of finding the vectors , but this algorithm does not look too practical.
Ideas of this section give us a more practical way to perform the completion to a basis.
First of all, notice that if an matrix is in an echelon form, then its non-zero rows (which are clearly linearly independent) can be easily completed to a basis in the whole space ; one just needs to add some rows in appropriate places, so the resulting matrix is still in the echelon form and has pivots in every column.
Then, the non-zero rows of the new matrix form a basis, and we can order it any way we want, because property of being basis does not depend on the ordering.
Suppose now that we have linearly independent vectors . Consider the matrix with rows and perform row operations to get the echelon form . As we discussed above, the rows of can be easily completed to a basis in . And it turns out that the same vectors that complete rows of to a basis complete to a basis the original vectors .
To see that, let vectors complete the rows of to a basis in . Then, if we add to a matrix rows , we get an invertible matrix. Let call this matrix , and let be the matrix obtained from by adding rows . The matrix can be obtained from by row operations, so
where is the product of the corresponding elementary matrices. Then and is invertible as a product of invertible matrices.
But that means that the rows of form a basis in , which is exactly what we need.
The method of completion to a basis described above may be not the simplest one, but one of its principal advantages is that it works for vector spaces over an arbitrary field.
True or false:
The rank of a matrix is equal to the number of its non-zero columns;
The zero matrix is the only matrix having rank 0;
Elementary row operations preserve rank;
Elementary column operations do not necessarily preserve rank;
The rank of a matrix is equal to the maximum number of linearly independent columns in the matrix;
The rank of a matrix is equal to the maximum number of linearly independent rows in the matrix;
The rank of an matrix is at most ;
An matrix having rank is invertible.
A matrix has rank . What are dimensions of all 4 fundamental subspaces?
Compute rank and find bases of all four fundamental subspaces for the matrices
Prove that if and is a subspace of then . ( here means the subspace transformed by the transformation , i.e. any vector in can be represented as , ). Deduce from here that .
Remark: Here one can use the fact that if then . Do you understand why is it true?
Prove that if and is a subspace of then . Deduce from here that .
Prove that if the product of two matrices is invertible, then both and are invertible. Even if you know about determinants, do not use them, we did not cover them yet. Hint: use previous 2 problems.
Prove that if has unique solution, then the equation
has a solution for every right side .
Hint: count pivots
Write a matrix with the required property, or explain why no such matrix exists
Column space contains , , row space contains , ;
Column space is spanned by , nullspace is spanned by ;
Column space is , row space is .
Hint: Check first if the dimensions add up.
If has the same four fundamental subspaces as , does ?
Complete the rows of a matrix
to a basis in .
For a matrix
find bases in its column and row spaces.
For the matrix in the previous problem, complete the basis in the row space to a basis in
For the matrix
compute and . What can you say about relation between these subspaces?
Is it possible that for a real matrix that ? Is it possible for a complex ?
Complete the vectors , , to a basis in .
The material we have learned about linear transformations and their matrices can be easily extended to transformations in abstract vector spaces with finite bases. In this section we will distinguish between a linear transformation and its matrix, the reason being that we consider different bases, so a linear transformation can have different matrix representation.
Let be a vector space with a basis . Any vector admits a unique representation as a linear combination
The numbers are called the coordinates of the vector in the basis . It is convenient to join these coordinates into the so-called coordinate vector of relative to the basis , which is the column vector
Note that the mapping
is an isomorphism between and . It transforms the basis to the standard basis in .
Let be a linear transformation, and let , be bases in and respectively.
A matrix of the transformation in (or with respect to) the bases and is an matrix, denoted by , which relates the coordinate vectors and ,
notice the balance of symbols and here: this is the reason we put the first basis into the second position.
The matrix is easy to find: its th column is just the coordinate vector (compare this with finding the matrix of a linear transformation from to ).
As in the case of standard bases, composition of linear transformations is equivalent to multiplication of their matrices: one only has to be a bit more careful about bases. Namely, let and be linear transformation, and let and be bases in , and respectively. Then for the composition ,
we have
| (2.8.1) |
(notice again the balance of indices here).
The proof here goes exactly as in the case of spaces with standard bases, so we do not repeat it here. Another possibility is to transfer everything to the spaces via the coordinate isomorphisms . Then one does not need any proof, everything follows from the results about matrix multiplication.
Let us have two bases and in a vector space . Consider the identity transformation and its matrix in these bases. By the definition
i.e. for any vector the matrix transforms its coordinates in the basis into coordinates in the basis . The matrix is often called the change of coordinates (from the basis to the basis ) matrix.
The matrix is easy to compute: according to the general rule of finding the matrix of a linear transformation, its th column is the coordinate representation of th element of the basis
Note that
(follows immediately from the multiplication of matrices rule (2.8.1)), so any change of coordinate matrix is always invertible.
Let our space be , and let us have a basis there. We also have the standard basis there. The change of coordinates matrix is easy to compute:
i.e. it is just the matrix whose th column is the vector (column) . And in the other direction
For example, consider a basis
in , and let denote the standard basis there. Then
and
(we know how to compute inverses, and it is also easy to check that the above matrix is indeed the inverse of )
In the space of polynomials of degree at most we have bases
and we want to find the change of coordinate matrix .
Of course, we can always take vectors from the basis and try to decompose them in the basis ; it involves solving linear systems, and we know how to do that.
However, I think the following way is simpler. In we also have the standard basis , and for this basis
and taking the inverses
Then
and
the reader should notice the balance of indices in the above formulas.
Let be a linear transformation, and let , be two bases in and let , be two bases in . Suppose we know the matrix , and we would like to find the matrix representation with respect to new bases , , i.e. the matrix . The rule is very simple:
to get the matrix in the “new” bases one has to left and right multiply the matrix in the “old” bases by change of coordinates matrices.
I did not mention here which change of coordinate matrix should go where, because we don’t have any choice if we follow the balance of indices rule. Namely, matrix representation of a linear transformation changes according to the formula
notice the balance of indices here. The proof can be done just by analyzing what each of the matrices does.
Let be a vector space and let be a basis in . Consider a linear transformation and let be its matrix in this basis (we use the same basis for “inputs” and “outputs”)
The case when we use the same basis for “inputs” and “outputs” is very important (because in this case we can multiply a matrix by itself), so let us study this case a bit more carefully. Notice, that very often in this case the shorter notation is used instead of . However, the two index notation is better adapted to the balance of indices rule, so I recommend using it (or at least always keep it in mind) when doing change of coordinates.
Let be another basis in . By the change of coordinate rule above
Recalling that
and denoting , we can rewrite the above formula as
This gives a motivation for the following definition
We say that a matrix is similar to a matrix if there exists an invertible matrix such that .
Since an invertible matrix must be square, it follows from counting dimensions, that similar matrices and have to be square and of the same size. If is similar to , i.e. if , then
(since is invertible), therefore is similar to . So, we can just say that and are similar.
The above reasoning shows, that it does not matter where to put and where : one can use the formula in the definition of similarity.
The above discussion shows, that one can treat similar matrices as different matrix representation of the same linear operator (transformation).
True or false
Every change of coordinate matrix is square;
Every change of coordinate matrix is invertible;
The matrices and are called similar if for some matrix ;
The matrices and are called similar if for some matrix ;
Similar matrices do not need to be square.
Consider the system of vectors
Prove that it is a basis in . Try to do minimal amount of computations.
Find the change of coordinate matrix that changes the coordinates in this basis to the standard coordinates in (i.e. to the coordinates in the standard basis ).
Find the change of coordinates matrix that changes the coordinates in the basis in to the coordinates in the basis .
Let be the linear operator in defined (in the standard coordinates) by
Find the matrix of in the standard basis and in the basis
Prove, that if and are similar matrices then . Hint: recall how and are related.
Are the matrices
similar? Justify.