CS 2120: Class #9¶
Array – first steps¶
The list was our first data structure.
Now we’re going to meet a similar, but slightly different, one: the array
Let’s get started:
>>> a=numpy.array([5,4,2]) >>> print a [5 4 2]
Looks a lot like a list, doesn’t it?
Can we manipulate it like a list?
>>> print a[0] 5 >>> print a[1] 4
We can definitely index it, the same as a list.
I wonder if arrays are mutable?
>>> a[1]=7 >>> print a [5 7 2]
Yes, arrays are mutable.
With lists, I could mix types in a single list. Like this:
>>> l = [5,4,3] >>> l[2] = 'walrus' >>> print l [5, 4, 'walrus']
Can I do that with arrays?
>>> a=numpy.array([5,4,2]) >>> a[2]='walrus' Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for long() with base 10: 'walrus'
Ah ha! We found a way in which arrays are different.
Lists are just collections of stuff. Any old stuff. Each element can be of a different type.
In an array, every element must have the same type!
Activity
Create two arrays of integers, each having the same number of elements.
What mathematical operations can you do on the arrays? (+,-,*,/
).
What happens if you try to perform the operations on arrays of different sizes?
How does +
work differently on arrays than lists?
NumPy object attributes¶
We can ask NumPy what type the items in an array have like this:
>>> a.dtype dtype('int64')
This is a new notation for us. We’re used to passing something to a function, which will tell us the type. Like this:
>>> type(something)
Here, we instead asked NumPy to tell us about an attribute of our array (In this case the attribute
dtype
– standing for “data type”):>>> a.dtype
Objects in NumPy have many attributes. These will mostly get set automatically for you, but we’ll need to set a few of them manually later.
If you want to see all the attributes your array has you can either type
a.
and then press the [Tab] key (if you’re using ipython) or you can do:>>> dir(a)
That’s a lot of attributes!
Some of those attributes are things like
dtype
that store information about the state of the object.Some are special functions that can only be applied to that object. For example, every NumPy object comes with it’s own
view
function. Check it out:>>> a.view() array([5, 4, 2])
- When a function appears after a
.
, that function is automatically applied to the object appearing before the.
- These special functions built in to objects can also take parameters.
- When a function appears after a
For example, we can change the types of the elements of our array:
>>> b=a.astype(numpy.float32) >>> b.view() array([ 5., 4., 2.], dtype=float32)
Activity
Create an array a = numpy.array([1,2,6,7,5,4,3])
. Figure out to find/do
the following things with attributes of a
:
- “view”
a
- sort
a
- find the maximum value in
a
- find the minimum value in
a
- add up (sum) all the values in
a
- find the average (mean) of the values in
a
Making arrays bigger¶
With lists, we could always append items to make them bigger (
+
)>>> [1,2,3] + [4] [1,2,3,4]
Arrays are meant to have fixed size.
Why do you think this is?
If you really, really, want to make an array bigger... you can’t.
You can however, make a new array that is bigger using
numpy.append()
:>>> a = numpy.array([1,2,3,4]) >>> a.view() array([1, 2, 3, 4]) >>> b = numpy.append(a,5) >>> a.view() array([1, 2, 3, 4]) >>> b.view() array([1, 2, 3, 4, 5])
Note carefully that
numpy.append()
did not change a. It created a new array, b.
Activity
Create an array of 4 integers.
Create a new, bigger, array by appending the integer 7
on to your array.
Create another new array by appending the string 'walrus'
.
Did that last one work? What happened?
Flexibility vs Power¶
Arrays are less flexible than lists:
- We can’t change their size
- They can only store data of a single type
But... it is this very lack of flexibility that lets us do all sorts of cool stuff like have a
.sum()
attribute.
Activity
How would you implement .sum()
for a list?
Higher dimensions¶
Numpy arrays generalize to higher dimensions.
Let’s create a 2D array:
>>> a=numpy.array([[1,2,3],[4,5,6],[7,8,9]]) >>> a.view() array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
Note the format in our call to
numpy.array
. A list of lists.Each row of the array gets its own list.
As long as two 2D arrays have the same shape, you can do arithmetic on them, just like 1D arrays.
How do we check the shape of an array?
>>> a.shape (3, 3)
Activity
Create a 4x4 array. Verify that it has shape
(4,4)
.
You’ve changed your mind. The array should actually be 2x8. reshape
your 4x4 array in to a 2x8 array without recreating it from scratch.
Verify that the reshaped array is (2,8)
.
Finally flatten
your 2D array into a 1D array.
Starting points¶
Sometimes you want an array of shape
(n,m)
that contains all zeros:>>> a=numpy.zeros([n,m])
Guess what
numpy.ones()
does?How about
numpy.eye()
?
Slicing¶
- We’ve already seen that you can index arrays like lists (and strings)
- Likewise, you can use Python’s powerful slicing on arrays.
Activity
- Create an array
arr = numpy.array([0,1,2,3,4,5,6,7])
. Using a single command - Print the first 3 elements
- Print the last 3 elements
- Print the even elements of
arr
Slicing works for higher dimensional arrays, too. For example:
>>> a=numpy.arange(25).reshape(5,5) >>> a.view() array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) >>> print a[0:2,1:4] [[1 2 3] [6 7 8]]
Note the use of
numpy.arange
which works likerange
but returns an array.If you want a whole column/row/etc, you can use a plain
:
as the index. For example, if I wanted to pull out every row of the first two columns:>>> print a[:,0:2] [[ 0 1] [ 5 6] [10 11] [15 16] [20 21]]
Activity
Modify the previous command to print all of the columns of the first two rows.
For loops¶
- If
for
loops work for lists, do you think they’ll work for arrays?
Activity
Write a function printeach(arr)
that uses a for
loop to print each element of an array that is passed in as a parameter.
Test it on a 1D array.
Now try a 2D array.
If you’re feeling bold, how about a 3D array?
NumPy Matrices¶
NumPy makes a distinction between a 2D array and a matrix.
Every matrix is definitely a 2D array, but not every 2D array is a matrix.
>>> a = numpy.eye(4) >>> b=numpy.array([1,2,3,4]) >>> b*a array([[ 1., 0., 0., 0.], [ 0., 2., 0., 0.], [ 0., 0., 3., 0.], [ 0., 0., 0., 4.]])
That... wasn’t what we expected. It’s a perfectly reasonable interpretation of our request, but here we really wanted matrix-vector multiplication.
If we want Python to treat the NumPy arrays as matrices and vectors, we have to explicitly say so. Like this:
>>> numpy.dot(a,b) array([ 1., 2., 3., 4.])
Another option is to convert
a
to thematrix
type:>>> a = numpy.matrix(a) >>> b *a matrix([[ 1., 2., 3., 4.]])
The preferred option is the first one. Keep everything as an array and use
numpy.dot
when you want matrix/vector multiplication.Moral of the story: if you want your 2D array to behave like a matrix when you do things like multiplication... you need to tell Python that it’s a matrix.
Activity
Write a function chain(matrix,n)
that will take an input matrix, and
return the result of multiplying it by itself n
times. Test it out on
a square matrix with every entry between 0
and 1
and the entries in each row
summing exactly to 1
. Like this:
>>> a=numpy.array([[0.2,0.8,0.0],[0.1,0.3,0.6],[0.0,0.8,0.1]])
This is an example of something called a Right Stochastic Matrix
Note
That last activity took you most of the way to implementing a Markov Chain . If your interests run towards state-based probabilistic simulation... you’ll want to follow up on this.
NumPy Linear Algebra¶
As you might expect, NumPy already has a whoooole bunch of linear algebra routines built right in:
>>> a=numpy.random.rand(5,5) # Create a random matrix >>> matshow(a) # Visualize it >>> numpy.linalg.eigvals(a) # Find its eigenvalues >>> numpy.linalg.det(a) # Find its determinant >>> numpy.linalg.inv(a) # Find its inverse
>>> b=numpy.random.rand(5) # Create a random vector >>> numpy.linalg.solve(a,b) # Solve the system of linear equations ax = b
And much, much, more.
Bottom line: If there’s linear algebra you need to do, NumPy/SciPy almost certainly already have built-in routines to do it. Google and the NumPy docs are your friends!
Going further with NumPy¶
- We’ve only just scratched the surface of what NumPy can do.
- If you foresee yourself using NumPy frequently, read the assigned reading carefully.
- Even that is just the bare-bones basics.
- You’ll eventually want to dig in to The NumPy docs. Don’t try to read those “cover-to-cover” though. The best way to use the docs is to look up stuff that you actually need and just read about that. Eventually you’ll get a feel for the capabilities of the package.
For next class¶
- If you think you’ll use NumPy a lot (and you probably will), read the NumPy Tutorial.
- Just Skim. Reading it in detail at this point might give you a headache.
- Read the parts 1 through 4 of chapter 11
- Read a little bit of chapter 12
- Read chapter 13 of the text to get an idea of wtf objects are Don’t panic. Just get an idea of wtf they are.