3-3-2-4. Creating and Saving NumPy ndarrays
And second, using a variety of
built-in Numpy functions that quickly generate specific types of arrays.
In this section, we will start with the first way.
Let’s import Numpy and create our first array.
Here’s a one-dimensional array that contains integers.
Note that for clarity,
the examples throughout these lessons will use small, simple arrays.
We’ll start by creating one-dimensional or 1D Numpy arrays.
Let’s print the array we just created,
as well as, it’s type.
You can see that the type is Numpy’s ndarray or n-dimensional array.
Numpy arrays have useful attributes that provide
us information about them in a very intuitive way.
For example, this dtype attribute.
Dtype returns the data type of the elements in that array.
Notice, dtype is different from the datatype of the array itself.
This d type let’s us know that the elements of X
are stored in memory a signed 64-bit integers.
An additional advantage of Numpy is that it handles more datatypes than Python.
You can check out all the different datatypes supported by Numpy in it’s documentation.
Another useful attribute is shape.
This returns a tuple of n positive integers that specifies
the sizes of each dimension n being the number of dimensions in the array.
X has one dimension.
So, shape returns an integer indicating the length of the array, five.
If we had a two-dimensional array,
this shape attribute would return a tuple with two values,
one for the number of rows and one for the number of columns.
To see this, let’s create a two-dimensional array from a nested Python list.
Here is one that contains integers.
And let’s print an additional attribute size.
Looking at the tuple returned by shape,
we know that Y has two dimensions since there are two elements.
One is for the size of the first dimension which is the number of rows,
four, and the other is for the second dimension,
which is the number of columns, three.
The size attribute gives us the total number of elements in Y which is 12.
Let’s pause for a second to introduce some useful terminology.
In general, we say that an array with n dimensions has a rank n. So,
we refer to the 1D array we created earlier as
a rank one-array and we refer to the 2D array we just created as a rank two-array.
For our next example,
let’s create a rank one-array that contains strings,
and let’s print those same attributes.
The type of the array object itself isn’t any different,
it’s still just a Numpy array.
However, the dtype of this array is different.
Here elements are stored as unicode strings of five characters.
Notice that when Numpy creates an array,
it automatically assigns it’s dtype
based on the type of the elements you used to create the array.
But what happens when we try to create a Numpy array
with a list that contains both integers and strings?
We can see that even though the Python list had mixed datatypes,
the array function created a Numpy array with elements of all the same datatype namely,
unicode strings of 21 characters.
Remember, unlike Python lists,
Numpy arrays must contain elements of the same type.
Up until now, we’ve only used elements that were integers or strings.
Let’s try another example with mixed datatypes using integers and floats.
When we input a list with both integers and floats,
Numpy assigns all elements,
the float 64 dtype,
this is called upcasting.
Since all the elements of a Numpy array must be of the same type,
Numpy upcasts the integers in the array to
floats in order to avoid losing precision in numerical computations.
Numpy also allows you to specify
a particular dtype you want to assign to the elements of an array,
you can do this using the keyword dtype in the array function.
Here, you can see Numpy created an array of
ints even though we passed it a list of floats.
Specifying the datatype of the elements in
a Numpy array can be useful in cases where you don’t want to accidentally
choose the wrong datatype or when you only need
a certain amount of precision in your calculations and want to save memory.
Once you create a Numpy array,
you may want to save it to a file to be read later or to be used by another program.
Numpy provides a way to save the arrays into files for later use.
We can save X into the current directory like this.
This saves the array into a file named my array dot npy.
You can later load this file into a variable by using the load function like this.
일반적으로 Numpy 배열을 만드는 방법에는 두 가지가 있습니다.
먼저 Numpy의 배열 함수를 사용하여
일반 Python 목록과 같은 다른 배열과 유사한 객체에서 가져옵니다.
그리고 두 번째, 다양한
특정 유형의 배열을 빠르게 생성하는 내장 Numpy 함수.
이 섹션에서는 첫 번째 방법부터 시작합니다.
Numpy를 가져와서 첫 번째 배열을 만들어 보겠습니다.
다음은 정수를 포함하는 1차원 배열입니다.
명확성을 위해,
이 단원의 예제에서는 작고 간단한 배열을 사용합니다.
1차원 또는 1D Numpy 배열을 만드는 것으로 시작하겠습니다.
방금 만든 배열을 인쇄해 보겠습니다.
뿐만 아니라 유형입니다.
유형이 Numpy의 ndarray 또는 n차원 배열임을 알 수 있습니다.
Numpy 배열에는 다음을 제공하는 유용한 속성이 있습니다.
매우 직관적인 방식으로 정보를 제공합니다.
예를 들어, 이 dtype 속성.
Dtype은 해당 배열에 있는 요소의 데이터 유형을 반환합니다.
dtype은 배열 자체의 데이터 유형과 다릅니다.
이 d 유형은 X의 요소가
메모리에 부호 있는 64비트 정수가 저장됩니다.
Numpy의 또 다른 장점은 Python보다 더 많은 데이터 유형을 처리한다는 것입니다.
Numpy의 문서에서 Numpy가 지원하는 모든 다양한 데이터 유형을 확인할 수 있습니다.
또 다른 유용한 속성은 모양입니다.
다음을 지정하는 n개의 양의 정수 튜플을 반환합니다.
각 차원의 크기 n은 배열의 차원 수입니다.
X에는 1차원이 있습니다.
따라서 모양은 배열의 길이를 나타내는 정수 5를 반환합니다.
2차원 배열이 있다면,
이 모양 속성은 두 개의 값이 있는 튜플을 반환합니다.
하나는 행 수이고 하나는 열 수입니다.
이를 보기 위해 중첩된 Python 목록에서 2차원 배열을 만들어 보겠습니다.
다음은 정수를 포함하는 것입니다.
그리고 추가 속성 크기를 인쇄해 보겠습니다.
모양으로 반환된 튜플을 보면,
2개의 요소가 있기 때문에 Y는 2차원을 갖는다는 것을 압니다.
하나는 행의 수인 첫 번째 차원의 크기입니다.
4개, 다른 하나는 2차원을 위한 것입니다.
열의 수인 3입니다.
size 속성은 Y의 총 요소 수인 12를 제공합니다.
몇 가지 유용한 용어를 소개하기 위해 잠시 멈추겠습니다.
일반적으로 차원이 n인 배열의 순위가 n이라고 합니다. 그래서,
앞에서 만든 1D 배열을 다음과 같이 참조합니다.
랭크 1 배열이고 방금 생성한 2D 배열을 랭크 2 배열이라고 합니다.
다음 예에서는
문자열을 포함하는 순위 1 배열을 만들어 보겠습니다.
동일한 속성을 인쇄해 보겠습니다.
배열 객체 자체의 유형은 다르지 않습니다.
여전히 Numpy 배열일 뿐입니다.
그러나 이 배열의 dtype은 다릅니다.
여기서 요소는 5자의 유니코드 문자열로 저장됩니다.
Numpy가 배열을 생성할 때,
자동으로 dtype을 할당합니다.
배열을 만드는 데 사용한 요소 유형을 기반으로 합니다.
하지만 Numpy 배열을 만들려고 하면 어떻게 될까요?
정수와 문자열을 모두 포함하는 목록으로?
Python 목록에 혼합 데이터 유형이 있음에도 불구하고 볼 수 있습니다.
배열 함수는 동일한 데이터 유형의 요소로 Numpy 배열을 생성했습니다. 즉,
21자의 유니코드 문자열.
파이썬 목록과 달리 기억하십시오.
Numpy 배열은 같은 유형의 요소를 포함해야 합니다.
지금까지는 정수 또는 문자열인 요소만 사용했습니다.
정수와 부동 소수점을 사용하는 혼합 데이터 유형으로 다른 예를 시도해 보겠습니다.
정수와 부동 소수점이 모두 포함된 목록을 입력하면
Numpy는 모든 요소를 할당하고,
플로트 64 dtype,
이것을 업캐스팅이라고 합니다.
Numpy 배열의 모든 요소는 동일한 유형이어야 하므로
Numpy는 배열의 정수를 다음으로 업캐스트합니다.
수치 계산에서 정밀도를 잃지 않기 위해 부동 소수점을 사용합니다.
Numpy는 또한 다음을 지정할 수 있습니다.
배열의 요소에 할당하려는 특정 dtype,
배열 함수에서 키워드 dtype을 사용하여 이 작업을 수행할 수 있습니다.
여기에서 Numpy가 다음과 같은 배열을 생성한 것을 볼 수 있습니다.
int는 float 목록을 전달했지만.
요소의 데이터 유형 지정
Numpy 배열은 실수로 원하지 않는 경우에 유용할 수 있습니다.
잘못된 데이터 유형을 선택하거나 필요할 때만
계산에 어느 정도 정밀도가 필요하고 메모리를 절약하고 싶습니다.
Numpy 배열을 생성하면
나중에 읽거나 다른 프로그램에서 사용할 수 있도록 파일에 저장할 수 있습니다.
Numpy는 나중에 사용할 수 있도록 배열을 파일에 저장하는 방법을 제공합니다.
이와 같이 현재 디렉토리에 X를 저장할 수 있습니다.
이렇게 하면 배열이 my array dot npy라는 파일에 저장됩니다.
나중에 이와 같은 load 함수를 사용하여 이 파일을 변수에 로드할 수 있습니다.
Creating NumPy ndarrays
We strongly encourage you to type the commands that you have learned in this demo. However, the Notebook demonstrated in the video above, is available at the bottom of this page. To download the file (
Creating and Saving NumPy ndarrays.ipynb
) to your computer, right-click on the link, then choose “Save Link As…”.
At the core of NumPy is the ndarray, where nd stands for n-dimensional. An ndarray is a multidimensional array of elements all of the same type. In other words, an ndarray is a grid that can take on many shapes and can hold either numbers or strings. In many Machine Learning problems you will often find yourself using ndarrays in many different ways. For instance, you might use an ndarray to hold the pixel values of an image that will be fed into a Neural Network for image classification.
But before we can dive in and start using NumPy to create ndarrays we need to import it into Python. We can import packages into Python using the import
command and it has become a convention to import NumPy as np
. Therefore, you can import NumPy by typing the following command in your Jupyter notebook:
import numpy as np
There are several ways to create ndarrays in NumPy. In the following lessons we will see two ways to create ndarrays:
- Using regular Python lists
- Using built-in NumPy functions
In this section, we will create ndarrays by providing Python lists to the NumPy np.array()
function. This can create some confusion for beginners, but it is important to remember that np.array()
is NOT a class, it is just a function that returns an ndarray. We should note that for the purposes of clarity, the examples throughout these lessons will use small and simple ndarrays. Let’s start by creating 1-Dimensional (1D) ndarrays.
# We import NumPy into Python import numpy as np # We create a 1D ndarray that contains only integers x = np.array([1, 2, 3, 4, 5]) # Let's print the ndarray we just created using the print() command print('x = ', x)
x = [1 2 3 4 5]
Rank of an Array (numpy.ndarray.ndim)
Syntax:
ndarray.ndim
It returns the number of array dimensions.
Let’s pause for a second to introduce some useful terminology. We refer to 1D arrays as rank 1 arrays. In general N-Dimensional arrays have rank N. Therefore, we refer to a 2D array as a rank 2 array.
# 1-D array x = np.array([1, 2, 3]) x.ndim # 2-D array Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]]) Y.ndim # Here the`zeros()` is an inbuilt function that you'll study on the next page. # The tuple (2, 3, 4( passed as an argument represents the shape of the ndarray y = np.zeros((2, 3, 4)) y.ndim
1 2 3
numpy.ndarray.shape
Syntax:
ndarray.shape
It returns a tuple representing the array dimensions. Refer more details here.
Another important property of arrays is their shape. The shape of an array is the size along each of its dimensions. For example, the shape of a rank 2 array will correspond to the number of rows and columns of the array. As you will see, NumPy ndarrays have attributes that allow us to get information about them in a very intuitive way. For example, the shape of an ndarray can be obtained using the .shape
attribute. The shape attribute returns a tuple of N positive integers that specify the sizes of each dimension.
numpy.dtype
The type tells us the data-type of the elements. Remember, a NumPy array is homogeneous, meaning all elements will have the same data-type. In the example below, we will create a rank 1 array and learn how to obtain its shape, its type, and the data-type (dtype) of its elements.
Example 1.a – Using a 1-D Array of Integers
# We create a 1D ndarray that contains only integers x = np.array([1, 2, 3, 4, 5]) # We print information about x print('x = ', x) print('x has dimensions:', x.shape) print('x is an object of type:', type(x)) print('The elements in x are of type:', x.dtype)
x = [1 2 3 4 5]
x has dimensions: (5,) x is an object of type: class ‘numpy.ndarray’ The elements in x are of type: int64
We can see that the shape attribute returns the tuple (5,)
telling us that x
is of rank 1 (i.e. x
only has 1 dimension ) and it has 5 elements. The type()
function tells us that x
is indeed a NumPy ndarray. Finally, the .dtype
attribute tells us that the elements of x
are stored in memory as signed 64-bit integers. Another great advantage of NumPy is that it can handle more data-types than Python lists. You can check out all the different data types NumPy supports in the link below:
As mentioned earlier, ndarrays can also hold strings. Let’s see how we can create a rank 1 ndarray of strings in the same manner as before, by providing the np.array()
function a Python list of strings.
Example 1.b – Using 1-D Array of Strings
# We create a rank 1 ndarray that only contains strings x = np.array(['Hello', 'World']) # We print information about x print('x = ', x) print('x has dimensions:', x.shape) print('x is an object of type:', type(x)) print('The elements in x are of type:', x.dtype)
x = [‘Hello’ ‘World’]
x has dimensions: (2,) x is an object of type: class ‘numpy.ndarray’ The elements in x are of type: U5
As we can see the shape attribute tells us that x
now has only 2 elements, and even though x
now holds strings, the type()
function tells us that x
is still an ndarray as before. In this case however, the .dtype
attribute tells us that the elements in x
are stored in memory as Unicode strings of 5 characters.
It is important to remember that one big difference between Python lists and ndarrays, is that unlike Python lists, all the elements of an ndarray must be of the same type. So, while we can create Python lists with both integers and strings, we can’t mix types in ndarrays. If you provide the np.array()
function with a Python list that has both integers and strings, NumPy will interpret all elements as strings. We can see this in the next example:
Example 1.c – Using a 1-D Array of Mixed Datatype
# We create a rank 1 ndarray from a Python list that contains integers and strings x = np.array([1, 2, 'World']) # We print information about x print('x = ', x) print('x has dimensions:', x.shape) print('x is an object of type:', type(x)) print('The elements in x are of type:', x.dtype)
x = [‘1’ ‘2’ ‘World’]
x has dimensions: (3,) x is an object of type: class ‘numpy.ndarray’ The elements in x are of type: U21
We can see that even though the Python list had mixed data types, the elements in x
are all of the same type, namely, Unicode strings of 21 characters. We won’t be using ndarrays with strings for the remaining of this introduction to NumPy, but it’s important to remember that ndarrays can hold strings as well.
Using a 1-D Array to Demonstrate Upcasting in Numeric datatype
Up till now, we have only created ndarrays with integers and strings. We saw that when we create an ndarray with only integers, NumPy will automatically assign the dtype int64 to its elements. Let’s see what happens when we create ndarrays with floats and integers.
Example 1.d – Using a 1-D Array of Int and Float
# We create a rank 1 ndarray that contains integers x = np.array([1,2,3]) # We create a rank 1 ndarray that contains floats y = np.array([1.0,2.0,3.0]) # We create a rank 1 ndarray that contains integers and floats z = np.array([1, 2.5, 4]) # We print the dtype of each ndarray print('The elements in x are of type:', x.dtype) print('The elements in y are of type:', y.dtype) print('The elements in z are of type:', z.dtype)
The elements in x are of type: int64 The elements in y are of type: float64 The elements in z are of type: float64
We can see that when we create an ndarray with only floats, NumPy stores the elements in memory as 64-bit floating point numbers (float64). However, notice that when we create an ndarray with both floats and integers, as we did with the z
ndarray above, NumPy assigns its elements a float64 dtype as well. This is called upcasting. Since all the elements of an ndarray must be of the same type, in this case NumPy upcasts the integers in z
to floats in order to avoid losing precision in numerical computations.
Using a 1-D Array of Float, and specifying the dtype of each element
Even though NumPy automatically selects the dtype of the ndarray, NumPy also allows you to specify the particular dtype you want to assign to the elements of the ndarray. You can specify the dtype when you create the ndarray using the keyword dtype
in the np.array()
function. Let’s see an example:
Example 1.e – Using a 1-D Array of Float, and specifying the datatype of each element as int64
# We create a rank 1 ndarray of floats but set the dtype to int64 x = np.array([1.5, 2.2, 3.7, 4.0, 5.9], dtype = np.int64) # We print the dtype x print('x = ', x) print('The elements in x are of type:', x.dtype)
x = [1 2 3 4 5]
The elements in x are of type: int64
We can see that even though we created the ndarray with floats, by specifying the dtype to be int64, NumPy converted the floating point numbers into integers by removing their decimals. Specifying the data type of the ndarray can be useful in cases when you don’t want NumPy to accidentally choose the wrong data type, or when you only need certain amount of precision in your calculations and you want to save memory.
numpy.ndarray.size and Creating a 2-D array
Another useful attribute is NumPy.size, which returns the number of elements in the array. Let us now look at how we can create a rank 2 ndarray from a nested Python list.
Example 2 – Using a 2-D Array (Rank #2 Array)
# We create a rank 2 ndarray that only contains integers Y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]]) print('Y = \n', Y) # We print information about Y print('Y has dimensions:', Y.shape) print('Y has a total of', Y.size, 'elements') print('Y is an object of type:', type(Y)) print('The elements in Y are of type:', Y.dtype)
Y = [[ 1 2 3] [ 4 5 6] [ 7 8 9] [10 11 12]]
Y has dimensions: (4, 3) Y has a total of 12 elements Y is an object of type: class ‘numpy.ndarray’ The elements in Y are of type: int64
We can see that now the shape attribute returns the tuple (4,3)
telling us that Y
is of rank 2 and it has 4 rows and 3 columns. The .size
attribute tells us that Y
has a total of 12 elements.
Notice that when NumPy creates an ndarray it automatically assigns its dtype based on the type of the elements you used to create the ndarray.
Save the NumPy array to a File
Once you create an ndarray, you may want to save it to a file to be read later or to be used by another program. NumPy provides a way to save the arrays into files for later use – let’s see how this is done.
Example 3 – Save the NumPy array to a File
# We create a rank 1 ndarray x = np.array([1, 2, 3, 4, 5]) # We save x into the current directory as np.save('my_array', x)
The above saves the x
ndarray into a file named my_array.npy
. You can load the saved ndarray into a variable by using the load()
function.
# We load the saved array from our current directory into variable y y = np.load('my_array.npy') # We print y print() print('y = ', y) print() # We print information about the ndarray we loaded print('y is an object of type:', type(y)) print('The elements in y are of type:', y.dtype)
y = [1 2 3 4 5]
y is an object of type: class ‘numpy.ndarray’ The elements in y are of type: int64
When loading an array from a file, make sure you include the name of the file together with the extension .npy
, otherwise you will get an error.
Additional Resource
- Refer more example at NumPy.org – How to create a basic array
댓글을 달려면 로그인해야 합니다.