1.3.1. NumPy 数组对象¶

1.3.1.1. 什么是 NumPy 和 NumPy 数组？¶

NumPy 数组¶

Python 对象:

高级数字对象：整数，浮点数
容器：列表（插入和追加无成本），字典（快速查找）

NumPy 提供:

用于多维数组的 Python 扩展包
更接近硬件（效率）
专为科学计算而设计（便利）
也称为数组导向计算

>>> importnumpyasnp
>>> a=np.array([0,1,2,3])
>>> a
array([0, 1, 2, 3])

提示

例如，一个包含

实验/模拟在离散时间步长上的值
测量设备记录的信号，例如声波
图像的像素，灰度或颜色
在不同 X-Y-Z 位置测量的 3-D 数据，例如 MRI 扫描
…

为什么它有用： 内存高效的容器，提供快速的数值运算。

In [1]: L=range(1000)
In [2]: %timeit [i**2 for i in L]
50.6 us +- 725 ns per loop (mean +- std. dev. of 7 runs, 10,000 loops each)
In [3]: a=np.arange(1000)
In [4]: %timeit a**2
920 ns +- 7.16 ns per loop (mean +- std. dev. of 7 runs, 1,000,000 loops each)

NumPy 参考文档¶

网络上：https://numpy.com.cn/doc/

交互式帮助

In [5]: np.array?
Docstring:
array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
      like=None)
Create an array.
Parameters
----------
object:array_like
Anarray,anyobjectexposingthearrayinterface,anobjectwhose
``__array__``methodreturnsanarray,orany(nested)sequence.
Ifobjectisascalar,a0-dimensionalarraycontainingobjectis
returned.
dtype:data-type,optional
Thedesireddata-typeforthearray.Ifnotgiven,NumPywilltrytouse
adefault``dtype``thatcanrepresentthevalues(byapplyingpromotion
ruleswhennecessary.)
copy:bool,optional
If``True``(default),thenthearraydataiscopied.If``None``,
acopywillonlybemadeif``__array__``returnsacopy,ifobjis
anestedsequence,orifacopyisneededtosatisfyanyoftheother
requirements(``dtype``,``order``,etc.).Notethatanycopyof
thedataisshallow,i.e.,forarrayswithobjectdtype,thenew
arraywillpointtothesameobjects.SeeExamplesfor`ndarray.copy`.
For``False``itraisesa``ValueError``ifacopycannotbeavoided.
Default:``True``.
order:{'K','A','C','F'},optional
Specifythememorylayoutofthearray.Ifobjectisnotanarray,the
newlycreatedarraywillbeinCorder(rowmajor)unless'F'is
specified,inwhichcaseitwillbeinFortranorder(columnmajor).
Ifobjectisanarraythefollowingholds.
=================================================================
ordernocopycopy=True
=================================================================
'K'unchangedF&Corderpreserved,otherwisemostsimilarorder
'A'unchangedForderifinputisFandnotC,otherwiseCorder
'C'CorderCorder
'F'ForderForder
=================================================================
When``copy=None``andacopyismadeforotherreasons,theresultis
thesameasif``copy=True``,withsomeexceptionsfor'A',seethe
Notessection.Thedefaultorderis'K'.
subok:bool,optional
IfTrue,thensub-classeswillbepassed-through,otherwise
thereturnedarraywillbeforcedtobeabase-classarray(default).
ndmin:int,optional
Specifiestheminimumnumberofdimensionsthattheresulting
arrayshouldhave.Oneswillbeprependedtotheshapeas
neededtomeetthisrequirement.
like:array_like,optional
Referenceobjecttoallowthecreationofarrayswhicharenot
NumPyarrays.Ifanarray-likepassedinas``like``supports
the``__array_function__``protocol,theresultwillbedefined
byit.Inthiscase,itensuresthecreationofanarrayobject
compatiblewiththatpassedinviathisargument.
..versionadded::1.20.0
Returns
-------
out:ndarray
Anarrayobjectsatisfyingthespecifiedrequirements.
SeeAlso
--------
empty_like:Returnanemptyarraywithshapeandtypeofinput.
ones_like:Returnanarrayofoneswithshapeandtypeofinput.
zeros_like:Returnanarrayofzeroswithshapeandtypeofinput.
full_like:Returnanewarraywithshapeofinputfilledwithvalue.
empty:Returnanewuninitializedarray.
ones:Returnanewarraysettingvaluestoone.
zeros:Returnanewarraysettingvaluestozero.
full:Returnanewarrayofgivenshapefilledwithvalue.
copy: Return an array copy of the given object.
Notes
-----
Whenorderis'A'and``object``isanarrayinneither'C'nor'F'order,
andacopyisforcedbyachangeindtype,thentheorderoftheresultis
notnecessarily'C'asexpected.Thisislikelyabug.
Examples
--------
>>>importnumpyasnp
>>>np.array([1,2,3])
array([1,2,3])
Upcasting:
>>>np.array([1,2,3.0])
array([1.,2.,3.])
Morethanonedimension:
>>>np.array([[1,2],[3,4]])
array([[1,2],
[3,4]])
Minimumdimensions2:
>>>np.array([1,2,3],ndmin=2)
array([[1,2,3]])
Typeprovided:
>>>np.array([1,2,3],dtype=complex)
array([1.+0.j,2.+0.j,3.+0.j])
Data-typeconsistingofmorethanoneelement:
>>>x=np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])
>>>x['a']
array([1,3])
Creatinganarrayfromsub-classes:
>>>np.array(np.asmatrix('1 2; 3 4'))
array([[1,2],
[3,4]])
>>>np.array(np.asmatrix('1 2; 3 4'),subok=True)
matrix([[1,2],
[3,4]])
Type:      builtin_function_or_method

提示

>>> help(np.array)
Help on built-in function array in module numpy:
array(...)
    array(object, dtype=None, ...

寻找东西

In [6]: np.con*?
np.concat
np.concatenate
np.conj
np.conjugate
np.convolve

导入约定¶

导入 NumPy 的推荐约定是

>>> importnumpyasnp

1.3.1.2. 创建数组 ¶

手动构建数组¶

1-D:

>>> a=np.array([0,1,2,3])
>>> a
array([0, 1, 2, 3])
>>> a.ndim
1
>>> a.shape
(4,)
>>> len(a)
4

2-D，3-D，…:

>>> b=np.array([[0,1,2],[3,4,5]])# 2 x 3 array
>>> b
array([[0, 1, 2],
       [3, 4, 5]])
>>> b.ndim
2
>>> b.shape
(2, 3)
>>> len(b)# returns the size of the first dimension
2
>>> c=np.array([[[1],[2]],[[3],[4]]])
>>> c
array([[[1],
        [2]],
       [[3],
        [4]]])
>>> c.shape
(2, 2, 1)

用于创建数组的函数¶

提示

在实践中，我们很少一个一个地输入项目…

等间距

>>> a=np.arange(10)# 0 .. n-1  (!)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b=np.arange(1,9,2)# start, end (exclusive), step
>>> b
array([1, 3, 5, 7])

或按点数

>>> c=np.linspace(0,1,6)# start, end, num-points
>>> c
array([0. ,  0.2,  0.4,  0.6,  0.8,  1. ])
>>> d=np.linspace(0,1,5,endpoint=False)
>>> d
array([0. ,  0.2,  0.4,  0.6,  0.8])

常用数组

>>> a=np.ones((3,3))# reminder: (3, 3) is a tuple
>>> a
array([[1.,  1.,  1.],
       [1.,  1.,  1.],
       [1.,  1.,  1.]])
>>> b=np.zeros((2,2))
>>> b
array([[0.,  0.],
       [0.,  0.]])
>>> c=np.eye(3)
>>> c
array([[1.,  0.,  0.],
       [0.,  1.,  0.],
       [0.,  0.,  1.]])
>>> d=np.diag(np.array([1,2,3,4]))
>>> d
array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

np.random: 随机数（Mersenne Twister PRNG）

>>> rng=np.random.default_rng(27446968)
>>> a=rng.random(4)# uniform in [0, 1]
>>> a
array([0.64613018, 0.48984931, 0.50851229, 0.22563948])
>>> b=rng.standard_normal(4)# Gaussian
>>> b
array([-0.38250769, -0.61536465,  0.98131732,  0.59353096])

1.3.1.3. 基本数据类型 ¶

你可能已经注意到，在某些情况下，数组元素以一个尾随点显示（例如 2. 与 2）。这是由于使用的不同数据类型。

>>> a=np.array([1,2,3])
>>> a.dtype
dtype('int64')
>>> b=np.array([1.,2.,3.])
>>> b.dtype
dtype('float64')

提示

不同的数据类型允许我们在内存中更紧凑地存储数据，但大多数情况下我们只使用浮点数。请注意，在上面的示例中，NumPy 会根据输入自动检测数据类型。

你可以明确指定所需的数据类型

>>> c=np.array([1,2,3],dtype=float)
>>> c.dtype
dtype('float64')

默认数据类型是浮点数

>>> a=np.ones((3,3))
>>> a.dtype
dtype('float64')

还有其他类型

复数:

>>> d=np.array([1+2j,3+4j,5+6*1j])
>>> d.dtype
dtype('complex128')

布尔值:

>>> e=np.array([True,False,False,True])
>>> e.dtype
dtype('bool')

字符串:

>>> f=np.array(['Bonjour','Hello','Hallo'])
>>> f.dtype# <--- strings containing max. 7 letters
dtype('<U7')

更多:

int32
int64
uint32
uint64

1.3.1.4. 基本可视化 ¶

现在我们有了第一个数据数组，我们将对其进行可视化。

从启动 IPython 开始

$ ipython # or ipython3 depending on your install

或笔记本

$ jupyter notebook

IPython 启动后，启用交互式绘图

>>> %matplotlib

或者，从笔记本中，在笔记本中启用绘图

>>> %matplotlibinline

inline 对笔记本很重要，这样绘图就会显示在笔记本中，而不是在新窗口中。

Matplotlib 是一个 2D 绘图包。我们可以像下面这样导入它的函数

>>> importmatplotlib.pyplotasplt# the tidy way

然后使用（注意，如果你没有使用 %matplotlib 启用交互式绘图，则必须显式使用 show）

>>> plt.plot(x,y)# line plot    
>>> plt.show()# <-- shows the plot (not needed with interactive plots) 

或者，如果你已使用 %matplotlib 启用交互式绘图

>>> plt.plot(x,y)# line plot    

1D 绘图:

>>> x=np.linspace(0,3,20)
>>> y=np.linspace(0,9,20)
>>> plt.plot(x,y)# line plot
[<matplotlib.lines.Line2D object at ...>]
>>> plt.plot(x,y,'o')# dot plot
[<matplotlib.lines.Line2D object at ...>]

../../_images/sphx_glr_plot_basic1dplot_001.png

2D 数组（如图像）

>>> rng=np.random.default_rng(27446968)
>>> image=rng.random((30,30))
>>> plt.imshow(image,cmap=plt.cm.hot)
<matplotlib.image.AxesImage object at ...>
>>> plt.colorbar()
<matplotlib.colorbar.Colorbar object at ...>

../../_images/sphx_glr_plot_basic2dplot_001.png

另请参阅

更多信息：matplotlib 章节

1.3.1.5. 索引和切片 ¶

数组的项目可以像其他 Python 序列（例如列表）一样被访问和赋值。

>>> a=np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[0],a[2],a[-1]
(np.int64(0), np.int64(2), np.int64(9))

警告

索引从 0 开始，与其他 Python 序列（和 C/C++）一样。相比之下，在 Fortran 或 Matlab 中，索引从 1 开始。

Python 中反转序列的常用习惯用法得到了支持

>>> a[::-1]
array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

对于多维数组，索引是整数元组

>>> a=np.diag(np.arange(3))
>>> a
array([[0, 0, 0],
       [0, 1, 0],
       [0, 0, 2]])
>>> a[1,1]
np.int64(1)
>>> a[2,1]=10# third line, second column
>>> a
array([[ 0,  0,  0],
       [ 0,  1,  0],
       [ 0, 10,  2]])
>>> a[1]
array([0, 1, 0])

注意

在 2D 中，第一个维度对应于行，第二个对应于列。
对于多维 a，a[0] 被解释为获取未指定维度的所有元素。

切片：数组像其他 Python 序列一样也可以被切片

>>> a=np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[2:9:3]# [start:end:step]
array([2, 5, 8])

请注意，最后一个索引不包括在内！

>>> a[:4]
array([0, 1, 2, 3])

所有三个切片组件都不是必需的：默认情况下，start 为 0，end 为最后一个，step 为 1

>>> a[1:3]
array([1, 2])
>>> a[::2]
array([0, 2, 4, 6, 8])
>>> a[3:]
array([3, 4, 5, 6, 7, 8, 9])

NumPy 索引和切片的简要图示总结…

你也可以结合赋值和切片

>>> a=np.arange(10)
>>> a[5:]=10
>>> a
array([ 0,  1,  2,  3,  4, 10, 10, 10, 10, 10])
>>> b=np.arange(5)
>>> a[5:]=b[::-1]
>>> a
array([0, 1, 2, 3, 4, 4, 3, 2, 1, 0])

1.3.1.6. 副本和视图 ¶

切片操作会在原始数组上创建一个视图，这仅仅是一种访问数组数据的途径。因此，原始数组不会在内存中被复制。你可以使用 np.may_share_memory() 来检查两个数组是否共享相同的内存块。但是请注意，这使用的是启发式方法，可能会给你带来误报。

修改视图时，原始数组也会被修改:

>>> a=np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b=a[::2]
>>> b
array([0, 2, 4, 6, 8])
>>> np.may_share_memory(a,b)
True
>>> b[0]=12
>>> b
array([12,  2,  4,  6,  8])
>>> a# (!)
array([12,  1,  2,  3,  4,  5,  6,  7,  8,  9])
>>> a=np.arange(10)
>>> c=a[::2].copy()# force a copy
>>> c[0]=12
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.may_share_memory(a,c)
False

这种行为乍一看可能令人惊讶……但它可以节省内存和时间。

1.3.1.7. 花式索引 ¶

提示

NumPy 数组可以使用切片索引，也可以使用布尔或整数数组（**掩码**）索引。这种方法称为 *花式索引*。它创建的是**副本而不是视图**。

使用布尔掩码¶

>>> rng=np.random.default_rng(27446968)
>>> a=rng.integers(0,21,15)
>>> a
array([ 3, 13, 12, 10, 10, 10, 18,  4,  8,  5,  6, 11, 12, 17,  3])
>>> (a%3==0)
array([ True, False,  True, False, False, False,  True, False, False,
       False,  True, False,  True, False,  True])
>>> mask=(a%3==0)
>>> extract_from_a=a[mask]# or,  a[a%3==0]
>>> extract_from_a# extract a sub-array with the mask
array([ 3, 12, 18,  6, 12,  3])

使用掩码索引对于为子数组分配新值非常有用

>>> a[a%3==0]=-1
>>> a
array([-1, 13, -1, 10, 10, 10, -1,  4,  8,  5, -1, 11, -1, 17, -1])

使用整数数组索引¶

>>> a=np.arange(0,100,10)
>>> a
array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

可以使用整数数组进行索引，其中相同的索引可以重复多次

>>> a[[2,3,2,4,2]]# note: [2, 3, 2, 4, 2] is a Python list
array([20, 30, 20, 40, 20])

可以使用这种类型的索引分配新值

>>> a[[9,7]]=-100
>>> a
array([   0,   10,   20,   30,   40,   50,   60, -100,   80, -100])

提示

当通过整数数组索引创建新数组时，新数组的形状与整数数组的形状相同

>>> a=np.arange(10)
>>> idx=np.array([[3,4],[9,7]])
>>> idx.shape
(2, 2)
>>> a[idx]
array([[3, 4],
       [9, 7]])

下图说明了各种花式索引应用