1.3.3. 更复杂的数组¶

1.3.3.1. 更多数据类型 ¶

类型转换¶

“较大”类型在混合类型操作中获胜

>>> np.array([1,2,3])+1.5
array([2.5,  3.5,  4.5])

赋值永远不会改变类型！

>>> a=np.array([1,2,3])
>>> a.dtype
dtype('int64')
>>> a[0]=1.9# <-- float is truncated to integer
>>> a
array([1, 2, 3])

强制转换

>>> a=np.array([1.7,1.2,1.6])
>>> b=a.astype(int)# <-- truncates to integer
>>> b
array([1, 1, 1])

舍入

>>> a=np.array([1.2,1.5,1.6,2.5,3.5,4.5])
>>> b=np.around(a)
>>> b# still floating-point
array([1.,  2.,  2.,  2.,  4.,  4.])
>>> c=np.around(a).astype(int)
>>> c
array([1, 2, 2, 2, 4, 4])

不同数据类型的大小¶

整数（带符号）

`int8`	8 位
`int16`	16 位
`int32`	32 位（与 32 位平台上的 `int` 相同）
`int64`	64 位（与 64 位平台上的 `int` 相同）

>>> np.array([1],dtype=int).dtype
dtype('int64')
>>> np.iinfo(np.int32).max,2**31-1
(2147483647, 2147483647)

无符号整数

`uint8`	8 位
`uint16`	16 位
`uint32`	32 位
`uint64`	64 位

>>> np.iinfo(np.uint32).max,2**32-1
(4294967295, 4294967295)

浮点数

`float16`	16 位
`float32`	32 位
`float64`	64 位（与 `float` 相同）
`float96`	96 位，平台相关（与 `np.longdouble` 相同）
`float128`	128 位，平台相关（与 `np.longdouble` 相同）

>>> np.finfo(np.float32).eps
np.float32(1.1920929e-07)
>>> np.finfo(np.float64).eps
np.float64(2.220446049250313e-16)
>>> np.float32(1e-8)+np.float32(1)==1
np.True_
>>> np.float64(1e-8)+np.float64(1)==1
np.False_

复数浮点数

`complex64`	两个 32 位浮点数
`complex128`	两个 64 位浮点数
`complex192`	两个 96 位浮点数，平台相关
`complex256`	两个 128 位浮点数，平台相关

1.3.3.2. 结构化数据类型 ¶

`sensor_code`	(4 个字符的字符串)
`position`	(浮点数)
`value`	(浮点数)

>>> samples=np.zeros((6,),dtype=[('sensor_code','S4'),
... ('position',float),('value',float)])
>>> samples.ndim
1
>>> samples.shape
(6,)
>>> samples.dtype.names
('sensor_code', 'position', 'value')
>>> samples[:]=[('ALFA',1,0.37),('BETA',1,0.11),('TAU',1,0.13),
... ('ALFA',1.5,0.37),('ALFA',3,0.11),('TAU',1.2,0.13)]
>>> samples
array([(b'ALFA', 1. , 0.37), (b'BETA', 1. , 0.11), (b'TAU', 1. , 0.13),
       (b'ALFA', 1.5, 0.37), (b'ALFA', 3. , 0.11), (b'TAU', 1.2, 0.13)],
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

字段访问通过使用字段名索引来实现

>>> samples['sensor_code']
array([b'ALFA', b'BETA', b'TAU', b'ALFA', b'ALFA', b'TAU'], dtype='|S4')
>>> samples['value']
array([0.37,  0.11,  0.13,  0.37,  0.11,  0.13])
>>> samples[0]
np.void((b'ALFA', 1.0, 0.37), dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])
>>> samples[0]['sensor_code']='TAU'
>>> samples[0]
np.void((b'TAU', 1.0, 0.37), dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

一次获取多个字段

>>> samples[['position','value']]
array([(1. ,  0.37), (1. ,  0.11), (1. ,  0.13), (1.5,  0.37),
       (3. ,  0.11), (1.2,  0.13)],
      dtype={'names': ['position', 'value'], 'formats': ['<f8', '<f8'], 'offsets': [4, 12], 'itemsize': 20})

花式索引照常起作用

>>> samples[samples['sensor_code']==b'ALFA']
array([(b'ALFA', 1.5, 0.37), (b'ALFA', 3. , 0.11)],
      dtype=[('sensor_code', 'S4'), ('position', '<f8'), ('value', '<f8')])

注意

还有许多其他构建结构化数组的语法，请参见这里和这里。

1.3.3.3. `maskedarray`: 处理（传播）缺失数据 ¶

对于浮点数，可以使用 NaN，但掩码适用于所有类型

>>> x=np.ma.array([1,2,3,4],mask=[0,1,0,1])
>>> x
masked_array(data=[1, --, 3, --],
             mask=[False,  True, False,  True],
       fill_value=999999)
>>> y=np.ma.array([1,2,3,4],mask=[0,1,1,1])
>>> x+y
masked_array(data=[2, --, --, --],
             mask=[False,  True,  True,  True],
       fill_value=999999)

常用函数的掩码版本

>>> np.ma.sqrt([1,-1,2,-2])
masked_array(data=[1.0, --, 1.41421356237... --],
             mask=[False,  True, False,  True],
       fill_value=1e+20)

注意

还有其他有用的数组兄弟姐妹

虽然在关于 NumPy 的章节中，它是一个脱轨话题，但让我们花点时间回顾一下良好的编码实践，这些实践确实会在长远带来回报

1.3.3. 更复杂的数组¶

1.3.3.1. 更多数据类型 ¶

类型转换¶

不同数据类型的大小¶

1.3.3.2. 结构化数据类型 ¶

1.3.3.3. `maskedarray`: 处理（传播）缺失数据 ¶

目录

上一主题

下一主题

此页面

1.3.3. 更复杂的数组¶

1.3.3.1. 更多数据类型¶

类型转换¶

不同数据类型的大小¶

1.3.3.2. 结构化数据类型¶

1.3.3.3. maskedarray: 处理（传播）缺失数据¶

1.3.3.1. 更多数据类型 ¶

1.3.3.2. 结构化数据类型 ¶

1.3.3.3. `maskedarray`: 处理（传播）缺失数据 ¶