# 3.1.6：数值运算方法

## 通用函数：保留索引

因为`Pandas`是建立在`NumPy`基础之上的，所以`NumPy`的通用函数同样适用于`Pandas`的`Series`和`DataFrame`对象。

```Python
import numpy as np
import pandas as pd

rng = np.random.RandomState(42)  #创建随机数种子
ser = pd.Series(rng.randint(0,10,4))
df = pd.DataFrame(rng.randint(0,10,(3,4)), columns=['A','B','C','D'])
# 对Series对象使用Numpy通用函数，生成的结果是另一个保留索引的Pands对象
print(np.exp(ser))
Out：0  403.428793
 	1  20.085537
 	2  1096.633158
 	3  54.598150
	 dtype: float64
# 对DataFrame使用Numpy通用函数
print(np.sin(df*np.pi/4))
Out:
       A 		 B		   C  		D 
 0 -1.000000 7.071068e-01 1.000000 -1.000000e+00 
 1 -0.707107 1.224647e-16 0.707107 -7.071068e-01 
 2 -0.707107 1.000000e+00 -0.707107 1.224647e-16
```

## 通用函数：索引对齐

### Series索引对齐

假如你要整合两个数据源的数据，其中一个是美国面积最大的三个州的面积数据，另一个是美国人口最多的三个州的人口数据：
```python
# 面积
area=pd.Series({'Alaska':1723337,'Texas':695662,'California':423967},name='area')
# 人口
population=pd.Series({'California':38332521,'Texas':26448193,'New York': 19651127}, name='population'})
```
人口除以面积的结果：
```python
print(population/area)
Out:
 Alaska 		NaN 
 California 	90.413926 
 New York 	  NaN 
 Texas 		 38.018740 
 dtype: float64
 
```

对于缺失的数据，`Pandas`会用`NaN`填充，表示空值。这是`Pandas`表示缺失值的方法。这种索引对齐方式是通过`Python`内置的集合运算规则实现的，任何缺失值默认都用`NaN`填充。

### DataFrame索引对齐

在计算两个`DataFrame`时，类似的索引对齐规则也同样会出现在共同列中：
```python
A = pd.DataFrame(rng.randint(0, 20, (2, 2)), columns=list('AB'))
"""
A: 
	A  B 
 0  1  11 
 1  5  1
"""

B = pd.DataFrame(rng.randint(0, 10, (3, 3)), columns=list('BAC'))

"""
B:
	B A C 
 0  4 0 9 
 1  5 8 0 
 2  9 2 6
"""
print(A + B)
Out::   A		B		C
 0 	1.0  	15.0 	NaN
 1 	13.0 	6.0 	 NaN
 2 	NaN  	NaN 	 NaN
```

从上面的例子可以发现，两个对象的行列索引可以是不同顺序的，结果的索引会自动按顺序排列。在`Series`中，我们可以通过运算符方法的` fill_value`参数自定义缺失值；这里我们将用`A`中所有值的均值来填充缺失值。

```python
fill = A.stack().mean()  # stack()能将二维数组压缩成多个一维数组
print(A.add(B,fill_value=fill))
Out:
	A	 B	 C 
 0 1.0  15.0  13.5       #NaN值都变成了均值
 1 13.0  6.0  4.5 
 2 6.5  13.5  10.5
```

下表中列举了与`Python`运算符相对应的`Pandas`对象方法。

|Python运算符|Pandas方法
|:-:|:-:|
|+|add()
|-|sub()、substract()
|\*|mul()、multiply()
|/|truediv()、div()、divide()
|//|floordiv()
|%|mod()
|\*\*|pow()

## 通用函数：DataFrame与Series的运算

`DataFrame`和`Series`的运算规则与`Numpy`中二维数组与一维数组的运算规则是一样的。来看一个常见运算，让一个二维数组减去自身的一行数据。

```python
A = rng.randint(10, size=(3, 4))
A - A[0]
Out:
array([[ 0,  0, 0, 0],
 	  [-1, -2, 2, 4],
	   [ 3, -7, 1, 4]])     # 根据Numpy的广播规则，默认是按行运算的
```

在`Pandas`里默认也是按行运算的，如果想按列计算，那么就需要利用前面介绍过的运算符方法，通过设置`axis(轴)`实现。

```python
df = pd.DataFrame(A, columns=list('QRST'))
print(df - df.iloc[0])
Out:
	Q  R  S  T
 0  0  0  0  0
 1 -1 -2  2  4
 2  3 -7  1  4

print(df.subtract(df['R'],axis=0))
Out:
	Q  R  S  T
 0 -5  0 -6 -4
 1 -4  0 -2  2
 2  5  0  2  7
```

`DataFrame/Series`的运算与前面介绍的运算一样，结果的索引都会自动对齐。