# 使用机器学习实现奇偶分类

## 步骤

1. 收集数据
2. 读入数据
3. 预处理数据（清除无效数据、数据类型调整、选取特征值）
4. 将数据分为训练集和测试集
5. 使用特定参数的模型对训练集进行训练
6. 使用测试集对训练好的模型进行测试
7. 调整特征、参数，重复上述工作

## 实验过程

### 第一次实验

```import pandas as pd
import random

def g():
i = random.randint(0,999999)
return [i, i%2]

data = pd.DataFrame([g() for i in range(100000)])
```

0 1
0 319255 1
1 430655 1
2 286442 0
3 709373 1
4 589398 0

```x = data[0].to_numpy().reshape(-1, 1)
y = data[1]
```

```from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.3)
```

```from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier()
knn.fit(x_train, y_train)
knn.score(x_test, y_test)
```

```from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier()
rfc.fit(x_train,y_train)
rfc.score(x_test,y_test)
```

### 第二次实验

```def g():
i = random.randint(0,999999)
return list(map(int, list("%010d"%i))) + [i%2]
```

0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 8 5 1 1 9 1 1
1 0 0 0 0 8 5 3 9 1 1 1
2 0 0 0 0 6 0 9 2 6 0 0
3 0 0 0 0 4 7 8 7 2 8 0
4 0 0 0 0 0 0 8 1 2 5 1

```x = data.iloc[:,:10]
y = data[10]
```

```rfc = RandomForestClassifier(
n_estimators=81,
max_features=6,
oob_score=True,
random_state=10,
)
```

## 完整代码

```# To add a new cell, type '# %%'
# To add a new markdown cell, type '# %% [markdown]'

# %% [markdown]
# # 奇偶性判断
#
# 使用随机数生成数据，并判断数据奇偶性
# %% [markdown]
# ## 读入数据
#
# %%
import pandas as pd
import random

def g():
i = random.randint(0,999999)
# return [i, i%2]
return list(map(int, list("%010d"%i))) + [i%2]

data = pd.DataFrame([g() for i in range(100000)])

# %%

# %% [markdown]
# ## 预处理数据
#
# 将特征和标签分离
# %%
# x = data[0].to_numpy().reshape(-1, 1)
# y = data[1]
x = data.iloc[:,:10]
y = data[10]

# %%
x

# %% [markdown]
# ## 分离训练集和测试集
#
# 按照 7:3 的比例分割

# %%
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.3)

print(x_train)

# %% [markdown]
# ## 训练并测试模型
#
# 分别使用 K-近邻 和 随机森林 算法
# %%
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier()
knn.fit(x_train, y_train)
knn.score(x_test, y_test)

# %%
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(n_estimators=81,max_features=6, oob_score=True,random_state=10)
rfc.fit(x_train,y_train)
rfc.score(x_test,y_test)
```