5. 课后练习-使用更多的分类器
本文最后更新于 2025年4月25日 晚上
课后练习 5
Tasks:
- Study k-Nearest Neighbours classifiers sklearn.neighbors.KNeighborsClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
- Study RandomForrest classifiers sklearn.ensemble.RandomForestClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
- Study Naïve Bayes classifiers 1.9. Naive Bayes — scikit-learn 0.24.1 documentation (scikitlearn.org)
Programming exercise:
This tutorial will use the MNIST dataset which was explored in tutorial 3.
Q1. Train a k-Nearest Neighbours classifier for handwritten digit recognition with MNIST dataset.
Try different parameter settings and study how the performance varies.
- Plot the accuracy vs k while changing the number of neighbours (k) with values [1, 3, 5, 7, 9]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
digits = datasets.load_digits()
labels = digits.target
data = images.reshape(len(images), -1)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
g = [1, 3, 5, 7, 9]
accurancy = []
for g_ in g:
clf = KNeighborsClassifier(n_neighbors = g)
clf.fit(x_train, y_train)
acc = clf.predict(x_test, y_test)
accurancy.append(acc)
plt.plot(g, accurancy)
plt.show()
Q2. Train a RandomForrest classifier for handwritten digit recognition with MNIST dataset.
Try different parameter settings and study how the performance varies.
- Plot the accuracy vs max_depth while changing the max depth parameter with values [1, 2, 4, 8, 16]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForrest
digits = datasets.load_digits()
labels = digits.target
data = images.reshape(len(images), -1)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
g = [1, 2, 4, 8, 16]
accurancy = []
for g_ in g:
clf = RandomForrest(max_depth = g)
clf.fit(x_train, y_train)
acc = clf.predict(x_test, y_test)
accurancy.append(acc)
plt.plot(g, accurancy)
plt.show()
Q3. Train a Gaussian Naive Bayes classifier for handwritten digit recognition with the MNIST dataset.
1 |
|
- Plus: Displaying the wrong images
1
2
3
4
5
6
7# 显示错误的图片
clf = RandomForrest(max_depth = g)
clf.fit(x_train, y_train)
predictions = clf.predict(x_test)
# clf.predict_proba() 显示每张图有多少概率是哪个标签
print(predictions) # 这样会输出所有图片的预测标签
print(y_test)
Q4. Do a comparison between the four classifiers (SVM – Tutorial 3, kNN, RandomForrest and NaïveBayes) by plotting the best performing accuracy value for each classifier in a bar chart.
5. 课后练习-使用更多的分类器
https://l61012345.top/2021/01/28/Machine Learning-NAU/5.a 课后练习-手写字符识别/