5. 课后练习-使用更多的分类器

本文最后更新于 2025年9月1日上午

课后练习 5

Tasks:

Study k-Nearest Neighbours classifiers sklearn.neighbors.KNeighborsClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
Study RandomForrest classifiers sklearn.ensemble.RandomForestClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
Study Naïve Bayes classifiers 1.9. Naive Bayes — scikit-learn 0.24.1 documentation (scikitlearn.org)

Programming exercise:

This tutorial will use the MNIST dataset which was explored in tutorial 3.

Q1. Train a k-Nearest Neighbours classifier for handwritten digit recognition with MNIST dataset.

Try different parameter settings and study how the performance varies.

Plot the accuracy vs k while changing the number of neighbours (k) with values [1, 3, 5, 7, 9]

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

digits = datasets.load_digits()
labels = digits.target

data = images.reshape(len(images), -1)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
g = [1, 3, 5, 7, 9]
accurancy = []
for g_ in g:
    clf = KNeighborsClassifier(n_neighbors = g)
    clf.fit(x_train, y_train)
    acc = clf.predict(x_test, y_test)
    accurancy.append(acc)

plt.plot(g, accurancy)
plt.show()

Q2. Train a RandomForrest classifier for handwritten digit recognition with MNIST dataset.

Try different parameter settings and study how the performance varies.

Plot the accuracy vs max_depth while changing the max depth parameter with values [1, 2, 4, 8, 16]

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForrest

digits = datasets.load_digits()
labels = digits.target

data = images.reshape(len(images), -1)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
g = [1, 2, 4, 8, 16]
accurancy = []
for g_ in g:
    clf = RandomForrest(max_depth = g)
    clf.fit(x_train, y_train)
    acc = clf.predict(x_test, y_test)
    accurancy.append(acc)
plt.plot(g, accurancy)
plt.show()

Q3. Train a Gaussian Naive Bayes classifier for handwritten digit recognition with the MNIST dataset.

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForrest

digits = datasets.load_digits()
labels = digits.target

data = images.reshape(len(images), -1)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
clf1 = sk_bayes.BernoulliNB(alpha=1.0,
                            binarize=0.0,
                            fit_prior=True,
                            class_prior=None)
clf1.fit(x_train, y_train)  
acc_BN = clf1.score(x_test, y_test) 
acc.append(acc_BN)

Plus： Displaying the wrong images

# 显示错误的图片
clf = RandomForrest(max_depth = g)
clf.fit(x_train, y_train)
predictions = clf.predict(x_test)
# clf.predict_proba() 显示每张图有多少概率是哪个标签
print(predictions) # 这样会输出所有图片的预测标签
print(y_test)

Q4. Do a comparison between the four classifiers (SVM – Tutorial 3, kNN, RandomForrest and NaïveBayes) by plotting the best performing accuracy value for each classifier in a bar chart.

学习笔记 > Machine Learning-NUS 2021 > 课后练习

5. 课后练习-使用更多的分类器

https://l61012345.top/2021/01/28/Machine Learning-NAU/5.a 课后练习-手写字符识别/

作者

Oreki Kigiha

发布于

2021年1月28日

更新于

2025年9月1日

许可协议

6. 大作业-创建一个交通标志分类器上一篇

5. 特征下一篇