5. 课后练习-使用更多的分类器

本文最后更新于 2023年10月20日 上午

课后练习 5

Tasks:

  • Study k-Nearest Neighbours classifiers sklearn.neighbors.KNeighborsClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
  • Study RandomForrest classifiers sklearn.ensemble.RandomForestClassifier — scikit-learn 0.24.1 documentation (scikit-learn.org)
  • Study Naïve Bayes classifiers 1.9. Naive Bayes — scikit-learn 0.24.1 documentation (scikitlearn.org) ## Programming exercise: This tutorial will use the MNIST dataset which was explored in tutorial 3. ### Q1. Train a k-Nearest Neighbours classifier for handwritten digit recognition with MNIST dataset. Try different parameter settings and study how the performance varies.
  • Plot the accuracy vs k while changing the number of neighbours (k) with values [1, 3, 5, 7, 9]
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    import matplotlib.pyplot as plt
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
    from sklearn.neighbors import KNeighborsClassifier

    digits = datasets.load_digits()
    labels = digits.target

    data = images.reshape(len(images), -1)
    x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
    g = [1, 3, 5, 7, 9]
    accurancy = []
    for g_ in g:
    clf = KNeighborsClassifier(n_neighbors = g)
    clf.fit(x_train, y_train)
    acc = clf.predict(x_test, y_test)
    accurancy.append(acc)

    plt.plot(g, accurancy)
    plt.show()
    ### Q2. Train a RandomForrest classifier for handwritten digit recognition with MNIST dataset. Try different parameter settings and study how the performance varies.
  • Plot the accuracy vs max_depth while changing the max depth parameter with values [1, 2, 4, 8, 16]
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    import matplotlib.pyplot as plt
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
    from sklearn.ensemble import RandomForrest

    digits = datasets.load_digits()
    labels = digits.target

    data = images.reshape(len(images), -1)
    x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
    g = [1, 2, 4, 8, 16]
    accurancy = []
    for g_ in g:
    clf = RandomForrest(max_depth = g)
    clf.fit(x_train, y_train)
    acc = clf.predict(x_test, y_test)
    accurancy.append(acc)
    plt.plot(g, accurancy)
    plt.show()
    ### Q3. Train a Gaussian Naive Bayes classifier for handwritten digit recognition with the MNIST dataset.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    import matplotlib.pyplot as plt
    from sklearn import datasets
    from sklearn.model_selection import train_test_split
    from sklearn.ensemble import RandomForrest

    digits = datasets.load_digits()
    labels = digits.target

    data = images.reshape(len(images), -1)
    x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, shuffle=False)
    clf1 = sk_bayes.BernoulliNB(alpha=1.0,
    binarize=0.0,
    fit_prior=True,
    class_prior=None)
    clf1.fit(x_train, y_train)
    acc_BN = clf1.score(x_test, y_test)
    acc.append(acc_BN)
  • Plus: Displaying the wrong images
    1
    2
    3
    4
    5
    6
    7
    # 显示错误的图片
    clf = RandomForrest(max_depth = g)
    clf.fit(x_train, y_train)
    predictions = clf.predict(x_test)
    # clf.predict_proba() 显示每张图有多少概率是哪个标签
    print(predictions) # 这样会输出所有图片的预测标签
    print(y_test)
    ### Q4. Do a comparison between the four classifiers (SVM – Tutorial 3, kNN, RandomForrest and NaïveBayes) by plotting the best performing accuracy value for each classifier in a bar chart.

5. 课后练习-使用更多的分类器
https://l61012345.top/2021/01/28/Machine Learning-NAU/5.a 课后练习-手写字符识别/
作者
Oreki Kigiha
发布于
2021年1月28日
更新于
2023年10月20日
许可协议