Javascript required
Skip to content Skip to sidebar Skip to footer

Scikit Learn Mix of Classification and Continuous Data

In thisPython tutorial, we will learn How scikit learn classification, and we will also cover different examples related to scikit learn classification. Moreover, we will cover these topics.

  • Scikit learn classification
  • Scikit learn classification report
  • Scikit learn classification metrics
  • Scikit learn classification example
  • Scikit learn classification tree
  • Scikit learn classification accuracy
  • Scikit learn classification report support

Scikit learn Classification

In this section, we will learn about how Scikit learn classification works in Python.

  • A classification is a form of data analysis that extracts models describing important data classes.
  • Classification is a bunch of different classes and sorting these classes into different categories.

Code:

In the following code, we will import some libraries from which we can perform the classification task.

  • x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=42) is used to split the data in training and testing part.
  • axis.scatter(x_train[:, 0], x_train[:, 1], c=y_train, cmap=cm_bright, edgecolors="k") is used to plot the training points.
  • x_test[:, 0], x_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6, edgecolors="k") is used to plot the testing points.
          import numpy as num import matplotlib.pyplot as plot from matplotlib.colors import ListedColormap from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.datasets import make_moons, make_circles, make_classification from sklearn.neural_network import MLPClassifier from sklearn.neighbors import KNeighborsClassifier from sklearn.svm import SVC from sklearn.gaussian_process import GaussianProcessClassifier from sklearn.gaussian_process.kernels import RBF from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier from sklearn.naive_bayes import GaussianNB from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis  h = 0.02   name = [     "Nearest Neighbors",     "Linear SVM",     "RBF SVM",     "Gaussian Process",     "Decision Tree",     "Random Forest",     "Neural Net",     "AdaBoost",     "Naive Bayes",     "QDA", ]  classifier = [     KNeighborsClassifier(3),     SVC(kernel="linear", C=0.025),     SVC(gamma=2, C=1),     GaussianProcessClassifier(1.0 * RBF(1.0)),     DecisionTreeClassifier(max_depth=5),     RandomForestClassifier(max_depth=5, n_estimators=10, max_features=1),     MLPClassifier(alpha=1, max_iter=1000),     AdaBoostClassifier(),     GaussianNB(),     QuadraticDiscriminantAnalysis(), ]  x, y = make_classification(     n_features=2, n_redundant=0, n_informative=2, random_state=1, n_clusters_per_class=1 ) rang = num.random.RandomState(2) x += 2 * rang.uniform(size=x.shape) linearly_separable = (x, y)  datasets = [     make_moons(noise=0.3, random_state=0),     make_circles(noise=0.2, factor=0.5, random_state=1),     linearly_separable, ]  figure = plot.figure(figsize=(27, 9)) i = 1  for ds_cnt, ds in enumerate(datasets):         x, y = ds     x = StandardScaler().fit_transform(x)     x_train, x_test, y_train, y_test = train_test_split(         x, y, test_size=0.4, random_state=42     )      x_min, x_max = x[:, 0].min() - 0.5, x[:, 0].max() + 0.5     y_min, y_max = x[:, 1].min() - 0.5, x[:, 1].max() + 0.5     xx, yy = num.meshgrid(num.arange(x_min, x_max, h), num.arange(y_min, y_max, h))      cm = plot.cm.RdBu     cm_bright = ListedColormap(["pink", "green"])     axis = plot.subplot(len(datasets), len(classifier) + 1, i)     if ds_cnt == 0:         axis.set_title("Input data")      axis.scatter(x_train[:, 0], x_train[:, 1], c=y_train, cmap=cm_bright, edgecolors="r")      axis.scatter(         x_test[:, 0], x_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6, edgecolors="r"     )     axis.set_xlim(xx.min(), xx.max())     axis.set_ylim(yy.min(), yy.max())     axis.set_xticks(())     axis.set_yticks(())     i += 1       for name, clf in zip(name, classifier):         axis = plot.subplot(len(datasets), len(classifier) + 1, i)         clf.fit(x_train, y_train)         score = clf.score(x_test, y_test)          if hasattr(clf, "decision_function"):             Z = clf.decision_function(num.c_[xx.ravel(), yy.ravel()])         else:             Z = clf.predict_proba(num.c_[xx.ravel(), yy.ravel()])[:, 1]          Z = Z.reshape(xx.shape)         axis.contourf(xx, yy, Z, cmap=cm, alpha=0.8)          axis.scatter(             x_train[:, 0], x_train[:, 1], c=y_train, cmap=cm_bright, edgecolors="r"         )              axis.scatter(             x_test[:, 0],             x_test[:, 1],             c=y_test,             cmap=cm_bright,             edgecolors="r",             alpha=0.6,         )          axis.set_xlim(xx.min(), xx.max())         axis.set_ylim(yy.min(), yy.max())         axis.set_xticks(())         axis.set_yticks(())         if ds_cnt == 0:             axis.set_title(name)         axis.text(             xx.max() - 0.3,             yy.min() + 0.3,             ("%.2f" % score).lstrip("0"),             size=15,             horizontalalignment="right",         )         i += 1  plot.tight_layout() plot.show()        

Output:

After running the above code, we get the following output in which we can see that we have a different classifier and we sorted this classification into different categories.

scikit learn classification
scikit learn classification

Also, check: Scikit-learn logistic regression

Scikit learn Classification Report

In this section, we will learn about how the scikit learn classification report works in python.

A classification report is a process that is used to calculate the worth of the prediction from the algorithm of classification.

Code:

In the following code, we will import classification_report from sklearn.metrics by which we can calculate the worth of the prediction from the algorithm of classification.

  • targetnames = ['Class 1', 'Class 2', 'Class 3'] is used as a target variable.
  • print(classification_report(y_true, y_pred, target_names=targetnames)) is used to print the report of classification.
          from sklearn.metrics import classification_report y_true = [0, 1, 2, 2, 1] y_pred = [0, 0, 2, 2, 1] targetnames = ['Class 1', 'Class 2', 'Class 3'] print(classification_report(y_true, y_pred, target_names=targetnames))        

Output:

After running the above code we get the following output in which we can see that the classification report is printed on the screen.

scikit learn classification report
scikit learn classification report

Read: Scikit learn Decision Tree

Scikit learn Classification Metrics

In this section, we will learn how scikit learn classification metrics works in python.

  • The classification metrics is a process that requires probability evaluation of the positive class.
  • sklearn.metrics is a function that implements score, probability functions to calculate classification performance.

Code:

In the following code, we will import fbeta_score,make_scorer from sklearn.metrics by which that require probability evaluation of the positive class.

  • scores = make_scorer(mycustomlossfunc, greater_is_better=False) is used to calculate the score function.
  • classifier = classifier.fit(x, y) is used to fit the classifiers.
          from sklearn.metrics import fbeta_score, make_scorer ftwoscorer = make_scorer(fbeta_score, beta=2) from sklearn.model_selection import GridSearchCV from sklearn.svm import LinearSVC Grid = GridSearchCV(LinearSVC(), param_grid={'C': [1, 10]},                     scoring=ftwoscorer, cv=5) import numpy as num def mycustomlossfunc(y_true, y_pred):     differ = num.abs(y_true - y_pred).max()     return num.log1p(differ)  scores = make_scorer(mycustomlossfunc, greater_is_better=False) x = [[1], [0]] y = [0, 1] from sklearn.dummy import DummyClassifier classifier = DummyClassifier(strategy='most_frequent', random_state=0) classifier = classifier.fit(x, y) mycustomlossfunc(y, classifier.predict(x))        

Output:

After running the above code we get the following output in which we can see a loss function is printed on the screen.

scikit learn classification metrics
scikit learn classification metrics

Read: Scikit learn Hierarchical Clustering

Scikit learn Classification Example

In this section, we will learn about scikit learn classification example works in python.

Classification is a form of data analysis that extracts models describing important data classes.

Code:

In the following code, we will import gaussianProcessClassifier from sklearn.gaussian_process also import matplotlib.pyplot as plot by which we plot the probability classes.

  • iris = datasets.load_iris() is used to load the iris dataset.
  • x = iris.data[:, 0:2] is used to take only first two features for visualization.
  • plot.figure(figsize=(3 * 2, nclassifiers * 2)) is used to plot the figure on the screen.
  • probab = classifier.predict_proba(xfull) is used to view probabilities.
  • axis = plot.axes([0.16, 0.05, 0.8, 0.06]) is used to plot the axes on the graph.
  • plot.title("Probability of classes") is used to give the title to the graph.
          import matplotlib.pyplot as plot import numpy as num  from sklearn.metrics import accuracy_score from sklearn.linear_model import LogisticRegression from sklearn.svm import SVC from sklearn.gaussian_process import GaussianProcessClassifier from sklearn.gaussian_process.kernels import RBF from sklearn import datasets  iris = datasets.load_iris() x = iris.data[:, 0:2]   y = iris.target  nfeatures = x.shape[1]  c = 10 kernel = 1.0 * RBF([1.0, 1.0])   classifier = {     "L1 logistic": LogisticRegression(         C=c, penalty="l1", solver="saga", multi_class="multinomial", max_iter=10000     ),     "L2 logistic (Multinomial)": LogisticRegression(         C=c, penalty="l2", solver="saga", multi_class="multinomial", max_iter=10000     ),     "L2 logistic (OvR)": LogisticRegression(         C=c, penalty="l2", solver="saga", multi_class="ovr", max_iter=10000     ),     "Linear SVC": SVC(kernel="linear", C=c, probability=True, random_state=0),     "GPC": GaussianProcessClassifier(kernel), }  nclassifiers = len(classifiers)  plot.figure(figsize=(3 * 2, nclassifiers * 2)) plot.subplots_adjust(bottom=0.2, top=0.95)  xx = num.linspace(3, 9, 100) yy = num.linspace(1, 5, 100).T xx, yy = num.meshgrid(xx, yy) xfull = num.c_[xx.ravel(), yy.ravel()]  for index, (name, classifier) in enumerate(classifiers.items()):     classifier.fit(x, y)      y_predict = classifier.predict(x)     accuracy = accuracy_score(y, y_predict)     print("Accuracy (train) for %s: %0.1f%% " % (name, accuracy * 100))          probab = classifier.predict_proba(xfull)     nclasses = num.unique(y_pred).size     for k in range(nclasses):         plot.subplot(nclassifiers, nclasses, index * nclasses + k + 1)         plot.title("Class %d" % k)         if k == 0:             plot.ylabel(name)         imshowhandle = plot.imshow(             probab[:, k].reshape((100, 100)), extent=(3, 9, 1, 5), origin="lower"         )         plot.xticks(())         plot.yticks(())         idx = y_pred == k         if idx.any():             plot.scatter(x[idx, 0], x[idx, 1], marker="o", c="b", edgecolor="r")  axis = plot.axes([0.16, 0.05, 0.8, 0.06]) plot.title("Probability of classes") plot.colorbar(imshowhandle, cax=axis, orientation="horizontal")  plot.show()        

Output:

After running the above code, we get the following output in which we can see that accuracy and probability of the model are shown on the screen.

scikit learn classification example
scikit learn classification example

Read: Scikit learn Random Forest

Scikit learn Classification Tree

In this section, we will learn about how scikit learn classification tree works in python.

A classification tree is a supervised learning method. The function aims to create a model from which a target variable is predicted.

Code:

In the following code, we will import cross_val_score from sklearn.model_selection by which we can calculate the cross value score.

  • classifier = DecisionTreeClassifier(random_state=1)is used to create a model and predicted a target value.
  • cross_val_score(classifier, iris.data, iris.target, cv=20) is used to calculate the cross value score.
          from sklearn.datasets import load_iris from sklearn.model_selection import cross_val_score from sklearn.tree import DecisionTreeClassifier classifier = DecisionTreeClassifier(random_state=1) iris = load_iris() cross_val_score(classifier, iris.data, iris.target, cv=20)        

Output:

After running the above code, we get the following output in which we can see that the cross value score is printed on the screen.

scikit learn classification tree
scikit learn classification tree

Read: Scikit learn Hidden Markov Model

Scikit learn Classification Accuracy

In this section, we will learn about how scikit learn classification accuracy works in python.

  • Classification is a process that has a bunch of classes and these classes are sorted into different categories.
  • Accuracy in classification is defined as a number of correct predictions upon total number of predictions.

Code:

In the following code, we will import accuracy_score from sklearn.metrics that implement score, probability functions to calculate classification performance.

accuracy_score(ytrue, ypred) is used to calculate the accuracy score.

          from sklearn.metrics import accuracy_score ypred = [0, 1, 2, 3] ytrue = [0, 3, 2, 1] accuracy_score(ytrue, ypred)        

Output:

After running the above code we get the following output in which we can see that the accuracy score is printed on the screen.

scikit learn classification accuracy
scikit learn classification accuracy

For details on accuracy_score, please check the following tutorial: Scikit learn accuracy_score.

Scikit learn Classification Report Support

In this section, we will learn how scikit learn classification report support works in python.

As we know classification report is used to calculate the worth of the prediction and support is defined as the number of samples of the true reaction that are placed in the given class.

Code:

  • In the following code, we will import precision_recall_fscore_support from sklearn.metrics by which a true response is printed.
  • The precision_recall_fscore_support(y_true, y_pred, average='micro') is used to printed the report support score on the screen.
          import numpy as num from sklearn.metrics import precision_recall_fscore_support y_true = num.array(['rat', 'cat', 'rat', 'bat', 'rat', 'cat']) y_pred = num.array(['rat', 'rat', 'bat', 'cat', 'bat', 'bat']) precision_recall_fscore_support(y_true, y_pred, average='micro')        

Output:

After running the above code, we get the following output in which we can see that the report support score is printed on the screen.

scikit learn classification report support
scikit learn classification report support

Also, take a look at some more articles on Scikit learn.

  • Scikit learn Feature Selection
  • Scikit learn Ridge Regression
  • Scikit learn Linear Regression
  • Scikit learn Genetic algorithm

So, in this tutorial, we discussed scikit learn classification and we have also covered different examples related to its implementation. Here is the list of examples that we have covered.

  • Scikit learn classification
  • Scikit learn classification report
  • Scikit learn classification metrics
  • Scikit learn classification example
  • Scikit learn classification tree
  • Scikit learn classification accuracy
  • Scikit learn classification report support

mcgirracloned.blogspot.com

Source: https://pythonguides.com/scikit-learn-classification/