0% found this document useful (0 votes)
14 views4 pages

Iris Dataset Analysis with Decision Tree

The document outlines a data analysis project using the Iris dataset, which includes loading the dataset, exploring its structure, and visualizing relationships between features. It further details the implementation of a Decision Tree Classifier to predict species based on sepal and petal measurements, achieving perfect accuracy on the test set. The results are presented through confusion matrices and classification reports, along with visualizations of the decision tree.

Uploaded by

satyamraj54671
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

Iris Dataset Analysis with Decision Tree

The document outlines a data analysis project using the Iris dataset, which includes loading the dataset, exploring its structure, and visualizing relationships between features. It further details the implementation of a Decision Tree Classifier to predict species based on sepal and petal measurements, achieving perfect accuracy on the test set. The results are presented through confusion matrices and classification reports, along with visualizations of the decision tree.

Uploaded by

satyamraj54671
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Devanshi Srivastava

ENG23CS0300

import pandas as pd
import [Link] as plt
import seaborn as sns
import numpy as np

import warnings
[Link]('ignore')

df = sns.load_dataset('iris')

[Link]()

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa

[Link]

750

[Link]

(150, 5)

[Link]()

<class '[Link]'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype

0 sepal_length 150 non-null float64


1 sepal_width 150 non-null float64
2 petal_length 150 non-null float64
3 petal_width 150 non-null float64
4 species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

[Link].value_counts()

species
setosa 50
versicolor 50
virginica 50
Name: count, dtype: int64

[Link](data = df , hue='species')

<[Link] at 0x1f4e3ec3da0>
[Link] (([Link]('species' , axis = 1).corr()), annot=True)

<Axes: >
from [Link] import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(df['species'])

y[0:5]

array([0, 0, 0, 0, 0])

x = [Link]('species' , axis =1)

from sklearn.model_selection import train_test_split


x_train,x_test,y_train,y_test = train_test_split(x , y , test_size=0.2 ,random_state = 42)

from [Link] import DecisionTreeClassifier


dtree = DecisionTreeClassifier()
[Link](x_train,y_train)
y_predicted = [Link](x_test)

from [Link] import classification_report , confusion_matrix


print(confusion_matrix(y_test , y_predicted))
cm=confusion_matrix(y_test , y_predicted)

[[10 0 0]
[ 0 9 0]
[ 0 0 11]]

print(classification_report(y_test , y_predicted))

precision recall f1-score support

0 1.00 1.00 1.00 10


1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

[Link](cm, annot=True , cmap='Greens' , cbar=False, annot_kws ={"fontsize":18})


[Link]("Predicted Value")
[Link]("Actual Value")
[Link]()
from [Link] import plot_tree
plot = plot_tree(decision_tree= dtree , feature_names=[Link] , class_names=["setosa" , "versicolor" , "virg

from [Link] import plot_tree


plot = plot_tree(decision_tree= dtree , feature_names=[Link] , class_names=["setosa" , "versicolor" , "virg

You might also like