Decision Tree And Random Forest

2 min readJul 2, 2021

Decision trees is a type of supervised machine learning algorithm used for classification problems. In it the data is being split according to a parameter. In a decision tree each internal node represents a test on a feature, each leaf node represents a class label.

Entropy

So basically how a decision tree is constructed depends on the entropy and information gain/gini impurity. A best split of a node would result in the entropy of 0 and worst split would make the entropy 1 meaning the split in worst case is of no use. The formula for calculating the entropy of a node in a decision tree is Entropy = -(p(0) * log(P(0)) + p(1) * log(P(1))). Where p(0) and p(1) are the probabilities of that event occurring under a specific condition.

Information Gain

After finding out the entropy we need to find information gain. Information tells us how much information we are carrying till that node higher the information gain the best feature we have chosen for the split. The formula for calculating information gain is Entropy(Dataset) — (Count(Group1) / Count(Dataset) * Entropy(Group1) + Count(Group2) / Count(Dataset) * Entropy(Group2)). Entropy(Dataset) is the entropy of the parent node.

Gini Impurity

Gini Impurity is just another alternative method of information gain. The main difference between gini impurity and information gain is that gini impurity is easy to compute and it is mathematically more feasible than information gain. Decision tree classifier in sklearn uses gini by default for determining the best split.

Disadvantages of decision trees

Decision trees are very robust to outliers. Even a single outlier could have major effect on the accuracy of our model.
Decision tress have very high variance and very low bias.
Training the model over a large dataset might be time consuming

Random Forest Classifier

Random forest builds multiple decision trees and merge them together to get more accurate and stable prediction. Random forest is trained with a bagging method.
Bagging method is based on the idea that combination of learning model increases the overall result.