Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

B. Mathematical Formulas > Gini Impurity

Gini Impurity

Gini impurity is a measure of how impure a set is. If you have a set of items, such as [A, A, B, B, B, C], then Gini impurity tells you the probability that you would be wrong if you picked one item and randomly guessed its label. If the set were all As, you would always guess A and never be wrong, so the set would be totally pure.

Figure B-6 shows the formula for Gini impurity.

Gini impurity

Figure B-6. Gini impurity

This function takes a list of items and calculates the Gini impurity:

def giniimpurity(l):
  total=len(l)
  counts={}
  for item in l:
    counts.setdefault(item,0)
    counts[item]+=1

  imp=0
  for j in l:
    f1=float(counts[j])/total
    for k in l:
      if j==k: continue
      f2=float(counts[k])/total
      imp+=f1*f2
  return imp

  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint