Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint

Entropy

Entropy is another way to see how mixed a set is. It comes from information theory, and it measures the amount of disorder in a set. Loosely defined, entropy is how surprising a randomly selected item from the set is. If the entire set were As, you would never be surprised to see an A, so the entropy would be 0. The formula is shown in Figure B-7.

Entropy

Figure B-7. Entropy

This function takes a list of items and calculates the entropy:

def entropy(l):
  from math import log
  log2=lambda x:log(x)/log(2)

  total=len(l)
  counts={}
  for item in l:
    counts.setdefault(item,0)
    counts[item]+=1

  ent=0
  for i in counts:
    p=float(counts[i])/total
    ent-=p*log2(p)
  return ent

In Chapter 7, Entropy is used in decision tree modeling to determine if dividing a set reduces the amount of disorder.


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free 10-Day Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint