Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL
Help

15. Getting Your Data into Shape > Summarizing Data by Groups

Summarizing Data by Groups

Problem

You want to summarize your data, based on one or more grouping variables.

Solution

Use ddply() from the plyr package with the summarise() function, and specify the operations to do:

library(MASS) # For the data set
library(plyr)

ddply(cabbages, c("Cult", "Date"), summarise, Weight = mean(HeadWt), 
      VitC = mean(VitC))

 Cult Date Weight VitC
  c39  d16   3.18 50.3
  c39  d20   2.80 49.4
  c39  d21   2.74 54.8
  c52  d16   2.26 62.5
  c52  d20   3.11 58.9
  c52  d21   1.47 71.8

Discussion

Let’s take a closer look at the cabbages data set. It has two factors that can be used as grouping variables: Cult, which has levels c39 and c52, and Date, which has levels d16, d20, and d21. It also has two numeric variables, HeadWt and VitC:


  

You are currently reading a PREVIEW of this book.

                                                                                                                    

Get instant access to over $1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint