Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


Share this Page URL
Help

3.2 Density Estimators > 3.2.1 Normal Kernel - Pg. 46

46 Introduction to Robust Estimation and Hypothesis Testing sample median will be estimated. So, for example, if data are stored in the R variable blob, the command bootse(blob) will return the estimated standard error of the usual sample median. 3.2 Density Estimators Before continuing with the main issues covered in this chapter, it helps to first touch on a related problem that plays a role here as well as in subsequent chapters. The problem is estimating f (x), the probability density function, based on a random sample of observations. Such estimators play an explicit role when trying to estimate the standard error of certain location estimators to be described. More generally, density estimators provide a useful perspective when trying to assess how groups differ and by how much. Generally, kernel density estimators take the form 1 f ^ (x) = nh n K i=1 x - X i h , where K is some probability density function and h is a constant to be determined. The constant h has been given several names including the span, the window width, the smoothing parameter, and the bandwidth. Some explicit choices for h are discussed later in this section. Often K is taken to be a distribution symmetric about zero, but there are exceptions. There is a vast literature on kernel density estimators (Silverman, 1986; Scott, 1992; Wand & Jones, 1995; Simonoff, 1996) and research in this area remains active. (For some recent results, see for example, Clements, Hurn, & Lindsay, 2003; Devroye & Lugosi, 2001; Messer & Goldstein, 1993; Yang & Marron, 1999; cf. Liu & Brown, 1993.) Here, four types of kernel density estimators are summarized for later reference. 3.2.1 Normal Kernel The first of the four methods covered here simply takes K to be the standard normal density. For reasons to be illustrated, the method can be unsatisfactory, but it is the default method used by some software packages, so it is included merely to illustrate potential problems. Following Silverman (1986), as well as the recommendation made by Venables and Ripley (2002, p. 127), the span is taken to be h = 1.06min(s, IQR/1.34)n -1/5 , where s is the usual sample standard deviation and IQR is some estimate of the interquartile range. That is, IQR estimates the difference between the 0.75 and 0.25 quantiles. www.elsevierdirect.com