Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint
Share this Page URL
Help

2. Working with the Depth Image > Converting to Real-World Distances

Converting to Real-World Distances

Converting to Real-World DistancesBut what about these pixels’ actual depth? We know that the color of each pixel in the depth image is supposed to relate to the distance of that part of the scene, but how far away is 96 or 115 or 224? How do we translate from these brightness values to a distance measurement in some kind of useful units like inches or centimeters?Well, let’s start with some basic proportions. We know from our experiments earlier in this chapter that the Kinect can detect objects within a range of about 20 inches to 25 feet. And we know that the pixels in our depth image have brightness values that range between 255 and 0. So, logically, it must be the case that pixels in the depth image that have a brightness of 255 must correspond to objects that are 20 inches away, pixels with a value of 0 must be at least 25 feet away, and values in between must represent the range in between, getting farther away as their values descend toward 0.So, how far away is the back of our chair with the depth pixel reading of 96? Given the logic just described, we can make an estimate. Doing a bit of back-of-the-envelope calculation tells me that 96 is approximately 37% of the way between 0 and 255. Hence the simplest possible distance calculation would hold that the chair back is 37% of the way between 25 feet and 20 inches, or about 14 and a half feet. This is clearly not right. I’m sitting in front of the chair now and it’s no more than 8 or 10 feet away from the Kinect.Did we go wrong somewhere in our calculation? Is our Kinect broken? No. The answer is much simpler. The relationship between the brightness value of the depth pixel and the real-world distance it represents is more complicated than a simple linear ratio. There are really good reasons for this to be the case.First of all, consider the two ranges of values being covered. As we’ve thoroughly discussed, the depth pixels represent grayscale values between 0 and 255. The real-world distance covered by the Kinect, on the other hand, ranges between 20 inches and 25 feet. Further, the Kinect’s depth readings have millimeter precision. That means they need to report not just a number ranging from 0 to a few hundred inches, but a number ranging from 0 to around 8,000 millimeters to cover the entirety of the distance the Kinect can see. That’s obviously a much larger range than can fit in the bit depth of the pixels in our depth image.Now, given these differences in range, we could still think of some simple ways of converting between the two. For example, it might have occurred to you to use Processing’s map function, which scales a variable from an expected range of input values to a different set of output values. However, using map is a little bit like pulling on a piece of stretchy cloth to get it to cover a larger area: you’re not actually creating more cloth, you’re just pulling the individual strands of the cloth apart, putting space between them to increase the area covered by the whole. If there were a pattern on the surface of the cloth, it would get bigger, but you’d also start to see spaces within it where the strands separated. Processing’s map function does something similar. When you use it to convert from a small range of input values to a larger output, it can’t create new values out of thin air—it just stretches out the existing values so that they cover the new range. Your highest input values will get stretched up to your highest output value and your lowest input to your lowest output, but just like with the piece of cloth, there won’t be enough material to cover all the intermediate values. There will be holes. And in the case of mapping our depth pixels from their brightness range of 0 to 255 to the physical range of 0 to 8,000 millimeters, there will be a lot of holes. Those 256 brightness values will only cover a small minority of the 8,000 possible distance values.To cover all of those distance values without holes, we’d need to access the depth data from the Kinect in some higher resolution form. As I mentioned in the introduction, the Kinect actually captures the depth information at a resolution of 11 bits per pixel. This means that these raw depth readings have a range of 0 to 2,047—much better than the 0 to 255 available in the depth pixels we’ve looked at so far. And the SimpleOpenNI library gives us a way to access these raw depth values.But wait! Have we been cheated? Why don’t the depth image pixels have this full range of values? Why haven’t we been working with them all along?Remember the discussion of images and pixels at the start of this chapter? Back then, I explained that the bigger of a number we use to store each pixel in an image, the more memory it takes to store that image and the slower any code runs that has to work with it. Also, it’s very hard for people to visually distinguish between more shades of gray than that anyway. Think back to our investigations of the depth image. There were areas in the back of the scene that looked like flat expanses of black pixels, but when we clicked around on them to investigate, it turned out that even they had different brightness values, simply with differences that were too small for us to see. For all of these reasons, all images in Processing have pixels that are 8 bits in depth—i.e., whose values only range from 0 to 255. The PImage class we’ve used in our examples that allows us to load images from the Kinect library and display them on the screen enforces the use of these smaller pixels.What it really comes down to is this: when we’re displaying depth information on the screen as images, we make a set of trade-offs. We use a rougher representation of the data because it’s easier to work with and we can’t really see the differences anyway. But now that we want to use the Kinect to make precise distance measurements, we want to make a different set of trade-offs. We need the full-resolution data in order to make more accurate measurements, but since that’s more unwieldy to work with, we’re going to use it more sparingly.Let’s make a new version of our Processing sketch that uses the higher-resolution depth data to turn our Kinect into a wireless tape measure. With this higher-resolution data, we actually have enough information to calculate the precise distance between our Kinect and any object in its field of view and to display it in real-world units like inches and millimeters.

  

You are currently reading a PREVIEW of this book.

                                                                                        

Get instant access to over
$1 million worth of books and videos.

  

Start a Free Trial


  
  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint