How to estimate/determine surface normals and tangent planes at points of a depth image?

I have depth image, that I've generated using 3D CAD data. This depth image can also be taken from a depth imaging sensor such as Kinect or a stereo camera. So basically it is depth map of points visible in the imaging view. In other words it is segmented point cloud of an object from a certain view.

I would like to determine (estimating will also do) the surface normals of each point, then find tangent plane of that point.

How can I do this? I've did some research and find some techniques but didn't understand them well (I could not implement it). More importantly how can I do this in Matlab or OpenCV? I've couldn't manage to do this using surfnorm command. AFAIK it needs single surface, and I have partial surfaces in my depth image.

This is an example depth image.


What I want to do is, after I get the surface normal at each point I will create tangent planes at those points. Then use those tangent planes to decide if that point is coming from a flat region or not by taking the sum of distances of neighbor points to the tangent plane.



So there are a couple of things that are undefined in your question, but I'll do my best to outline an answer.

The basic idea for what you want to do is to take the gradient of the image, and then apply a transformation to the gradient to get the normal vectors. Taking the gradient in matlab is easy:

[m, g] = imgradient(d);

gives us the magnitude (m) and the direction (g) of the gradient (relative to the horizontal and measured in degrees) of the image at every point. For instance, if we display the magnitude of the gradient for your image it looks like this:

Now, the harder part is to take this information we have about the gradient and turn it into a normal vector. In order to do this properly we need to know how to transform from image coordinates to world coordinates. For a CAD-generated image like yours, this information is contained in the projection transformation used to make the image. For a real-world image like one you'd get from a Kinect, you would have to look up the spec for the image-capture device.

The key piece of information we need is this: just how wide is each pixel in real-world coordinates? For non-orthonormal projections (like those used by real-world image capture devices) we can approximate this by assuming each pixel represents light within a fixed angle of the real world. If we know this angle (call it p and measure it in radians), then the real-world distance covered by a pixel is just sin(p) .* d, or approximately p .* d where d is the depth of the image at each pixel.

Now if we have this info, we can construct the 3 components of the normal vectors:

width = p .* d;
gradx = m .* cos(g) * width;
grady = m .* sin(g) * width;

normx = - gradx;
normy = - grady;
normz = 1;

len = sqrt(normx .^ 2 + normy .^ 2 + normz .^ 2);
x = normx ./ len;
y = normy ./ len;
z = normz ./ len;

What mattnewport is suggesting is can be done in a pixel shader. In each pixel shader you calculate two vectors A and B and the cross product of the vectors will give you the normal. The way you calculate the two vectors is like so:

float2 du //values sent to the shader based on depth image's width and height
float2 dv //normally du = float2(1/width, 0) and dv = float2(0, 1/height)
float D = sample(depthtex, uv)
float D1 = sample(depthtex, uv + du)
float D2 = sample(depthtex, uv + dv)
float3 A = float3(du*width_of_image, 0, D1-D)
float3 B = float3(0, dv*height_of_image, D2-D)
float3 normal = AXB
return normal

This will break when there're discontinuities in the depth values.

To calculate if a surface is flat in the pixel shader the second order partial derivatives can be used. The way you calculate the second order derivatives is by calculating the finite differences and the finding the difference on that like so:

float D = sample(depthtex, uv)
float D1 = sample(depthtex, uv + du)
float D3 = sample(depthtex, uv - du)

float dx1 = (D1 - D)/du
float dx2 = (D - D3)/du
float dxx = (dx2 - dx1)/du

In the same way you have to calculate dyy, dxy and dyx. The surface is flat if dxx = dyy = dxy = dyx = 0.

Typically, you'd choose the du and dv to be 1/width and 1/height of the depth image .

All of this stuff happens on the GPU which makes everything really fast. But if you don't care about that you can run this method in the CPU as well. The only issue will be for you to replace a function like sample and implement your own version of that. It will take the depth image and u, v values as input and return a depth value at the sampled point.


Here's a hypothetical sampling function that does nearest neighbour sampling on the CPU.

float Sample(const Texture& texture, vector_2d uv){
    return[(int)(uv.x * texture.width + 0.5)][(int)(uv.y * texture.height + 0.5];

I will describe what I think you have to do conceptually and provide links to the relevant parts of opencv,

To determine normal of a given (3d) point in a pointcloud:

  1. Create a kd-tree or (balltree?) representation of your point cloud so that you can efficiently compute the k nearest neighbors. Your choice of k should depend on the density of your data.

  2. After querying for the k-nearest neighbors of a given point p, use them to find a best fit plane. You can use PCA to do this. Set maxComponents=2.

  3. Step 2 should return two eigenvectors which define the plane you are interested in. The cross product of these two vectors should be (an estimation of) your desired normal vector. You can find info how to calculate this in opencv (Mat::cross)

Need Your Help

Should I use Bootstrap from CDN or make a copy on my server?

twitter-bootstrap twitter-bootstrap-3

What's the best practice of using Twitter Bootstrap, refer to it from CDN or make a local copy on my server?

Calling app from with arguments

python process python-2.6 raspberry-pi

I'm a beginner in Python, and I've been trying to call a command line app, but it fails: