Friday, March 6, 2015

3D Photos - Motorcycle

There appears to be a new Middlebury stereo dataset: 2014 Stereo datasets. The motorcycle below is one of the stereo pairs on offer.


Left view of motorcycle. This image is 1200 pixels wide. The original stereo pair is much larger. When doing stereo matching, I always try to limit the largest dimension to be around 1200 pixels.


Right view of motorcycle.

Alright, let's fire up the old Depth Map Automatic Generator 5 (DMAG5). First, with the default parameters. The min and max disparities were of course obtained with Disparity Finder 2 (DF2).



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 12, alpha = 0.9, truncation value (color) = 7, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 0.

It's actually a pretty decent depth map but it doesn't hurt to fiddle around with the parameters and see if things can be improved. It's not always easy to tell whether one depth map is better than an other (often, you really have to render the 3d scene to really tell) and there is a lot of trial and error involved. Let's see if we can get a better depth map by solely relying on color for stereo matching, which is done by setting alpha to 0.



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 12, alpha = 0.0, truncation value (color) = 128, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 0.

Ok, not a good idea. Maybe, decreasing the truncation value (color) would help, although I haven't tried. Let's go back to the previous parameters and increase the window radius.



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 24, alpha = 0.9, truncation value (color) = 7, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 0.

Again, not a tremendous idea. Let's increase the disparity tolerance so that we get fewer occlusions.



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 12, alpha = 0.9, truncation value (color) = 7, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 4.

I think we can increase the disparity tolerance some more and see what happens.



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 12, alpha = 0.9, truncation value (color) = 7, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 8.

It's probably worth a try to decrease the window radius.



Depth map and occlusion map for min disparity = 0, max disparity = 100, window radius = 6, alpha = 0.9, truncation value (color) = 7, truncation value (gradient) = 2, epsilon = 4, number of smoothing iterations = 1, and disparity tolerance (occlusion detection) = 8.

Ok, let's settle on that depth map even if it's not the best one, because one could spend a lot of time playing around parameters with not much of a payoff guarantee.


This is what you can see when you load the left image and the depth map into Gimpel3d. The depth map is obviously not perfect. The spikes could be probably be ironed out by applying a little dose of Edge Preserving Smoothing (EPS5) and the areas that are clearly wrong could be painted over in Photoshop or Gimp with your favorite brush. Of course, the "things" that connect objects at different depth are a bit of an eye sore but they cannot be helped, I am afraid. These are the areas that get uncovered when you don't look at the scene straight on.


This the animation gif created by DEPTHY using the depth map we created last (make sure you invert the depth map color because depthy.me expects black to represent the foreground, not white). If you're not too picky, it looks ok. Actually, more than ok. Even though the scene in Gimpel3d showed quite a few problems, the animation gif is rather pleasant to look at (and get hypnotized by). It goes to show that you don't have to have perfect depth maps to get a smooth and relatively accurate 3d wiggle. Now, to get a fully rendered 3d scene, that's a whole different story as you can see in the Gimpel3d video above. In my opinion, one would probably need to cut up the image (and depth map) in layers (one layer per object in the image as well as the corresponding layer in the depth map). Kinda like what you do in a 2d to 3d conversion. So, in our case, maybe one pair of layers for the bike and another pair for the background. You certainly also might want to do some in-painting to extend the background objects, just like for a 2d to 3d conversion. I have never tried to load multiple images and multiple depth maps into Blender but I think it should be doable and not too painful. Maybe I'll do that in an upcoming post, who knows?

Just for fun, let's see what Depth Map Automatic Generator 3 (DMAG3) does with the same stereo pair as input. Recall that DMAG3 is a "graph cuts" stereo matching algorithm, which means it's gonna be (much) slower than DMAG5.


Depth map obtained with DMAG3 using window radius = 2, gamma proximity = 17, gamma color similarity = 14, lambda = 10, and K = 30.

No comments:

Post a Comment