Resources and Help Accurate, Dense, and Robust Multiview Stereopsis Abstract: This paper proposes a novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges.
|Published (Last):||12 January 2014|
|PDF File Size:||20.81 Mb|
|ePub File Size:||9.33 Mb|
|Price:||Free* [*Free Regsitration Required]|
This algorithm does not require any initialization in the form of a bounding volume, and it detects and discards automatically outliers and obstacles.
It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmark datasets presented in . The keys to its performance are effective techniques for enforcing local photometric consistency and global visibility constraints. Stereopsis is implemented as a match, expand, and? A simple but effective method for turning the resulting patch model into a mesh appropriate for image-based modeling is also presented.
The proposed approach is demonstrated on various datasets including objects with? Introduction As in the binocular case, although most early work in multi-view stereopsis e. Competing approaches mostly differ in the type of optimization techniques that they use, ranging from local methods such as gradient descent [3, 4, 7], level sets [1, 9, 18], or expectation maximization , to global ones such as graph cuts [3, 8, 17, 22, 23].
The variational approach has led to impressive progress, and several of the methods recently surveyed by Seitz et al. It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmark datasets provided in .
A simple but effective method for turning the resulting patch model into a mesh suitable for image-based modeling is also presented. The proposed approach is applied to three classes of datasets:? We will revisit tradeoffs between computational ef? Figure 1.
Overall approach. From left to right: a sample input image; detected features; reconstructed patches after the initial matching;? Object datasets are the ideal input for these algorithms, but methods using multiple depth maps [5, 21] or small, independent surface elements [6, 13] are better suited to the more challenging scene datasets.
Crowded scenes are even more dif? The method proposed in  uses expectation maximization and multiple depth maps to reconstruct a crowded scene despite the presence of occluders, but it is limited to a small number of images typically three.
As shown by qualitative and quantitative experiments in the rest of this paper, our algorithm effectively handles all three types of data, and, in particular, outputs accurate object and scene models with? As noted earlier, it implements multi-view stereopsis as a simple match, expand, and? This approach is similar to the method proposed by Lhuillier and Quan , but their expansion procedure is greedy, while our algorithm iterates between expansion and?
Furthermore, outliers cannot be handled in their method. These differences are also true with the approach by Kushal and Ponce  in comparison to ours. In addition, only a pair of images can be handled at once in , while our method can process arbitrary number of images uniformly.
We also introduce two other fundamental building blocks of our approach, namely, the methods used to accurately reconstruct a patch once the corresponding image fragments have been matched, and determine its visibility. Patch Models A patch p is a rectangle with center c p and unit normal vector n p oriented toward the cameras observing it Fig. We associate with p a reference image R p , chosen so that its retinal plane is close to parallel to p with little distortion.
In turn, R p determines the orientation and extent of the rectangle p in the plane orthogonal to n p , so the projection of one of its edges into R p is parallel to the image rows, and the smallest axis-aligned square containing its image covers a? Two sets of pictures are also attached to each patch p: the images S p where p should be visible despite self-occlusion , but may in practice not be recognizable due to highlights, motion blur, etc.
Second, we enforce global visibility consistency by requiring that no patch p be occluded by any other patch in any image in S p. See text for the details. We also associate with C i, j the depth of the center of the patch in Qt i, j closest to the optical center of the corresponding camera. This amounts to attaching a depth map to I , which will prove useful in the visibility calculations of Sect.
Concretely, a? Given a patch p, its reference image R p , and the set of images T p where it is truly visible, we can now estimate its position c p and its surface normal n p by maximizing the average NCC score? On the other hand, in the expansion phase of our algorithm Sect.
This process may fail when the reference image R p is itself an outlier, but, as explained in the next section, our algorithm is designed to handle this problem.
Iterating its matching and? Algorithm 3. Matching As the? After these features have been found in each image, they are matched across multiple pictures to reconstruct a sparse set of patches, which are then stored in the grid of cells C i, j overlaid on each image Fig.
We then consider these points in order of increasing distance from O as potential patch centers, 4 and return the? More concretely, for each 3 Brie? The response of the Harris? The response of the DoG? To simplify computations, we constrain c p to lie on the ray joining the optical center of the reference camera to the corresponding image point, reducing the number of degrees of freedom of this optimization problem to three—depth along the ray plus yaw and pitch angles for n p , and use a conjugate gradient method  to?
Simple methods for computing reasonable initial guesses for c p and n p are given in Sects. Enforcing Visibility Consistency The visibility of each patch p is determined by the images S p and T p where it is potentially or truly observed. We use two slightly different methods for constructing S p and T p depending on the stage of our reconstruction algorithm. In the matching phase Sect. The second part of our algorithm iterates three times in all our experiments between an expansion step to obtain dense patches and a?
Expansion At this stage, we iteratively add new neighbors to existing patches until they cover the surfaces visible in the scene.
Intuitively, two patches p and p are considered to be neighbors when they are stored in adjacent cells C i, j and C i , j of the same image I in S p , and their tangent planes are close to each other. We only attempt to create new neighbors when necessary—that is, when Qt i , j is empty, 5 and none of the elements of Q f i , j is n-adjacent to p, where two patches p and p are said to be n-adjacent when c p?
When these two conditions are veri? Next, c p and n p are re? Since some matches and thus the corresponding depth map information may be incorrect at this point, the elements of T p are added to S p to avoid missing any image where p may be visible.
Finally, after updating T p using photometric constraints as in Sect. See Fig. Output: Initial sparse set of patches P. Figure 3. Feature matching algorithm. Bottom: The matching algorithm. After initializing T p by using photometric consistency as in Sect. Finally, if p satis? Note that since the purpose of this step is only to reconstruct an initial, sparse set of patches, features lying in non-empty cells are skipped for ef?
Also note that the patch generation process may fail if the reference image R p is an outlier, for example when f correspond to a highlight.
This does not prevent, however, the reconstruction of the correspond4 Two? We remove p0 as an outlier when T p0 N? The second? Input: Patches P from the feature matching step. Output: Expanded set of reconstructed patches. Use P to initialize, for each image, Q f , Qt , and its depth map. Figure 4. Patch expansion algorithm. Polygonal surface reconstruction. Left: bounding volumes for the dino visual hull , steps convex hull , and city-hall union of hemispheres datasets featured in Figs.
Right: geometric elements driving the deformation process. Outliers lying outside left or inside right the correct surface.
U denotes a set of patches occluded by an outlier. See text for details. Note that the recomputed values of S p0 and T p0 may be different from those obtained in the expansion step since more patches have been computed after the reconstruction of p0. Finally, we enforce a weak form of regularization as follows: For each patch p, we collect the patches lying in its own and adjacent cells in all images of S p. Polygonal Surface Reconstruction The reconstructed patches form an oriented point, or surfel model.
Despite the growing popularity of this type of models in the computer graphics community , it remains desirable to turn our collection of patches into surface meshes for image-based modeling applications. The 5 approach that we have adopted is a variant of the iterative deformation algorithm presented in , and consists of two phases.
Concretely, the smoothness term is? In the? In the second phase, the photometric consistency term is computed for each vertex by using the patch optimization routine as follows. At each vertex v, we create a patch p by initializing c p with v, n p with a surface normal estimated at v on S, and a set of visible images S p from a depth-map testing on the mesh S at v, then apply the patch optimization routine described in Sect.
Let c? Characteristics of the datasets used in our experiments. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski temple and dino, see also  ; C. Schmitt and the Museum of Cherbourg polynesian ; S.
Accurate, Dense, and Robust Multiview Stereopsis
How- gular patches covering the surfaces visible in the input images. The keys to its performance are effective tech- tion, is capable of detecting and discarding outliers and ob- niques for enforcing local photometric consistency and global stacles, and outputs a quasi dense collection of small ori- visibility constraints. Stereopsis is implemented as a match, ex- ented rectangular patches [6, 13], obtained from pixel-level pand, and filter procedure, starting from a sparse set of matched correspondences and tightly covering the observed surfaces keypoints, and repeatedly expanding these to nearby pixel corre- except in small textureless or occluded regions. It does not spondences before using visibility constraints to filter away false perform any smoothing across nearby features, yet is cur- matches.
Accurate, Dense, and Robust Multi-View Stereopsis论文分析与代码实现（一）
Abstract Abstract: This paper proposes a novel algorithm for calibrated multi-view stereopsis that outputs a quasi dense set of rectangular patches covering the surfaces visible in the input images. This algorithm does not require any initialization in the form of a bounding volume, and it detects and discards automatically outliers and obstacles. It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmarkdatasets presented in . The keys to its performance are effective techniques for enforcing local photometric consistency and global visibility constraints. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these to nearby pixel correspondences before using visibility constraints to filter away false matches. A simple but effective method for turning the resulting patch model into a mesh appropriate for image-based modeling is also presented.