The computation of the essential matrix using the five-point algorithm is a staple task usually considered as being solved. However, we show that the algorithm frequently selects erroneous solutions in the presence of noise and outliers. These errors arise when the supporting point correspondences supplied to the algorithm do not adequately cover all essential planes in the scene, leading to ambiguous essential matrix solutions. This is not merely a theoretical problem: such scene conditions often occur in 3D reconstruction of real-world data when fronto-parallel point correspondences, such as points on building facades, are captured but correspondences on obliquely observed planes, such as the ground plane, are missed. To solve this problem, we propose to leverage semantic labelings of image features to guide hypothesis selection in the five-point algorithm. More specifically, we propose a two-stage RANSAC procedure in which, in the first step, only features classified as ground points are processed. These inlier ground features are subsequently used to score two-view geometry hypotheses generated by the five-point algorithm using samples of non-ground points. Results for scenes with prominent ground regions demonstrate the ability of our approach to recover epipolar geometries that describe the entire scene, rather than only well-sampled scene planes.
«The computation of the essential matrix using the five-point algorithm is a staple task usually considered as being solved. However, we show that the algorithm frequently selects erroneous solutions in the presence of noise and outliers. These errors arise when the supporting point correspondences supplied to the algorithm do not adequately cover all essential planes in the scene, leading to ambiguous essential matrix solutions. This is not merely a theoretical problem: such scene conditions oft...
»