Report

Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda 19th Nov. 2010 Contribution of this talk Fast GPU implementations of registration algorithms for 3D point sets. Softassign [Gold et al., 1998] EM-ICP [Granger et al., 2002] (Weighted) Horn’s method [Horn, 1987] So, what is “registartion” ? What is “Registration” or “Alignment” ? Image registration A set of images 3D registration algorithm Input Two point sets: and Output Rotation matrix Translation vector and X Y Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations. Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations. Horn’s method: correspondence is known. 1 2 ⋮ 1 2 ⋮ Known correspondence X Y 1 = (1 , 1 , 1 ) Unknown correspondence ? X Y Horn’s method: correspondence is known. 1 2 ⋮ 3 1 2 ⋮ = 4 = − − Compute centers 1 Centering 2 5 Computer 1st Eigenvector : quaternion Convert to = − Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations. ICP: correspondence is unknown. 1 2 ⋮ 1 2 ⋮ ∗ Find closest (nearest) point to 1 in Put the point to ∗ ICP: correspondence is unknown. 1 2 ⋮ 1 2 ⋮ ⋮ Horn’s method with and ∗ ∗ Estimate and Find closest (nearest) point to 1 in Put the point to ∗ ICP: correspondence is unknown. 1 2 ⋮ 1 2 ⋮ Horn’s method with and ∗ ⋮ + ∗ Estimate and Repeat Find closest (nearest) point to 1 in Put the point to ∗ Fast, but easy to fail due to hard correspondence. Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations. Softassign: soft correspondence. Weighted Horn’s method with and Each row and column should be normalized to 1 by Shinkhorn iterations = || − + || Estimate and Repeat Shinkhorn iterations sum up to 1 sum up to 1 sum up to 1 ⋮ Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 Repeat row and column normalization until converge. Shinkhorn iterations Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 ⋮ sum up to 1 sum up to 1 sum up to 1 Repeat row and column normalization until converge. Shinkhorn.GPU (row normalization) Using sgemv of CUBLAS 1 1 1 ⋮ Each row and column should be normalized to 1 by Shinkhorn iterations Shinkhorn.GPU (row normalization) Using CUDA kernel Row-wise division Each row and column should be normalized to 1 by Shinkhorn iterations Column normalization is done by the same way. Weighted Horn’s method Normal version Weighted version 3 3 = = Using CUBLAS sgemv twice. Centering.GPU (weighted version) CUDA kernel ∗ ∗ 1 1 1 ⋮ CUBLAS sasum Weighted sum CUBLAS sasum ∗ Weighted center Same as for Pipeline of Softassing.GPU Compute with CUDA kernel Shinkhorn.GPU , and Solve Eigenvalue problem = Centering.GPU Weighted Horn’s method Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations. EM-ICP: soft correspondence. Pseudo correspondence ′ Weighted Horn’s method with ′ and ′ ′ Estimate and Each row is normalized once. = || − + || Repeat Row normalization on GPU Using sgemv of CUBLAS 1 1 1 ⋮ Not normalized yet. Row normalization on GPU Using CUDA kernel Row-wise division + sqrt Now normalized. Computing weights Using sgemv of CUBLAS 1 1 1 ⋮ Now normalized. Pseudo correspondence CUBLAS sgemv Now normalized. Centering: same with Softassing.GPU ′ Weighted Horn’s method Weighted version (not efficient) 3 1 = 0 2 ′ 0 ⋱ Weighted version (2 steps) 3 CUDA kernel CUBLAS sgemm ∗ ′ = ’ Pipeline of EM-ICP.GPU Compute with CUDA kernel Row normalization on GPU , and Solve Eigenvalue problem ∗ ′ Centering.GPU = ′ 2 step weighted Horn’s method Computing time over different number of points GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores) Successfully aligned 5000 points less than 7 seconds. Slightly fast, but failed. Summary Implemented 3D registration algorithms on a GPU are: Softassign, EM-ICP, Weighted Horn’s method. EM-ICP.GPU is able to align 5000 points within 7 seconds, 60 times faster than EM-ICP.CPU, more robust than ICP.CPU. Code, binary, and movies are available at: http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/ Limitations Number of points Should be less than 8000 for GeForce8800GT with 512MB memory. More memory, more points. Stopping condition requires to store whole matrix or , and compare with previous ones: inefficient. Hence, currently, number of iterations is fixed. Algorithms for registration Horn’s method ICP (Iterative closest point) • Corresponding point sets are given. • Estimate R and t. • • • • Unknown correspondence. Fast, standard. Easily fail due to local minimum. A lot of variants follow. Registration algorithm Softassign EM-ICP • Unknown correspondence. • Robust. • Very slow because of iterations. • Unknown correspondence. • Robust. • Very slow because of iterations.