In this release, we focused our efforts on improving the quality of Open3D documentation and paving the way for upcoming GPU support.
Documentation is a critical aspect of any software project, but it becomes especially critical in open-source projects. This is one of the main ways we engage with the community. For this reason, we improved the internal infrastructure to automatically generate documentation in a new format, which makes the Python API more readable and easy to understand. Please take a look atOpen3D docs.
The team has also been working on bringing multi-GPU support to Open3D. We will start rolling this out in upcoming releases. In the meantime, feedback and suggestions are welcomed. Please check our GPU integration branchhere. This release also includes new data types that serve as the foundation for new meshing algorithms that will be rolled out in our next release.
The full list of changes can be seen below. Please send us feedback at info@open3d.org and join our Discord network to participate in the discussions.
Enjoy!
The Open3D team
Legend:
[Added]: Used to indicate addition of new features
[Changed]: Updates of existing functionalities
[Deprecated]: Functionalities / features removed in future releases
[Removed]: Functionalities / features removed in this release
[Fixed]: For any bug fixes
[Breaking] This functionality breaks the previous API and you need to check your code
Installation and project structure
[Changed] simplified cmake include directory structure #839
[Changed] new installation default behavior: don’t install 3rd party header except Eigen and GL #840
[Braking] new project directory structure #842 #850 #855
CORE features and applications
[Added] Travis build docs, use 16.04, and other fixes #885
[Added] update adjacency list after mesh operations #843
[Added] HalfEdgeTriangleMesh data type support #851 #868
[Added] STL file support #786
[Added] Compute vertex adjacency map #830
[Changed] Standardize API of SolveLinearSystemPSD #821
[Changed] upgraded pybind11 #837
[Changed] Upgrade OpenGL GLSL version #854
[Fixed] path in the comments of python_binding.py #878
[Fixed] clang format discrepancy and links #793 #795 #816
[Fixed] autocomplete for python modules #799
[Fixed] intrinsic parameters for Kinect2 #801
[Fixed] initializers for FastGlobalRegistration class #807 #808
[Fixed] Travis fails when unit tests fail in a docker container #810
[Fixed] ColorMap divergency and other issues #819 #860
[Fixed] add minus sign in SolveJacobianSystemAndObtainExtrinsicMatrixArray #822
[Fixed] STL mesh write vertex index #829
Documentation, tutorials, and examples
[Added] pybind docs parser and Google-style docstring generator #864
The Open3D team and the Open Source Vision Foundation (http://www.osvf.org) are excited to announce the 0.5.0 release of the Open3D library.
In this release we show the power of Open3D as a core tool to create machine learning solutions for 3D data. We introduce a re-implementation of the PointNet++ architecture to perform point cloud semantic segmentation using Open3D and TensorFlow. Our Open3D-PointNet++ is able to produce highly accurate results in the Semantic3D benchmark, surpassing the results of the original PointNet++ implementation. Even more exciting is the fact that our re-designed Open3D-PointNet++ is able to perform real-time inference (+10 FPS) on the KITTI dataset. We show how to perform training and inference of Open3D-PointNet++ in both Semantic3D and KITTI. Check out this blog post for more information!
We have also added a new VoxelGrid representation and tooling to convert from point clouds to a VoxelGrid structure. This functionality is extremely useful to produce representations that are easier to digest by neural networks.
We have also done significant improvements to our internal infrastructure, including a simplified CI testing mechanism via docker images, enhanced testing coverage, and easier installation of the library.
Full list of changes below. Please send us feedback at info@open3d.org and join our Discord network [link] to participate in the discussions.
Enjoy!
The Open3D team
Legend:
[Added]: Used to indicate addition of new features
[Changed]: Updates of existing functionalities
[Deprecated]: Functionalities / features removed in future releases
[Removed]: Functionalities / features removed in this release
[Fixed]: For any bug fixes
[Breaking] This functionality breaks the previous API and you need to check your code
Installation and project structure
[Added] docker images for Open3D in dockerhub
[Added] option to disable jupyter build
[Added] new way of detecting conda active environment
[Added] option to link to static Windows runtime
[Changed] 3rdparty folder moved to Open3D-3rdparty repository
[Changed] bug_report.md to improve communication with users when issues are reported
[Fixed] Conda and Pip packaging issues to build platform-specific targets
[Fixed] conda dependency conflicts resulting in forced downgrade
[Fixed] python 2.7 import JVisulizer
[Fixed] Disabled conda executable check (conda command could be a bash function instead of an executable, CMake may complain)
[Fixed] Windows compilation warning with py::ssize_t
[Removed] mac flag -Wno-expansion-to-defined in CI
CORE features and applications
[Added] New Open3D Point cloud semantic segmentation architecture based on PointNet++
[Added] New training code for Point cloud semantic segmentation
[Added] New real-time inference code for Point cloud semantic segmentation
[Added] New compatibility with TensorFlow operators
[Added] New function for building Jacobian matrices that follows RGBDOdometry structure
[Added] Non-rigid optimization for more than 6 variables (6D camera pose + anchor points)
[Added] A new general purpose image processing function: CreateDepthBoundaryMask
[Added] "shift + +/-" key event that can change width of LineSet for the visualization
[Added] line_width in RenderOption and corresponding Python binding (Applies to C++/Python API)
[Added] I/O functions for LineSet
[Added] "lineset" option into ViewGeometry application
[Added] New box primitive
[Added] VoxelGrid structure
[Added] I/O functions for VoxelGrid
[Added] Utility function to transform point clouds to voxels
[Added] new shader to render voxel clouds
[Added] warning output for "degenerated" TriangleMeshes
[Added] Promote compiled extension for pycharm autocomplete
[Changed] Image class to use namespace directive in order to reduce code line length [Changed] Image class to remove global variables [Changed] Image class to shorten local variables names [Changed] Image class to simplify comparisons using unit_test::ExpectEQ(...)
[Changed] KDTreeFlann class to use namespace directive in order to reduce code line length
[Changed] KDTreeFlann class to simplify comparisons using unit_test::ExpectEQ(...)
[Changed] TriangleMesh class to use namespace directive in order to reduce code line length
[Changed] TriangleMesh class to simplify comparisons using unit_test::ExpectEQ(...)
[Changed] Relative paths in CMake package config
[Changed] Factorization of internal functions in ColormapOptimization module as public functions
[Changed] RGBDImage class to use namespace directive in order to reduce code line length
[Changed] RGBDImage class to simplified comparisons using unit_test::ExpectEQ(...)
[Changed] RGBDImage class to fixed Rand float/double to return unscaled values between 0.0 and 1.0
[Changed] PointCloud class to use namespace directive in order to reduce code line length
[Changed] PointCloud class to simplify comparisons using unit_test::ExpectEQ(...)
[Changed] LineSet class to use namespace directive in order to reduce code line length
[Changed] LineSet class to simplify comparisons using unit_test::ExpectEQ(...)
[Changed] Vector3dvector and other vector Eigen bindings to improve performance (speedup of 40-200x)
[Fixed] Bug due to PinholeCameraIntrinsic constructor not initializing member data
[Fixed] Bug in PinholeCameraTrajectory
[Fixed] Bug in ConvertToJsonValue
[Fixed] Bug in ConvertFromJsonValue
[Fixed] Bug in TransformationEstimationPointToPlane::ComputeRMSE
[Fixed] typos in FilePLY.cpp: from ply_poincloud_reader to ply_pointcloud_reader
[Fixed] parameter name of create_window
[Removed] Unneeded std::move calls
Documentation and tutorials
[Added] New tutorial on how to perform real-time PointCloud semantic segmentation using Open3D
[Added] Documentation on supported point cloud formats
Testing and benchmarking
[Added] Test case for IJsonConvertible
[Added] Test case for Core/Utility/Eigen
[Added] Test case for Core/Utility/FileSystem
[Added] Test case for PinholeCameraTrajectory
[Added] Test case for PinholeCameraIntrinsic
[Added] Test case for RGBDOdometryJacobianFromHybridTerm
[Added] Test case for RGBDOdometryJacobianFromColorTerm
[Added] New reference data for RGBDImage based on fixes to Rand float/double
[Added] New utilities for generating input data for the unit tests
[Changed] UnitTest/Utility moved to its own folder
[Changed] unit_test:ExpectEQ to removed unused code
In this post, we will walk you through how Open3D can be used to perform real-time semantic segmentation of point clouds for Autonomous Driving purposes. We demonstrate our results in the KITTI benchmark and the Semantic3D benchmark. Please, use the following link to access our demo project. See Figure 1 for an example of semantic segmentation of PointClouds in the Semantic3D dataset.
Figure 1. Example of PointCloud semantic segmentation. Left, input dense point cloud with RGB information. Right, semantic segmentation prediction map using Open3D-PointNet++.
The main purpose of this project is to showcase how to build a state-of-the-art machine learning pipeline for 3D inference by leveraging the building blogs available in Open3D. For this purpose we have to deal with several stages, such as: 1) pre-processing, 2) custom TensorFlow op integration, 3) post-processing and 4) visualization. Furthermore, we want to demonstrate how critical is the correct design of these modules in order to achieve maximum accuracy and run-time performance, and how Open3D can help to simplify this process.
Segmenting PointClouds
We based our development on the well-known PointNet++ architecture, following Mathieu Orhan and Guillaume Dekeyser's repo and the original PointNet++ implementations as a reference. We thank authors for sharing their methods. Our implementation was re-built using Open3D and we deviated from the reference design when needed in order to improve performance, as described in the following section.
For our experiments we made use of the state-of-the-art Semantic3D and KITTI datasets. In Semantic3D, there is ground truth labels for 8 semantic classes: 1) man-made terrain, 2) natural terrain, 3) high vegetation, 4) low vegetation, 5) buildings, 6) remaining hardscape, 7) scanning artifacts, 8) cars and trucks. The goal for the point cloud classification task is to output per-point class labels given the point cloud.
Figure 3. Semantic 3D snapshot
Figure 4. KITTI snapshot
Since Semantic3D dataset contains a huge number of points per point cloud (up to 5e8, see dataset stats), we first run voxel-downsampling with Open3D to reduce the dataset size. During both training and inference, PointNet++ is fed with fix-sized cropped point clouds within boxes, we set the box size to be 60m x 20m x Inf, with the Z-axis allowing all values. During inference with KITTI, we set the region of interest to be 30m in front and behind the car, 10m to the left and right of the car center to fit the box size. This allows the PointNet++ model to only predict one sample per frame.
Our semantic segmentation model is trained on the Semantic3D dataset, and it is used to perform inference on both Semantic3D and KITTI datasets. In this document, we focus on the techniques which enable real-time inference on KITTI.
Accelerating PointNet++ with Open3D-enabled TensorFlow op
In PointNet++’s set abstraction layer, the original points are subsampled, and features of the subsampled points must be propagated to all of the original points by interpolation (see Section 3.4 of PointNet++). This is achieved by 3-nearest neighbors search, of which the authors provided a simple C++ implementation via custom TensorFlow op called ThreeNN. However, this turns out to be the bottleneck of the PointNet++ prediction model.
The following benchmark is obtained by the benchmark script running inference on a batch of 64 samples on colored Semantic3D dataset. As we can see, the ThreeNN op accounts for 87% of the graph execution time.
// Batch time
Batch size: 64, batch_time: 1.8208365440368652
// Per-op time
node name | total execution time | accelerator execution time | cpu execution time |
ThreeNN 1.73sec (100.00%, 87.61%), 0us (100.00%, 0.00%), 1.73sec (100.00%, 95.87%)
ThreeInterpolate 60.68ms (12.39%, 3.07%), 0us (100.00%, 0.00%), 60.68ms (4.13%, 3.36%)
GroupPoint 27.31ms (9.32%, 1.38%), 27.03ms (100.00%, 15.85%), 275us (0.77%, 0.02%)
Conv2D 26.91ms (7.94%, 1.36%), 23.99ms (84.15%, 14.07%), 2.91ms (0.76%, 0.16%)
Open3D uses FLANN to build KDTrees for fast retrieval of nearest neighbors, which can be used to accelerate the ThreeNN op. This custom TensorFlow op implementation must be linked with Open3D and the TensorFlow library. To conveniently link the various dependencies, we provide a CMake file that automatically downloads, builds and links Open3D. When Open3D is properly installed (in this case automatically), one can simply use Open3D's CMake finder to include headers and link Open3D like the following.
target_include_directories(tf_interpolate PUBLIC ${Open3D_INCLUDE_DIRS})
target_link_libraries(tf_interpolate tensorflow_framework ${Open3D_LIBRARIES})
For more details on how to link a C++ project to Open3D, please see this documentation.
Then for each target point, we search the 3 nearest neighbors in the KDTree
// for each j:
reference_kd_tree.SearchKNN(target_pcd.points_[j],3, three_indices, three_dists);
After refactoring ThreeNN with Open3D, we see a ~2X speed up in both the ThreeNN and the full model run time with batch size 64.
// Batch time
Batch size: 64, batch_time: 0.7777869701385498
// Per-op time
node name | total execution time | accelerator execution time | cpu execution time |
ThreeNN 694.14ms (100.00%, 73.72%), 0us (100.00%, 0.00%), 694.14ms (100.00%, 90.20%)
ThreeInterpolate 62.94ms (26.28%, 6.68%), 0us (100.00%, 0.00%), 62.94ms (9.80%, 8.18%)
GroupPoint 27.18ms (19.60%, 2.89%), 26.90ms (100.00%, 15.63%), 287us (1.62%, 0.04%)
Conv2D 26.39ms (16.71%, 2.80%), 23.83ms (84.37%, 13.85%), 2.56ms (1.58%, 0.33%)
Post processing: accelerating label interpolation
Since we subsampled the original dataset before feeding points to PointNet++, the network outputs only correspond to a sparse subset of the original point cloud.
Figure 5. Inference on sparse pointcloud (KITTI).
Figure 6. Inference results after interpolation.
The sparse labels need to be interpolated to generate labels for all input points. This interpolation can be achieved with nearest neighbor search using open3d.KDTreeFlann and majority voting, similar to what we did above in the ThreeNN op.
However, doing so in Python could be a major performance hit. We run the full kitti_predict.py inference on KITTI dataset for a benchmark. The interpolation step takes about 90% of the total run time and slows down the full pipeline to about 1 FPS.
To address the performance issue, another custom TensorFlow C++ op InterploateLabel is added. The op takes sparse_points, sparse_labels, dense_points and outputs dense_labels. OpenMP is used to parallelize KNN tree search. dense_colors output is also added to the op to directly output label-colored dense points. Please refer to the source code for details.
Another benefit of using such approach is that now the full pipeline of prediction and interpolation is implemented with one TensorFlow op graph. That is, TensorFlow session takes in the original dense points, and directly returns dense labels and label-colored dense points. This approach is more modular and efficient than doing the interpolation outside of the TensorFlow graph. After optimization, the end-to-end pipeline achieved an average of 10+ FPS on KITTI dataset, which faster than KITTI's capture rate at 10 FPS.
We need to build TF kernels in tf_ops. First, activate the virtualenv and make sure TF can be found with current python. The following line shall run without error.
python -c "import tensorflow as tf"
Then build TF ops. You'll need CUDA and CMake 3.8+.
cd tf_ops
mkdir build
cd build
cmake ..
make
After compilation the following .so files shall be in the build directory.
Verify that that the TF kernels are working by running
cd .. # Now we're at Open3D-PointNet2-Semantic3D/tf_ops
python test_tf_ops.py
5. Train
Run
python train.py
By default, the training set will be used for training and the validation set will be used for validation. To train with both training and validation set, use the --train_set=train_full flag. Checkpoints will be output to log/semantic.
6. Predict
Pick a checkpoint and run the predict.py script. The prediction dataset is configured by --set. Since PointNet2 only takes a few thousand points per forward pass, we need to sample from the prediction dataset multiple times to get a good coverage of the points. Each sample contains the few thousand points required by PointNet2. To specify the number of such samples per scene, use the --num_samples flag.