Vision3DTech: PCL

Showing posts with label PCL. Show all posts

My Kinect 3D Background and Foreground Subtraction Demo

Background subtraction is a classic computer vision technique used to separate the foreground (the subject of interest) from the background in images or video streams. While this is a well-established task in 2D image processing, accurately distinguishing between foreground and background remains challenging. Our brains can easily interpret depth and context from a flat image, but computers working with only 2D data-limited to color and intensity-often struggle, especially in complex or cluttered scenes.

Why 3D Changes the Game

The real world is three-dimensional, but most computer vision applications still rely on 2D images due to hardware and computational constraints. The main limitation of 2D images is the absence of explicit depth information. While 2D images may contain visual cues about depth, they cannot provide the actual distance of each pixel from the camera, making precise foreground-background separation difficult.

By introducing 3D data-specifically, the depth of each pixel-background subtraction becomes much more robust. With depth information, it’s possible to segment a scene based on how far objects are from the camera, making it easy to distinguish between myself (as the foreground) and everything else (the background).

Live 3D Reconstruction and Background Subtraction

In my demonstration, I sit in front of a Kinect sensor, which captures not only the color and intensity of each pixel but also its depth, allowing for a full 3D reconstruction of me in the scene. This 3D information enables much more accurate and intuitive separation of foreground and background.

Here’s how the process works in my setup:

3D Reconstruction: The Kinect captures me and reconstructs my 3D representation, assigning a depth value to each pixel.
Depth-Based Segmentation: By analyzing the depth map, the system segments the scene based on distance. Only when I am within a specific distance range from the Kinect (for example, between 1.5 meters and 5 meters) do I appear as the foreground and get reconstructed in 3D.
Dynamic Filtering: If I move forward and get too close to the sensor, or if I move backward beyond the set distance range, I am filtered out from the scene. In other words, I am only reconstructed and visible as long as I remain within the defined distance boundaries.
Live Visualization: This process happens in real time, so you can see me being reconstructed in 3D, with the background subtracted, as I move closer or farther from the Kinect.
This approach not only makes background subtraction more accurate but also enables new possibilities for live video applications, such as virtual backgrounds, privacy filters, or immersive AR experiences.

Why This Matters

Traditional 2D background subtraction methods are sensitive to lighting changes, color similarities, and shadows, often leading to errors in segmentation. By leveraging 3D data, I can overcome these limitations and achieve reliable foreground and background separation-even in challenging environments. If you’d like to see this in action, check out the live capture demo video below, where I am reconstructed in 3D and both background and foreground subtraction are demonstrated in real time as I move within and outside the defined distance range.

Curious about 3D vision or want to learn more about practical computer vision? Let me know in the comments!

How to search given 3D point and its index in PCL point cloud data ?

In PCL, backend data structure of point cloud is vector. It is a vector of user defined 3D points. We consider here point cloud data with point type pcl::PointXYZ . In the source code we create dummy point cloud data using rand () function. As original data structure of point cloud is vector, we can use iterators to search specific point in the cloud.

11 ways to get command on LIDAR Data processing

LIDAR technology has been widely used in recent years for good reason. Using this technology we can get higher-quality results than traditional photogrammetric techniques for lower cost. But switching to LIDAR technology is still challenging. A major challenge is increasing data volume, and volume of data is increasing day by day. So how to you use this massive data for our benefits ? Following are different ways to take charge of LIDAR data.

3D Spline fitting

This post is dedicated to interpolating spline from 3d point cloud data. In computer graphics, a spline is a smooth curve passing through two or more specific points. If points are in two dimension then we use regression techniques to interpolate the curve. But if data is having dimensions more than 2, fitting spline through the data is becomes more difficult. In this post we are dealing with noisy 3d data.

Lidar Data Car Detection Using Conditional Clustering Approach

Demo On Road Curved Line Tracing In 3D Mobile Lidar Data

This video is about our work on road curved line tracing in 3D mobile Lidar data.

Object Detection Using Machine Learning Techniques: A Fast Review

What is object detection ?

Predicting the location of the object along with the class is called object detection. so, object detection= locating object (localization ) + classifying object (classification). So in short, classification means answering ‘what’, and localization means answering ‘where’.
e.g. Suppose you have an image , and you have to tell whether the image contains a cat or not. This is a classification problem, since we are referring to ‘what’ the image has. However, outlining the region within the image ‘where’ the cat is seen, is a localization problem. In short,

e.g. In the below image after classification we get elephant class.

Filter Point Cloud for given X,Y or Z value range in Cloudcompare

Suppose we have to select and delete point at specific elevation:

First Convert the (Z) coordinates to a scalar field with 'Edit > Scalar fields > Export coordinate(s) to SF.

Then use the 'Edit > Scalar fields > Filter by value' to filter the points with specific scalar values (the Z coordinate here) falling in a specific range.

Airborn 3D Lidar Data Tree Segmentation

This is a demo on my work to segment tree clusters from airborne 3D lidar data. For more detail go through video.

I will explain detail about the work in upcoming days ..

Light Pole Detection In Mobile Lidar data

Install Latest PCL-trunk version on Ubuntu 16.04

This post is dedicated to install latest Point Cloud Library PCL (Point Cloud Library). The PCL is an open-source library of algorithms for point cloud processing tasks and 3D geometry processing, such as occur in three-dimensional computer vision. The library contains algorithms for feature estimation, surface reconstruction, 3D registration, model fitting, and segmentation. As we want to install latest version of PCL, we will compile latest source code of PCL from official github repository.

Real-time 3D Kinect Point Cloud Viewer.

In this video I rendered real-time 3D point cloud data captured by Kinect sensor . Kinect sensor gives depth image as well as RGB image. By using RGB image and Depth image I created point cloud and rendered it using point cloud library.

Real-time 3D mapping using Kinect sensor.

Implementation of RTAB-Map

RTAB-Map (Real-Time Appearance-Based Mapping) is a RGB-D, Stereo and Lidar Graph-Based SLAM approach based on an incremental appearance-based loop closure detector. The loop closure detector uses a bag-of-words approach to determinate how likely a new image comes from a previous location or a new location. When a loop closure hypothesis is accepted, a new constraint is added to the map’s graph, then a graph optimizer minimizes the errors in the map. A memory management approach is used to limit the number of locations used for loop closure detection and graph optimization, so that real-time constraints on large-scale environnements are always respected. RTAB-Map can be used alone with a handheld Kinect, a stereo camera or a 3D lidar for 6DoF mapping, or on a robot equipped with a laser rangefinder for 3DoF mapping.