My Kinect 3D Background and Foreground Subtraction Demo



Background subtraction is a classic computer vision technique used to separate the foreground (the subject of interest) from the background in images or video streams. While this is a well-established task in 2D image processing, accurately distinguishing between foreground and background remains challenging. Our brains can easily interpret depth and context from a flat image, but computers working with only 2D data-limited to color and intensity-often struggle, especially in complex or cluttered scenes. 

Why 3D Changes the Game 

The real world is three-dimensional, but most computer vision applications still rely on 2D images due to hardware and computational constraints. The main limitation of 2D images is the absence of explicit depth information. While 2D images may contain visual cues about depth, they cannot provide the actual distance of each pixel from the camera, making precise foreground-background separation difficult. 
By introducing 3D data-specifically, the depth of each pixel-background subtraction becomes much more robust. With depth information, it’s possible to segment a scene based on how far objects are from the camera, making it easy to distinguish between myself (as the foreground) and everything else (the background). 

Live 3D Reconstruction and Background Subtraction 

In my demonstration, I sit in front of a Kinect sensor, which captures not only the color and intensity of each pixel but also its depth, allowing for a full 3D reconstruction of me in the scene. This 3D information enables much more accurate and intuitive separation of foreground and background. 
Here’s how the process works in my setup: 
  • 3D Reconstruction: The Kinect captures me and reconstructs my 3D representation, assigning a depth value to each pixel. 
  • Depth-Based Segmentation: By analyzing the depth map, the system segments the scene based on distance. Only when I am within a specific distance range from the Kinect (for example, between 1.5 meters and 5 meters) do I appear as the foreground and get reconstructed in 3D. 
  • Dynamic Filtering: If I move forward and get too close to the sensor, or if I move backward beyond the set distance range, I am filtered out from the scene. In other words, I am only reconstructed and visible as long as I remain within the defined distance boundaries. 
  • Live Visualization: This process happens in real time, so you can see me being reconstructed in 3D, with the background subtracted, as I move closer or farther from the Kinect. 
  • This approach not only makes background subtraction more accurate but also enables new possibilities for live video applications, such as virtual backgrounds, privacy filters, or immersive AR experiences. 
Why This Matters 
Traditional 2D background subtraction methods are sensitive to lighting changes, color similarities, and shadows, often leading to errors in segmentation. By leveraging 3D data, I can overcome these limitations and achieve reliable foreground and background separation-even in challenging environments. If you’d like to see this in action, check out the live capture demo video below, where I am reconstructed in 3D and both background and foreground subtraction are demonstrated in real time as I move within and outside the defined distance range. 
Curious about 3D vision or want to learn more about practical computer vision? Let me know in the comments!

ONNX Simplified


If you're working with machine learning models in frameworks like PyTorch or TensorFlow, you've likely heard of ONNX. But what exactly is ONNX, and why should you care about it when you're ready to move from model training to deployment?

Let’s break it down in a way that makes sense, even if you're not a deep learning expert.


What is ONNX?

ONNX stands for Open Neural Network Exchange. It’s an open-source format that allows you to move models between different frameworks and run them efficiently on various hardware. Think of it like exporting your Word document as a PDF so it can be viewed the same way on any device—ONNX lets your trained models be used across platforms regardless of the original training environment.


Why Do You Need ONNX?

When you train a model in TensorFlow or PyTorch, the model is saved in that framework’s own format. To run it elsewhere, you typically need the same framework installed. That can be bulky, complex, or incompatible with your production setup.

ONNX solves this by making your model framework-independent. This means:

  • You don’t need to install heavy ML libraries to run the model.

  • You can deploy the same model across different environments.

  • It’s easier to scale or integrate with production systems.


Benefits of Converting to ONNX

  1. Framework Freedom: Train in one tool, deploy in another.

  2. Optimized Inference: Use efficient runtimes like ONNX Runtime or TensorRT.

  3. Cross-platform Support: Run on cloud, edge, or mobile devices.

  4. Lower Latency & Memory Usage: Ideal for real-time systems.

  5. Hardware Acceleration: Take advantage of NVIDIA GPUs, Intel chips, and more.


When Should You Convert to ONNX?

  • When your model is trained and ready for deployment.

  • When your production environment doesn’t support your training framework.

  • When you want to run your model on devices with limited resources.

  • When performance (speed, memory) matters.


What About TensorFlow Models?

TensorFlow models can also be converted to ONNX using tools like tf2onnx. This opens up the same portability and performance benefits:

  • Share models with teams using other frameworks.

  • Use the model on devices that don’t support TensorFlow.

  • Optimize inference using non-TensorFlow runtimes.


What is TensorRT and How Does It Fit In?

TensorRT is a high-performance inference engine by NVIDIA. It makes deep learning models run faster on NVIDIA GPUs. Here’s what it does:

  • Reduces precision (e.g., from FP32 to FP16 or INT8) to save memory.

  • Optimizes operations for better performance.

  • Speeds up model inference in real-time applications.

Do You Need ONNX for TensorRT?

Not necessarily, but ONNX is the easiest and most versatile way to use TensorRT. TensorRT supports TensorFlow and Caffe as well, but ONNX provides:

  • Simpler integration

  • More framework compatibility

  • Less boilerplate code

So yes, TensorRT can work without ONNX, but ONNX makes it much more convenient and effective.


In Short

  • ONNX is your universal model format for deployment.

  • It makes AI models portable, fast, and production-ready.

  • It supports both PyTorch and TensorFlow workflows.

  • With ONNX, tools like TensorRT become easily accessible for further performance gains.

Whether you're working with PyTorch or TensorFlow, ONNX helps you bridge the gap between research and real-world use.

That’s all for this post.Thank you for reading! If you found this post helpful or have any questions about ONNX and AI deployment, please leave a comment below. Stay tuned for more insights on making AI easier and more accessible!


Function Overloading Vs Function Overriding in C++

Function Overloading means having multiple definitions of the function by changing its signature.

float area(int a);
float area(int a, int b); 

Function Overriding means redefining a base class function in its derived class with same signature i.e return type and parameters.

Difference between lazy learning and eager learning

Eager Learning  vs Lazy learning
When a machine learning algorithm builds a model soon after receiving training data set, it is called eager learning. It is called eager; because, when it gets the data set, the first thing it does – build the model. Then it forgets the training data. Later, when an input data comes, it uses this model to evaluate it. Most machine learning algorithms are eager learners.

My Kinect 3D Background and Foreground Subtraction Demo

Background subtraction is a classic computer vision technique used to separate the foreground (the subject of interest) from the backg...