PyTorchGeoNodes

1LIGM, Ecole des Ponts, Institut Polytechnique de Paris 2Institute for Computer Graphics and Vision, Graz University of Technology,

PyTorchGeoNodes is a differentiable module that enables reconstruction of 3D objects and their semantic parameters using interpretable shape programs.

In the teaser video, for an input scene (blue mesh), we reconstruct objects including the corresponding geometry and important semantic parameters (visualized in the red display). These parameters include important semantic properties such as type of legs, existance of chair armrest, tabletop type, and also important measurements parameters (in meters).

Abstract

We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects and their parameters from images using interpretable shape programs. Unlike traditional CAD model retrieval, shape programs allow reasoning about semantic parameters, editing, and a low memory footprint. Despite their potential, shape programs for 3D scene understanding have been largely overlooked. Our key contribution is enabling gradient-based optimization by parsing shape programs, or more precisely procedural models designed in Blender, into efficient PyTorch code. While there are many possible applications of our PyTochGeoNodes, we show that a combination of PyTorchGeoNodes with genetic algorithm is a method of choice to optimize both discrete and continuous shape program parameters for 3D reconstruction and understanding of 3D object parameters. Our modular framework can be further integrated with other reconstruction algorithms, and we demonstrate one such integration to enable procedural Gaussian splatting. Our experiments on the ScanNet dataset show that our method achieves accurate reconstructions while enabling, until now, unseen level of 3D scene understanding.

Teaser image.

Given interpretable input shape parameters, a shape program generates a mesh for a 3D object. Our differentiable pipeline makes possible the flow of gradients from the generated shape to the continuous parameters. We show a variety of sofas and armchairs generated by a single shape program. The method we propose can retrieve these shapes using a sequence of RGB-D scans as input, estimating both the continuous parameters~(Width, Depth, etc.) and the discrete parameters~(Has Left Arm, Is L-Shaped, etc.).

From Blender to PyTorchGeoNodes

Our framework provides different computational nodes that reimplement the functionalities of the individual geometry nodes in Blender. More exactly, for every node type in Blender, we implement a corresponding node type with the same functionalities using PyTorch, or PyTorch3D in case of geometric operations.

blender2pytorchgeonodes image.

In the visualization above we show the computational graph designed and visualized in Blender using Geometry Nodes feature. Underneath, we show how we abstract the different nodes using PyTorch tensors and PyTorch3D meshes. The input node takes input parameters, here {Width: 0.5, Dividing Board Thickness: 0.04, Height: 0.6, Number of Dividing Boards: 5, Board Thickness: 0.04} and feeds them to a series of operations. The blue nodes are arithmetic and concatenation nodes, which transform input parameters and feed the results to geometry nodes, in green. In this example, we generate a cuboid mesh and instantiate a line of points which generates the final geometry for dividing boards. In practice, this shape program is part of a larger shape program for modeling cabinets.

Gradient-based optimization of continuous parameters of a shape program

From an initial estimate of the parameters of the object, we can perform gradient descent on the parameters based on a 3D geometric loss term. In contrast to methods that directly optimize the reconstructed mesh, PyTorchGeoNodes allows optimization in the parameter space which has several benefits. From the resulting shapes in the example below, it is observable that individual parameters can be scaled independently targeting only specific parts of the shape geometry while preserving the compactness of the 3D shape at the same time.

Initialization

Initialization

Loading...
Final result

Final step


Joint discrete and continuous parameter optimization

At every iteration, our genetic algorithm-based search generates individuals consisting of sets of shape parameters and object pose. These are then passed through the differentiable computational graph, parsed by our PyTorchGeoNodes framework for a shape program, that generates a 3D mesh based on the selected shape parameters. Then, we compute the loss that evaluates how well the shape fits the input scene. We further optimize the continuous parameters of our shape program using gradient descent and update the individuals based on this loss.

Search overview image.

Demonstrative examples

Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Semantic parts segmentation

Segmented partial point clouds.

Our reconstructions can be used for segmenting semantic parts in partial point clouds. In this application, the graph of PyTorchGeoNodes directly assigns points in the point cloud to base primitives of our procedural graph based on simple chamfer distance.

Procedural Gaussian splatting

Reconstructed Gaussians.

We enable procedural Gaussian splats within PyTorchGeoNodes by extending the nodes that create mesh primitives to also create trainable Gaussian parameters. The key difference to regular Gaussian Splatting is that our Gaussians are constrained by the procedural model that enables interesting properties such as ' cloning' of Gaussians: the Gaussians sampled on 'cloned' base primitives are tied together. For example, as the armrests of sofa 'clones' of the same base primitives, the occluded part of an armrest can still be recovered if it is visible for another armrest. (Implementation will be released soon)

In the video, we show how reconstructed Gaussians can be edited by modifying parameters of the shape program. As our reconstructions are procedural, all changes in the parameter space are propagated directly to the final Gaussians.

Related research

If you are interested in further reading regarding this topic we suggest following readings (at the time of project release).

SceneScript proposes to represent 3D scenes using a structured language and introduces an auto-regressive model to infer scene geometry based on this representation. Interestingly, in the supplementary material, authors suggest possibility of incorporating Blender Geometry Nodes into the SceneScript language, which could be a potential future use case for PyTorchGeoNodes.

GeoCode proposes to use interpretable shape programs for 3D shape synthesis. In addition, this work proposes a deep learning-based method to infer shape parameters from 3D point clouds or 2D sketches.

Our previous work MonteFloor introduces a general method for problems in 3D scene understanding based on joint MCTS and gradient-based optimization. Our HOC-Search applies similar principles for efficient CAD Model and Pose Retrieval from RGB-D Scans.

BibTeX

@article{stekovic2025pytorchgeonodes,
  author    = {Stekovic, Sinisa and Artykov, Arslan and Ainetter, Stefan and D'Urso, Mattia and Fraundorfer, Friedrich},
  title     = {PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction},
  journal   = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2025}
}