PyTorchGeoNodes

1Institute for Computer Graphics and Vision, Graz University of Technology, 2LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS

PyTorchGeoNodes is a differentiable module for reconstructing 3D objects from images using interpretable shape programs.

Abstract

We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of shape programs for 3D reconstruction allows for reasoning about the semantic properties of reconstructed objects, editing, low memory footprint, etc. However, the utilization of shape programs for 3D scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing a module that translates shape programs designed in Blender, for example, into efficient PyTorch code. We also provide a method that relies on PyTorchGeoNodes and is inspired by Monte Carlo Tree Search (MCTS) to jointly optimize discrete and continuous parameters of shape programs and reconstruct 3D objects for input scenes. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions. Our experiments indicate that our reconstructions match well the input scenes while enabling semantic reasoning about reconstructed objects.

Teaser image.

Given interpretable input shape parameters, a shape program generates a mesh for a 3D object. Our differentiable pipeline makes possible the flow of gradients from the generated shape to the continuous parameters. We show a variety of sofas and armchairs generated by a single shape program. The method we propose can retrieve these shapes using a sequence of RGB-D scans as input, estimating both the continuous parameters~(Width, Depth, etc.) and the discrete parameters~(Has Left Arm, Is L-Shaped, etc.).

From Blender to PyTorchGeoNodes

Our framework provides different computational nodes that reimplement the functionalities of the individual geometry nodes in Blender. More exactly, for every node type in Blender, we implement a corresponding node type with the same functionalities using PyTorch, or PyTorch3D in case of geometric operations.

blender2pytorchgeonodes image.

In the visualization above we show the computational graph designed and visualized in Blender using Geometry Nodes feature. Underneath, we show how we abstract the different nodes using PyTorch tensors and PyTorch3D meshes. The input node takes input parameters, here {Width: 0.5, Dividing Board Thickness: 0.04, Height: 0.6, Number of Dividing Boards: 5, Board Thickness: 0.04} and feeds them to a series of operations. The blue nodes are arithmetic and concatenation nodes, which transform input parameters and feed the results to geometry nodes, in green. In this example, we generate a cuboid mesh and instantiate a line of points which generates the final geometry for dividing boards. In practice, this shape program is part of a larger shape program for modeling cabinets.

Gradient-based optimization of continuous parameters of a shape program

From an initial estimate of the parameters of the object, we can perform gradient descent on the parameters based on a 3D geometric loss term. In contrast to methods that directly optimize the reconstructed mesh, PyTorchGeoNodes allows optimization in the parameter space which has several benefits. From the resulting shapes in the example below, it is observable that individual parameters can be scaled independently targeting only specific parts of the shape geometry while preserving the compactness of the 3D shape at the same time.

Initialization

Initialization

Loading...
Final result

Final step


Joint discrete and continuous parameter optimization

At every iteration, our MCTS-based search algorithm first selects a set of shape parameters and object pose. These are then passed through the differentiable computational graph, parsed by our PyTorchGeoNodes framework for a shape program, that generates a 3D mesh based on the selected shape parameters. Then, we compute the loss that evaluates how well the shape fits the input scene. We further optimize the continuous parameters of our shape program using gradient descent and update the search tree based on this loss.

Search overview image.

Demonstrative examples

Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Iteration 1

Iteration 1

Loading...
Final result

Final Result


Related Research

If you are interested in further reading regarding this topic we suggest following readings (at the time of project release).

SceneScript proposes to represent 3D scenes using a structured language and introduces an auto-regressive model to infer scene geometry based on this representation. Interestingly, in the supplementary material, authors suggest possibility of incorporating Blender Geometry Nodes into the SceneScript language, which could be a potential future use case for PyTorchGeoNodes.

GeoCode proposes to use interpretable shape programs for 3D shape synthesis. In addition, this work proposes a deep learning-based method to infer shape parameters from 3D point clouds or 2D sketches.

Our previous work MonteFloor introduces a general method for problems in 3D scene understanding based on joint MCTS and gradient-based optimization. Our HOC-Search applies similar principles for efficient CAD Model and Pose Retrieval from RGB-D Scans.

BibTeX

@article{stekovic2024pytorchgeonodes,
  author    = {Stekovic, Sinisa and Ainetter, Stefan and D'Urso, Mattia and Fraundorfer, Friedrich and Lepetit, Vincent},
  title     = {PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction},
  journal   = {arxiv},
  year      = {2024}
}