pytorch3d.structures

pytorch3d.structures.join_meshes_as_batch(meshes: List[Meshes], include_textures: bool = True) → Meshes[source]

Merge multiple Meshes objects, i.e. concatenate the meshes objects. They must all be on the same device. If include_textures is true, they must all be compatible, either all or none having textures, and all the Textures objects being the same type. If include_textures is False, textures are ignored.

If the textures are TexturesAtlas then being the same type includes having the same resolution. If they are TexturesUV then it includes having the same align_corners and padding_mode.

Parameters:

meshes – list of meshes.
include_textures – (bool) whether to try to join the textures.

Returns:

new Meshes object containing all the meshes from all the inputs.

pytorch3d.structures.join_meshes_as_scene(meshes: Meshes | List[Meshes], include_textures: bool = True) → Meshes[source]

Joins a batch of meshes in the form of a Meshes object or a list of Meshes objects as a single mesh. If the input is a list, the Meshes objects in the list must all be on the same device. Unless include_textures is False, the meshes must all have the same type of texture or must all not have textures.

If textures are included, then the textures are joined as a single scene in addition to the meshes. For this, texture types have an appropriate method called join_scene which joins mesh textures into a single texture. If the textures are TexturesAtlas then they must have the same resolution. If they are TexturesUV then they must have the same align_corners and padding_mode. Values in verts_uvs outside [0, 1] will not be respected.

Parameters:

meshes – Meshes object that contains a batch of meshes, or a list of Meshes objects.
include_textures – (bool) whether to try to join the textures.

Returns:

new Meshes object containing a single mesh

class pytorch3d.structures.Meshes(verts, faces, textures=None, *, verts_normals=None)[source]

This class provides functions for working with batches of triangulated meshes with varying numbers of faces and vertices, and converting between representations.

Within Meshes, there are three different representations of the faces and verts data:

List

only used for input as a starting point to convert to other representations.

Padded

has specific batch dimension.

Packed

no batch dimension.
has auxiliary variables used to index into the padded representation.

Example:

Input list of verts V_n = [[V_1], [V_2], … , [V_N]] where V_1, … , V_N are the number of verts in each mesh and N is the number of meshes.

Input list of faces F_n = [[F_1], [F_2], … , [F_N]] where F_1, … , F_N are the number of faces in each mesh.

From the faces, edges are computed and have packed and padded representations with auxiliary variables.

E_n = [[E_1], … , [E_N]] where E_1, … , E_N are the number of unique edges in each mesh. Total number of unique edges = sum(E_n)

__init__(verts, faces, textures=None, *, verts_normals=None) → None[source]

Parameters:

verts –
Can be either
- List where each element is a tensor of shape (num_verts, 3) containing the (x, y, z) coordinates of each vertex.
- Padded float tensor with shape (num_meshes, max_num_verts, 3). Meshes should be padded with fill value of 0 so they all have the same number of vertices.
faces –
Can be either
- List where each element is a tensor of shape (num_faces, 3) containing the indices of the 3 vertices in the corresponding mesh in verts which form the triangular face.
- Padded long tensor of shape (num_meshes, max_num_faces, 3). Meshes should be padded with fill value of -1 so they have the same number of faces.
textures – Optional instance of the Textures class with mesh texture properties.
verts_normals –
Optional. Can be either
- List where each element is a tensor of shape (num_verts, 3) containing the normals of each vertex.
- Padded float tensor with shape (num_meshes, max_num_verts, 3). They should be padded with fill value of 0 so they all have the same number of vertices.
Note that modifying the mesh later, e.g. with offset_verts_, can cause these normals to be forgotten and normals to be recalculated based on the new vertex positions.

Refer to comments above for descriptions of List and Padded representations.

__getitem__(index: int | List[int] | slice | BoolTensor | LongTensor) → Meshes[source]

Parameters:: index – Specifying the index of the mesh to retrieve. Can be an int, slice, list of ints or a boolean tensor.
Returns:: Meshes object with selected meshes. The mesh tensors are not cloned.

isempty() → bool[source]

Checks whether any mesh is valid.

Returns:: bool indicating whether there is any data.

verts_list()[source]

Get the list representation of the vertices.

Returns:: list of tensors of vertices of shape (V_n, 3).

faces_list()[source]

Get the list representation of the faces.

Returns:: list of tensors of faces of shape (F_n, 3).

verts_packed()[source]

Get the packed representation of the vertices.

Returns:: tensor of vertices of shape (sum(V_n), 3).

verts_packed_to_mesh_idx()[source]

Return a 1D tensor with the same first dimension as verts_packed. verts_packed_to_mesh_idx[i] gives the index of the mesh which contains verts_packed[i].

Returns:: 1D tensor of indices.

mesh_to_verts_packed_first_idx()[source]

Return a 1D tensor x with length equal to the number of meshes such that the first vertex of the ith mesh is verts_packed[x[i]].

Returns:: 1D tensor of indices of first items.

num_verts_per_mesh()[source]

Return a 1D tensor x with length equal to the number of meshes giving the number of vertices in each mesh.

Returns:: 1D tensor of sizes.

faces_packed()[source]

Get the packed representation of the faces. Faces are given by the indices of the three vertices in verts_packed.

Returns:: tensor of faces of shape (sum(F_n), 3).

faces_packed_to_mesh_idx()[source]

Return a 1D tensor with the same first dimension as faces_packed. faces_packed_to_mesh_idx[i] gives the index of the mesh which contains faces_packed[i].

Returns:: 1D tensor of indices.

mesh_to_faces_packed_first_idx()[source]

Return a 1D tensor x with length equal to the number of meshes such that the first face of the ith mesh is faces_packed[x[i]].

Returns:: 1D tensor of indices of first items.

verts_padded()[source]

Get the padded representation of the vertices.

Returns:: tensor of vertices of shape (N, max(V_n), 3).

faces_padded()[source]

Get the padded representation of the faces.

Returns:: tensor of faces of shape (N, max(F_n), 3).

num_faces_per_mesh()[source]

Return a 1D tensor x with length equal to the number of meshes giving the number of faces in each mesh.

Returns:: 1D tensor of sizes.

edges_packed()[source]

Get the packed representation of the edges.

Returns:: tensor of edges of shape (sum(E_n), 2).

edges_packed_to_mesh_idx()[source]

Return a 1D tensor with the same first dimension as edges_packed. edges_packed_to_mesh_idx[i] gives the index of the mesh which contains edges_packed[i].

Returns:: 1D tensor of indices.

mesh_to_edges_packed_first_idx()[source]

Return a 1D tensor x with length equal to the number of meshes such that the first edge of the ith mesh is edges_packed[x[i]].

Returns:: 1D tensor of indices of first items.

faces_packed_to_edges_packed()[source]

Get the packed representation of the faces in terms of edges. Faces are given by the indices of the three edges in the packed representation of the edges.

Returns:: tensor of faces of shape (sum(F_n), 3).

num_edges_per_mesh()[source]

Return a 1D tensor x with length equal to the number of meshes giving the number of edges in each mesh.

Returns:: 1D tensor of sizes.

verts_padded_to_packed_idx()[source]

Return a 1D tensor x with length equal to the total number of vertices such that verts_packed()[i] is element x[i] of the flattened padded representation. The packed representation can be calculated as follows.

p = verts_padded().reshape(-1, 3)
verts_packed = p[x]

Returns:: 1D tensor of indices.

has_verts_normals() → bool[source]: Check whether vertex normals are already present.

verts_normals_packed()[source]

Get the packed representation of the vertex normals.

Returns:: tensor of normals of shape (sum(V_n), 3).

verts_normals_list()[source]

Get the list representation of the vertex normals.

Returns:: list of tensors of normals of shape (V_n, 3).

verts_normals_padded()[source]

Get the padded representation of the vertex normals.

Returns:: tensor of normals of shape (N, max(V_n), 3).

faces_normals_packed()[source]

Get the packed representation of the face normals.

Returns:: tensor of normals of shape (sum(F_n), 3).

faces_normals_list()[source]

Get the list representation of the face normals.

Returns:: list of tensors of normals of shape (F_n, 3).

faces_normals_padded()[source]

Get the padded representation of the face normals.

Returns:: tensor of normals of shape (N, max(F_n), 3).

faces_areas_packed()[source]

Get the packed representation of the face areas.

Returns:: tensor of areas of shape (sum(F_n),).

laplacian_packed()[source]

clone()[source]

Deep copy of Meshes object. All internal tensors are cloned individually.

Returns:: new Meshes object.

detach()[source]

Detach Meshes object. All internal tensors are detached individually.

Returns:: new Meshes object.

to(device: str | device, copy: bool = False)[source]

Match functionality of torch.Tensor.to() If copy = True or the self Tensor is on a different device, the returned tensor is a copy of self with the desired torch.device. If copy = False and the self Tensor already has the correct torch.device, then self is returned.

Parameters:

device – Device (as str or torch.device) for the new tensor.
copy – Boolean indicator whether or not to clone self. Default False.

Returns:

Meshes object.

cpu()[source]

cuda()[source]

get_mesh_verts_faces(index: int)[source]

Get tensors for a single mesh from the list representation.

Parameters:: index – Integer in the range [0, N).
Returns:: verts – Tensor of shape (V, 3). faces: LongTensor of shape (F, 3).

split(split_sizes: list)[source]

Splits Meshes object of size N into a list of Meshes objects of size len(split_sizes), where the i-th Meshes object is of size split_sizes[i]. Similar to torch.split().

Parameters:: split_sizes – List of integer sizes of Meshes objects to be returned.
Returns:: list[Meshes].

offset_verts_(vert_offsets_packed)[source]

Add an offset to the vertices of this Meshes. In place operation. If normals are present they may be recalculated.

Parameters:: vert_offsets_packed – A Tensor of shape (3,) or the same shape as self.verts_packed, giving offsets to be added to all vertices.
Returns:: self.

offset_verts(vert_offsets_packed)[source]

Out of place offset_verts.

Parameters:: vert_offsets_packed – A Tensor of the same shape as self.verts_packed giving offsets to be added to all vertices.
Returns:: new Meshes object.

scale_verts_(scale)[source]

Multiply the vertices of this Meshes object by a scalar value. In place operation.

Parameters:: scale – A scalar, or a Tensor of shape (N,).
Returns:: self.

scale_verts(scale)[source]

Out of place scale_verts.

Parameters:: scale – A scalar, or a Tensor of shape (N,).
Returns:: new Meshes object.

update_padded(new_verts_padded)[source]

This function allows for an update of verts_padded without having to explicitly convert it to the list representation for heterogeneous batches. Returns a Meshes structure with updated padded tensors and copies of the auxiliary tensors at construction time. It updates self._verts_padded with new_verts_padded, and does a shallow copy of (faces_padded, faces_list, num_verts_per_mesh, num_faces_per_mesh). If packed representations are computed in self, they are updated as well.

Parameters:: new_points_padded – FloatTensor of shape (N, V, 3)
Returns:: Meshes with updated padded representations

get_bounding_boxes()[source]

Compute an axis-aligned bounding box for each mesh in this Meshes object.

Returns:: bboxes – Tensor of shape (N, 3, 2) where bbox[i, j] gives the min and max values of mesh i along the jth coordinate axis.

extend(N: int)[source]

Create new Meshes class which contains each input mesh N times

Parameters:: N – number of new copies of each mesh.
Returns:: new Meshes object.

sample_textures(fragments)[source]

submeshes(face_indices: List[List[LongTensor]] | List[LongTensor] | LongTensor) → Meshes[source]

Split a batch of meshes into a batch of submeshes.

The return value is a Meshes object representing: [mesh restricted to only faces indexed by selected_faces for mesh, selected_faces_list in zip(self, face_indices) for faces in selected_faces_list]

Parameters:

face_indices –

Let the original mesh have verts_list() of length N. Can be either

List of lists of LongTensors. The n-th element is a list of length

num_submeshes_n (empty lists are allowed). The k-th element of the n-th sublist is a LongTensor of length num_faces_submesh_n_k. - List of LongTensors. The n-th element is a (possibly empty) LongTensor

of shape (num_submeshes_n, num_faces_n).

A LongTensor of shape (N, num_submeshes_per_mesh, num_faces_per_submesh) where all meshes in the batch will have the same number of submeshes. This will result in an output Meshes object with batch size equal to N * num_submeshes_per_mesh.

Returns:

Meshes object of length sum(len(ids) for ids in face_indices).

Example 1:

If meshes has batch size 1, and face_indices is a 1D LongTensor, then meshes.submeshes([[face_indices]]) and `meshes.submeshes(face_indices[None, None]) both produce a Meshes of length 1, containing a single submesh with a subset of meshes’ faces, whose indices are specified by face_indices.

Example 2:

Take a Meshes object cubes with 4 meshes, each a translated cube. Then:

len(cubes) is 4, len(cubes.verts_list()) is 4, len(cubes.faces_list()) 4,
[cube_verts.size for cube_verts in cubes.verts_list()] is [8, 8, 8, 8],
[cube_faces.size for cube_faces in cubes.faces_list()] if [6, 6, 6, 6],

Now let front_facet, top_and_bottom, all_facets be LongTensors of sizes (2), (4), and (12), each picking up a number of facets of a cube by specifying the appropriate triangular faces.

Then let `subcubes = cubes.submeshes([[front_facet, top_and_bottom], [],

[all_facets], []])`.

len(subcubes) is 3.
subcubes[0] is the front facet of the cube contained in cubes[0].
subcubes[1] is a mesh containing the (disconnected) top and bottom facets of cubes[0].
subcubes[2] is cubes[2].
There are no submeshes of cubes[1] and cubes[3] in subcubes.
subcubes[0] and subcubes[1] are not watertight. subcubes[2] is.

pytorch3d.structures.join_pointclouds_as_batch(pointclouds: Sequence[Pointclouds]) → Pointclouds[source]

Merge a list of Pointclouds objects into a single batched Pointclouds object. All pointclouds must be on the same device.

Parameters:

batch – List of Pointclouds objects each with batch dim [b1, b2, …, bN]

Returns:

pointcloud –

Poinclouds object with all input pointclouds collated into: a single object with batch dim = sum(b1, b2, …, bN)

pytorch3d.structures.join_pointclouds_as_scene(pointclouds: Pointclouds | List[Pointclouds]) → Pointclouds[source]

Joins a batch of point cloud in the form of a Pointclouds object or a list of Pointclouds objects as a single point cloud. If the input is a list, the Pointclouds objects in the list must all be on the same device, and they must either all or none have features and all or none have normals.

Parameters:: Pointclouds – Pointclouds object that contains a batch of point clouds, or a list of Pointclouds objects.
Returns:: new Pointclouds object containing a single point cloud

class pytorch3d.structures.Pointclouds(points, normals=None, features=None)[source]

This class provides functions for working with batches of 3d point clouds, and converting between representations.

Within Pointclouds, there are three different representations of the data.

List

only used for input as a starting point to convert to other representations.

Padded

has specific batch dimension.

Packed

no batch dimension.
has auxiliary variables used to index into the padded representation.

Example

Input list of points = [[P_1], [P_2], … , [P_N]] where P_1, … , P_N are the number of points in each cloud and N is the number of clouds.

__init__(points, normals=None, features=None) → None[source]

Parameters:

points –
Can be either
- List where each element is a tensor of shape (num_points, 3) containing the (x, y, z) coordinates of each point.
- Padded float tensor with shape (num_clouds, num_points, 3).
normals –
Can be either
- None
- List where each element is a tensor of shape (num_points, 3) containing the normal vector for each point.
- Padded float tensor of shape (num_clouds, num_points, 3).
features –
Can be either
- None
- List where each element is a tensor of shape (num_points, C) containing the features for the points in the cloud.
- Padded float tensor of shape (num_clouds, num_points, C).
where C is the number of channels in the features. For example 3 for RGB color.

Refer to comments above for descriptions of List and Padded representations.

__getitem__(index: int | List[int] | slice | BoolTensor | LongTensor) → Pointclouds[source]

Parameters:: index – Specifying the index of the cloud to retrieve. Can be an int, slice, list of ints or a boolean tensor.
Returns:: Pointclouds object with selected clouds. The tensors are not cloned.

isempty() → bool[source]

Checks whether any cloud is valid.

Returns:: bool indicating whether there is any data.

points_list() → List[Tensor][source]

Get the list representation of the points.

Returns:: list of tensors of points of shape (P_n, 3).

normals_list() → List[Tensor] | None[source]

Get the list representation of the normals, or None if there are no normals.

Returns:: list of tensors of normals of shape (P_n, 3).

features_list() → List[Tensor] | None[source]

Get the list representation of the features, or None if there are no features.

Returns:: list of tensors of features of shape (P_n, C).

points_packed() → Tensor[source]

Get the packed representation of the points.

Returns:: tensor of points of shape (sum(P_n), 3).

normals_packed() → Tensor | None[source]

Get the packed representation of the normals.

Returns:: tensor of normals of shape (sum(P_n), 3), or None if there are no normals.

features_packed() → Tensor | None[source]

Get the packed representation of the features.

Returns:: tensor of features of shape (sum(P_n), C), or None if there are no features

packed_to_cloud_idx()[source]

Return a 1D tensor x with length equal to the total number of points. packed_to_cloud_idx()[i] gives the index of the cloud which contains points_packed()[i].

Returns:: 1D tensor of indices.

cloud_to_packed_first_idx()[source]

Return a 1D tensor x with length equal to the number of clouds such that the first point of the ith cloud is points_packed[x[i]].

Returns:: 1D tensor of indices of first items.

num_points_per_cloud() → Tensor[source]

Return a 1D tensor x with length equal to the number of clouds giving the number of points in each cloud.

Returns:: 1D tensor of sizes.

points_padded() → Tensor[source]

Get the padded representation of the points.

Returns:: tensor of points of shape (N, max(P_n), 3).

normals_padded() → Tensor | None[source]

Get the padded representation of the normals, or None if there are no normals.

Returns:: tensor of normals of shape (N, max(P_n), 3).

features_padded() → Tensor | None[source]

Get the padded representation of the features, or None if there are no features.

Returns:: tensor of features of shape (N, max(P_n), 3).

padded_to_packed_idx()[source]

Return a 1D tensor x with length equal to the total number of points such that points_packed()[i] is element x[i] of the flattened padded representation. The packed representation can be calculated as follows.

p = points_padded().reshape(-1, 3)
points_packed = p[x]

Returns:: 1D tensor of indices.

clone()[source]

Deep copy of Pointclouds object. All internal tensors are cloned individually.

Returns:: new Pointclouds object.

detach()[source]

Detach Pointclouds object. All internal tensors are detached individually.

Returns:: new Pointclouds object.

to(device: str | device, copy: bool = False)[source]

Match functionality of torch.Tensor.to() If copy = True or the self Tensor is on a different device, the returned tensor is a copy of self with the desired torch.device. If copy = False and the self Tensor already has the correct torch.device, then self is returned.

Parameters:

device – Device (as str or torch.device) for the new tensor.
copy – Boolean indicator whether or not to clone self. Default False.

Returns:

Pointclouds object.

cpu()[source]

cuda()[source]

get_cloud(index: int)[source]

Get tensors for a single cloud from the list representation.

Parameters:: index – Integer in the range [0, N).
Returns:: points – Tensor of shape (P, 3). normals: Tensor of shape (P, 3) features: LongTensor of shape (P, C).

split(split_sizes: list)[source]

Splits Pointclouds object of size N into a list of Pointclouds objects of size len(split_sizes), where the i-th Pointclouds object is of size split_sizes[i]. Similar to torch.split().

Parameters:

split_sizes – List of integer sizes of Pointclouds objects to be
returned. –

Returns:

list[Pointclouds].

offset_(offsets_packed)[source]

Translate the point clouds by an offset. In place operation.

Parameters:: offsets_packed – A Tensor of shape (3,) or the same shape as self.points_packed giving offsets to be added to all points.
Returns:: self.

offset(offsets_packed)[source]

Out of place offset.

Parameters:: offsets_packed – A Tensor of the same shape as self.points_packed giving offsets to be added to all points.
Returns:: new Pointclouds object.

subsample(max_points: int | Sequence[int]) → Pointclouds[source]

Subsample each cloud so that it has at most max_points points.

Parameters:: max_points – maximum number of points in each cloud.
Returns:: new Pointclouds object, or self if nothing to be done.

scale_(scale)[source]

Multiply the coordinates of this object by a scalar value. - i.e. enlarge/dilate In place operation.

Parameters:: scale – A scalar, or a Tensor of shape (N,).
Returns:: self.

scale(scale)[source]

Out of place scale_.

Parameters:: scale – A scalar, or a Tensor of shape (N,).
Returns:: new Pointclouds object.

get_bounding_boxes()[source]

Compute an axis-aligned bounding box for each cloud.

Returns:: bboxes – Tensor of shape (N, 3, 2) where bbox[i, j] gives the min and max values of cloud i along the jth coordinate axis.

estimate_normals(neighborhood_size: int = 50, disambiguate_directions: bool = True, assign_to_self: bool = False)[source]

Estimates the normals of each point in each cloud and assigns them to the internal tensors self._normals_list and self._normals_padded

The function uses ops.estimate_pointcloud_local_coord_frames to estimate the normals. Please refer to that function for more detailed information about the implemented algorithm.

Parameters:

**neighborhood_size** – The size of the neighborhood used to estimate the geometry around each point.
**disambiguate_directions** – If True, uses the algorithm from [1] to ensure sign consistency of the normals of neighboring points.
**normals** – A tensor of normals for each input point of shape (minibatch, num_point, 3). If pointclouds are of Pointclouds class, returns a padded tensor.
**assign_to_self** – If True, assigns the computed normals to the internal buffers overwriting any previously stored normals.

References

[1] Tombari, Salti, Di Stefano: Unique Signatures of Histograms for Local Surface Description, ECCV 2010.

extend(N: int)[source]

Create new Pointclouds which contains each cloud N times.

Parameters:: N – number of new copies of each cloud.
Returns:: new Pointclouds object.

update_padded(new_points_padded, new_normals_padded=None, new_features_padded=None)[source]

Returns a Pointcloud structure with updated padded tensors and copies of the auxiliary tensors. This function allows for an update of points_padded (and normals and features) without having to explicitly convert it to the list representation for heterogeneous batches.

Parameters:

new_points_padded – FloatTensor of shape (N, P, 3)
new_normals_padded – (optional) FloatTensor of shape (N, P, 3)
new_features_padded – (optional) FloatTensor of shape (N, P, C)

Returns:

Pointcloud with updated padded representations

inside_box(box)[source]

Finds the points inside a 3D box.

Parameters:

box –

FloatTensor of shape (2, 3) or (N, 2, 3) where N is the number of clouds.

box[…, 0, :] gives the min x, y & z. box[…, 1, :] gives the max x, y & z.

Returns:

idx –

BoolTensor of length sum(P_i) indicating whether the packed points are: within the input box.

pytorch3d.structures.list_to_packed(x: List[Tensor])[source]

Transforms a list of N tensors each of shape (Mi, K, …) into a single tensor of shape (sum(Mi), K, …).

Parameters:

x – list of tensors.

Returns:

4-element tuple containing

x_packed: tensor consisting of packed input tensors along the 1st dimension.
num_items: tensor of shape N containing Mi for each element in x.
item_packed_first_idx: tensor of shape N indicating the index of the first item belonging to the same element in the original list.
item_packed_to_list_idx: tensor of shape sum(Mi) containing the index of the element in the list the item belongs to.

pytorch3d.structures.list_to_padded(x: List[Tensor] | Tuple[Tensor], pad_size: Sequence[int] | None = None, pad_value: float = 0.0, equisized: bool = False) → Tensor[source]

Transforms a list of N tensors each of shape (Si_0, Si_1, … Si_D) into: - a single tensor of shape (N, pad_size(0), pad_size(1), …, pad_size(D))

if pad_size is provided

or a tensor of shape (N, max(Si_0), max(Si_1), …, max(Si_D)) if pad_size is None.

Parameters:

x – list of Tensors
pad_size – list(int) specifying the size of the padded tensor. If None (default), the largest size of each dimension is set as the pad_size.
pad_value – float value to be used to fill the padded tensor
equisized – bool indicating whether the items in x are of equal size (sometimes this is known and if provided saves computation)

Returns:

x_padded –

tensor consisting of padded input tensors stored: over the newly allocated memory.

pytorch3d.structures.packed_to_list(x: Tensor, split_size: list | int)[source]

Transforms a tensor of shape (sum(Mi), K, L, …) to N set of tensors of shape (Mi, K, L, …) where Mi’s are defined in split_size

Parameters:

x – tensor
split_size – list, tuple or int defining the number of items for each tensor in the output list.

Returns:

x_list – A list of Tensors

pytorch3d.structures.padded_to_list(x: Tensor, split_size: Sequence[int] | Sequence[Sequence[int]] | None = None)[source]

Transforms a padded tensor of shape (N, S_1, S_2, …, S_D) into a list of N tensors of shape: - (Si_1, Si_2, …, Si_D) where (Si_1, Si_2, …, Si_D) is specified in split_size(i) - or (S_1, S_2, …, S_D) if split_size is None - or (Si_1, S_2, …, S_D) if split_size(i) is an integer.

Parameters:

x – tensor
split_size – optional 1D or 2D list/tuple of ints defining the number of items for each tensor.

Returns:

x_list – a list of tensors sharing the memory with the input.

This class provides functions for working with batches of volumetric grids of possibly varying spatial sizes.

VOLUME DENSITIES

The Volumes class can be either constructed from a 5D tensor of densities of size batch x density_dim x depth x height x width or from a list of differently-sized 4D tensors [D_1, …, D_batch], where each D_i is of size [density_dim x depth_i x height_i x width_i].

In case the Volumes object is initialized from the list of densities, the list of tensors is internally converted to a single 5D tensor by zero-padding the relevant dimensions. Both list and padded representations can be accessed with the Volumes.densities() or Volumes.densities_list() getters. The sizes of the individual volumes in the structure can be retrieved with the Volumes.get_grid_sizes() getter.

The Volumes class is immutable. I.e. after generating a Volumes object, one cannot change its properties, such as self._densities or self._features anymore.

VOLUME FEATURES

While the densities field is intended to represent various measures of the “density” of the volume cells (opacity, signed/unsigned distances from the nearest surface, …), one can additionally initialize the object with the features argument. features are either a 5D tensor of shape batch x feature_dim x depth x height x width or a list of of differently-sized 4D tensors [F_1, …, F_batch], where each F_i is of size [feature_dim x depth_i x height_i x width_i]. features are intended to describe other properties of volume cells, such as per-voxel 3D vectors of RGB colors that can be later used for rendering the volume.

VOLUME COORDINATES

Additionally, using the VolumeLocator class the Volumes class keeps track of the locations of the centers of the volume cells in the local volume coordinates as well as in the world coordinates.

Local coordinates:

Represent the locations of the volume cells in the local coordinate frame of the volume.

The center of the voxel indexed with [·, ·, 0, 0, 0] in the volume has its 3D local coordinate set to [-1, -1, -1], while the voxel at index [·, ·, depth_i-1, height_i-1, width_i-1] has its 3D local coordinate set to [1, 1, 1].

The first/second/third coordinate of each of the 3D per-voxel XYZ vector denotes the horizontal/vertical/depth-wise position respectively. I.e the order of the coordinate dimensions in the volume is reversed w.r.t. the order of the 3D coordinate vectors.

The intermediate coordinates between [-1, -1, -1] and [1, 1, 1]. are linearly interpolated over the spatial dimensions of the volume.

Note that the convention is the same as for the 5D version of the torch.nn.functional.grid_sample function called with the same value of align_corners argument.

Note that the local coordinate convention of Volumes (+X = left to right, +Y = top to bottom, +Z = away from the user) is different from the world coordinate convention of the renderer for Meshes or Pointclouds (+X = right to left, +Y = bottom to top, +Z = away from the user).

World coordinates:
These define the locations of the centers of the volume cells in the world coordinates.
They are specified with the following mapping that converts points x_local in the local coordinates to points x_world in the world coordinates:
x_world = (
    x_local * (volume_size - 1) * 0.5 * voxel_size
) - volume_translation,
here voxel_size specifies the size of each voxel of the volume, and volume_translation is the 3D offset of the central voxel of the volume w.r.t. the origin of the world coordinate frame. Both voxel_size and volume_translation are specified in the world coordinate units. volume_size is the spatial size of the volume in form of a 3D vector [width, height, depth].
Given the above definition of x_world, one can derive the inverse mapping from x_world to x_local as follows:
x_local = (
    (x_world + volume_translation) / (0.5 * voxel_size)
) / (volume_size - 1)
For a trivial volume with volume_translation==[0, 0, 0] with voxel_size=-1, x_world would range from -(volume_size-1)/2` to +(volume_size-1)/2.

Coordinate tensors that denote the locations of each of the volume cells in local / world coordinates (with shape (depth x height x width x 3)) can be retrieved by calling the Volumes.get_coord_grid() getter with the appropriate world_coordinates argument.

Internally, the mapping between x_local and x_world is represented as a Transform3d object Volumes.VolumeLocator._local_to_world_transform. Users can access the relevant transformations with the Volumes.get_world_to_local_coords_transform() and Volumes.get_local_to_world_coords_transform() functions.

Example coordinate conversion:

For a “trivial” volume with voxel_size = 1., volume_translation=[0., 0., 0.], and the spatial size of DxHxW = 5x5x5, the point x_world = (-2, 0, 2) gets mapped to x_local=(-1, 0, 1).
For a “trivial” volume v with voxel_size = 1., volume_translation=[0., 0., 0.], the following holds:

torch.nn.functional.grid_sample(
v.densities(), v.get_coord_grid(world_coordinates=False), align_corners=align_corners,

) == v.densities(),

i.e. sampling the volume at trivial local coordinates (no scaling with voxel_size` or shift with volume_translation) results in the same volume.

Parameters:

**densities** – Batch of input feature volume occupancies of shape (minibatch, density_dim, depth, height, width), or a list of 4D tensors [D_1, …, D_minibatch] where each D_i has shape (density_dim, depth_i, height_i, width_i). Typically, each voxel contains a non-negative number corresponding to its opaqueness.
**features** – Batch of input feature volumes of shape: (minibatch, feature_dim, depth, height, width) or a list of 4D tensors [F_1, …, F_minibatch] where each F_i has shape (feature_dim, depth_i, height_i, width_i). The field is optional and can be set to None in case features are not required.
**voxel_size** – Denotes the size of each volume voxel in world units. Has to be one of: a) A scalar (square voxels) b) 3-tuple or a 3-list of scalars c) a Tensor of shape (3,) d) a Tensor of shape (minibatch, 3) e) a Tensor of shape (minibatch, 1) f) a Tensor of shape (1,) (square voxels)
**volume_translation** – Denotes the 3D translation of the center of the volume in world units. Has to be one of: a) 3-tuple or a 3-list of scalars b) a Tensor of shape (3,) c) a Tensor of shape (minibatch, 3) d) a Tensor of shape (1,) (square voxels)
**align_corners** – If set (default), the coordinates of the corner voxels are exactly −1 or +1 in the local coordinate system. Otherwise, the coordinates correspond to the centers of the corner voxels. Cf. the namesake argument to torch.nn.functional.grid_sample.

Parameters:: index – Specifying the index of the volume to retrieve. Can be an int, slice, list of ints or a boolean or a long tensor.
Returns:: Volumes object with selected volumes. The tensors are not cloned.

features() → Tensor | None[source]

Returns the features of the volume.

Returns:: *features* – The tensor of volume features.

densities() → Tensor[source]

Returns the densities of the volume.

Returns:: *densities* – The tensor of volume densities.

densities_list() → List[Tensor][source]

Get the list representation of the densities.

Returns:: list of tensors of densities of shape (dim_i, D_i, H_i, W_i).

features_list() → List[Tensor][source]

Get the list representation of the features.

Returns:: list of tensors of features of shape (dim_i, D_i, H_i, W_i) or None for feature-less volumes.

get_align_corners() → bool[source]: Return whether the corners of the voxels should be aligned with the image pixels.

update_padded(new_densities: Tensor, new_features: Tensor | None = None) → Volumes[source]

Returns a Volumes structure with updated padded tensors and copies of the auxiliary tensors self._local_to_world_transform, device and self._grid_sizes. This function allows for an update of densities (and features) without having to explicitly convert it to the list representation for heterogeneous batches.

Parameters:

new_densities – FloatTensor of shape (N, dim_density, D, H, W)
new_features – (optional) FloatTensor of shape (N, dim_feature, D, H, W)

Returns:

Volumes with updated features and densities

clone() → Volumes[source]

Deep copy of Volumes object. All internal tensors are cloned individually.

Returns:: new Volumes object.

to(device: str | device, copy: bool = False) → Volumes[source]

Match the functionality of torch.Tensor.to() If copy = True or the self Tensor is on a different device, the returned tensor is a copy of self with the desired torch.device. If copy = False and the self Tensor already has the correct torch.device, then self is returned.

Parameters:

device – Device (as str or torch.device) for the new tensor.
copy – Boolean indicator whether or not to clone self. Default False.

Returns:

Volumes object.

cpu() → Volumes[source]

cuda() → Volumes[source]

get_grid_sizes() → LongTensor[source]

Returns the sizes of individual volumetric grids in the structure.

Returns:

*grid_sizes* –

Tensor of spatial sizes of each of the volumes: of size (batchsize, 3), where i-th row holds (D_i, H_i, W_i).

get_local_to_world_coords_transform() → Transform3d[source]

Return a Transform3d object that converts points in the the local coordinate frame of the volume to world coordinates. Local volume coordinates are scaled s.t. the coordinates along one side of the volume are in range [-1, 1].

Returns:

*local_to_world_transform* –

A Transform3d object converting: points from local coordinates to the world coordinates.

get_world_to_local_coords_transform() → Transform3d[source]

Return a Transform3d object that converts points in the world coordinates to the local coordinate frame of the volume. Local volume coordinates are scaled s.t. the coordinates along one side of the volume are in range [-1, 1].

Returns:

*world_to_local_transform* –

A Transform3d object converting: points from world coordinates to local coordinates.

world_to_local_coords(points_3d_world: Tensor) → Tensor[source]

Convert a batch of 3D point coordinates points_3d_world of shape (minibatch, …, dim) in the world coordinates to the local coordinate frame of the volume. Local volume coordinates are scaled s.t. the coordinates along one side of the volume are in range [-1, 1].

Parameters:

**points_3d_world** – A tensor of shape (minibatch, …, 3) containing the 3D coordinates of a set of points that will be converted from the local volume coordinates (ranging within [-1, 1]) to the world coordinates defined by the self.center and self.voxel_size parameters.

Returns:

*points_3d_local* –

points_3d_world converted to the local: volume coordinates of shape (minibatch, …, 3).

local_to_world_coords(points_3d_local: Tensor) → Tensor[source]

Convert a batch of 3D point coordinates points_3d_local of shape (minibatch, …, dim) in the local coordinate frame of the volume to the world coordinates.

Parameters:

**points_3d_local** – A tensor of shape (minibatch, …, 3) containing the 3D coordinates of a set of points that will be converted from the local volume coordinates (ranging within [-1, 1]) to the world coordinates defined by the self.center and self.voxel_size parameters.

Returns:

*points_3d_world* –

points_3d_local converted to the world: coordinates of the volume of shape (minibatch, …, 3).

get_coord_grid(world_coordinates: bool = True) → Tensor[source]

Return the 3D coordinate grid of the volumetric grid in local (world_coordinates=False) or world coordinates (world_coordinates=True).

The grid records location of each center of the corresponding volume voxel.

Local coordinates are scaled s.t. the values along one side of the volume are in range [-1, 1].

Parameters:

**world_coordinates** – if True, the method returns the grid in the world coordinates, otherwise, in local coordinates.

Returns:

*coordinate_grid* –

The grid of coordinates of shape: (minibatch, depth, height, width, 3), where minibatch, height, width and depth are the batch size, height, width and depth of the volume features or densities.