pytorch3d.ops

pytorch3d.ops.cubify(voxels, thresh, device=None, align: str = 'topleft') → pytorch3d.structures.meshes.Meshes[source]

Converts a voxel to a mesh by replacing each occupied voxel with a cube consisting of 12 faces and 8 vertices. Shared vertices are merged, and internal faces are removed. :param voxels: A FloatTensor of shape (N, D, H, W) containing occupancy probabilities. :param thresh: A scalar threshold. If a voxel occupancy is larger than

thresh, the voxel is considered occupied.
Parameters:
  • device – The device of the output meshes
  • align – Defines the alignment of the mesh vertices and the grid locations. Has to be one of {“topleft”, “corner”, “center”}. See below for explanation. Default is “topleft”.
Returns:

meshes – A Meshes object of the corresponding meshes.

The alignment between the vertices of the cubified mesh and the voxel locations (or pixels) is defined by the choice of align. We support three modes, as shown below for a 2x2 grid:

X—X—- X——-X ——— | | | | | | | X | X | X—X—- ——— ——— | | | | | | | X | X | ——— X——-X ———

topleft corner center

In the figure, X denote the grid locations and the squares represent the added cuboids. When align=”topleft”, then the top left corner of each cuboid corresponds to the pixel coordinate of the input grid. When align=”corner”, then the corners of the output mesh span the whole grid. When align=”center”, then the grid locations form the center of the cuboids.

class pytorch3d.ops.GraphConv(input_dim: int, output_dim: int, init: str = 'normal', directed: bool = False)[source]

A single graph convolution layer.

__init__(input_dim: int, output_dim: int, init: str = 'normal', directed: bool = False)[source]
Parameters:
  • input_dim – Number of input features per vertex.
  • output_dim – Number of output features per vertex.
  • init – Weight initialization method. Can be one of [‘zero’, ‘normal’].
  • directed – Bool indicating if edges in the graph are directed.
forward(verts, edges)[source]
Parameters:
  • verts – FloatTensor of shape (V, input_dim) where V is the number of vertices and input_dim is the number of input features per vertex. input_dim has to match the input_dim specified in __init__.
  • edges – LongTensor of shape (E, 2) where E is the number of edges where each edge has the indices of the two vertices which form the edge.
Returns:

out – FloatTensor of shape (V, output_dim) where output_dim is the number of output features per vertex.

pytorch3d.ops.knn_gather(x: torch.Tensor, idx: torch.Tensor, lengths: Optional[torch.Tensor] = None)[source]

A helper function for knn that allows indexing a tensor x with the indices idx returned by knn_points.

For example, if dists, idx = knn_points(p, x, lengths_p, lengths, K) where p is a tensor of shape (N, L, D) and x a tensor of shape (N, M, D), then one can compute the K nearest neighbors of p with p_nn = knn_gather(x, idx, lengths). It can also be applied for any tensor x of shape (N, M, U) where U != D.

Parameters:
  • x – Tensor of shape (N, M, U) containing U-dimensional features to be gathered.
  • idx – LongTensor of shape (N, L, K) giving the indices returned by knn_points.
  • lengths – LongTensor of shape (N,) of values in the range [0, M], giving the length of each example in the batch in x. Or None to indicate that every example has length M.
Returns:

x_out

Tensor of shape (N, L, K, U) resulting from gathering the elements of x

with idx, s.t. x_out[n, l, k] = x[n, idx[n, l, k]]. If k > lengths[n] then x_out[n, l, k] is filled with 0.0.

pytorch3d.ops.knn_points(p1: torch.Tensor, p2: torch.Tensor, lengths1: Optional[torch.Tensor] = None, lengths2: Optional[torch.Tensor] = None, K: int = 1, version: int = -1, return_nn: bool = False, return_sorted: bool = True)[source]

K-Nearest neighbors on point clouds.

Parameters:
  • p1 – Tensor of shape (N, P1, D) giving a batch of N point clouds, each containing up to P1 points of dimension D.
  • p2 – Tensor of shape (N, P2, D) giving a batch of N point clouds, each containing up to P2 points of dimension D.
  • lengths1 – LongTensor of shape (N,) of values in the range [0, P1], giving the length of each pointcloud in p1. Or None to indicate that every cloud has length P1.
  • lengths2 – LongTensor of shape (N,) of values in the range [0, P2], giving the length of each pointcloud in p2. Or None to indicate that every cloud has length P2.
  • K – Integer giving the number of nearest neighbors to return.
  • version – Which KNN implementation to use in the backend. If version=-1, the correct implementation is selected based on the shapes of the inputs.
  • return_nn – If set to True returns the K nearest neighbors in p2 for each point in p1.
  • return_sorted – (bool) whether to return the nearest neighbors sorted in ascending order of distance.
Returns:

dists

Tensor of shape (N, P1, K) giving the squared distances to

the nearest neighbors. This is padded with zeros both where a cloud in p2 has fewer than K points and where a cloud in p1 has fewer than P1 points.

idx: LongTensor of shape (N, P1, K) giving the indices of the

K nearest neighbors from points in p1 to points in p2. Concretely, if p1_idx[n, i, k] = j then p2[n, j] is the k-th nearest neighbors to p1[n, i] in p2[n]. This is padded with zeros both where a cloud in p2 has fewer than K points and where a cloud in p1 has fewer than P1 points.

nn: Tensor of shape (N, P1, K, D) giving the K nearest neighbors in p2 for

each point in p1. Concretely, p2_nn[n, i, k] gives the k-th nearest neighbor for p1[n, i]. Returned if return_nn is True. The nearest neighbors are collected using knn_gather

which is a helper function that allows indexing any tensor of shape (N, P2, U) with the indices p1_idx returned by knn_points. The outout is a tensor of shape (N, P1, K, U).

pytorch3d.ops.mesh_face_areas_normals()
pytorch3d.ops.packed_to_padded(inputs, first_idxs, max_size)[source]

Torch wrapper that handles allowed input shapes. See description below.

Parameters:
  • inputs – FloatTensor of shape (F,) or (F, D), representing the packed batch tensor, e.g. areas for faces in a batch of meshes.
  • first_idxs – LongTensor of shape (N,) where N is the number of elements in the batch and first_idxs[i] = f means that the inputs for batch element i begin at inputs[f].
  • max_size – Max length of an element in the batch.
Returns:

inputs_padded

FloatTensor of shape (N, max_size) or (N, max_size, D)

where max_size is max of sizes. The values for batch element i which start at inputs[first_idxs[i]] will be copied to inputs_padded[i, :], with zeros padding out the extra inputs.

To handle the allowed input shapes, we convert the inputs tensor of shape (F,) to (F, 1). We reshape the output back to (N, max_size) from (N, max_size, 1).

pytorch3d.ops.padded_to_packed(inputs, first_idxs, num_inputs)[source]

Torch wrapper that handles allowed input shapes. See description below.

Parameters:
  • inputs – FloatTensor of shape (N, max_size) or (N, max_size, D), representing the padded tensor, e.g. areas for faces in a batch of meshes.
  • first_idxs – LongTensor of shape (N,) where N is the number of elements in the batch and first_idxs[i] = f means that the inputs for batch element i begin at inputs_packed[f].
  • num_inputs – Number of packed entries (= F)
Returns:

inputs_packed

FloatTensor of shape (F,) or (F, D) where

inputs_packed[first_idx[i]:] = inputs[i, :].

To handle the allowed input shapes, we convert the inputs tensor of shape (N, max_size) to (N, max_size, 1). We reshape the output back to (F,) from (F, 1).

pytorch3d.ops.efficient_pnp(x: torch.Tensor, y: torch.Tensor, weights: Optional[torch.Tensor] = None, skip_quadratic_eq: bool = False) → pytorch3d.ops.perspective_n_points.EpnpSolution[source]

Implements Efficient PnP algorithm [1] for Perspective-n-Points problem: finds a camera position (defined by rotation R and translation T) that minimizes re-projection error between the given 3D points x and the corresponding uncalibrated 2D points y, i.e. solves

y[i] = Proj(x[i] R[i] + T[i])

in the least-squares sense, where i are indices within the batch, and Proj is the perspective projection operator: Proj([x y z]) = [x/z y/z]. In the noise-less case, 4 points are enough to find the solution as long as they are not co-planar.

Parameters:
  • x – Batch of 3-dimensional points of shape (minibatch, num_points, 3).
  • y – Batch of 2-dimensional points of shape (minibatch, num_points, 2).
  • weights – Batch of non-negative weights of shape (minibatch, num_point). None means equal weights.
  • skip_quadratic_eq – If True, assumes the solution space for the linear system is one-dimensional, i.e. takes the scaled eigenvector that corresponds to the smallest eigenvalue as a solution. If False, finds the candidate coordinates in the potentially 4D null space by approximately solving the systems of quadratic equations. The best candidate is chosen by examining the 2D re-projection error. While this option finds a better solution, especially when the number of points is small or perspective distortions are low (the points are far away), it may be more difficult to back-propagate through.
Returns:

EpnpSolution namedtuple containing elements –

x_cam: Batch of transformed points x that is used to find

the camera parameters, of shape (minibatch, num_points, 3). In the general (noisy) case, they are not exactly equal to x[i] R[i] + T[i] but are some affine transform of `x[i]`s.

R: Batch of rotation matrices of shape (minibatch, 3, 3). T: Batch of translation vectors of shape (minibatch, 3). err_2d: Batch of mean 2D re-projection errors of shape

(minibatch,). Specifically, if yhat is the re-projection for the i-th batch element, it returns sum_j norm(yhat_j - y_j) where j iterates over points and norm denotes the L2 norm.

err_3d: Batch of mean algebraic errors of shape (minibatch,).

Specifically, those are squared distances between x_world and estimated points on the rays defined by y.

[1] Moreno-Noguer, F., Lepetit, V., & Fua, P. (2009). EPnP: An Accurate O(n) solution to the PnP problem. International Journal of Computer Vision. https://www.epfl.ch/labs/cvlab/software/multi-view-stereo/epnp/

pytorch3d.ops.corresponding_points_alignment(X: Union[torch.Tensor, Pointclouds], Y: Union[torch.Tensor, Pointclouds], weights: Union[torch.Tensor, List[torch.Tensor], None] = None, estimate_scale: bool = False, allow_reflection: bool = False, eps: float = 1e-09) → pytorch3d.ops.points_alignment.SimilarityTransform[source]

Finds a similarity transformation (rotation R, translation T and optionally scale s) between two given sets of corresponding d-dimensional points X and Y such that:

s[i] X[i] R[i] + T[i] = Y[i],

for all batch indexes i in the least squares sense.

The algorithm is also known as Umeyama [1].

Parameters:
  • **X** – Batch of d-dimensional points of shape (minibatch, num_point, d) or a Pointclouds object.
  • **Y** – Batch of d-dimensional points of shape (minibatch, num_point, d) or a Pointclouds object.
  • **weights** – Batch of non-negative weights of shape (minibatch, num_point) or list of minibatch 1-dimensional tensors that may have different shapes; in that case, the length of i-th tensor should be equal to the number of points in X_i and Y_i. Passing None means uniform weights.
  • **estimate_scale** – If True, also estimates a scaling component s of the transformation. Otherwise assumes an identity scale and returns a tensor of ones.
  • **allow_reflection** – If True, allows the algorithm to return R which is orthonormal but has determinant==-1.
  • **eps** – A scalar for clamping to avoid dividing by zero. Active for the code that estimates the output scale s.
Returns:

3-element named tuple SimilarityTransform containing - R: Batch of orthonormal matrices of shape (minibatch, d, d). - T: Batch of translations of shape (minibatch, d). - s: batch of scaling factors of shape (minibatch, ).

References

[1] Shinji Umeyama: Least-Suqares Estimation of Transformation Parameters Between Two Point Patterns

pytorch3d.ops.iterative_closest_point(X: Union[torch.Tensor, Pointclouds], Y: Union[torch.Tensor, Pointclouds], init_transform: Optional[pytorch3d.ops.points_alignment.SimilarityTransform] = None, max_iterations: int = 100, relative_rmse_thr: float = 1e-06, estimate_scale: bool = False, allow_reflection: bool = False, verbose: bool = False) → pytorch3d.ops.points_alignment.ICPSolution[source]

Executes the iterative closest point (ICP) algorithm [1, 2] in order to find a similarity transformation (rotation R, translation T, and optionally scale s) between two given differently-sized sets of d-dimensional points X and Y, such that:

s[i] X[i] R[i] + T[i] = Y[NN[i]],

for all batch indices i in the least squares sense. Here, Y[NN[i]] stands for the indices of nearest neighbors from Y to each point in X. Note, however, that the solution is only a local optimum.

Parameters:
  • **X** – Batch of d-dimensional points of shape (minibatch, num_points_X, d) or a Pointclouds object.
  • **Y** – Batch of d-dimensional points of shape (minibatch, num_points_Y, d) or a Pointclouds object.
  • **init_transform** – A named-tuple SimilarityTransform of tensors R, T, `s, where R is a batch of orthonormal matrices of shape (minibatch, d, d), T is a batch of translations of shape (minibatch, d) and s is a batch of scaling factors of shape (minibatch,).
  • **max_iterations** – The maximum number of ICP iterations.
  • **relative_rmse_thr** – A threshold on the relative root mean squared error used to terminate the algorithm.
  • **estimate_scale** – If True, also estimates a scaling component s of the transformation. Otherwise assumes the identity scale and returns a tensor of ones.
  • **allow_reflection** – If True, allows the algorithm to return R which is orthonormal but has determinant==-1.
  • **verbose** – If True, prints status messages during each ICP iteration.
Returns:

A named tuple ICPSolution with the following fields –

converged: A boolean flag denoting whether the algorithm converged

successfully (=`True`) or not (=`False`).

rmse: Attained root mean squared error after termination of ICP. Xt: The point cloud X transformed with the final transformation

(R, T, s). If X is a Pointclouds object, returns an instance of Pointclouds, otherwise returns torch.Tensor.

RTs: A named tuple SimilarityTransform containing a batch of similarity transforms with fields:

R: Batch of orthonormal matrices of shape (minibatch, d, d). T: Batch of translations of shape (minibatch, d). s: batch of scaling factors of shape (minibatch, ).

t_history: A list of named tuples SimilarityTransform

the transformation parameters after each ICP iteration.

References

[1] Besl & McKay: A Method for Registration of 3-D Shapes. TPAMI, 1992. [2] https://en.wikipedia.org/wiki/Iterative_closest_point

pytorch3d.ops.estimate_pointcloud_local_coord_frames(pointclouds: Union[torch.Tensor, Pointclouds], neighborhood_size: int = 50, disambiguate_directions: bool = True) → Tuple[torch.Tensor, torch.Tensor][source]

Estimates the principal directions of curvature (which includes normals) of a batch of pointclouds.

The algorithm first finds neighborhood_size nearest neighbors for each point of the point clouds, followed by obtaining principal vectors of covariance matrices of each of the point neighborhoods. The main principal vector corresponds to the normals, while the other 2 are the direction of the highest curvature and the 2nd highest curvature.

Note that each principal direction is given up to a sign. Hence, the function implements disambiguate_directions switch that allows to ensure consistency of the sign of neighboring normals. The implementation follows the sign disabiguation from SHOT descriptors [1].

The algorithm also returns the curvature values themselves. These are the eigenvalues of the estimated covariance matrices of each point neighborhood.

Parameters:
  • **pointclouds** – Batch of 3-dimensional points of shape (minibatch, num_point, 3) or a Pointclouds object.
  • **neighborhood_size** – The size of the neighborhood used to estimate the geometry around each point.
  • **disambiguate_directions** – If True, uses the algorithm from [1] to ensure sign consistency of the normals of neigboring points.
Returns:

*curvatures*

The three principal curvatures of each point

of shape (minibatch, num_point, 3). If pointclouds are of Pointclouds class, returns a padded tensor.

local_coord_frames: The three principal directions of the curvature

around each point of shape (minibatch, num_point, 3, 3). The principal directions are stored in columns of the output. E.g. local_coord_frames[i, j, :, 0] is the normal of j-th point in the i-th pointcloud. If pointclouds are of Pointclouds class, returns a padded tensor.

References

[1] Tombari, Salti, Di Stefano: Unique Signatures of Histograms for Local Surface Description, ECCV 2010.

pytorch3d.ops.estimate_pointcloud_normals(pointclouds: Union[torch.Tensor, Pointclouds], neighborhood_size: int = 50, disambiguate_directions: bool = True) → torch.Tensor[source]

Estimates the normals of a batch of pointclouds.

The function uses estimate_pointcloud_local_coord_frames to estimate the normals. Please refer to this function for more detailed information.

Parameters:
  • **pointclouds** – Batch of 3-dimensional points of shape (minibatch, num_point, 3) or a Pointclouds object.
  • **neighborhood_size** – The size of the neighborhood used to estimate the geometry around each point.
  • **disambiguate_directions** – If True, uses the algorithm from [1] to ensure sign consistency of the normals of neigboring points.
Returns:

*normals*

A tensor of normals for each input point

of shape (minibatch, num_point, 3). If pointclouds are of Pointclouds class, returns a padded tensor.

References

[1] Tombari, Salti, Di Stefano: Unique Signatures of Histograms for Local Surface Description, ECCV 2010.

pytorch3d.ops.sample_points_from_meshes(meshes, num_samples: int = 10000, return_normals: bool = False) → Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Convert a batch of meshes to a pointcloud by uniformly sampling points on the surface of the mesh with probability proportional to the face area.

Parameters:
  • meshes – A Meshes object with a batch of N meshes.
  • num_samples – Integer giving the number of point samples per mesh.
  • return_normals – If True, return normals for the sampled points.
  • eps – (float) used to clamp the norm of the normals to avoid dividing by 0.
Returns:

2-element tuple containing

  • samples: FloatTensor of shape (N, num_samples, 3) giving the coordinates of sampled points for each mesh in the batch. For empty meshes the corresponding row in the samples array will be filled with 0.
  • normals: FloatTensor of shape (N, num_samples, 3) giving a normal vector to each sampled point. Only returned if return_normals is True. For empty meshes the corresponding row in the normals array will be filled with 0.

class pytorch3d.ops.SubdivideMeshes(meshes=None)[source]

Subdivide a triangle mesh by adding a new vertex at the center of each edge and dividing each face into four new faces. Vectors of vertex attributes can also be subdivided by averaging the values of the attributes at the two vertices which form each edge. This implementation preserves face orientation - if the vertices of a face are all ordered counter-clockwise, then the faces in the subdivided meshes will also have their vertices ordered counter-clockwise.

If meshes is provided as an input, the initializer performs the relatively expensive computation of determining the new face indices. This one-time computation can be reused for all meshes with the same face topology but different vertex positions.

__init__(meshes=None)[source]
Parameters:meshes – Meshes object or None. If a meshes object is provided, the first mesh is used to compute the new faces of the subdivided topology which can be reused for meshes with the same input topology.
subdivide_faces(meshes)[source]
Parameters:meshes – a Meshes object.
Returns:subdivided_faces_packed – (4*sum(F_n), 3) shape LongTensor of original and new faces.

Refer to pytorch3d.structures.meshes.py for more details on packed representations of faces.

Each face is split into 4 faces e.g. Input face

         v0
         /\
        /  \
       /    \
   e1 /      \ e0
     /        \
    /          \
   /            \
  /______________\
v2       e2       v1

faces_packed = [[0, 1, 2]]
faces_packed_to_edges_packed = [[2, 1, 0]]

faces_packed_to_edges_packed is used to represent all the new vertex indices corresponding to the mid-points of edges in the mesh. The actual vertex coordinates will be computed in the forward function. To get the indices of the new vertices, offset faces_packed_to_edges_packed by the total number of vertices.

faces_packed_to_edges_packed = [[2, 1, 0]] + 3 = [[5, 4, 3]]

e.g. subdivided face

        v0
        /\
       /  \
      / f0 \
  v4 /______\ v3
    /\      /\
   /  \ f3 /  \
  / f2 \  / f1 \
 /______\/______\
v2       v5       v1

f0 = [0, 3, 4]
f1 = [1, 5, 3]
f2 = [2, 4, 5]
f3 = [5, 4, 3]
forward(meshes, feats=None)[source]

Subdivide a batch of meshes by adding a new vertex on each edge, and dividing each face into four new faces. New meshes contains two types of vertices: 1) Vertices that appear in the input meshes.

Data for these vertices are copied from the input meshes.
  1. New vertices at the midpoint of each edge. Data for these vertices is the average of the data for the two vertices that make up the edge.
Parameters:
  • meshes – Meshes object representing a batch of meshes.
  • feats – Per-vertex features to be subdivided along with the verts. Should be parallel to the packed vert representation of the input meshes; so it should have shape (V, D) where V is the total number of verts in the input meshes. Default: None.
Returns:

2-element tuple containing

  • new_meshes: Meshes object of a batch of subdivided meshes.
  • new_feats: (optional) Tensor of subdivided feats, parallel to the (packed) vertices of the subdivided meshes. Only returned if feats is not None.

subdivide_homogeneous(meshes, feats=None)[source]

Subdivide verts (and optionally features) of a batch of meshes where each mesh has the same topology of faces. The subdivided faces are precomputed in the initializer.

Parameters:
  • meshes – Meshes object representing a batch of meshes.
  • feats – Per-vertex features to be subdivided along with the verts.
Returns:

2-element tuple containing

  • new_meshes: Meshes object of a batch of subdivided meshes.
  • new_feats: (optional) Tensor of subdivided feats, parallel to the (packed) vertices of the subdivided meshes. Only returned if feats is not None.

subdivide_heterogenerous(meshes, feats=None)[source]

Subdivide faces, verts (and optionally features) of a batch of meshes where each mesh can have different face topologies.

Parameters:
  • meshes – Meshes object representing a batch of meshes.
  • feats – Per-vertex features to be subdivided along with the verts.
Returns:

2-element tuple containing

  • new_meshes: Meshes object of a batch of subdivided meshes.
  • new_feats: (optional) Tensor of subdivided feats, parallel to the (packed) vertices of the subdivided meshes. Only returned if feats is not None.

pytorch3d.ops.convert_pointclouds_to_tensor(pcl: Union[torch.Tensor, Pointclouds])[source]

If type(pcl)==Pointclouds, converts a pcl object to a padded representation and returns it together with the number of points per batch. Otherwise, returns the input itself with the number of points set to the size of the second dimension of pcl.

pytorch3d.ops.eyes(dim: int, N: int, device: Optional[torch.device] = None, dtype: torch.dtype = torch.float32) → torch.Tensor[source]

Generates a batch of N identity matrices of shape (N, dim, dim).

Parameters:
  • **dim** – The dimensionality of the identity matrices.
  • **N** – The number of identity matrices.
  • **device** – The device to be used for allocating the matrices.
  • **dtype** – The datatype of the matrices.
Returns:

*identities* – A batch of identity matrices of shape (N, dim, dim).

pytorch3d.ops.get_point_covariances(points_padded: torch.Tensor, num_points_per_cloud: torch.Tensor, neighborhood_size: int) → Tuple[torch.Tensor, torch.Tensor][source]

Computes the per-point covariance matrices by of the 3D locations of K-nearest neighbors of each point.

Parameters:
  • **points_padded** – Input point clouds as a padded tensor of shape (minibatch, num_points, dim).
  • **num_points_per_cloud** – Number of points per cloud of shape (minibatch,).
  • **neighborhood_size** – Number of nearest neighbors for each point used to estimate the covariance matrices.
Returns:

*covariances*

A batch of per-point covariance matrices

of shape (minibatch, dim, dim).

k_nearest_neighbors: A batch of neighborhood_size nearest

neighbors for each of the point cloud points of shape (minibatch, num_points, neighborhood_size, dim).

pytorch3d.ops.is_pointclouds(pcl: Union[torch.Tensor, Pointclouds])[source]

Checks whether the input pcl is an instance of Pointclouds by checking the existence of points_padded and num_points_per_cloud functions.

pytorch3d.ops.wmean(x: torch.Tensor, weight: Optional[torch.Tensor] = None, dim: Union[int, Tuple[int]] = -2, keepdim: bool = True, eps: float = 1e-09) → torch.Tensor[source]

Finds the mean of the input tensor across the specified dimension. If the weight argument is provided, computes weighted mean. :param x: tensor of shape (*, D), where D is assumed to be spatial; :param weights: if given, non-negative tensor of shape (*,). It must be

broadcastable to x.shape[:-1]. Note that the weights for the last (spatial) dimension are assumed same;
Parameters:
  • dim – dimension(s) in x to average over;
  • keepdim – tells whether to keep the resulting singleton dimension.
  • eps – minumum clamping value in the denominator.
Returns:

the mean tensor

  • if weights is None => mean(x, dim),
  • otherwise => sum(x*w, dim) / max{sum(w, dim), eps}.

pytorch3d.ops.vert_align(feats, verts, return_packed: bool = False, interp_mode: str = 'bilinear', padding_mode: str = 'zeros', align_corners: bool = True) → torch.Tensor[source]

Sample vertex features from a feature map. This operation is called “perceptual feaure pooling” in [1] or “vert align” in [2].

[1] Wang et al, “Pixel2Mesh: Generating 3D Mesh Models from Single
RGB Images”, ECCV 2018.

[2] Gkioxari et al, “Mesh R-CNN”, ICCV 2019

Parameters:
  • feats – FloatTensor of shape (N, C, H, W) representing image features from which to sample or a list of features each with potentially different C, H or W dimensions.
  • verts – FloatTensor of shape (N, V, 3) or an object (e.g. Meshes or Pointclouds) with `verts_padded’ or `points_padded’ as an attribute giving the (x, y, z) vertex positions for which to sample. (x, y) verts should be normalized such that (-1, -1) corresponds to top-left and (+1, +1) to bottom-right location in the input feature map.
  • return_packed – (bool) Indicates whether to return packed features
  • interp_mode – (str) Specifies how to interpolate features. (‘bilinear’ or ‘nearest’)
  • padding_mode – (str) Specifies how to handle vertices outside of the [-1, 1] range. (‘zeros’, ‘reflection’, or ‘border’)
  • align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. If set to True, the extrema (-1 and 1) are considered as referring to the center points of the input’s corner pixels. If set to False, they are instead considered as referring to the corner points of the input’s corner pixels, making the sampling more resolution agnostic. Default: True
Returns:

feats_sampled

FloatTensor of shape (N, V, C) giving sampled features for each

vertex. If feats is a list, we return concatentated features in axis=2 of shape (N, V, sum(C_n)) where C_n = feats[n].shape[1]. If return_packed = True, the features are transformed to a packed representation of shape (sum(V), C)