pytorch3d.implicitron.models.view_pooler.view_sampler

view_sampler

class pytorch3d.implicitron.models.view_pooler.view_sampler.ViewSampler(*args, **kwargs)[source]

Bases: Configurable, Module

Implements sampling of image-based features at the 2d projections of a set of 3D points.

Parameters:
  • masked_sampling – If True, the sampled_masks output of self.forward contains the input masks sampled at the 2d projections. Otherwise, all entries of sampled_masks are set to 1.

  • sampling_mode – Controls the mode of the torch.nn.functional.grid_sample function used to interpolate the sampled feature tensors at the locations of the 2d projections.

masked_sampling: bool = False
sampling_mode: str = 'bilinear'
forward(*, pts: Tensor, seq_id_pts: List[int] | List[str] | LongTensor, camera: CamerasBase, seq_id_camera: List[int] | List[str] | LongTensor, feats: Dict[str, Tensor], masks: Tensor | None, **kwargs) Tuple[Dict[str, Tensor], Tensor][source]

Project each point cloud from a batch of point clouds to corresponding input cameras and sample features at the 2D projection locations.

Parameters:
  • pts – A tensor of shape [pts_batch x n_pts x 3] in world coords.

  • seq_id_pts – LongTensor of shape [pts_batch] denoting the ids of the scenes from which pts were extracted, or a list of string names.

  • camera – ‘n_cameras’ cameras, each coresponding to a batch element of feats.

  • seq_id_camera – LongTensor of shape [n_cameras] denoting the ids of the scenes corresponding to cameras in camera, or a list of string names.

  • feats – a dict of tensors of per-image features {feat_i: T_i}. Each tensor T_i is of shape [n_cameras x dim_i x H_i x W_i].

  • masks[n_cameras x 1 x H x W], define valid image regions for sampling feats.

Returns:

sampled_feats

Dict of sampled features {feat_i: sampled_T_i}.

Each sampled_T_i of shape [pts_batch, n_cameras, n_pts, dim_i].

sampled_masks: A tensor with mask of the sampled features

of shape (pts_batch, n_cameras, n_pts, 1).

pytorch3d.implicitron.models.view_pooler.view_sampler.project_points_and_sample(pts: Tensor, feats: Dict[str, Tensor], camera: CamerasBase, masks: Tensor | None, eps: float = 0.01, sampling_mode: str = 'bilinear') Tuple[Dict[str, Tensor], Tensor][source]

Project each point cloud from a batch of point clouds to all input cameras and sample features at the 2D projection locations.

Parameters:
  • pts(pts_batch, n_pts, 3) tensor containing a batch of 3D point clouds.

  • feats – A dict {feat_i: feat_T_i} of features to sample, where each feat_T_i is a tensor of shape (n_cameras, feat_i_dim, feat_i_H, feat_i_W) of feat_i_dim-dimensional features extracted from n_cameras source views.

  • camera – A batch of n_cameras cameras corresponding to their feature tensors feat_T_i from feats.

  • masks – A tensor of shape (n_cameras, 1, mask_H, mask_W) denoting valid locations for sampling.

  • eps – A small constant controlling the minimum depth of projections of pts to avoid divisons by zero in the projection operation.

  • sampling_mode – Sampling mode of the grid sampler.

Returns:

sampled_feats

Dict of sampled features {feat_i: sampled_T_i}.

Each sampled_T_i is of shape (pts_batch, n_cameras, n_pts, feat_i_dim).

sampled_masks: A tensor with the mask of the sampled features

of shape (pts_batch, n_cameras, n_pts, 1). If masks is None, the returned sampled_masks will be filled with 1s.

pytorch3d.implicitron.models.view_pooler.view_sampler.handle_seq_id(seq_id: LongTensor | List[str] | List[int], device) LongTensor[source]

Converts the input sequence id to a LongTensor.

Parameters:
  • seq_id – A sequence of sequence ids.

  • device – The target device of the output.

Returns

long_seq_id: seq_id converted to a LongTensor and moved to device.

pytorch3d.implicitron.models.view_pooler.view_sampler.cameras_points_cartesian_product(camera: CamerasBase, pts: Tensor) Tuple[CamerasBase, Tensor][source]

Generates all pairs of pairs of elements from ‘camera’ and ‘pts’ and returns camera_rep and pts_rep such that:

camera_rep = [                 pts_rep = [
    camera[0]                      pts[0],
    camera[0]                      pts[1],
    camera[0]                      ...,
    ...                            pts[batch_pts-1],
    camera[1]                      pts[0],
    camera[1]                      pts[1],
    camera[1]                      ...,
    ...                            pts[batch_pts-1],
    ...                            ...,
    camera[n_cameras-1]            pts[0],
    camera[n_cameras-1]            pts[1],
    camera[n_cameras-1]            ...,
    ...                            pts[batch_pts-1],
]                              ]
Parameters:
  • camera – A batch of n_cameras cameras.

  • pts – A batch of batch_pts points of shape (batch_pts, …, dim)

Returns:

camera_rep

A batch of batch_pts*n_cameras cameras such that:

camera_rep = [
    camera[0]
    camera[0]
    camera[0]
    ...
    camera[1]
    camera[1]
    camera[1]
    ...
    ...
    camera[n_cameras-1]
    camera[n_cameras-1]
    camera[n_cameras-1]
]
pts_rep: Repeated pts of shape (batch_pts*n_cameras, …, dim),

such that:

pts_rep = [

pts[0], pts[1], …, pts[batch_pts-1], pts[0], pts[1], …, pts[batch_pts-1], …, pts[0], pts[1], …, pts[batch_pts-1],

]