pytorch3d.implicitron.models.view_pooler.view_pooler
view_pooler
- class pytorch3d.implicitron.models.view_pooler.view_pooler.ViewPooler(*args, **kwargs)[source]
Bases:
Configurable
,Module
Implements sampling of image-based features at the 2d projections of a set of 3D points, and a subsequent aggregation of the resulting set of features per-point.
- Parameters:
view_sampler – An instance of ViewSampler which is used for sampling of image-based features at the 2D projections of a set of 3D points.
feature_aggregator_class_type – The name of the feature aggregator class which is available in the global registry.
feature_aggregator – A feature aggregator class which inherits from FeatureAggregatorBase. Typically, the aggregated features and their masks are output by a ViewSampler which samples feature tensors extracted from a set of source images. FeatureAggregator executes step (4) above.
- view_sampler: ViewSampler
- feature_aggregator_class_type: str = 'AngleWeightedReductionFeatureAggregator'
- feature_aggregator: FeatureAggregatorBase
- get_aggregated_feature_dim(feats: Dict[str, Tensor] | int)[source]
Returns the final dimensionality of the output aggregated features.
- Parameters:
feats – Either a dict of sampled features {f_i: t_i} corresponding to the feats_sampled argument of feature_aggregator,forward, or an int representing the sum of dimensionalities of each t_i.
- Returns:
aggregated_feature_dim –
- The final dimensionality of the output
aggregated features.
- has_aggregation()[source]
Specifies whether the feature_aggregator reduces the output reduce_dim dimension to 1.
- Returns:
has_aggregation – True if reduce_dim==1, else False.
- forward(*, pts: Tensor, seq_id_pts: List[int] | List[str] | LongTensor, camera: CamerasBase, seq_id_camera: List[int] | List[str] | LongTensor, feats: Dict[str, Tensor], masks: Tensor | None, **kwargs) Tensor | Dict[str, Tensor] [source]
Project each point cloud from a batch of point clouds to corresponding input cameras, sample features at the 2D projection locations in a batch of source images, and aggregate the pointwise sampled features.
- Parameters:
pts – A tensor of shape [pts_batch x n_pts x 3] in world coords.
seq_id_pts – LongTensor of shape [pts_batch] denoting the ids of the scenes from which pts were extracted, or a list of string names.
camera – ‘n_cameras’ cameras, each coresponding to a batch element of feats.
seq_id_camera – LongTensor of shape [n_cameras] denoting the ids of the scenes corresponding to cameras in camera, or a list of string names.
feats – a dict of tensors of per-image features {feat_i: T_i}. Each tensor T_i is of shape [n_cameras x dim_i x H_i x W_i].
masks – [n_cameras x 1 x H x W], define valid image regions for sampling feats.
- Returns:
feats_aggregated –
- If feature_aggregator.concatenate_output==True, a tensor
of shape (pts_batch, reduce_dim, n_pts, sum(dim_1, … dim_N)) containing the aggregated features. reduce_dim depends on the specific feature aggregator implementation and typically equals 1 or n_cameras. If feature_aggregator.concatenate_output==False, the aggregator does not concatenate the aggregated features and returns a dictionary of per-feature aggregations {f_i: t_i_aggregated} instead. Each t_i_aggregated is of shape (pts_batch, reduce_dim, n_pts, aggr_dim_i).