pytorch3d.implicitron.models.view_pooler.view_pooler

view_pooler

class pytorch3d.implicitron.models.view_pooler.view_pooler.ViewPooler(*args, **kwargs)[source]

Bases: Configurable, Module

Implements sampling of image-based features at the 2d projections of a set of 3D points, and a subsequent aggregation of the resulting set of features per-point.

Parameters:

view_sampler – An instance of ViewSampler which is used for sampling of image-based features at the 2D projections of a set of 3D points.
feature_aggregator_class_type – The name of the feature aggregator class which is available in the global registry.
feature_aggregator – A feature aggregator class which inherits from FeatureAggregatorBase. Typically, the aggregated features and their masks are output by a ViewSampler which samples feature tensors extracted from a set of source images. FeatureAggregator executes step (4) above.

view_sampler: ViewSampler

feature_aggregator_class_type: str = 'AngleWeightedReductionFeatureAggregator'

feature_aggregator: FeatureAggregatorBase

get_aggregated_feature_dim(feats: Dict[str, Tensor] | int)[source]

Returns the final dimensionality of the output aggregated features.

Parameters:

feats – Either a dict of sampled features {f_i: t_i} corresponding to the feats_sampled argument of feature_aggregator,forward, or an int representing the sum of dimensionalities of each t_i.

Returns:

aggregated_feature_dim –

The final dimensionality of the output: aggregated features.

has_aggregation()[source]

Specifies whether the feature_aggregator reduces the output reduce_dim dimension to 1.

Returns:: has_aggregation – True if reduce_dim==1, else False.

Project each point cloud from a batch of point clouds to corresponding input cameras, sample features at the 2D projection locations in a batch of source images, and aggregate the pointwise sampled features.

Parameters:

pts – A tensor of shape [pts_batch x n_pts x 3] in world coords.
seq_id_pts – LongTensor of shape [pts_batch] denoting the ids of the scenes from which pts were extracted, or a list of string names.
camera – ‘n_cameras’ cameras, each coresponding to a batch element of feats.
seq_id_camera – LongTensor of shape [n_cameras] denoting the ids of the scenes corresponding to cameras in camera, or a list of string names.
feats – a dict of tensors of per-image features {feat_i: T_i}. Each tensor T_i is of shape [n_cameras x dim_i x H_i x W_i].
masks – [n_cameras x 1 x H x W], define valid image regions for sampling feats.

Returns:

feats_aggregated –

If feature_aggregator.concatenate_output==True, a tensor: of shape (pts_batch, reduce_dim, n_pts, sum(dim_1, … dim_N)) containing the aggregated features. reduce_dim depends on the specific feature aggregator implementation and typically equals 1 or n_cameras. If feature_aggregator.concatenate_output==False, the aggregator does not concatenate the aggregated features and returns a dictionary of per-feature aggregations {f_i: t_i_aggregated} instead. Each t_i_aggregated is of shape (pts_batch, reduce_dim, n_pts, aggr_dim_i).