pytorch3d.implicitron.models.renderer.base

base

class pytorch3d.implicitron.models.renderer.base.EvaluationMode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

TRAINING = 'training'

EVALUATION = 'evaluation'

class pytorch3d.implicitron.models.renderer.base.RenderSamplingMode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

MASK_SAMPLE = 'mask_sample'

FULL_GRID = 'full_grid'

class pytorch3d.implicitron.models.renderer.base.ImplicitronRayBundle(origins: Tensor, directions: Tensor, lengths: Tensor | None, xys: Tensor, camera_ids: LongTensor | None = None, camera_counts: LongTensor | None = None, bins: Tensor | None = None, pixel_radii_2d: Tensor | None = None)[source]

Bases: object

Parametrizes points along projection rays by storing ray origins, directions vectors and lengths at which the ray-points are sampled. Furthermore, the xy-locations (xys) of the ray pixels are stored as well. Note that directions don’t have to be normalized; they define unit vectors in the respective 1D coordinate systems; see documentation for ray_bundle_to_ray_points() for the conversion formula.

Ray bundle may represent rays from multiple cameras. In that case, cameras are stored in the packed form (i.e. rays from the same camera are stored in the consecutive elements). The following indices will be set:

camera_ids: A tensor of shape (N, ) which indicates which camera
was used to sample the rays. N is the number of different sampled cameras.

camera_counts: A tensor of shape (N, ) which how many times the
coresponding camera in camera_ids was sampled. sum(camera_counts) == minibatch, where minibatch = origins.shape[0].

origins: A tensor of shape (…, 3) denoting the origins of the sampling rays in world coords.

directions: A tensor of shape (…, 3) containing the direction vectors of sampling rays in world coords. They don’t have to be normalized; they define unit vectors in the respective 1D coordinate systems; see documentation for ray_bundle_to_ray_points() for the conversion formula.

lengths: A tensor of shape (…, num_points_per_ray) containing the lengths at which the rays are sampled.

xys: A tensor of shape (…, 2), the xy-locations (xys) of the ray pixels

camera_ids: An optional tensor of shape (N, ) which indicates which camera was used to sample the rays. N is the number of unique sampled cameras.

camera_counts: An optional tensor of shape (N, ) indicates how many times the coresponding camera in camera_ids was sampled. sum(camera_counts)==total_number_of_rays.

bins: An optional tensor of shape (…, num_points_per_ray + 1) containing the bins at which the rays are sampled. In this case lengths should be equal to the midpoints of bins (…, num_points_per_ray).

pixel_radii_2d: An optional tensor of shape (…, 1) base radii of the conical frustums.

Raises:

ValueError – If either bins or lengths are not provided.
ValueError – If bins is provided and the last dim is inferior or equal to 1.

property lengths: Tensor

is_packed() → bool[source]: Returns whether the ImplicitronRayBundle carries data in packed state

get_padded_xys() → Tuple[Tensor, LongTensor, int][source]

For a packed ray bundle, returns padded rays. Assumes the input bundle is packed (i.e. camera_ids and camera_counts are set).

Returns:

- xys –

Tensor of shape (N, max_size, …) containing the padded: representation of the pixel coordinated; where max_size is max of camera_counts. The values for camera id i will be copied to xys[i, :], with zeros padding out the extra inputs.

first_idxs: cumulative sum of camera_counts defininf the boundaries
between cameras in the packed representation
num_inputs: the number of cameras in the bundle.

class pytorch3d.implicitron.models.renderer.base.RendererOutput(features: ~torch.Tensor, depths: ~torch.Tensor, masks: ~torch.Tensor, prev_stage: ~pytorch3d.implicitron.models.renderer.base.RendererOutput | None = None, normals: ~torch.Tensor | None = None, points: ~torch.Tensor | None = None, weights: ~torch.Tensor | None = None, aux: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Bases: object

A structure for storing the output of a renderer.

Parameters:

features – rendered features (usually RGB colors), (B, …, C) tensor.
depth – rendered ray-termination depth map, in NDC coordinates, (B, …, 1) tensor.
mask – rendered object mask, values in [0, 1], (B, …, 1) tensor.
prev_stage – for multi-pass renderers (e.g. in NeRF), a reference to the output of the previous stage.
normals – surface normals, for renderers that estimate them; (B, …, 3) tensor.
points – ray-termination points in the world coordinates, (B, …, 3) tensor.
aux – dict for implementation-specific renderer outputs.

features: Tensor

depths: Tensor

masks: Tensor

prev_stage: RendererOutput | None = None

normals: Tensor | None = None

points: Tensor | None = None

weights: Tensor | None = None

aux: Dict[str, Any]

class pytorch3d.implicitron.models.renderer.base.ImplicitFunctionWrapper(fn: Module)[source]

Bases: Module

bind_args(**bound_args)[source]

unbind_args()[source]

forward(*args, **kwargs)[source]

class pytorch3d.implicitron.models.renderer.base.BaseRenderer(*args, **kwargs)[source]

Bases: ABC, ReplaceableBase

Base class for all Renderer implementations.

requires_object_mask() → bool[source]: Whether forward needs the object_mask.

abstract forward(ray_bundle: ImplicitronRayBundle, implicit_functions: List[ImplicitFunctionWrapper], evaluation_mode: EvaluationMode = EvaluationMode.EVALUATION, **kwargs) → RendererOutput[source]

Each Renderer should implement its own forward function that returns an instance of RendererOutput.

Parameters:

ray_bundle –
An ImplicitronRayBundle object containing the following variables: origins: A tensor of shape (minibatch, …, 3) denoting

the origins of the rendering rays.

directions: A tensor of shape (minibatch, …, 3)
containing the direction vectors of rendering rays.

lengths: A tensor of shape
(minibatch, …, num_points_per_ray)containing the lengths at which the ray points are sampled. The coordinates of the points on the rays are thus computed as origins + lengths * directions.

xys: A tensor of shape
(minibatch, …, 2) containing the xy locations of each ray’s pixel in the NDC screen space.

camera_ids: A tensor of shape (N, ) which indicates which camera
was used to sample the rays. N is the number of different sampled cameras.

camera_counts: A tensor of shape (N, ) which how many times the
coresponding camera in camera_ids was sampled. sum(camera_counts)==minibatch
implicit_functions – List of ImplicitFunctionWrappers which define the implicit function methods to be used. Most Renderers only allow a single implicit function. Currently, only the MultiPassEmissionAbsorptionRenderer allows specifying mulitple values in the list.
evaluation_mode – one of EvaluationMode.TRAINING or EvaluationMode.EVALUATION which determines the settings used for rendering.
**kwargs – In addition to the name args, custom keyword args can be specified. For example in the SignedDistanceFunctionRenderer, an object_mask is required which needs to be passed via the kwargs.

Returns:

instance of RendererOutput

pytorch3d.implicitron.models.renderer.base.compute_3d_diagonal_covariance_gaussian(rays_directions: Tensor, rays_dir_variance: Tensor, radii_variance: Tensor, eps: float = 1e-06) → Tensor[source]

Transform the variances (rays_dir_variance, radii_variance) of the gaussians from the coordinate frame of the conical frustum to 3D world coordinates.

It follows the equation 16 of MIP-NeRF

Parameters:

rays_directions – A tensor of shape (…, 3)
rays_dir_variance – A tensor of shape (…, num_intervals) representing the variance of the conical frustum with respect to the rays direction.
radii_variance – A tensor of shape (…, num_intervals) representing the variance of the conical frustum with respect to its radius.
eps – a small number to prevent division by zero.

Returns:

A tensor of shape (…, num_intervals, 3) containing the diagonal: of the covariance matrix.

pytorch3d.implicitron.models.renderer.base.approximate_conical_frustum_as_gaussians(bins: Tensor, radii: Tensor) → Tuple[Tensor, Tensor, Tensor][source]

Approximates a conical frustum as two Gaussian distributions.

The Gaussian distributions are characterized by three values:

rays_dir_mean: mean along the rays direction
(defined as t in the parametric representation of a cone).
rays_dir_variance: the variance of the conical frustum along the rays direction.
radii_variance: variance of the conical frustum with respect to its radius.

The computation is stable and follows equation 7 of MIP-NeRF.

For more information on how the mean and variances are computed refers to the appendix of the paper.

Parameters:

bins – A tensor of shape (…, num_points_per_ray + 1) containing the bins at which the rays are sampled. bin[…, t] and bin[…, t+1] represent respectively the left and right coordinates of the interval.
t0 – A tensor of shape (…, num_points_per_ray) containing the left coordinates of the intervals on which the rays are sampled.
t1 – A tensor of shape (…, num_points_per_ray) containing the rights coordinates of the intervals on which the rays are sampled.
radii – A tensor of shape (…, 1) base radii of the conical frustums.

Returns:

rays_dir_mean –

A tensor of shape (…, num_intervals) representing: the mean along the rays direction (t in the parametric represention of the cone)
rays_dir_variance: A tensor of shape (…, num_intervals) representing: the variance of the conical frustum along the rays (t in the parametric represention of the cone).
radii_variance: A tensor of shape (…, num_intervals) representing: the variance of the conical frustum with respect to its radius.

pytorch3d.implicitron.models.renderer.base.conical_frustum_to_gaussian(ray_bundle: ImplicitronRayBundle) → Tuple[Tensor, Tensor][source]

Approximate a conical frustum following a ray bundle as a Gaussian.

Parameters:

ray_bundle –

A RayBundle or HeterogeneousRayBundle object with fields: origins: A tensor of shape (…, 3) directions: A tensor of shape (…, 3) lengths: A tensor of shape (…, num_points_per_ray) bins: A tensor of shape (…, num_points_per_ray + 1)

containing the bins at which the rays are sampled. .

pixel_radii_2d: A tensor of shape (…, 1): base radii of the conical frustums.

Returns:

means –

A tensor of shape (…, num_points_per_ray - 1, 3): representing the means of the Gaussians approximating the conical frustums.
diag_covariances: A tensor of shape (…,num_points_per_ray -1, 3): representing the diagonal covariance matrices of our Gaussians.