pytorch3d.datasets

Dataset loaders for datasets including ShapeNetCore.

class pytorch3d.datasets.BlenderCamera(R=<Mock name='numpy.expand_dims()' id='140019248723344'>, T=<Mock name='numpy.expand_dims()' id='140019248723344'>, K=<Mock name='numpy.expand_dims()' id='140019248723344'>, device: str | ~torch.device = 'cpu')[source]

Bases: CamerasBase

Camera for rendering objects with calibration matrices from the R2N2 dataset (which uses Blender for rendering the views for each model).

__init__(R=<Mock name='numpy.expand_dims()' id='140019248723344'>, T=<Mock name='numpy.expand_dims()' id='140019248723344'>, K=<Mock name='numpy.expand_dims()' id='140019248723344'>, device: str | ~torch.device = 'cpu') → None[source]

Parameters:

R – Rotation matrix of shape (N, 3, 3).
T – Translation matrix of shape (N, 3).
K – Intrinsic matrix of shape (N, 4, 4).
device – Device (as str or torch.device).

get_projection_transform(**kwargs) → Transform3d[source]

is_perspective()[source]

in_ndc()[source]

pytorch3d.datasets.collate_batched_R2N2(batch: List[Dict])[source]

Take a list of objects in the form of dictionaries and merge them into a single dictionary. This function can be used with a Dataset object to create a torch.utils.data.Dataloader which directly returns Meshes objects. TODO: Add support for textures.

Parameters:

batch – List of dictionaries containing information about objects in the dataset.

Returns:

collated_dict –

Dictionary of collated lists. If batch contains both: verts and faces, a collated mesh batch is also returned.

class pytorch3d.datasets.R2N2(split: str, shapenet_dir: str, r2n2_dir: str, splits_file: str, return_all_views: bool = True, return_voxels: bool = False, views_rel_path: str = 'ShapeNetRendering', voxels_rel_path: str = 'ShapeNetVoxels', load_textures: bool = True, texture_resolution: int = 4)[source]

Bases: ShapeNetBase

This class loads the R2N2 dataset from a given directory into a Dataset object. The R2N2 dataset contains 13 categories that are a subset of the ShapeNetCore v.1 dataset. The R2N2 dataset also contains its own 24 renderings of each object and voxelized models. Most of the models have all 24 views in the same split, but there are eight of them that divide their views between train and test splits.

__init__(split: str, shapenet_dir: str, r2n2_dir: str, splits_file: str, return_all_views: bool = True, return_voxels: bool = False, views_rel_path: str = 'ShapeNetRendering', voxels_rel_path: str = 'ShapeNetVoxels', load_textures: bool = True, texture_resolution: int = 4) → None[source]

Store each object’s synset id and models id the given directories.

Parameters:

split (str) – One of (train, val, test).
shapenet_dir (str) – Path to ShapeNet core v1.
r2n2_dir (str) – Path to the R2N2 dataset.
splits_file (str) – File containing the train/val/test splits.
return_all_views (bool) – Indicator of whether or not to load all the views in the split. If set to False, one of the views in the split will be randomly selected and loaded.
return_voxels (bool) – Indicator of whether or not to return voxels as a tensor of shape (D, D, D) where D is the number of voxels along each dimension.
views_rel_path – path to rendered views within the r2n2_dir. If not specified, the renderings are assumed to be at os.path.join(rn2n_dir, “ShapeNetRendering”).
voxels_rel_path – path to rendered views within the r2n2_dir. If not specified, the renderings are assumed to be at os.path.join(rn2n_dir, “ShapeNetVoxels”).
load_textures – Boolean indicating whether textures should loaded for the model. Textures will be of type TexturesAtlas i.e. a texture map per face.
texture_resolution – Int specifying the resolution of the texture map per face created using the textures in the obj file. A (texture_resolution, texture_resolution, 3) map is created per face.

__getitem__(model_idx, view_idxs: List[int] | None = None) → Dict[source]

Read a model by the given index.

Parameters:

model_idx – The idx of the model to be retrieved in the dataset.
view_idx – List of indices of the view to be returned. Each index needs to be contained in the loaded split (always between 0 and 23, inclusive). If an invalid index is supplied, view_idx will be ignored and all the loaded views will be returned.

Returns:

dictionary with following keys –

verts: FloatTensor of shape (V, 3).
faces: faces.verts_idx, LongTensor of shape (F, 3).
synset_id (str): synset id.
model_id (str): model id.
label (str): synset label.
images: FloatTensor of shape (V, H, W, C), where V is number of views
returned. Returns a batch of the renderings of the models from the R2N2 dataset.
R: Rotation matrix of shape (V, 3, 3), where V is number of views returned.
T: Translation matrix of shape (V, 3), where V is number of views returned.
K: Intrinsic matrix of shape (V, 4, 4), where V is number of views returned.
voxels: Voxels of shape (D, D, D), where D is the number of voxels along each
dimension.

render(model_ids: ~typing.List[str] | None = None, categories: ~typing.List[str] | None = None, sample_nums: ~typing.List[int] | None = None, idxs: ~typing.List[int] | None = None, view_idxs: ~typing.List[int] | None = None, shader_type=<class 'pytorch3d.renderer.mesh.shader.HardPhongShader'>, device: str | ~torch.device = 'cpu', **kwargs) → Tensor[source]

Render models with BlenderCamera by default to achieve the same orientations as the R2N2 renderings. Also accepts other types of cameras and any of the args that the render function in the ShapeNetBase class accepts.

Parameters:

view_idxs – each model will be rendered with the orientation(s) of the specified views. Only render by view_idxs if no camera or args for BlenderCamera is supplied.
ShapeNetBase (Accepts any of the args of the render function in) –
model_ids – List[str] of model_ids of models intended to be rendered.
categories – List[str] of categories intended to be rendered. categories and sample_nums must be specified at the same time. categories can be given in the form of synset offsets or labels, or a combination of both.
sample_nums – List[int] of number of models to be randomly sampled from each category. Could also contain one single integer, in which case it will be broadcasted for every category.
idxs – List[int] of indices of models to be rendered in the dataset.
shader_type – Shader to use for rendering. Examples include HardPhongShader
(default) –
class. (SoftPhongShader etc or any other type of valid Shader) –
device – Device (as str or torch.device) on which the tensors should be located.
**kwargs – Accepts any of the kwargs that the renderer supports and any of the args that BlenderCamera supports.

Returns:

Batch of rendered images of shape (N, H, W, 3).

pytorch3d.datasets.render_cubified_voxels(voxels: ~torch.Tensor, shader_type=<class 'pytorch3d.renderer.mesh.shader.HardPhongShader'>, device: str | ~torch.device = 'cpu', **kwargs)[source]

Use the Cubify operator to convert inputs voxels to a mesh and then render that mesh.

Parameters:

voxels – FloatTensor of shape (N, D, D, D) where N is the batch size and D is the number of voxels along each dimension.
shader_type – shader_type: shader_type: Shader to use for rendering. Examples include HardPhongShader (default), SoftPhongShader etc or any other type of valid Shader class.
device – Device (as str or torch.device) on which the tensors should be located.
**kwargs – Accepts any of the kwargs that the renderer supports.

Returns:

Batch of rendered images of shape (N, H, W, 3).

class pytorch3d.datasets.ShapeNetCore(data_dir, synsets=None, version: int = 1, load_textures: bool = True, texture_resolution: int = 4)[source]

Bases: ShapeNetBase

This class loads ShapeNetCore from a given directory into a Dataset object. ShapeNetCore is a subset of the ShapeNet dataset and can be downloaded from https://www.shapenet.org/.

__init__(data_dir, synsets=None, version: int = 1, load_textures: bool = True, texture_resolution: int = 4) → None[source]

Store each object’s synset id and models id from data_dir.

Parameters:

data_dir – Path to ShapeNetCore data.
synsets – List of synset categories to load from ShapeNetCore in the form of synset offsets or labels. A combination of both is also accepted. When no category is specified, all categories in data_dir are loaded.
version – (int) version of ShapeNetCore data in data_dir, 1 or 2. Default is set to be 1. Version 1 has 57 categories and version 2 has 55 categories. Note: version 1 has two categories 02858304(boat) and 02992529(cellphone) that are hyponyms of categories 04530566(watercraft) and 04401088(telephone) respectively. You can combine the categories manually if needed. Version 2 doesn’t have 02858304(boat) or 02834778(bicycle) compared to version 1.
load_textures – Boolean indicating whether textures should loaded for the model. Textures will be of type TexturesAtlas i.e. a texture map per face.
texture_resolution – Int specifying the resolution of the texture map per face created using the textures in the obj file. A (texture_resolution, texture_resolution, 3) map is created per face.

__getitem__(idx: int) → Dict[source]

Read a model by the given index.

Parameters:

idx – The idx of the model to be retrieved in the dataset.

Returns:

dictionary with following keys –

verts: FloatTensor of shape (V, 3).
faces: LongTensor of shape (F, 3) which indexes into the verts tensor.
synset_id (str): synset id
model_id (str): model id
label (str): synset label.

pytorch3d.datasets.collate_batched_meshes(batch: List[Dict])[source]

Take a list of objects in the form of dictionaries and merge them into a single dictionary. This function can be used with a Dataset object to create a torch.utils.data.Dataloader which directly returns Meshes objects. TODO: Add support for textures.

Parameters:

batch – List of dictionaries containing information about objects in the dataset.

Returns:

collated_dict –

Dictionary of collated lists. If batch contains both: verts and faces, a collated mesh batch is also returned.