d3d.dataset

This module contains loaders for various datasets.

class d3d.dataset.base.NumberPool(processes, offset=0, *args, **kargs)[source]

Bases: object

This class is a utility for multiprocessing using tqdm, define the task as

def task(ntqdm, ...):
    ...
    for data in tqdm(..., position=ntqdm, leave=False):
        ...

Then the parallel progress bars will be displayed in place.

Parameters

processes (int) – Number of processes available in the pool. If processes < 1, then functions will be executed in current thread.
offset (int) – The offset added to the ntqdm value of all process. This is useful when you want to display a progress bar in outer loop.

apply_async(func, args=(), callback=None)[source]

wait_for_once(margin=0)[source]

Block current thread and wait for one available process

Parameters: margin (int) – Define when a process is available. The method will return when there is nprocess + margin processes in the pool.

class d3d.dataset.base.DatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False)[source]

Bases: object

This class acts as the base for all dataset loaders

Parameters

base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data
inzip (bool) – whether the dataset is store in original zip archives or unzipped
phase (str) – training, validation or testing
trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval() for detail.
trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval() for detail.

identity(idx)[source]

Return something that can track the data back to original dataset. The result tuple can be passed: to any accessors above and directly access given data.

Parameters: idx (int) – index of requested frame to be parsed
Return type: tuple

return_path()[source]

Make the dataset return the raw paths to the data instead of parsing it. This method returns a context manager.

Return type: AbstractContextManager

class d3d.dataset.base.DetectionDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False)[source]

Bases: d3d.dataset.base.DatasetBase, d3d.dataset.base.MultiModalDatasetMixin

This class defines basic interface for object detection datasets

Parameters

base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data
inzip (bool) – whether the dataset is store in original zip archives or unzipped
phase (str) – training, validation or testing
trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval() for detail.
trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval() for detail.

VALID_OBJ_CLASSES: enum.Enum: List of valid object labels

analyze_3dobject()[source]

Report statistics on 3D object labels

Returns: Statistics containing mean dimension
Return type: dict

annotation_3dobject(idx, raw=None)[source]

Return list of converted ground truth targets in lidar frame.

Parameters

idx (Union[int, tuple]) – index of requested frame
raw (Optional[bool]) – if false, targets will be converted to d3d d3d.abstraction.Target3DArray format, otherwise raw data will be returned in original format

Return type

Union[d3d.abstraction.Target3DArray, Any]

class d3d.dataset.base.MultiModalDatasetMixin[source]

Bases: object

This class defines basic interface for multi-modal datasets

VALID_CAM_NAMES: List[str]: List of valid sensor names of camera

VALID_LIDAR_NAMES: List[str]: List of valid sensor names of lidar

calibration_data(idx, raw=None)[source]

Return the calibration data

Parameters

idx (Union[int, tuple]) – index of requested frame
raw (Optional[bool]) – if false, converted d3d.abstraction.TransformSet will be returned, otherwise raw data will be returned in original format.

Return type

Union[d3d.abstraction.TransformSet, Any]

camera_data(idx, names=None)[source]

Return the camera image data

Parameters

names (Optional[Union[str, List[str]]]) – name of requested camera sensors. The default sensor is the first element in VALID_CAM_NAMES.
idx (Union[int, tuple]) – index of requested image frames

Return type

Union[PIL.Image.Image, List[PIL.Image.Image]]

lidar_data(idx, names=None, formatted=False)[source]

Return the lidar point cloud data

Parameters

names (Optional[Union[str, List[str]]]) – name of requested lidar sensors. The default sensor is the first element in VALID_LIDAR_NAMES.
idx (Union[int, tuple]) – index of requested lidar frames
formatted (bool) – if true, the point cloud wrapped in a numpy record array will be returned

Return type

Union[numpy.ndarray, List[numpy.ndarray]]

class d3d.dataset.base.MultiModalSequenceDatasetMixin[source]

Bases: object

This class defines basic interface for multi-modal datasets of sequences

calibration_data(idx, raw=False)[source]

Return the calibration data. Notices that we assume the calibration is fixed among one squence, so it always return a single object.

Parameters

idx (Union[int, tuple]) – index of requested lidar frames
raw (Optional[bool]) – If false, converted d3d.abstraction.TransformSet will be returned, otherwise raw data will be returned in original format

Return type

Union[d3d.abstraction.TransformSet, Any]

camera_data(idx, names=None)[source]

Return the camera image data

Parameters

names (Optional[Union[str, List[str]]]) – name of requested camera sensors. The default sensor is the first element in VALID_CAM_NAMES.
idx (Union[int, tuple]) – index of requested image frames, see description in lidar_data() method.

Return type

Union[PIL.Image.Image, List[PIL.Image.Image], List[List[PIL.Image.Image]]]

lidar_data(idx, names=None, formatted=False)[source]

If multiple frames are requested, the results will be a list of list. Outer list corresponds to frame names and inner list corresponds to time sequence. So len(names) × len(frames) data objects will be returned

Parameters

names (Optional[Union[str, List[str]]]) – name of requested lidar sensors. The default frame is the first element in VALID_LIDAR_NAMES.
idx (Union[int, tuple]) – index of requested lidar frames
formatted (bool) –
if true, the point cloud wrapped in a numpy record array will be returned
- If single index is given, then the frame indexing is done on the whole dataset with trainval split
- If a tuple is given, it’s considered to be a unique id of the frame (from identity() method), trainval split is ignored in this way and nframes offset is not added

Return type

Union[numpy.ndarray, List[numpy.ndarray], List[List[numpy.ndarray]]]

class d3d.dataset.base.SegmentationDatasetMixin[source]

Bases: object

This class define basic interface for point cloud segmentation datasets

VALID_PTS_CLASSES: enum.Enum: List of valid points labels

annotation_3dpoints(idx, names=None, formatted=False)[source]

Return list of point-wise labels in lidar frame.

Parameters

idx (Union[int, tuple]) – index of requested frame
formatted (Optional[bool]) – if True, the point labels will be represented as a numpy record array, otherwise, the returned object will be a dictionary of numpy arrays.
names (Optional[Union[str, List[str]]]) –

class d3d.dataset.base.SequenceDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False, trainval_byseq=False, nframes=0)[source]

Bases: d3d.dataset.base.DatasetBase

This class defines basic interface for datasets of sequences

Parameters

base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data
inzip (bool) – whether the dataset is store in original zip archives or unzipped
phase (str) – training, validation or testing
trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval_seq() for detail.
trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval_seq() for detail.
nframes (int) –
number of consecutive frames returned from the accessors
- If it’s a positive number, then it returns adjacent frames with total number reduced
- If it’s a negative number, absolute value of it is consumed
- If it’s zero, then it act like object detection dataset, which means the methods will return unpacked data
trainval_byseq – Whether split trainval partitions by sequences instead of frames

_locate_frame(idx)[source]

Subclass should implement this function to convert overall index to (sequence_id, frame_idx) to support decorator expand_idx() and expand_idx_name()

Returns: (seq_id, frame_idx) where frame_idx is the index of starting frame
Parameters: idx (int) –
Return type: Tuple[Any, int]

identity(idx)[source]

Return something that can track the data back to original dataset

Parameters: idx (int) – index of requested frame to be parsed
Returns: if nframes > 0, then the function return a list of ids which are consistent with other functions.
Return type: Union[tuple, List[tuple]]

intermediate_data(idx, names=None, ninter_frames=1)[source]

Return the intermediate data (and annotations) between keyframes. For key frames data, please use corresponding function to load them

Parameters

idx (Union[int, tuple]) – index of requested data frames
names (Optional[Union[str, List[str]]]) – name of requested sensors.
ninter_frames (int) – number of intermediate frames. If set to None, then all frames will be returned.

Return type

dict

property sequence_ids: List[Any]: Return the list of sequence ids

property sequence_sizes: Dict[Any, int]: Return the mapping from sequence id to sequence sizes

timestamp(idx, names=None)[source]

Return the timestamp of frame specified by the index, represented by Unix timestamp in macroseconds (usually 16 digits integer)

Parameters

idx (Union[int, tuple]) – index of requested frame
names (Optional[Union[str, List[str]]]) – specify the sensor whose pose is requested. This option only make sense when the dataset contains separate timestamps for data from each sensor.

Return type

Union[int, List[int]]

class d3d.dataset.base.TrackingDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False, trainval_byseq=False, nframes=0)[source]

Bases: d3d.dataset.base.SequenceDatasetBase, d3d.dataset.base.MultiModalSequenceDatasetMixin

Tracking dataset is similarly defined with detection dataset. The two major differences are 1. Tracking dataset use (sequence_id, frame_id) as identifier. 2. Tracking dataset provide unique object id across time frames.

Parameters

base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data
inzip (bool) – whether the dataset is store in original zip archives or unzipped
phase (str) – training, validation or testing
trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval_seq() for detail.
trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval_seq() for detail.
nframes (int) –
number of consecutive frames returned from the accessors
- If it’s a positive number, then it returns adjacent frames with total number reduced
- If it’s a negative number, absolute value of it is consumed
- If it’s zero, then it act like object detection dataset, which means the methods will return unpacked data
trainval_byseq – Whether split trainval partitions by sequences instead of frames

annotation_3dobject(idx, raw=False)[source]

Return list of converted ground truth targets in lidar frame.

Parameters

idx (Union[int, tuple]) – index of requested frame
raw (Optional[bool]) – if false, targets will be converted to d3d d3d.abstraction.Target3DArray format, otherwise raw data will be returned in original format.

Return type

Union[d3d.abstraction.Target3DArray, List[d3d.abstraction.Target3DArray]]

pose(idx, raw=False, names=None)[source]

Return (relative) pose of the vehicle for the frame. The base frame should be ground attached which means the base frame will follow a East-North-Up axis order.

Parameters

idx (Union[int, tuple]) – index of requested frame
names (Optional[Union[str, List[str]]]) – specify the sensor whose pose is requested. This option only make sense when the dataset contains separate timestamps for data from each sensor. In this case, the pose either comes from dataset, or from interpolation.
raw (Optional[bool]) – if false, targets will be converted to d3d d3d.abstraction.EgoPose format, otherwise raw data will be returned in original format.

Return type

Union[d3d.abstraction.EgoPose, Any]

property pose_name: str: Return the sensor frame name whose coordinate the pose is reported in. This frame can be different from the default frame in the calibration TransformSet.

d3d.dataset.base.check_frames(names, valid)[source]

Check wether names is inside valid options.

Parameters

names (Union[List[str], str]) – Names to be checked
valid (List[str]) – List of the valid names

Returns

unpack_result: whether need to unpack results names: frame names converted to list

d3d.dataset.base.expand_idx(func)[source]

This decorator wraps SequenceDatasetBase member functions with index input. It will delegates the situation: where self.nframe > 0 to the original function so that the original function can support only one index.

There is a parameter :obj:bypass` added to decorated function, which is used to call the original underlying method without expansion.

d3d.dataset.base.expand_idx_name(valid_names)[source]

This decorator works similar to expand_idx with support to distribute both indices and frame names. Note that this function acts as a decorator factory instead of decorator

There is a parameter :obj:bypass` added to decorated function, which is used to call the original underlying method without expansion.

Parameters: valid_names (List[str]) – List of valid sensor names
Return type: Callable[[Callable], Callable]

d3d.dataset.base.expand_name(valid_names)[source]

This decorator works similar to expand_idx with support to distribute frame names. Note that this function acts as a decorator factory instead of decorator

Parameters: valid_names (List[str]) – List of valid sensor names
Return type: Callable[[Callable], Callable]

d3d.dataset.base.split_trainval(phase, total_count, trainval_split, trainval_random)[source]

Split frames for training or validation set

Parameters

phase (str) – training or validation
total_count (int) – total number of frames in trainval part of the dataset
trainval_split (Union[float, List[int]]) –
the ratio to split training dataset.
- If it’s a number, then it’s the ratio to split training dataset.
- If it’s 1, then the validation set is empty; if it’s 0, then training set is empty
- If it’s a list of number, then it directly defines the indices to report (ignoring trainval_random)
trainval_random (Union[bool, int, str]) –
whether select the train/val split randomly.
- If it’s a bool, then trainval is split with or without shuffle
- If it’s a number, then it’s used as the seed for random shuffling
- If it’s a string, then predefined order is used. {r: reverse}

Return type

Iterable[int]

d3d.dataset.base.split_trainval_seq(phase, seq_counts, trainval_split, trainval_random, by_seq=False)[source]

Split frames for training or validation by frames or by sequence TODO: consider nframes

Parameters

phase (str) – training or validation
total_count – total number of frames in trainval part of the dataset
trainval_split (Union[float, List[int]]) –
the ratio to split training dataset.
- If it’s a number, then it’s the ratio to split training dataset.
- If it’s 1, then the validation set is empty; if it’s 0, then training set is empty
- If it’s a list of number and by_seq is True, then it directly defines the indices to report (ignoring trainval_random)
- If it’s a list of sequence name and by_seq is False, then it directly defines the sequences to be chosen
trainval_random (Union[bool, int, str]) –
whether select the train/val split randomly.
- If it’s a bool, then trainval is split with or without shuffle
- If it’s a number, then it’s used as the seed for random shuffling
- If it’s a string, then predefined order is used. {r: reverse}
by_seq (bool) – Whether split trainval partitions by sequences instead of frames
seq_counts (SortedCountDict) –

Return type

Iterable[int]

class d3d.dataset.zip.PatchedZipFile(file, mode='r', compression=0, allowZip64=True, to_extract=[])[source]

Bases: zipfile.ZipFile

This class is based on build-in ZipFile class, which is further patched for better reading speed. The improvement is achieved by skip reading metadata of files not interested in.

Parameters: to_extract (Union[List[str], str]) – specify the path (inside zip) of files to be extracted

Open the ZIP file with mode read ‘r’, write ‘w’, exclusive create ‘x’, or append ‘a’.