d3d.dataset

This module contains loaders for various datasets.

class d3d.dataset.base.NumberPool(processes, offset=0, *args, **kargs)[source]

Bases: object

This class is a utility for multiprocessing using tqdm, define the task as

def task(ntqdm, ...):
    ...
    for data in tqdm(..., position=ntqdm, leave=False):
        ...

Then the parallel progress bars will be displayed in place.

Parameters
  • processes (int) – Number of processes available in the pool. If processes < 1, then functions will be executed in current thread.

  • offset (int) – The offset added to the ntqdm value of all process. This is useful when you want to display a progress bar in outer loop.

apply_async(func, args=(), callback=None)[source]
wait_for_once(margin=0)[source]

Block current thread and wait for one available process

Parameters

margin (int) – Define when a process is available. The method will return when there is nprocess + margin processes in the pool.

class d3d.dataset.base.DatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False)[source]

Bases: object

This class acts as the base for all dataset loaders

Parameters
  • base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data

  • inzip (bool) – whether the dataset is store in original zip archives or unzipped

  • phase (str) – training, validation or testing

  • trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval() for detail.

  • trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval() for detail.

identity(idx)[source]
Return something that can track the data back to original dataset. The result tuple can be passed

to any accessors above and directly access given data.

Parameters

idx (int) – index of requested frame to be parsed

Return type

tuple

return_path()[source]

Make the dataset return the raw paths to the data instead of parsing it. This method returns a context manager.

Return type

AbstractContextManager

class d3d.dataset.base.DetectionDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False)[source]

Bases: d3d.dataset.base.DatasetBase, d3d.dataset.base.MultiModalDatasetMixin

This class defines basic interface for object detection datasets

Parameters
  • base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data

  • inzip (bool) – whether the dataset is store in original zip archives or unzipped

  • phase (str) – training, validation or testing

  • trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval() for detail.

  • trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval() for detail.

VALID_OBJ_CLASSES: enum.Enum

List of valid object labels

analyze_3dobject()[source]

Report statistics on 3D object labels

Returns

Statistics containing mean dimension

Return type

dict

annotation_3dobject(idx, raw=None)[source]

Return list of converted ground truth targets in lidar frame.

Parameters
Return type

Union[d3d.abstraction.Target3DArray, Any]

class d3d.dataset.base.MultiModalDatasetMixin[source]

Bases: object

This class defines basic interface for multi-modal datasets

VALID_CAM_NAMES: List[str]

List of valid sensor names of camera

VALID_LIDAR_NAMES: List[str]

List of valid sensor names of lidar

calibration_data(idx, raw=None)[source]

Return the calibration data

Parameters
Return type

Union[d3d.abstraction.TransformSet, Any]

camera_data(idx, names=None)[source]

Return the camera image data

Parameters
Return type

Union[PIL.Image.Image, List[PIL.Image.Image]]

lidar_data(idx, names=None, formatted=False)[source]

Return the lidar point cloud data

Parameters
  • names (Optional[Union[str, List[str]]]) – name of requested lidar sensors. The default sensor is the first element in VALID_LIDAR_NAMES.

  • idx (Union[int, tuple]) – index of requested lidar frames

  • formatted (bool) – if true, the point cloud wrapped in a numpy record array will be returned

Return type

Union[numpy.ndarray, List[numpy.ndarray]]

class d3d.dataset.base.MultiModalSequenceDatasetMixin[source]

Bases: object

This class defines basic interface for multi-modal datasets of sequences

calibration_data(idx, raw=False)[source]

Return the calibration data. Notices that we assume the calibration is fixed among one squence, so it always return a single object.

Parameters
Return type

Union[d3d.abstraction.TransformSet, Any]

camera_data(idx, names=None)[source]

Return the camera image data

Parameters
  • names (Optional[Union[str, List[str]]]) – name of requested camera sensors. The default sensor is the first element in VALID_CAM_NAMES.

  • idx (Union[int, tuple]) – index of requested image frames, see description in lidar_data() method.

Return type

Union[PIL.Image.Image, List[PIL.Image.Image], List[List[PIL.Image.Image]]]

lidar_data(idx, names=None, formatted=False)[source]

If multiple frames are requested, the results will be a list of list. Outer list corresponds to frame names and inner list corresponds to time sequence. So len(names) × len(frames) data objects will be returned

Parameters
  • names (Optional[Union[str, List[str]]]) – name of requested lidar sensors. The default frame is the first element in VALID_LIDAR_NAMES.

  • idx (Union[int, tuple]) – index of requested lidar frames

  • formatted (bool) –

    if true, the point cloud wrapped in a numpy record array will be returned

    • If single index is given, then the frame indexing is done on the whole dataset with trainval split

    • If a tuple is given, it’s considered to be a unique id of the frame (from identity() method), trainval split is ignored in this way and nframes offset is not added

Return type

Union[numpy.ndarray, List[numpy.ndarray], List[List[numpy.ndarray]]]

class d3d.dataset.base.SegmentationDatasetMixin[source]

Bases: object

This class define basic interface for point cloud segmentation datasets

VALID_PTS_CLASSES: enum.Enum

List of valid points labels

annotation_3dpoints(idx, names=None, formatted=False)[source]

Return list of point-wise labels in lidar frame.

Parameters
  • idx (Union[int, tuple]) – index of requested frame

  • formatted (Optional[bool]) – if True, the point labels will be represented as a numpy record array, otherwise, the returned object will be a dictionary of numpy arrays.

  • names (Optional[Union[str, List[str]]]) –

class d3d.dataset.base.SequenceDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False, trainval_byseq=False, nframes=0)[source]

Bases: d3d.dataset.base.DatasetBase

This class defines basic interface for datasets of sequences

Parameters
  • base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data

  • inzip (bool) – whether the dataset is store in original zip archives or unzipped

  • phase (str) – training, validation or testing

  • trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval_seq() for detail.

  • trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval_seq() for detail.

  • nframes (int) –

    number of consecutive frames returned from the accessors

    • If it’s a positive number, then it returns adjacent frames with total number reduced

    • If it’s a negative number, absolute value of it is consumed

    • If it’s zero, then it act like object detection dataset, which means the methods will return unpacked data

  • trainval_byseq – Whether split trainval partitions by sequences instead of frames

_locate_frame(idx)[source]

Subclass should implement this function to convert overall index to (sequence_id, frame_idx) to support decorator expand_idx() and expand_idx_name()

Returns

(seq_id, frame_idx) where frame_idx is the index of starting frame

Parameters

idx (int) –

Return type

Tuple[Any, int]

identity(idx)[source]

Return something that can track the data back to original dataset

Parameters

idx (int) – index of requested frame to be parsed

Returns

if nframes > 0, then the function return a list of ids which are consistent with other functions.

Return type

Union[tuple, List[tuple]]

intermediate_data(idx, names=None, ninter_frames=1)[source]

Return the intermediate data (and annotations) between keyframes. For key frames data, please use corresponding function to load them

Parameters
  • idx (Union[int, tuple]) – index of requested data frames

  • names (Optional[Union[str, List[str]]]) – name of requested sensors.

  • ninter_frames (int) – number of intermediate frames. If set to None, then all frames will be returned.

Return type

dict

property sequence_ids: List[Any]

Return the list of sequence ids

property sequence_sizes: Dict[Any, int]

Return the mapping from sequence id to sequence sizes

timestamp(idx, names=None)[source]

Return the timestamp of frame specified by the index, represented by Unix timestamp in macroseconds (usually 16 digits integer)

Parameters
  • idx (Union[int, tuple]) – index of requested frame

  • names (Optional[Union[str, List[str]]]) – specify the sensor whose pose is requested. This option only make sense when the dataset contains separate timestamps for data from each sensor.

Return type

Union[int, List[int]]

class d3d.dataset.base.TrackingDatasetBase(base_path, inzip=False, phase='training', trainval_split=1.0, trainval_random=False, trainval_byseq=False, nframes=0)[source]

Bases: d3d.dataset.base.SequenceDatasetBase, d3d.dataset.base.MultiModalSequenceDatasetMixin

Tracking dataset is similarly defined with detection dataset. The two major differences are 1. Tracking dataset use (sequence_id, frame_id) as identifier. 2. Tracking dataset provide unique object id across time frames.

Parameters
  • base_path (Union[str, pathlib.Path]) – directory containing the zip files, or the required data

  • inzip (bool) – whether the dataset is store in original zip archives or unzipped

  • phase (str) – training, validation or testing

  • trainval_split (Union[float, List[int]]) – the ratio to split training dataset. See documentation of split_trainval_seq() for detail.

  • trainval_random (Union[bool, int, str]) – whether select the train/val split randomly. See documentation of split_trainval_seq() for detail.

  • nframes (int) –

    number of consecutive frames returned from the accessors

    • If it’s a positive number, then it returns adjacent frames with total number reduced

    • If it’s a negative number, absolute value of it is consumed

    • If it’s zero, then it act like object detection dataset, which means the methods will return unpacked data

  • trainval_byseq – Whether split trainval partitions by sequences instead of frames

annotation_3dobject(idx, raw=False)[source]

Return list of converted ground truth targets in lidar frame.

Parameters
Return type

Union[d3d.abstraction.Target3DArray, List[d3d.abstraction.Target3DArray]]

pose(idx, raw=False, names=None)[source]

Return (relative) pose of the vehicle for the frame. The base frame should be ground attached which means the base frame will follow a East-North-Up axis order.

Parameters
  • idx (Union[int, tuple]) – index of requested frame

  • names (Optional[Union[str, List[str]]]) – specify the sensor whose pose is requested. This option only make sense when the dataset contains separate timestamps for data from each sensor. In this case, the pose either comes from dataset, or from interpolation.

  • raw (Optional[bool]) – if false, targets will be converted to d3d d3d.abstraction.EgoPose format, otherwise raw data will be returned in original format.

Return type

Union[d3d.abstraction.EgoPose, Any]

property pose_name: str

Return the sensor frame name whose coordinate the pose is reported in. This frame can be different from the default frame in the calibration TransformSet.

d3d.dataset.base.check_frames(names, valid)[source]

Check wether names is inside valid options.

Parameters
Returns

unpack_result: whether need to unpack results names: frame names converted to list

d3d.dataset.base.expand_idx(func)[source]
This decorator wraps SequenceDatasetBase member functions with index input. It will delegates the situation

where self.nframe > 0 to the original function so that the original function can support only one index.

There is a parameter :obj:bypass` added to decorated function, which is used to call the original underlying method without expansion.

d3d.dataset.base.expand_idx_name(valid_names)[source]

This decorator works similar to expand_idx with support to distribute both indices and frame names. Note that this function acts as a decorator factory instead of decorator

There is a parameter :obj:bypass` added to decorated function, which is used to call the original underlying method without expansion.

Parameters

valid_names (List[str]) – List of valid sensor names

Return type

Callable[[Callable], Callable]

d3d.dataset.base.expand_name(valid_names)[source]

This decorator works similar to expand_idx with support to distribute frame names. Note that this function acts as a decorator factory instead of decorator

Parameters

valid_names (List[str]) – List of valid sensor names

Return type

Callable[[Callable], Callable]

d3d.dataset.base.split_trainval(phase, total_count, trainval_split, trainval_random)[source]

Split frames for training or validation set

Parameters
  • phase (str) – training or validation

  • total_count (int) – total number of frames in trainval part of the dataset

  • trainval_split (Union[float, List[int]]) –

    the ratio to split training dataset.

    • If it’s a number, then it’s the ratio to split training dataset.

    • If it’s 1, then the validation set is empty; if it’s 0, then training set is empty

    • If it’s a list of number, then it directly defines the indices to report (ignoring trainval_random)

  • trainval_random (Union[bool, int, str]) –

    whether select the train/val split randomly.

    • If it’s a bool, then trainval is split with or without shuffle

    • If it’s a number, then it’s used as the seed for random shuffling

    • If it’s a string, then predefined order is used. {r: reverse}

Return type

Iterable[int]

d3d.dataset.base.split_trainval_seq(phase, seq_counts, trainval_split, trainval_random, by_seq=False)[source]

Split frames for training or validation by frames or by sequence TODO: consider nframes

Parameters
  • phase (str) – training or validation

  • total_count – total number of frames in trainval part of the dataset

  • trainval_split (Union[float, List[int]]) –

    the ratio to split training dataset.

    • If it’s a number, then it’s the ratio to split training dataset.

    • If it’s 1, then the validation set is empty; if it’s 0, then training set is empty

    • If it’s a list of number and by_seq is True, then it directly defines the indices to report (ignoring trainval_random)

    • If it’s a list of sequence name and by_seq is False, then it directly defines the sequences to be chosen

  • trainval_random (Union[bool, int, str]) –

    whether select the train/val split randomly.

    • If it’s a bool, then trainval is split with or without shuffle

    • If it’s a number, then it’s used as the seed for random shuffling

    • If it’s a string, then predefined order is used. {r: reverse}

  • by_seq (bool) – Whether split trainval partitions by sequences instead of frames

  • seq_counts (SortedCountDict) –

Return type

Iterable[int]

class d3d.dataset.zip.PatchedZipFile(file, mode='r', compression=0, allowZip64=True, to_extract=[])[source]

Bases: zipfile.ZipFile

This class is based on build-in ZipFile class, which is further patched for better reading speed. The improvement is achieved by skip reading metadata of files not interested in.

Parameters

to_extract (Union[List[str], str]) – specify the path (inside zip) of files to be extracted

Open the ZIP file with mode read ‘r’, write ‘w’, exclusive create ‘x’, or append ‘a’.