TypeError: unhashable type: 'numpy.ndarray' when saving test results with Custom3D dataset
Checklist
- [X] I have searched for similar issues.
- [X] I have tested with the latest development wheel.
- [X] I have checked the release documentation and the latest documentation (for
masterbranch).
Describe the issue
Hello,
First of all, thank you for the awesome library.
We are trying to train the RandLANet model with a custom dataset to perform 3D semantic segmentation. For that we are using the Custom3D dataset which, from our understanding, is the class to use if we need our own dataset.
We created a folder for our dataset with three folders inside, like expected by the Custom3D dataset class ("Expect point clouds to be in npy format with train, val and test files in separate folders.").
To each of this folders we saved our training, validation and test files, in the expected format (x, y, z, class, feat_1, feat_2, ..., feat_n).
We also had to override the get_label_to_names static method in the Custom3D class so that it returned our own label mappings, instead of the ones that were implemented.
We managed to train the model on the custom dataset using this class, using the run_train function of the pipeline, but when we tried evaluating the model performance (run_test function of the pipeline), it threw the error described below.
Could you help out with this issue?
Also, is this the correct approach to train models on custom datasets in Open3D?
Thanks for your help.
Steps to reproduce the bug
import pathlib
import numpy as np
import open3d as o3d
import open3d.ml as ml3d
import open3d.ml.tf
@staticmethod
def get_label_to_names():
return {
0: "unclassified",
1: "bunny",
}
# Create custom dataset
bunny_mesh = o3d.data.BunnyMesh()
mesh = o3d.io.read_triangle_mesh(bunny_mesh.path)
pcd = mesh.sample_points_uniformly(number_of_points=500)
points = np.asarray(pcd.points)
labels = np.ones(shape=500)
train_dataset = np.column_stack((points, labels))
val_dataset = train_dataset
test_dataset = points
pathlib.Path("custom/train").mkdir(parents=True, exist_ok=True)
np.save("custom/train/bunny.npy", train_dataset)
pathlib.Path("custom/val").mkdir(parents=True, exist_ok=True)
np.save("custom/val/bunny.npy", val_dataset)
pathlib.Path("custom/test").mkdir(parents=True, exist_ok=True)
np.save("custom/test/bunny.npy", test_dataset)
# Train model using custom dataset
framework = "tf"
config = ml3d.utils.Config.load_from_file("randlanet.yml")
Dataset = ml3d.utils.get_module("dataset", config.dataset.name)
Dataset.get_label_to_names = get_label_to_names
dataset = Dataset(**config.dataset)
Model = ml3d.utils.get_module("model", config.model.name, framework)
model = Model(**config.model)
Pipeline = ml3d.utils.get_module("pipeline", config.pipeline.name, framework)
pipeline = Pipeline(model, dataset, **config.pipeline)
pipeline.run_train()
pipeline.run_test()
### randlanet.yml ###
dataset:
name: Custom3D
dataset_path: custom
train_dir: train
val_dir: val
test_dir: test
model:
name: RandLANet
batcher: DefaultBatcher
ckpt_path: # path/to/your/checkpoint
num_neighbors: 16
num_layers: 4
num_points: 65536
num_classes: 1
ignored_label_inds: [0]
sub_sampling_ratio: [4, 4, 4, 4]
in_channels: 3
dim_features: 8
dim_output: [16, 64, 128, 256]
grid_size: 0.06
augment:
recenter:
dim: [0, 1]
pipeline:
name: SemanticSegmentation
optimizer:
lr: 0.001
batch_size: 4
main_log_dir: ./logs
max_epoch: 1
save_ckpt_freq: 5
scheduler_gamma: 0.9886
test_batch_size: 1
train_sum_dir: ./train_log
val_batch_size: 2
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
Error message
File .venv/lib/python3.10/site-packages/open3d/_ml3d/datasets/customdataset.py:223, in Custom3D.save_test_result(self, results, attr) 220 make_dir(path) 222 pred = results['predict_labels'] --> 223 pred = np.array(self.label_to_names[pred]) 225 store_path = join(path, name + '.npy') 226 np.save(store_path, pred)
TypeError: unhashable type: 'numpy.ndarray'
Expected behavior
We would expect that the save_test_result function from the Custom3D dataset didn't fail by trying to use a numpy array to access the label_to_names dict by key.
Open3D, Python and System information
- Operating system: macOS Ventura 13.4.1
- Python version: 3.10.12
- Open3D version: 0.17.0
- System type: x64
- Is this remote workstation?: no
- How did you install Open3D?: pip
Additional information
No response
I don't understand why this error hasn't been fixed yet. You can try the following modifications.
def save_test_result(self, results, attr):
cfg = self.cfg
name = attr['name']
path = cfg.test_result_folder
make_dir(path)
pred = results['predict_labels']
if isinstance(pred, np.ndarray):
pred_names = np.vectorize(lambda x: self.label_to_names[int(x)])(pred)
else:
pred_names = self.label_to_names[int(pred)]
store_path = join(path, name + '.npy')
np.save(store_path, pred_names)