Auto-PyTorch icon indicating copy to clipboard operation
Auto-PyTorch copied to clipboard

"ValueError: Configuration passed does not come from the same configuration space." encountered when evaluating configurations in time-series forecasting.

Open RobbyW551 opened this issue 3 years ago • 4 comments

NOTE: ISSUES ARE NOT FOR CODE HELP - Ask for Help at https://stackoverflow.com

Your issue may already be reported! Also, please search on the issue tracker before creating one.

  • I'm submitting a ...
    • [x] bug report

Issue Description

  • When Issue Happens Running time-series example.
  • Steps To Reproduce
    1. Install autoPyTorch-v0.2 with time-series forecasting requirements.
    2. Copy https://github.com/automl/Auto-PyTorch/blob/master/autoPyTorch/configs/forecasting_init_cfgs.json to config module in autoPytorch site-package directory and modify feature_encoding:__choice__: "OneHotEncoder" to feature_encoding:__choice__: "NoEncoder".
    3. Change default device from cpu to cuda in default_pipeline_options.json.
    4. Set autoPyTorch.api.BaseTask._multiprocessing_context to spawn (if not set, a CUDA runtime-error will be raised).
    5. Comment line 112-113 of pipepline.base_pipeline.py (the two warnings clauses leads to a typeError).

Expected Behavior

Multiple initial configurations should be evaluated properly.

Current Behavior

A part or all of the initial configurations will encounter a valueError raised when the evaluator is creating a pipeline and do the configuration check. The configurations that raise this error is not stable, for example, sometimes the MLP configuration will be evaluated successfully but sometimes it will crash.

Possible Solution

Your Code

import os
import tempfile as tmp
import warnings
import copy

if __name__ == '__main__':
  os.environ['JOBLIB_TEMP_FOLDER'] = tmp.gettempdir()
  os.environ['OMP_NUM_THREADS'] = '1'
  os.environ['OPENBLAS_NUM_THREADS'] = '1'
  os.environ['MKL_NUM_THREADS'] = '1'
  
  warnings.simplefilter(action='ignore', category=UserWarning)
  warnings.simplefilter(action='ignore', category=FutureWarning)
  
  from sktime.datasets import load_longley
  targets, features = load_longley()
  
  forecasting_horizon = 3
  
  # Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the
  # list, or a single pd.DataFrame that records the series
  # index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate
  # column
  # Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets
  # Normally the value to be forecasted should follow the training sets
  y_train = [targets[: -forecasting_horizon]]
  y_test = [targets[-forecasting_horizon:]]
  
  # same for features. For uni-variant models, X_train, X_test can be omitted and set as None
  X_train = [features[: -forecasting_horizon]]
  # Here x_test indicates the 'known future features': they are the features known previously, features that are unknown
  # could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,
  # we could also omit X_test
  known_future_features = list(features.columns)
  X_test = [features[-forecasting_horizon:]]
  
  start_times = [targets.index.to_timestamp()[0]]
  freq = '1Y'
  
  from autoPyTorch.api.time_series_forecasting import TimeSeriesForecastingTask
  ############################################################################
  # Build and fit a forecaster
  # ==========================
  api = TimeSeriesForecastingTask()
  
  ############################################################################
  # Search for an ensemble of machine learning algorithms
  # =====================================================
  api.search(
      X_train=X_train,
      y_train=copy.deepcopy(y_train),
      X_test=X_test,
      optimize_metric='mean_MASE_forecasting',
      n_prediction_steps=forecasting_horizon,
      memory_limit=16 * 1024,   # Currently, forecasting models use much more memories
      freq=freq,
      start_times=start_times,
      func_eval_time_limit_secs=50,
      total_walltime_limit=60,
      min_num_test_instances=1000,  # proxy validation sets. This only works for the tasks with more than 1000 series
      known_future_features=known_future_features,
  )
  
  
  from autoPyTorch.datasets.time_series_dataset import TimeSeriesSequence
  
  test_sets = []
  
  # We could construct test sets from scratch
  for feature, future_feature, target, start_time in zip(X_train, X_test,y_train, start_times):
      test_sets.append(
          TimeSeriesSequence(X=feature.values,
                             Y=target.values,
                             X_test=future_feature.values,
                             start_time=start_time,
                             is_test_set=True,
                             # additional information required to construct a new time series sequence
                             **api.dataset.sequences_builder_kwargs
                             )
      )
  # Alternatively, if we only want to forecast the value after the X_train, we could directly ask datamanager to
  # generate a test set:
  # test_sets2 = api.dataset.generate_test_seqs()
  
  pred = api.predict(test_sets)

Error Message

snapshot of configuration 2:

[INFO] [2022-08-03 23:10:23,436:Client-TAE] Starting to evaluate configuration 2
[DEBUG] [2022-08-03 23:10:23,440:Client-TAE] Search space updates for 3: <autoPyTorch.utils.hyperparameter_search_space_update.HyperparameterSearchSpaceUpdates object at 0x7f69ba641dc0>
[DEBUG] [2022-08-03 23:10:23,441:Client-pynisher] Restricting your function to 16384 mb memory.
[DEBUG] [2022-08-03 23:10:23,441:Client-pynisher] Restricting your function to 498 seconds wall time.
[DEBUG] [2022-08-03 23:10:23,441:Client-pynisher] Allowing a grace period of 0 seconds.
[DEBUG] [2022-08-03 23:10:23,441:Client-pynisher] Function called with argument: (), {'queue': <multiprocessing.queues.Queue object at 0x7f6a98d370a0>, 'config': Configuration(values={
  'data_loader:backcast': False,
  'data_loader:batch_size': 164,
  'data_loader:num_batches_per_epoch': 49,
  'data_loader:sample_strategy': 'SeqUniform',
  'data_loader:transform_time_features': False,
  'data_loader:window_size': 3,
  'feature_encoding:__choice__': 'NoEncoder',
  'loss:DistributionLoss:dist_cls': 'studentT',
  'loss:DistributionLoss:forecast_strategy': 'mean',
  'loss:__choice__': 'DistributionLoss',
  'lr_scheduler:__choice__': 'NoScheduler',
  'network_backbone:__choice__': 'seq_encoder',
  'network_backbone:seq_encoder:block_1:RNNDecoder:decoder_type': 'RNNDecoder',
  'network_backbone:seq_encoder:block_1:RNNEncoder:bidirectional': False,
  'network_backbone:seq_encoder:block_1:RNNEncoder:cell_type': 'lstm',
  'network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type': 'RNNDecoder',
  'network_backbone:seq_encoder:block_1:RNNEncoder:hidden_size': 300,
  'network_backbone:seq_encoder:block_1:RNNEncoder:num_layers': 1,
  'network_backbone:seq_encoder:block_1:RNNEncoder:use_dropout': False,
  'network_backbone:seq_encoder:block_1:__choice__': 'RNNEncoder',
  'network_backbone:seq_encoder:decoder_auto_regressive': True,
  'network_backbone:seq_encoder:grn_dropout_rate': 0.7492525532132399,
  'network_backbone:seq_encoder:grn_use_dropout': True,
  'network_backbone:seq_encoder:num_blocks': 1,
  'network_backbone:seq_encoder:share_single_variable_networks': False,
  'network_backbone:seq_encoder:skip_connection': True,
  'network_backbone:seq_encoder:skip_connection_type': 'gate_add_norm',
  'network_backbone:seq_encoder:use_temporal_fusion': False,
  'network_backbone:seq_encoder:variable_selection': True,
  'network_backbone:seq_encoder:variable_selection_use_dropout': False,
  'network_embedding:__choice__': 'NoEmbedding',
  'network_init:SparseInit:bias_strategy': 'Zero',
  'network_init:__choice__': 'SparseInit',
  'optimizer:AdamOptimizer:beta1': 0.9871557825406481,
  'optimizer:AdamOptimizer:beta2': 0.9826622585337917,
  'optimizer:AdamOptimizer:lr': 3.850178418868606e-05,
  'optimizer:AdamOptimizer:weight_decay': 0.024026630796054407,
  'optimizer:__choice__': 'AdamOptimizer',
  'scaler:scaling_mode': 'mean_abs',
  'target_scaler:scaling_mode': 'none',
  'trainer:__choice__': 'ForecastingStandardTrainer',
})
, 'backend': <autoPyTorch.automl_common.common.utils.backend.Backend object at 0x7f69ba641670>, 'metric': mean_MASE_forecasting, 'seed': 1, 'num_run': 3, 'output_y_hat_optimization': True, 'include': {}, 'exclude': {}, 'disable_file_output': [], 'instance': '{"task_id": "5c07a09c-133e-11ed-92c3-e163018ae3d0"}', 'init_params': {'instance': '{"task_id": "5c07a09c-133e-11ed-92c3-e163018ae3d0"}'}, 'budget': 5.555555555555555, 'budget_type': 'epochs', 'pipeline_config': {'device': 'cuda', 'budget_type': 'epochs', 'epochs': 50, 'runtime': 3600, 'torch_num_threads': 1, 'early_stopping': 20, 'use_tensorboard_logger': False, 'metrics_during_training': True, 'optimize_metric': 'mean_MASE_forecasting'}, 'logger_port': 33427, 'all_supported_metrics': True, 'search_space_updates': <autoPyTorch.utils.hyperparameter_search_space_update.HyperparameterSearchSpaceUpdates object at 0x7f69ba641dc0>}
[DEBUG] [2022-08-03 23:10:26,067:Client-pynisher] Redirecting output of the function to files. Access them via the stdout and stderr attributes of the wrapped function.
[DEBUG] [2022-08-03 23:10:26,068:Client-pynisher] call function
[DEBUG] [2022-08-03 23:10:26,430:Client-TimeSeriesForecastingTrainEvaluator(1)] Fit dictionary in Abstract evaluator: dataset_properties: {'categories': [], 'categorical_columns': [], 'known_future_features': ('GNPDEFL', 'GNP', 'UNEMP', 'ARMED', 'POP'), 'is_small_preprocess': True, 'feature_shapes': {'GNPDEFL': 1, 'GNP': 1, 'UNEMP': 1, 'ARMED': 1, 'POP': 1}, 'numerical_features': [0, 1, 2, 3, 4], 'sequence_lengths_train': array([10]), 'input_shape': (10, 5), 'task_type': 'time_series_forecasting', 'categorical_features': [], 'numerical_columns': [0, 1, 2, 3, 4], 'static_features': (), 'time_feature_names': ('time_feature_Constant',), 'output_shape': [3, 1], 'n_prediction_steps': 3, 'freq': '1Y', 'feature_names': ('GNPDEFL', 'GNP', 'UNEMP', 'ARMED', 'POP'), 'output_type': 'continuous', 'issparse': False, 'sp': 1, 'time_feature_transform': [Constant()], 'uni_variant': False, 'static_features_shape': 0, 'future_feature_shapes': (3, 5), 'targets_have_missing_values': False, 'encoder_can_be_auto_regressive': True, 'features_have_missing_values': False}
additional_metrics: ['mean_MASE_forecasting', 'mean_MASE_forecasting', 'median_MASE_forecasting', 'mean_MAE_forecasting', 'median_MAE_forecasting', 'mean_MAPE_forecasting', 'median_MAPE_forecasting', 'mean_MSE_forecasting', 'median_MSE_forecasting']
X_train:    GNPDEFL       GNP   UNEMP   ARMED       POP
0     83.0  234289.0  2356.0  1590.0  107608.0
0     88.5  259426.0  2325.0  1456.0  108632.0
0     88.2  258054.0  3682.0  1616.0  109773.0
0     89.5  284599.0  3351.0  1650.0  110929.0
0     96.2  328975.0  2099.0  3099.0  112075.0
0     98.1  346999.0  1932.0  3594.0  113270.0
0     99.0  365385.0  1870.0  3547.0  115094.0
0    100.0  363112.0  3578.0  3350.0  116219.0
0    101.2  397469.0  2904.0  3048.0  117388.0
0    104.6  419180.0  2822.0  2857.0  118734.0
0    108.4  442769.0  2936.0  2798.0  120445.0
0    110.8  444546.0  4681.0  2637.0  121950.0
0    112.6  482704.0  3813.0  2552.0  123366.0
y_train:          0
0  60323.0
0  61122.0
0  60171.0
0  61187.0
0  63221.0
0  63639.0
0  64989.0
0  63761.0
0  66019.0
0  67857.0
0  68169.0
0  66513.0
0  68655.0
X_test: None
y_test: None
backend: <autoPyTorch.automl_common.common.utils.backend.Backend object at 0x7f5295a80d00>
logger_port: 33427
optimize_metric: mean_MASE_forecasting
device: cuda
budget_type: epochs
epochs: 5.555555555555555
torch_num_threads: 1
early_stopping: 20
use_tensorboard_logger: False
metrics_during_training: True
[DEBUG] [2022-08-03 23:10:26,430:Client-TimeSeriesForecastingTrainEvaluator(1)] Search space updates :<autoPyTorch.utils.hyperparameter_search_space_update.HyperparameterSearchSpaceUpdates object at 0x7f5295a98100>
[DEBUG] [2022-08-03 23:10:26,430:Client-TimeSeriesForecastingTrainEvaluator(1)] Search space updates :<autoPyTorch.utils.hyperparameter_search_space_update.HyperparameterSearchSpaceUpdates object at 0x7f5295a98100>
[INFO] [2022-08-03 23:10:26,431:Client-TimeSeriesForecastingTrainEvaluator(1)] Starting fit 0
[DEBUG] [2022-08-03 23:10:26,624:Client-pynisher] function returned properly: (None, 0)
[DEBUG] [2022-08-03 23:10:26,624:Client-pynisher] return value: (None, 0)
[DEBUG] [2022-08-03 23:10:27,065:Client-TAE] Finish function evaluation 3.
Status: StatusType.CRASHED, Cost: 2147483647.0, Runtime: 3.1836764812469482,
Additional information:
traceback: Traceback (most recent call last):
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/evaluation/tae.py", line 61, in fit_predict_try_except_decorator
    ta(queue=queue, **kwargs)
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/evaluation/time_series_forecasting_train_evaluator.py", line 558, in forecasting_eval_train_function
    evaluator.fit_predict_and_loss()
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/evaluation/time_series_forecasting_train_evaluator.py", line 169, in fit_predict_and_loss
    pipeline = self._get_pipeline()
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/evaluation/abstract_evaluator.py", line 685, in _get_pipeline
    pipeline = self.pipeline_class(config=self.configuration,
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/pipeline/time_series_forecasting.py", line 91, in __init__
    BasePipeline.__init__(self,
  File "/home/robby/miniconda3/envs/auto-pytorch/lib/python3.8/site-packages/autoPyTorch/pipeline/base_pipeline.py", line 119, in __init__
    raise ValueError('Configuration passed does not come from the '
ValueError: Configuration passed does not come from the same configuration space. Differences are: --- 

+++ 

@@ -240,8 +240,8 @@

     (network_backbone:flat_encoder:NBEATSDecoder:width_i_2 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 1 && network_backbone:flat_encoder:NBEATSDecoder:width_i_2 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')
     (network_backbone:flat_encoder:NBEATSDecoder:width_i_3 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 2 && network_backbone:flat_encoder:NBEATSDecoder:width_i_3 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')
     (network_backbone:flat_encoder:NBEATSDecoder:width_i_4 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 3 && network_backbone:flat_encoder:NBEATSDecoder:width_i_4 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')
-    (network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')
-    (network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')
+    (network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')
+    (network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')
     (network_backbone:seq_encoder:block_1:RNNEncoder:dropout | network_backbone:seq_encoder:block_1:RNNEncoder:use_dropout == True && network_backbone:seq_encoder:block_1:RNNEncoder:dropout | network_backbone:seq_encoder:block_1:RNNEncoder:num_layers > 1)
     (network_backbone:seq_encoder:block_1:TransformerDecoder:dropout_positional_decoder | network_backbone:seq_encoder:block_1:TransformerDecoder:use_dropout == True && network_backbone:seq_encoder:block_1:TransformerDecoder:dropout_positional_decoder | network_backbone:seq_encoder:block_1:TransformerDecoder:use_positional_decoder == True)
     (network_backbone:seq_encoder:block_1:TransformerEncoder:dropout_positional_encoder | network_backbone:seq_encoder:block_1:TransformerEncoder:use_dropout == True && network_backbone:seq_encoder:block_1:TransformerEncoder:dropout_positional_encoder | network_backbone:seq_encoder:block_1:TransformerEncoder:use_positional_encoder == True)

error: ValueError("Configuration passed does not come from the same configuration space. Differences are: --- \n\n+++ \n\n@@ -240,8 +240,8 @@\n\n     (network_backbone:flat_encoder:NBEATSDecoder:width_i_2 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 1 && network_backbone:flat_encoder:NBEATSDecoder:width_i_2 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')\n     (network_backbone:flat_encoder:NBEATSDecoder:width_i_3 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 2 && network_backbone:flat_encoder:NBEATSDecoder:width_i_3 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')\n     (network_backbone:flat_encoder:NBEATSDecoder:width_i_4 | network_backbone:flat_encoder:NBEATSDecoder:num_stacks_i > 3 && network_backbone:flat_encoder:NBEATSDecoder:width_i_4 | network_backbone:flat_encoder:NBEATSDecoder:n_beats_type == 'I')\n-    (network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')\n-    (network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')\n+    (network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')\n+    (network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'InceptionTimeEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:__choice__ == 'TCNEncoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:TransformerEncoder:decoder_type == 'MLPDecoder' || network_backbone:seq_encoder:block_1:MLPDecoder:num_layers | network_backbone:seq_encoder:block_1:RNNEncoder:decoder_type == 'MLPDecoder')\n     (network_backbone:seq_encoder:block_1:RNNEncoder:dropout | network_backbone:seq_encoder:block_1:RNNEncoder:use_dropout == True && network_backbone:seq_encoder:block_1:RNNEncoder:dropout | network_backbone:seq_encoder:block_1:RNNEncoder:num_layers > 1)\n     (network_backbone:seq_encoder:block_1:TransformerDecoder:dropout_positional_decoder | network_backbone:seq_encoder:block_1:TransformerDecoder:use_dropout == True && network_backbone:seq_encoder:block_1:TransformerDecoder:dropout_positional_decoder | network_backbone:seq_encoder:block_1:TransformerDecoder:use_positional_decoder == True)\n     (network_backbone:seq_encoder:block_1:TransformerEncoder:dropout_positional_encoder | network_backbone:seq_encoder:block_1:TransformerEncoder:use_dropout == True && network_backbone:seq_encoder:block_1:TransformerEncoder:dropout_positional_encoder | network_backbone:seq_encoder:block_1:TransformerEncoder:use_positional_encoder == True)")
configuration_origin: Random Search (sorted)
[DEBUG] [2022-08-03 23:10:27,067:smac.intensification.successive_halving.0._SuccessiveHalving] Incumbent (2147483647.0000) is at least as good as the challenger (2147483647.0000) on budget 5.5556.
[DEBUG] [2022-08-03 23:10:27,067:Client-EnsembleBuilder] iteration=1 @ elapsed_time=11.821754693984985 has history=[{'Timestamp': Timestamp('2022-08-03 23:10:20.352246'), 'train_mean_MASE_forecasting': 0.6190706888834635}]
[DEBUG] [2022-08-03 23:10:27,072:Client-EnsembleBuilder] Restricting your function to 16384 mb memory.
[DEBUG] [2022-08-03 23:10:27,072:Client-EnsembleBuilder] Restricting your function to 983 seconds wall time.
[DEBUG] [2022-08-03 23:10:27,072:Client-EnsembleBuilder] Allowing a grace period of 0 seconds.
[DEBUG] [2022-08-03 23:10:27,073:Client-EnsembleBuilder] Function called with argument: (988.1762018203735, 1, False), {}
[DEBUG] [2022-08-03 23:10:28,535:Client-EnsembleBuilder] call function
[DEBUG] [2022-08-03 23:10:28,535:Client-EnsembleBuilder] Starting iteration 1, time left: 988.176202
[DEBUG] [2022-08-03 23:10:28,535:Client-EnsembleBuilder] Read ensemble data set predictions
[DEBUG] [2022-08-03 23:10:28,536:Client-EnsembleBuilder] Done reading 0 new prediction files. Loaded 1 predictions in total.
[DEBUG] [2022-08-03 23:10:28,536:Client-EnsembleBuilder] Use 0.619071 as dummy loss
[DEBUG] [2022-08-03 23:10:28,536:Client-EnsembleBuilder] Library Pruning: using for ensemble only  1 (out of 1) models
[DEBUG] [2022-08-03 23:10:28,536:Client-EnsembleBuilder] No new model predictions selected -- skip ensemble building -- current performance: inf
[DEBUG] [2022-08-03 23:10:28,537:Client-EnsembleBuilder] function returned properly: (([], 50, None, None), 0)
[DEBUG] [2022-08-03 23:10:28,537:Client-EnsembleBuilder] return value: (([], 50, None, None), 0)
[INFO] [2022-08-03 23:10:28,812:Client-EnsembleBuilder] DummyFuture: ([], 50, None, None)/SingleThreadedClient() Started Ensemble builder job at 2022.08.03-23.10.28 for iteration 1.
[DEBUG] [2022-08-03 23:10:28,812:smac.stats.stats.Stats] Saving stats to /tmp/autoPyTorch_tmp_5c065547-133e-11ed-92c3-e163018ae3d0/smac3-output/run_1/stats.json
[DEBUG] [2022-08-03 23:10:28,813:smac.intensification.successive_halving.0._SuccessiveHalving] Generating new challenger from optimizer
[DEBUG] [2022-08-03 23:10:28,813:smac.optimizer.configuration_chooser.epm_chooser.EPMChooser] Search for next configuration
[DEBUG] [2022-08-03 23:10:28,813:smac.runhistory.runhistory2epm.RunHistory2EPM4LogCost] Transform runhistory into X,y format
[DEBUG] [2022-08-03 23:10:28,814:smac.runhistory.runhistory2epm.RunHistory2EPM4LogCost] Converted 2 observations
[DEBUG] [2022-08-03 23:10:28,816:smac.intensification.successive_halving.0._SuccessiveHalving] Time to select next challenger: 0.0033
[DEBUG] [2022-08-03 23:10:30,340:smac.optimizer.acquisition.maximizer.LocalSearch] Active hyperparameter 'network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive' not specified!
[DEBUG] [2022-08-03 23:10:30,432:smac.optimizer.acquisition.maximizer.LocalSearch] Active hyperparameter 'network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive' not specified!
[DEBUG] [2022-08-03 23:10:31,509:smac.optimizer.acquisition.maximizer.LocalSearch] Active hyperparameter 'network_backbone:seq_encoder:block_1:MLPDecoder:auto_regressive' not specified!
[DEBUG] [2022-08-03 23:10:31,841:smac.optimizer.acquisition.maximizer.LocalSearch] Local searches took [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] steps and looked at [1728, 1148, 1957, 1368, 1727, 1774, 1732, 2236, 1874, 1543, 1192, 1190] configurations. Computing the acquisition function in vectorized for took 0.008060 seconds on average.
[DEBUG] [2022-08-03 23:10:31,842:smac.optimizer.acquisition.maximizer.LocalAndSortedRandomSearch] First 5 acq func (origin) values of selected configurations: [[3.989422804014327e-06, 'Random Search (sorted)'], [3.989422804014327e-06, 'Random Search (sorted)'], [3.989422804014327e-06, 'Random Search (sorted)'], [3.989422804014327e-06, 'Random Search (sorted)'], [3.989422804014327e-06, 'Random Search (sorted)']]
[DEBUG] [2022-08-03 23:10:31,844:smac.intensification.successive_halving.0._SuccessiveHalving]  Running challenger  -  Configuration(values={
  'data_loader:backcast': False,
  'data_loader:batch_size': 37,
  'data_loader:num_batches_per_epoch': 72,
  'data_loader:sample_strategy': 'LengthUniform',
  'data_loader:transform_time_features': False,
  'data_loader:window_size': 2,
  'feature_encoding:__choice__': 'NoEncoder',
  'loss:QuantileLoss:lower_quantile': 0.11201002285042878,
  'loss:QuantileLoss:upper_quantile': 0.6622526858014545,
  'loss:__choice__': 'QuantileLoss',
  'lr_scheduler:CyclicLR:base_lr': 0.04869660265666608,
  'lr_scheduler:CyclicLR:max_lr': 0.0814056329224693,
  'lr_scheduler:CyclicLR:mode': 'exp_range',
  'lr_scheduler:CyclicLR:step_size_up': 2572,
  'lr_scheduler:__choice__': 'CyclicLR',
  'network_backbone:__choice__': 'flat_encoder',
  'network_backbone:flat_encoder:MLPDecoder:activation': 'tanh',
  'network_backbone:flat_encoder:MLPDecoder:has_local_layer': False,
  'network_backbone:flat_encoder:MLPDecoder:num_layers': 3,
  'network_backbone:flat_encoder:MLPDecoder:units_layer_1': 199,
  'network_backbone:flat_encoder:MLPDecoder:units_layer_2': 22,
  'network_backbone:flat_encoder:MLPDecoder:units_layer_3': 35,
  'network_backbone:flat_encoder:MLPEncoder:activation': 'relu',
  'network_backbone:flat_encoder:MLPEncoder:dropout_1': 0.011953412343664472,
  'network_backbone:flat_encoder:MLPEncoder:dropout_2': 0.18438733874587002,
  'network_backbone:flat_encoder:MLPEncoder:dropout_3': 0.7459000202220012,
  'network_backbone:flat_encoder:MLPEncoder:dropout_4': 0.08573347786995066,
  'network_backbone:flat_encoder:MLPEncoder:normalization': 'NoNorm',
  'network_backbone:flat_encoder:MLPEncoder:num_groups': 4,
  'network_backbone:flat_encoder:MLPEncoder:num_units_1': 123,
  'network_backbone:flat_encoder:MLPEncoder:num_units_2': 129,
  'network_backbone:flat_encoder:MLPEncoder:num_units_3': 86,
  'network_backbone:flat_encoder:MLPEncoder:num_units_4': 42,
  'network_backbone:flat_encoder:MLPEncoder:use_dropout': True,
  'network_backbone:flat_encoder:__choice__': 'MLPEncoder',
  'network_embedding:__choice__': 'NoEmbedding',
  'network_init:XavierInit:bias_strategy': 'Normal',
  'network_init:__choice__': 'XavierInit',
  'optimizer:AdamOptimizer:beta1': 0.9607093001993826,
  'optimizer:AdamOptimizer:beta2': 0.9128641267778828,
  'optimizer:AdamOptimizer:lr': 0.08763240845811145,
  'optimizer:AdamOptimizer:weight_decay': 0.009756264281097361,
  'optimizer:__choice__': 'AdamOptimizer',
  'scaler:scaling_mode': 'none',
  'target_scaler:scaling_mode': 'min_max',
  'trainer:__choice__': 'ForecastingStandardTrainer',
})

[DEBUG] [2022-08-03 23:10:31,844:smac.intensification.successive_halving.0._SuccessiveHalving] Cutoff for challenger: 498.0

Your Local environment

  • Operating System, version Ubuntu 20.04
  • Python, version Python 3.8
  • Outputs of pip freeze or conda list pytorch=1.12+cu116

Make sure to add all the information needed to understand the bug so that someone can help. If the info is missing, we'll add the 'Needs more information' label and close the issue until there is enough information.

RobbyW551 avatar Aug 04 '22 02:08 RobbyW551

Hi,

thanks for the bug reporting. I think the issue might be caused by the spawn mutliprocessing setting. Could you please

  1. set autoPyTorch.api.BaseTask._multiprocessing_context back to fork and then
  2. set memory_limit=None under api.search

to see if the cuda error (and this error) still occurs?

Best, Difan

dengdifan avatar Aug 04 '22 15:08 dengdifan

Hi! Many thanks for replying. I've set multiprocessing context back to fork but a CUDA runtime error is encountered regardless of the setting memory_limit=None under api.search when evaluating initial configurations. When I use spawn, this error disappears, which is the main reason I modified the context manually, but above ValueError emerges at the same time. I've uploaded the whole log file in the Error Message section below, maybe it can help to find out the reason.

Error Message A complete log after making above two changes: AutoPyTorch_1b7042cb-1453-11ed-9574-e5ec019f8c19_1.log

Environment Ubuntu20.04, Python 3.8, CUDA 11.6

RobbyW551 avatar Aug 05 '22 00:08 RobbyW551

Hmmm...Then I have no clue about the CUDA initialization error. I have never encountered it before...

Let's come back to the spawn setting. As a workaround, you could try to add the following code after line 110 of pipeline.base_pipeline (https://github.com/automl/Auto-PyTorch/blob/master/autoPyTorch/pipeline/base_pipeline.py#L110):

            elif isinstance(config, Configuration):
                config = Configuration(configuration_space=self.config_space, values=config.get_dictionary())

This should partially solve the issue of unmatched configuration space.

dengdifan avatar Aug 05 '22 10:08 dengdifan

Thanks for your reply. I will follow your suggestion and do some further investigation on the CUDA error.

RobbyW551 avatar Aug 05 '22 13:08 RobbyW551

I am closing this issue due to inactivity. Feel free to reopen if the issue persists.

ravinkohli avatar Aug 23 '22 17:08 ravinkohli