edelweiss package

Submodules

edelweiss.classifier module

class edelweiss.classifier.Classifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: object

The detection classifer class that wraps a sklearn classifier.

Parameters:
  • scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile

  • clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

fit(X, y, **args)

Train the classifier.

Parameters:
  • X – the features to train on (array or recarray)

  • y – the labels to train on

  • args – additional arguments for the classifier

predict(X, prob_multiplier=1.0)[source]

Predict the labels for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted labels

predict_non_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

predict_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

save(path, subfolder=None)[source]

Save the classifier to a given path.

Parameters:
  • path – path to the folder where the emulator is saved

  • subfolder – subfolder of the emulator folder where the classifier is stored

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

train(X, y, **args)[source]

Train the classifier.

Parameters:
  • X – the features to train on (array or recarray)

  • y – the labels to train on

  • args – additional arguments for the classifier

class edelweiss.classifier.MultiClassClassifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: Classifier

The detection classifer class that wraps a sklearn classifier for multiple classes.

Parameters:
  • scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile

  • clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

predict(X)[source]

Predict the labels for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted labels

predict_non_proba(X)[source]

Predict the class non-probabilistically for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

predict_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

class edelweiss.classifier.MultiClassifier(split_label='galaxy_type', labels=None, scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: object

A classifier class that trains multiple classifiers for a specific label. This label could e.g. be the galaxy type (star, red galaxy, blue galaxy).

Parameters:
  • split_label – the label to split the data in different classifers

  • labels – the different labels of the split label

  • scaler – the scaler to use for the classifier

  • clf – the classifier to use

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

fit(X, y)

Train the classifier.

predict(X)[source]

Predict the labels for a given set of features.

predict_non_proba(X)[source]

Predict the probabilities for a given set of features.

predict_proba(X)[source]

Predict the probabilities for a given set of features.

save(path, subfolder=None)[source]

Save the classifier to a given path.

Parameters:
  • path – path to the folder where the emulator is saved

  • subfolder – subfolder of the emulator folder where the classifier is stored

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

train(X, y)[source]

Train the classifier.

edelweiss.classifier.load_classifier(path, subfolder=None)[source]

Load a classifier from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.classifier.load_multiclassifier(path, subfolder=None)[source]

Load a multiclassifier from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.clf_diagnostics module

edelweiss.clf_diagnostics.add_range_to_name(field_names, ranges)[source]

Add the range to the name of the variable such that the range is visible in the spider plot.

Parameters:
  • field_names – list with the names of the variables

  • ranges – dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_all_scores(test_arr, y_test, y_pred, y_prob)[source]

Calculates all the scores and append them to the test_arr dict

Parameters:
  • test_arr – dict where the test scores will be saved

  • y_test – test labels

  • y_pred – predicted labels

  • y_prob – probability of being detected

edelweiss.clf_diagnostics.get_all_scores_multiclass(test_arr, y_test, y_pred, y_prob)[source]

Calculates all the scores and append them to the test_arr dict for a multiclass classifier.

Parameters:
  • test_arr – dict where the test scores will be saved

  • y_test – test labels

  • y_pred – predicted labels

  • y_prob – probability of being detected

edelweiss.clf_diagnostics.get_confusion_matrix(y_true, y_pred)[source]

Get the confusion matrix for the classifier.

Parameters:
  • y_true – true labels

  • y_pred – predicted labels

Returns:

True Positives, True Negatives, False Positives, False Negatives

edelweiss.clf_diagnostics.get_default_ranges_for_spider()[source]

Get the default ranges for the spider plot.

Returns:

dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_name(clf, final=False)[source]

Get the name to add to the classifier

Parameters:
  • clf – classifier object (from sklearn) or name of the classifier

  • final – if True, the classifier was tested on the test data.

Returns:

name

edelweiss.clf_diagnostics.plot_all_scores(scores, path_labels=None)[source]

Plot all scores for the classifiers. Input can either be directly a recarray with the scores or the path to the scores or a list of paths to the scores. If a list is given, the scores of the different paths are combined and plotted with different colors.

Parameters:
  • scores – recarray with the scores or path to the scores or list of paths

  • path_labels – list of labels for the different paths

edelweiss.clf_diagnostics.plot_calibration_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the calibration curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_classifier_comparison(clfs, conf, path, spider_ranges=None, labels=None, print_scores=False, special_param='mag_i')[source]

Plot the diagnostics for chosen classifiers. If the classifiers are not all from same path, the conf and path parameters should be lists of the same length as clfs.

Parameters:
  • clfs – list of classifier names

  • conf – configuration dictionary or list of dictionaries

  • path – path to the data or list of paths

  • spider_ranges – dictionary with the ranges for the spider plot

  • labels – list of labels for the different paths

  • print_scores – if True, print the scores for the different classifiers

  • special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_diagnostics(clf, X_test, y_test, output_directory='.', final=False, save_plot=False, special_param='mag_i')[source]

Plot the diagnostics for the classifier.

Parameters:
  • clf – classifier object

  • X_test – test data

  • y_test – true labels

  • output_directory – directory to save the plots

  • final – if True, the classifier was tested on the test data.

  • save_plot – if True, save the plots

  • special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_feature_importances(clf, clf_name='classifier', summed=False)[source]

Plots the feature importances for the classifier.

Parameters:
  • clf – classifier object

  • names – names of the features

  • clf_name – name of the classifier

  • summed – if True, the summed feature importances are plotted

edelweiss.clf_diagnostics.plot_hist_fp_fn_tp_tn(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False)[source]

Plot the stacked histogram of one parameter (e.g. i-band magnitude) for the different confusion matrix elements.

Parameters:
  • param – parameter to plot

  • y_true – true labels

  • y_pred – predicted labels

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

edelweiss.clf_diagnostics.plot_hist_n_gal(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the histogram of detected galaxies for the classifer and the true detected galaxies for one parameter (e.g. i-band magnitude).

Parameters:
  • param – parameter to plot

  • y_true – true labels

  • y_pred – predicted labels

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_pr_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the precision-recall curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

Returns:

figure object

edelweiss.clf_diagnostics.plot_roc_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the ROC curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_spider_scores(y_true, y_pred, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None, ranges=None, print_scores=False)[source]

Plot the spider scores for the classifier.

Parameters:
  • y_true – true labels

  • y_pred – predicted labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

  • ranges – dictionary of ranges for each score

  • print_scores – if True, print the scores

Returns:

figure object

edelweiss.clf_diagnostics.scale_data_for_spider(data, ranges=None)[source]

Scale the data for the spider plot such that the chosen range corresponds to the 0-1 range of the spider plot.

If the lower value of the range is higher than the upper value, the data is inverted.

Parameters:

data – data to scale

Ranges:

dictionary with the ranges for each variable, if a parameter is not in the

dictionary, the default range is (0, 1) :return: scaled data

edelweiss.clf_diagnostics.setup_test(multi_class=False)[source]

Returns a dict where the test scores will be saved.

edelweiss.clf_utils module

edelweiss.clf_utils.custom_roc_auc_score(y_true, y_prob)[source]

Scorer for the ROC AUC score using y_prob

Parameters:
  • y_true – true labels (detected or not)

  • y_prob – predicted probabilities (2D array)

Returns:

score

edelweiss.clf_utils.get_classifier(classifier, scaler=None, **kwargs)[source]

Returns the classifier object

Parameters:
  • classifier – name of the classifier

  • scaler – scaler object

  • kwargs – additional arguments for the classifier

Returns:

classifier object (sklearn pipeline)

Raises:

ValueError if classifier is not known

edelweiss.clf_utils.get_classifier_args(clf, conf)[source]

Returns the arguments for the classifier defined in the config file

Parameters:
  • clf – classifier name

  • conf – config file

Returns:

arguments for the classifier

edelweiss.clf_utils.get_clf_name(index=None)[source]

Returns the name of the classifier file.

Parameters:

index – index of the classifier

Returns:

name of the classifier file

edelweiss.clf_utils.get_detection_label(clf, bands, n_detected_bands=None)[source]

Get the detection label for the classifier.

Parameters:
  • clf – classification data (rec array)

  • bands – which bands the data has

  • n_detected_bands – how many bands have to be detected such that the event is

classified as detected, if None, the detection label is already given in clf :return: detection label (bool array) and the names of the detection labels

edelweiss.clf_utils.get_scaler(scaler)[source]

Returns the scaler object

Parameters:

scaler – name of the scaler

Returns:

scaler object

Raises:

ValueError if scaler is not known

edelweiss.clf_utils.get_scorer(score, **kwargs)[source]

Returns the scorer object given input string. If not one of the known self defined scorers, returns the input string assuming it is a sklearn scorer.

Parameters:

score – name of the scorer

Kwargs:

additional arguments for the scorer

Returns:

scorer object

edelweiss.clf_utils.load_hyperparams(clf)[source]

Loads the hyperparameters for the classifier for the CV search.

Parameters:

clf – classifier object

Returns:

hyperparameter grid

edelweiss.clf_utils.ngal_hist_scorer(y_true, y_pred, mag, bins=100, range=(15, 30))[source]

Scorer accounting for the number of galaxies in the sample on a histogram level. score = (N_pred - N_true)**2

Parameters:
  • y_true – true labels (detected or not)

  • y_pred – predicted labels (detected or not)

  • mag – magnitude of the galaxies

Returns:

score

edelweiss.clf_utils.ngal_scorer(y_true, y_pred)[source]

Scorer accounting for the number of galaxies in the sample. score = (N_pred - N_true)**2

Parameters:
  • y_true – true labels (detected or not)

  • y_pred – predicted labels (detected or not)

Returns:

score

edelweiss.custom_clfs module

class edelweiss.custom_clfs.NeuralNetworkClassifier(hidden_units=(64, 32), learning_rate=0.001, epochs=10, batch_size=32, loss='auto', activation='relu', activation_output='auto')[source]

Bases: BaseEstimator, ClassifierMixin

Neural network classifier based on Keras Sequential model

Parameters:
  • hidden_units – tuple/list, optional (default=(64, 32)) The number of units per hidden layer

  • learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer

  • epochs – int, optional (default=10) The number of epochs to train the model

  • batch_size – int, optional (default=32) The batch size for training the model

  • loss – str, optional (default=”auto”) The loss function to use, defaults to binary_crossentropy if binary and sparse_categorical_crossentropy if multiclass

  • activation – str, optional (default=”relu”) The activation function to use for the hidden layers

  • activation_output – str, optional (default=”auto”) The activation function to use for the output layer, defaults to sigmoid for single class and softmax for multiclass

  • sample_weight_col – int, optional (default=None)

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]

Fit the neural network model

Parameters:
  • X – array-like, shape (n_samples, n_features) The training input samples

  • y – array-like, shape (n_samples,) The target values

  • sample_weight – array-like, shape (n_samples,), optional (default=None) Sample weights

  • early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]

Predict the class labels for the provided data

Parameters:

X – array-like, shape (n_samples, n_features) The input samples

Returns:

array-like, shape (n_samples,) The predicted class labels

predict_proba(X)[source]

Predict the class probabilities for the provided data

Parameters:

X – array-like, shape (n_samples, n_features) The input samples

Returns:

array-like, shape (n_samples, n_classes) The predicted class probabilities

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkClassifier

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for early_stopping_patience parameter in fit.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

Returns

selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

selfobject

The updated object.

edelweiss.custom_regs module

class edelweiss.custom_regs.NeuralNetworkRegressor(hidden_units=(64, 64), learning_rate=0.001, epochs=10, batch_size=32, loss='mse', activation='relu', activation_output='linear', dropout_prob=0.0)[source]

Bases: BaseEstimator

Neural network regressor based on Keras Sequential model

Parameters:
  • hidden_units – tuple/list, optional (default=(64, 64)) The number of units per hidden layer

  • learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer

  • epochs – int, optional (default=10) The number of epochs to train the model

  • batch_size – int, optional (default=32) The batch size for training the model

  • loss – str, optional (default=”mse”) The loss function to use

  • activation – str, optional (default=”relu”) The activation function to use for the hidden layers

  • activation_output – str, optional (default=”linear”) The activation function to use for the output layer

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]

Fit the neural network model

Parameters:
  • X – array-like, shape (n_samples, n_features) The training input samples

  • y – array-like, shape (n_samples, n_outputs) The target values

  • sample_weight – array-like, shape (n_samples,), optional (default=None)

  • early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]

Predict the output from the input.

Parameters:

X – the input data

Returns:

the predicted output

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkRegressor

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for early_stopping_patience parameter in fit.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

Returns

selfobject

The updated object.

edelweiss.emulator module

edelweiss.emulator.load_emulator(path, bands=('g', 'r', 'i', 'z', 'y'), multiclassifier=False, subfolder_clf=None, subfolder_nflow=None)[source]

Load an emulator from a given path. If bands is None, returns the classifier and normalizing flow. If bands is not None, returns the classifier and a dictionary of normalizing flows for each band.

Parameters:
  • path – path to the folder containing the emulator

  • bands – the bands to load (if None, assumes that there is only one nflow)

  • multiclassifier – whether to load a multiclassifier or not

  • subfolder_clf – subfolder of the emulator folder where the classifier is stored

  • subfolder_nflow – subfolder of the emulator folder where the normalizing flow

is stored :return: the loaded classifier and normalizing flow

edelweiss.nflow module

class edelweiss.nflow.Nflow(output=None, input=None, scaler='standard')[source]

Bases: object

The normalizing flow class that wraps a pzflow normalizing flow.

Parameters:
  • output – the names of the output parameters

  • input – the names of the input parameters (=conditional parameters)

  • scaler – the scaler to use for the normalizing flow

fit(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)

Train the normalizing flow.

Parameters:
  • X – the features to train on (recarray)

  • epochs – number of epochs

  • batch_size – batch size

  • progress_bar – whether to show a progress bar

  • verbose – whether to print the losses

  • min_loss – minimum loss that is allowed for convergence

sample(X=None, n_samples=1)[source]

Sample from the normalizing flow.

Parameters:
  • X – the features to sample from (recarray or None for non-conditional sampling)

  • n_samples – number of samples to draw, number of total samples is n_samples * len(X)

Returns:

the sampled features (including the conditional parameters)

save(path, band=None, subfolder=None)[source]

Save the normalizing flow to a given path.

Parameters:
  • path – path to the folder where the emulator is saved

  • subfolder – subfolder of the emulator folder where the normalizing flow is stored

train(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)[source]

Train the normalizing flow.

Parameters:
  • X – the features to train on (recarray)

  • epochs – number of epochs

  • batch_size – batch size

  • progress_bar – whether to show a progress bar

  • verbose – whether to print the losses

  • min_loss – minimum loss that is allowed for convergence

edelweiss.nflow.load_nflow(path, band=None, subfolder=None)[source]

Load a normalizing flow from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • band – the band to load (if None, assumes that there is only one nflow)

  • subfolder – subfolder of the emulator folder where the normalizing flow is stored

Returns:

the loaded normalizing flow

edelweiss.nflow_utils module

exception edelweiss.nflow_utils.ModelNotConvergedError(model_name, reason=None)[source]

Bases: Exception

Custom error class for when a has not converged.

edelweiss.nflow_utils.check_convergence(losses, min_loss=5)[source]

Check if the model has converged.

Parameters:
  • losses – list of losses

  • min_loss – minimum loss, if the loss is higher than this,

the model has not converged :raises ModelNotConvergedError: if the model has not converged

edelweiss.nflow_utils.get_scalers(scaler)[source]

Get the scalers from the name.

Parameters:

scaler – name of the scaler (str)

Returns:

scaler

Raises:

ValueError – if the scaler is not implemented

edelweiss.nflow_utils.prepare_columns(args, bands=None)[source]

Prepare the columns for the training of the normalizing flow.

Parameters:
  • args – arparse arguments

  • bands – list of bands to use, if None, no bands are used

Returns:

input and output columns

edelweiss.nflow_utils.prepare_data(args, X)[source]

Prepare the data for the training of the normalizing flow by combining the different bands to one array.

Parameters:
  • args – argparse arguments

  • X – dictionary with the data (keys are the bands)

Returns:

rec array with the data

edelweiss.reg_utils module

edelweiss.reg_utils.get_regressor(regressor, scaler, **kwargs)[source]

Returns the regressor object

Parameters:
  • regressor – name of the regressor

  • scaler – scaler object

  • kwargs – additional arguments for the regressor

Returns:

regressor object (sklearn pipeline)

Raises:

ValueError if regressor is not known

edelweiss.regressor module

class edelweiss.regressor.Regressor(scaler='standard', reg='linear', cv=0, cv_scoring='neg_mean_squared_error', input_params=None, output_params=None, **reg_kwargs)[source]

Bases: object

Wrapper class for a several regression models.

Parameters:
  • scaler – the scaler to use for the regressor

  • reg – the regressor to use

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • input_params – the names of the input parameters

  • output_params – the names of the output parameters

  • reg_kwargs – additional keyword arguments for the regressor

fit(X, y, flat_param=None, **args)

Train the regressor.

Parameters:
  • X – the training data

  • y – the training labels

predict(X)[source]

Predict the output from the input.

Parameters:

X – the input data

Returns:

the predicted output as a recarray

save(path, name='regressor')[source]

Save the regressor to a given path.

Parameters:
  • path – the path where to save the regressor

  • name – the name of the regressor

test(X, y)[source]

Test the regressor.

Parameters:
  • X – the test data

  • y – the test labels

train(X, y, flat_param=None, **args)[source]

Train the regressor.

Parameters:
  • X – the training data

  • y – the training labels

edelweiss.regressor.load_regressor(path, name='regressor')[source]

Load a regressor from a given path.

Parameters:
  • path – path to the folder containing the regressor

  • name – the name of the regressor

Returns:

the loaded regressor

edelweiss.tf_utils module

class edelweiss.tf_utils.EpochProgressCallback(total_epochs)[source]

Bases: Callback

Class to implement a tqdm progress bar over epochs, written by ChatGPT, provided by Arne Thomsen

on_epoch_end(epoch, logs=None)[source]

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

on_train_begin(logs=None)[source]

Called at the beginning of training.

Subclasses should override for any actions to run.

Args:
logs: Dict. Currently no data is passed to this argument for this

method but that may change in the future.

on_train_end(logs=None)[source]

Called at the end of training.

Subclasses should override for any actions to run.

Args:
logs: Dict. Currently the output of the last call to

on_epoch_end() is passed to this argument for this method but that may change in the future.

Module contents