edelweiss package

Submodules

edelweiss.classifier module

class edelweiss.classifier.Classifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: object

The detection classifer class that wraps a sklearn classifier.

Parameters:
  • scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile

  • clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

fit(X, y, param_grid=None, **args)

Train the classifier.

Parameters:
  • X – the features to train on (array or recarray)

  • y – the labels to train on

  • param_grid – the hyperparameter grid to search over

  • args – additional arguments for the classifier

predict(X, prob_multiplier=1.0)[source]

Predict the labels for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted labels

predict_non_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

predict_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

save(path, subfolder=None)[source]

Save the classifier to a given path.

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

train(X, y, param_grid=None, **args)[source]

Train the classifier.

Parameters:
  • X – the features to train on (array or recarray)

  • y – the labels to train on

  • param_grid – the hyperparameter grid to search over

  • args – additional arguments for the classifier

class edelweiss.classifier.MultiClassClassifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: Classifier

The detection classifer class that wraps a sklearn classifier for multiple classes.

Parameters:
  • scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile

  • clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

predict(X)[source]

Predict the labels for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted labels

predict_non_proba(X)[source]

Predict the class non-probabilistically for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

predict_proba(X)[source]

Predict the probabilities for a given set of features.

Parameters:

X – the features to predict on (array or recarry)

Returns:

the predicted probabilities

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

class edelweiss.classifier.MultiClassifier(split_label='galaxy_type', labels=None, scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]

Bases: object

A classifier class that trains multiple classifiers for a specific label. This label could e.g. be the galaxy type (star, red galaxy, blue galaxy).

Parameters:
  • split_label – the label to split the data in different classifers

  • labels – the different labels of the split label

  • scaler – the scaler to use for the classifier

  • clf – the classifier to use

  • calibrate – whether to calibrate the probabilities

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • params – the names of the parameters

  • clf_kwargs – additional keyword arguments for the classifier

fit(X, y)

Train the classifier.

predict(X)[source]

Predict the labels for a given set of features.

predict_non_proba(X)[source]

Predict the probabilities for a given set of features.

predict_proba(X)[source]

Predict the probabilities for a given set of features.

save(path, subfolder=None)[source]

Save the classifier to a given path.

Parameters:
  • path – path to the folder where the emulator is saved

  • subfolder – subfolder of the emulator folder where the classifier is stored

test(X_test, y_test, non_proba=False)[source]

Tests the classifier on the test data

Parameters:
  • test_arr – dict where the test scores will be saved

  • clf – classifier

  • X_test – test data

  • y_test – test labels

  • non_proba – whether to use non-probabilistic predictions

train(X, y)[source]

Train the classifier.

edelweiss.classifier.load_classifier(path, subfolder=None)[source]

Load a classifier from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.classifier.load_multiclassifier(path, subfolder=None)[source]

Load a multiclassifier from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.clf_diagnostics module

edelweiss.clf_diagnostics.add_range_to_name(field_names, ranges)[source]

Add the range to the name of the variable such that the range is visible in the spider plot.

Parameters:
  • field_names – list with the names of the variables

  • ranges – dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_all_scores(test_arr, y_test, y_pred, y_prob)[source]

Calculates all the scores and append them to the test_arr dict

Parameters:
  • test_arr – dict where the test scores will be saved

  • y_test – test labels

  • y_pred – predicted labels

  • y_prob – probability of being detected

edelweiss.clf_diagnostics.get_all_scores_multiclass(test_arr, y_test, y_pred, y_prob)[source]

Calculates all the scores and append them to the test_arr dict for a multiclass classifier.

Parameters:
  • test_arr – dict where the test scores will be saved

  • y_test – test labels

  • y_pred – predicted labels

  • y_prob – probability of being detected

edelweiss.clf_diagnostics.get_confusion_matrix(y_true, y_pred)[source]

Get the confusion matrix for the classifier.

Parameters:
  • y_true – true labels

  • y_pred – predicted labels

Returns:

True Positives, True Negatives, False Positives, False Negatives

edelweiss.clf_diagnostics.get_default_ranges_for_spider()[source]

Get the default ranges for the spider plot.

Returns:

dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_name(clf, final=False)[source]

Get the name to add to the classifier

Parameters:
  • clf – classifier object (from sklearn) or name of the classifier

  • final – if True, the classifier was tested on the test data.

Returns:

name

edelweiss.clf_diagnostics.plot_all_scores(scores, path_labels=None)[source]

Plot all scores for the classifiers. Input can either be directly a recarray with the scores or the path to the scores or a list of paths to the scores. If a list is given, the scores of the different paths are combined and plotted with different colors.

Parameters:
  • scores – recarray with the scores or path to the scores or list of paths

  • path_labels – list of labels for the different paths

edelweiss.clf_diagnostics.plot_calibration_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the calibration curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_classifier_comparison(clfs, conf, path, spider_ranges=None, labels=None, print_scores=False, special_param='mag_i')[source]

Plot the diagnostics for chosen classifiers. If the classifiers are not all from same path, the conf and path parameters should be lists of the same length as clfs.

Parameters:
  • clfs – list of classifier names

  • conf – configuration dictionary or list of dictionaries

  • path – path to the data or list of paths

  • spider_ranges – dictionary with the ranges for the spider plot

  • labels – list of labels for the different paths

  • print_scores – if True, print the scores for the different classifiers

  • special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_diagnostics(clf, X_test, y_test, output_directory='.', final=False, save_plot=False, special_param='mag_i')[source]

Plot the diagnostics for the classifier.

Parameters:
  • clf – classifier object

  • X_test – test data

  • y_test – true labels

  • output_directory – directory to save the plots

  • final – if True, the classifier was tested on the test data.

  • save_plot – if True, save the plots

  • special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_feature_importances(clf, clf_name='classifier', summed=False)[source]

Plots the feature importances for the classifier.

Parameters:
  • clf – classifier object

  • names – names of the features

  • clf_name – name of the classifier

  • summed – if True, the summed feature importances are plotted

edelweiss.clf_diagnostics.plot_hist_fp_fn_tp_tn(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False)[source]

Plot the stacked histogram of one parameter (e.g. i-band magnitude) for the different confusion matrix elements.

Parameters:
  • param – parameter to plot

  • y_true – true labels

  • y_pred – predicted labels

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

edelweiss.clf_diagnostics.plot_hist_n_gal(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the histogram of detected galaxies for the classifer and the true detected galaxies for one parameter (e.g. i-band magnitude).

Parameters:
  • param – parameter to plot

  • y_true – true labels

  • y_pred – predicted labels

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_pr_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the precision-recall curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

Returns:

figure object

edelweiss.clf_diagnostics.plot_roc_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]

Plot the ROC curve for the classifier.

Parameters:
  • y_true – true labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_spider_scores(y_true, y_pred, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None, ranges=None, print_scores=False)[source]

Plot the spider scores for the classifier.

Parameters:
  • y_true – true labels

  • y_pred – predicted labels

  • y_prob – predicted probabilities

  • output_directory – directory to save the plot

  • clf – classifier object or name of the classifier

  • final – if True, the plot is for the final classifier

  • save_plot – if True, save the plot

  • fig – figure object, if None, create a new figure

  • ranges – dictionary of ranges for each score

  • print_scores – if True, print the scores

Returns:

figure object

edelweiss.clf_diagnostics.scale_data_for_spider(data, ranges=None)[source]

Scale the data for the spider plot such that the chosen range corresponds to the 0-1 range of the spider plot.

If the lower value of the range is higher than the upper value, the data is inverted.

Parameters:

data – data to scale

Ranges:

dictionary with the ranges for each variable, if a parameter is not in the

dictionary, the default range is (0, 1) :return: scaled data

edelweiss.clf_diagnostics.setup_test(multi_class=False)[source]

Returns a dict where the test scores will be saved.

edelweiss.clf_utils module

edelweiss.clf_utils.custom_roc_auc_score(y_true, y_prob)[source]

Scorer for the ROC AUC score using y_prob

Parameters:
  • y_true – true labels (detected or not)

  • y_prob – predicted probabilities (2D array)

Returns:

score

edelweiss.clf_utils.get_classifier(classifier, scaler=None, **kwargs)[source]

Returns the classifier object

Parameters:
  • classifier – name of the classifier

  • scaler – scaler object

  • kwargs – additional arguments for the classifier

Returns:

classifier object (sklearn pipeline)

Raises:

ValueError if classifier is not known

edelweiss.clf_utils.get_classifier_args(clf, conf)[source]

Returns the arguments for the classifier defined in the config file

Parameters:
  • clf – classifier name

  • conf – config file

Returns:

arguments for the classifier

edelweiss.clf_utils.get_clf_name(index=None)[source]

Returns the name of the classifier file.

Parameters:

index – index of the classifier

Returns:

name of the classifier file

edelweiss.clf_utils.get_detection_label(clf, bands, n_detected_bands=None)[source]

Get the detection label for the classifier.

Parameters:
  • clf – classification data (rec array)

  • bands – which bands the data has

  • n_detected_bands – how many bands have to be detected such that the event is

classified as detected, if None, the detection label is already given in clf :return: detection label (bool array) and the names of the detection labels

edelweiss.clf_utils.get_scaler(scaler)[source]

Returns the scaler object

Parameters:

scaler – name of the scaler

Returns:

scaler object

Raises:

ValueError if scaler is not known

edelweiss.clf_utils.get_scorer(score, **kwargs)[source]

Returns the scorer object given input string. If not one of the known self defined scorers, returns the input string assuming it is a sklearn scorer.

Parameters:

score – name of the scorer

Kwargs:

additional arguments for the scorer

Returns:

scorer object

edelweiss.clf_utils.load_hyperparams(clf)[source]

Loads the hyperparameters for the classifier for the CV search.

Parameters:

clf – classifier object

Returns:

hyperparameter grid

edelweiss.clf_utils.ngal_hist_scorer(y_true, y_pred, mag, bins=100, range=(15, 30))[source]

Scorer accounting for the number of galaxies in the sample on a histogram level. score = (N_pred - N_true)**2

Parameters:
  • y_true – true labels (detected or not)

  • y_pred – predicted labels (detected or not)

  • mag – magnitude of the galaxies

Returns:

score

edelweiss.clf_utils.ngal_scorer(y_true, y_pred)[source]

Scorer accounting for the number of galaxies in the sample. score = (N_pred - N_true)**2

Parameters:
  • y_true – true labels (detected or not)

  • y_pred – predicted labels (detected or not)

Returns:

score

edelweiss.custom_clfs module

class edelweiss.custom_clfs.NeuralNetworkClassifier(hidden_units=(64, 32), learning_rate=0.001, epochs=10, batch_size=32, loss='auto', activation='relu', activation_output='auto')[source]

Bases: BaseEstimator, ClassifierMixin

Neural network classifier based on Keras Sequential model

Parameters:
  • hidden_units – tuple/list, optional (default=(64, 32)) The number of units per hidden layer

  • learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer

  • epochs – int, optional (default=10) The number of epochs to train the model

  • batch_size – int, optional (default=32) The batch size for training the model

  • loss – str, optional (default=”auto”) The loss function to use, defaults to binary_crossentropy if binary and sparse_categorical_crossentropy if multiclass

  • activation – str, optional (default=”relu”) The activation function to use for the hidden layers

  • activation_output – str, optional (default=”auto”) The activation function to use for the output layer, defaults to sigmoid for single class and softmax for multiclass

  • sample_weight_col – int, optional (default=None)

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]

Fit the neural network model

Parameters:
  • X – array-like, shape (n_samples, n_features) The training input samples

  • y – array-like, shape (n_samples,) The target values

  • sample_weight – array-like, shape (n_samples,), optional (default=None) Sample weights

  • early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]

Predict the class labels for the provided data

Parameters:

X – array-like, shape (n_samples, n_features) The input samples

Returns:

array-like, shape (n_samples,) The predicted class labels

predict_proba(X)[source]

Predict the class probabilities for the provided data

Parameters:

X – array-like, shape (n_samples, n_features) The input samples

Returns:

array-like, shape (n_samples, n_classes) The predicted class probabilities

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkClassifier

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for early_stopping_patience parameter in fit.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkClassifier

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

selfobject

The updated object.

edelweiss.custom_regs module

class edelweiss.custom_regs.NeuralNetworkRegressor(hidden_units=(64, 64), learning_rate=0.001, epochs=10, batch_size=32, loss='mse', activation='relu', activation_output='linear', dropout_prob=0.0)[source]

Bases: BaseEstimator

Neural network regressor based on Keras Sequential model

Parameters:
  • hidden_units – tuple/list, optional (default=(64, 64)) The number of units per hidden layer

  • learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer

  • epochs – int, optional (default=10) The number of epochs to train the model

  • batch_size – int, optional (default=32) The batch size for training the model

  • loss – str, optional (default=”mse”) The loss function to use

  • activation – str, optional (default=”relu”) The activation function to use for the hidden layers

  • activation_output – str, optional (default=”linear”) The activation function to use for the output layer

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]

Fit the neural network model

Parameters:
  • X – array-like, shape (n_samples, n_features) The training input samples

  • y – array-like, shape (n_samples, n_outputs) The target values

  • sample_weight – array-like, shape (n_samples,), optional (default=None)

  • early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]

Predict the output from the input.

Parameters:

X – the input data

Returns:

the predicted output

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') NeuralNetworkRegressor

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for early_stopping_patience parameter in fit.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

selfobject

The updated object.

edelweiss.emulator module

edelweiss.emulator.load_emulator(path, bands=('g', 'r', 'i', 'z', 'y'), multiclassifier=False, subfolder_clf=None, subfolder_nflow=None)[source]

Load an emulator from a given path. If bands is None, returns the classifier and normalizing flow. If bands is not None, returns the classifier and a dictionary of normalizing flows for each band.

Parameters:
  • path – path to the folder containing the emulator

  • bands – the bands to load (if None, assumes that there is only one nflow)

  • multiclassifier – whether to load a multiclassifier or not

  • subfolder_clf – subfolder of the emulator folder where the classifier is stored

  • subfolder_nflow – subfolder of the emulator folder where the normalizing flow

is stored :return: the loaded classifier and normalizing flow

edelweiss.nflow module

class edelweiss.nflow.Nflow(output=None, input=None, scaler='standard')[source]

Bases: object

The normalizing flow class that wraps a pzflow normalizing flow.

Parameters:
  • output – the names of the output parameters

  • input – the names of the input parameters (=conditional parameters)

  • scaler – the scaler to use for the normalizing flow

fit(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)

Train the normalizing flow.

Parameters:
  • X – the features to train on (recarray)

  • epochs – number of epochs

  • batch_size – batch size

  • progress_bar – whether to show a progress bar

  • verbose – whether to print the losses

  • min_loss – minimum loss that is allowed for convergence

sample(X=None, n_samples=1)[source]

Sample from the normalizing flow.

Parameters:
  • X – the features to sample from (recarray or None for non-conditional sampling)

  • n_samples – number of samples to draw, number of total samples is n_samples * len(X)

Returns:

the sampled features (including the conditional parameters)

save(path, band=None, subfolder=None)[source]

Save the normalizing flow to a given path.

Parameters:
  • path – path to the folder where the emulator is saved

  • subfolder – subfolder of the emulator folder where the normalizing flow is stored

train(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)[source]

Train the normalizing flow.

Parameters:
  • X – the features to train on (recarray)

  • epochs – number of epochs

  • batch_size – batch size

  • progress_bar – whether to show a progress bar

  • verbose – whether to print the losses

  • min_loss – minimum loss that is allowed for convergence

edelweiss.nflow.load_nflow(path, band=None, subfolder=None)[source]

Load a normalizing flow from a given path.

Parameters:
  • path – path to the folder containing the emulator

  • band – the band to load (if None, assumes that there is only one nflow)

  • subfolder – subfolder of the emulator folder where the normalizing flow is stored

Returns:

the loaded normalizing flow

edelweiss.nflow_utils module

exception edelweiss.nflow_utils.ModelNotConvergedError(model_name, reason=None)[source]

Bases: Exception

Custom error class for when a has not converged.

edelweiss.nflow_utils.check_convergence(losses, min_loss=5)[source]

Check if the model has converged.

Parameters:
  • losses – list of losses

  • min_loss – minimum loss, if the loss is higher than this,

the model has not converged :raises ModelNotConvergedError: if the model has not converged

edelweiss.nflow_utils.get_scalers(scaler)[source]

Get the scalers from the name.

Parameters:

scaler – name of the scaler (str)

Returns:

scaler

Raises:

ValueError – if the scaler is not implemented

edelweiss.nflow_utils.prepare_columns(args, bands=None)[source]

Prepare the columns for the training of the normalizing flow.

Parameters:
  • args – arparse arguments

  • bands – list of bands to use, if None, no bands are used

Returns:

input and output columns

edelweiss.nflow_utils.prepare_data(args, X)[source]

Prepare the data for the training of the normalizing flow by combining the different bands to one array.

Parameters:
  • args – argparse arguments

  • X – dictionary with the data (keys are the bands)

Returns:

rec array with the data

edelweiss.reg_utils module

edelweiss.reg_utils.get_regressor(regressor, scaler, **kwargs)[source]

Returns the regressor object

Parameters:
  • regressor – name of the regressor

  • scaler – scaler object

  • kwargs – additional arguments for the regressor

Returns:

regressor object (sklearn pipeline)

Raises:

ValueError if regressor is not known

edelweiss.regressor module

class edelweiss.regressor.Regressor(scaler='standard', reg='linear', cv=0, cv_scoring='neg_mean_squared_error', input_params=None, output_params=None, **reg_kwargs)[source]

Bases: object

Wrapper class for a several regression models.

Parameters:
  • scaler – the scaler to use for the regressor

  • reg – the regressor to use

  • cv – number of cross validation folds, if 0 no cross validation is performed

  • cv_scoring – the scoring method to use for cross validation

  • input_params – the names of the input parameters

  • output_params – the names of the output parameters

  • reg_kwargs – additional keyword arguments for the regressor

fit(X, y, flat_param=None, **args)

Train the regressor.

Parameters:
  • X – the training data

  • y – the training labels

predict(X)[source]

Predict the output from the input.

Parameters:

X – the input data

Returns:

the predicted output as a recarray

save(path, name='regressor')[source]

Save the regressor to a given path.

Parameters:
  • path – the path where to save the regressor

  • name – the name of the regressor

test(X, y)[source]

Test the regressor.

Parameters:
  • X – the test data

  • y – the test labels

train(X, y, flat_param=None, **args)[source]

Train the regressor.

Parameters:
  • X – the training data

  • y – the training labels

edelweiss.regressor.load_regressor(path, name='regressor')[source]

Load a regressor from a given path.

Parameters:
  • path – path to the folder containing the regressor

  • name – the name of the regressor

Returns:

the loaded regressor

edelweiss.tf_utils module

class edelweiss.tf_utils.EpochProgressCallback(total_epochs)[source]

Bases: Callback

Class to implement a tqdm progress bar over epochs, written by ChatGPT, provided by Arne Thomsen

on_epoch_end(epoch, logs=None)[source]

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:

epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

on_train_begin(logs=None)[source]

Called at the beginning of training.

Subclasses should override for any actions to run.

Args:
logs: Dict. Currently no data is passed to this argument for this

method but that may change in the future.

on_train_end(logs=None)[source]

Called at the end of training.

Subclasses should override for any actions to run.

Args:
logs: Dict. Currently the output of the last call to

on_epoch_end() is passed to this argument for this method but that may change in the future.

Module contents