edelweiss package¶

Submodules¶

edelweiss.classifier module¶

class edelweiss.classifier.Classifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]¶

Bases: object

The detection classifer class that wraps a sklearn classifier.

Parameters:

scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile
clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,
calibrate – whether to calibrate the probabilities
cv – number of cross validation folds, if 0 no cross validation is performed
cv_scoring – the scoring method to use for cross validation
params – the names of the parameters
clf_kwargs – additional keyword arguments for the classifier

fit(X, y, **args)¶

Train the classifier.

Parameters:

X – the features to train on (array or recarray)
y – the labels to train on
args – additional arguments for the classifier

predict(X, prob_multiplier=1.0)[source]¶

Predict the labels for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted labels

predict_non_proba(X)[source]¶

Predict the probabilities for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted probabilities

predict_proba(X)[source]¶

Predict the probabilities for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted probabilities

save(path, subfolder=None)[source]¶

Save the classifier to a given path.

Parameters:

path – path to the folder where the emulator is saved
subfolder – subfolder of the emulator folder where the classifier is stored

test(X_test, y_test, non_proba=False)[source]¶

Tests the classifier on the test data

Parameters:

test_arr – dict where the test scores will be saved
clf – classifier
X_test – test data
y_test – test labels
non_proba – whether to use non-probabilistic predictions

train(X, y, **args)[source]¶

Train the classifier.

Parameters:

X – the features to train on (array or recarray)
y – the labels to train on
args – additional arguments for the classifier

class edelweiss.classifier.MultiClassClassifier(scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]¶

Bases: Classifier

The detection classifer class that wraps a sklearn classifier for multiple classes.

Parameters:

scaler – the scaler to use for the classifier, options: standard, minmax, maxabs, robust, quantile
clf – the classifier to use, options are: XGB, MLP, RandomForest, NeuralNetwork, LogisticRegression, LinearSVC, DecisionTree, AdaBoost, GaussianNB, QDA, KNN,
calibrate – whether to calibrate the probabilities
cv – number of cross validation folds, if 0 no cross validation is performed
cv_scoring – the scoring method to use for cross validation
params – the names of the parameters
clf_kwargs – additional keyword arguments for the classifier

predict(X)[source]¶

Predict the labels for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted labels

predict_non_proba(X)[source]¶

Predict the class non-probabilistically for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted probabilities

predict_proba(X)[source]¶

Predict the probabilities for a given set of features.

Parameters:: X – the features to predict on (array or recarry)
Returns:: the predicted probabilities

test(X_test, y_test, non_proba=False)[source]¶

Tests the classifier on the test data

Parameters:

test_arr – dict where the test scores will be saved
clf – classifier
X_test – test data
y_test – test labels
non_proba – whether to use non-probabilistic predictions

class edelweiss.classifier.MultiClassifier(split_label='galaxy_type', labels=None, scaler='standard', clf='XGB', calibrate=True, cv=0, cv_scoring='f1', params=None, **clf_kwargs)[source]¶

Bases: object

A classifier class that trains multiple classifiers for a specific label. This label could e.g. be the galaxy type (star, red galaxy, blue galaxy).

Parameters:

split_label – the label to split the data in different classifers
labels – the different labels of the split label
scaler – the scaler to use for the classifier
clf – the classifier to use
calibrate – whether to calibrate the probabilities
cv – number of cross validation folds, if 0 no cross validation is performed
cv_scoring – the scoring method to use for cross validation
params – the names of the parameters
clf_kwargs – additional keyword arguments for the classifier

fit(X, y)¶: Train the classifier.

predict(X)[source]¶: Predict the labels for a given set of features.

predict_non_proba(X)[source]¶: Predict the probabilities for a given set of features.

predict_proba(X)[source]¶: Predict the probabilities for a given set of features.

save(path, subfolder=None)[source]¶

Save the classifier to a given path.

Parameters:

path – path to the folder where the emulator is saved
subfolder – subfolder of the emulator folder where the classifier is stored

test(X_test, y_test, non_proba=False)[source]¶

Tests the classifier on the test data

Parameters:

test_arr – dict where the test scores will be saved
clf – classifier
X_test – test data
y_test – test labels
non_proba – whether to use non-probabilistic predictions

train(X, y)[source]¶: Train the classifier.

edelweiss.classifier.load_classifier(path, subfolder=None)[source]¶

Load a classifier from a given path.

Parameters:

path – path to the folder containing the emulator
subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.classifier.load_multiclassifier(path, subfolder=None)[source]¶

Load a multiclassifier from a given path.

Parameters:

path – path to the folder containing the emulator
subfolder – subfolder of the emulator folder where the classifier is stored

Returns:

the loaded classifier

edelweiss.clf_diagnostics module¶

edelweiss.clf_diagnostics.add_range_to_name(field_names, ranges)[source]¶

Add the range to the name of the variable such that the range is visible in the spider plot.

Parameters:

field_names – list with the names of the variables
ranges – dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_all_scores(test_arr, y_test, y_pred, y_prob)[source]¶

Calculates all the scores and append them to the test_arr dict

Parameters:

test_arr – dict where the test scores will be saved
y_test – test labels
y_pred – predicted labels
y_prob – probability of being detected

edelweiss.clf_diagnostics.get_all_scores_multiclass(test_arr, y_test, y_pred, y_prob)[source]¶

Calculates all the scores and append them to the test_arr dict for a multiclass classifier.

Parameters:

test_arr – dict where the test scores will be saved
y_test – test labels
y_pred – predicted labels
y_prob – probability of being detected

edelweiss.clf_diagnostics.get_confusion_matrix(y_true, y_pred)[source]¶

Get the confusion matrix for the classifier.

Parameters:

y_true – true labels
y_pred – predicted labels

Returns:

True Positives, True Negatives, False Positives, False Negatives

edelweiss.clf_diagnostics.get_default_ranges_for_spider()[source]¶

Get the default ranges for the spider plot.

Returns:: dictionary with the ranges for each variable

edelweiss.clf_diagnostics.get_name(clf, final=False)[source]¶

Get the name to add to the classifier

Parameters:

clf – classifier object (from sklearn) or name of the classifier
final – if True, the classifier was tested on the test data.

Returns:

name

edelweiss.clf_diagnostics.plot_all_scores(scores, path_labels=None)[source]¶

Plot all scores for the classifiers. Input can either be directly a recarray with the scores or the path to the scores or a list of paths to the scores. If a list is given, the scores of the different paths are combined and plotted with different colors.

Parameters:

scores – recarray with the scores or path to the scores or list of paths
path_labels – list of labels for the different paths

edelweiss.clf_diagnostics.plot_calibration_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]¶

Plot the calibration curve for the classifier.

Parameters:

y_true – true labels
y_prob – predicted probabilities
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot
fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_classifier_comparison(clfs, conf, path, spider_ranges=None, labels=None, print_scores=False, special_param='mag_i')[source]¶

Plot the diagnostics for chosen classifiers. If the classifiers are not all from same path, the conf and path parameters should be lists of the same length as clfs.

Parameters:

clfs – list of classifier names
conf – configuration dictionary or list of dictionaries
path – path to the data or list of paths
spider_ranges – dictionary with the ranges for the spider plot
labels – list of labels for the different paths
print_scores – if True, print the scores for the different classifiers
special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_diagnostics(clf, X_test, y_test, output_directory='.', final=False, save_plot=False, special_param='mag_i')[source]¶

Plot the diagnostics for the classifier.

Parameters:

clf – classifier object
X_test – test data
y_test – true labels
output_directory – directory to save the plots
final – if True, the classifier was tested on the test data.
save_plot – if True, save the plots
special_param – param to plot the histogram for

edelweiss.clf_diagnostics.plot_feature_importances(clf, clf_name='classifier', summed=False)[source]¶

Plots the feature importances for the classifier.

Parameters:

clf – classifier object
names – names of the features
clf_name – name of the classifier
summed – if True, the summed feature importances are plotted

edelweiss.clf_diagnostics.plot_hist_fp_fn_tp_tn(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False)[source]¶

Plot the stacked histogram of one parameter (e.g. i-band magnitude) for the different confusion matrix elements.

Parameters:

param – parameter to plot
y_true – true labels
y_pred – predicted labels
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot

edelweiss.clf_diagnostics.plot_hist_n_gal(param, y_true, y_pred, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]¶

Plot the histogram of detected galaxies for the classifer and the true detected galaxies for one parameter (e.g. i-band magnitude).

Parameters:

param – parameter to plot
y_true – true labels
y_pred – predicted labels
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot
fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_pr_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]¶

Plot the precision-recall curve for the classifier.

Parameters:

y_true – true labels
y_prob – predicted probabilities
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot
fig – figure object, if None, create a new figure

Returns:

figure object

edelweiss.clf_diagnostics.plot_roc_curve(y_true, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None)[source]¶

Plot the ROC curve for the classifier.

Parameters:

y_true – true labels
y_prob – predicted probabilities
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot
fig – figure object, if None, create a new figure

edelweiss.clf_diagnostics.plot_spider_scores(y_true, y_pred, y_prob, output_directory='.', clf='classifier', final=False, save_plot=False, fig=None, ranges=None, print_scores=False)[source]¶

Plot the spider scores for the classifier.

Parameters:

y_true – true labels
y_pred – predicted labels
y_prob – predicted probabilities
output_directory – directory to save the plot
clf – classifier object or name of the classifier
final – if True, the plot is for the final classifier
save_plot – if True, save the plot
fig – figure object, if None, create a new figure
ranges – dictionary of ranges for each score
print_scores – if True, print the scores

Returns:

figure object

edelweiss.clf_diagnostics.scale_data_for_spider(data, ranges=None)[source]¶

Scale the data for the spider plot such that the chosen range corresponds to the 0-1 range of the spider plot.

If the lower value of the range is higher than the upper value, the data is inverted.

Parameters:: data – data to scale
Ranges:: dictionary with the ranges for each variable, if a parameter is not in the

dictionary, the default range is (0, 1) :return: scaled data

edelweiss.clf_diagnostics.setup_test(multi_class=False)[source]¶: Returns a dict where the test scores will be saved.

edelweiss.clf_utils module¶

edelweiss.clf_utils.custom_roc_auc_score(y_true, y_prob)[source]¶

Scorer for the ROC AUC score using y_prob

Parameters:

y_true – true labels (detected or not)
y_prob – predicted probabilities (2D array)

Returns:

score

edelweiss.clf_utils.get_classifier(classifier, scaler=None, **kwargs)[source]¶

Returns the classifier object

Parameters:

classifier – name of the classifier
scaler – scaler object
kwargs – additional arguments for the classifier

Returns:

classifier object (sklearn pipeline)

Raises:

ValueError if classifier is not known

edelweiss.clf_utils.get_classifier_args(clf, conf)[source]¶

Returns the arguments for the classifier defined in the config file

Parameters:

clf – classifier name
conf – config file

Returns:

arguments for the classifier

edelweiss.clf_utils.get_clf_name(index=None)[source]¶

Returns the name of the classifier file.

Parameters:: index – index of the classifier
Returns:: name of the classifier file

edelweiss.clf_utils.get_detection_label(clf, bands, n_detected_bands=None)[source]¶

Get the detection label for the classifier.

Parameters:

clf – classification data (rec array)
bands – which bands the data has
n_detected_bands – how many bands have to be detected such that the event is

classified as detected, if None, the detection label is already given in clf :return: detection label (bool array) and the names of the detection labels

edelweiss.clf_utils.get_scaler(scaler)[source]¶

Returns the scaler object

Parameters:: scaler – name of the scaler
Returns:: scaler object
Raises:: ValueError if scaler is not known

edelweiss.clf_utils.get_scorer(score, **kwargs)[source]¶

Returns the scorer object given input string. If not one of the known self defined scorers, returns the input string assuming it is a sklearn scorer.

Parameters:: score – name of the scorer
Kwargs:: additional arguments for the scorer
Returns:: scorer object

edelweiss.clf_utils.load_hyperparams(clf)[source]¶

Loads the hyperparameters for the classifier for the CV search.

Parameters:: clf – classifier object
Returns:: hyperparameter grid

edelweiss.clf_utils.ngal_hist_scorer(y_true, y_pred, mag, bins=100, range=(15, 30))[source]¶

Scorer accounting for the number of galaxies in the sample on a histogram level. score = (N_pred - N_true)**2

Parameters:

y_true – true labels (detected or not)
y_pred – predicted labels (detected or not)
mag – magnitude of the galaxies

Returns:

score

edelweiss.clf_utils.ngal_scorer(y_true, y_pred)[source]¶

Scorer accounting for the number of galaxies in the sample. score = (N_pred - N_true)**2

Parameters:

y_true – true labels (detected or not)
y_pred – predicted labels (detected or not)

Returns:

score

edelweiss.custom_clfs module¶

class edelweiss.custom_clfs.NeuralNetworkClassifier(hidden_units=(64, 32), learning_rate=0.001, epochs=10, batch_size=32, loss='auto', activation='relu', activation_output='auto')[source]¶

Bases: BaseEstimator, ClassifierMixin

Neural network classifier based on Keras Sequential model

Parameters:

hidden_units – tuple/list, optional (default=(64, 32)) The number of units per hidden layer
learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer
epochs – int, optional (default=10) The number of epochs to train the model
batch_size – int, optional (default=32) The batch size for training the model
loss – str, optional (default=”auto”) The loss function to use, defaults to binary_crossentropy if binary and sparse_categorical_crossentropy if multiclass
activation – str, optional (default=”relu”) The activation function to use for the hidden layers
activation_output – str, optional (default=”auto”) The activation function to use for the output layer, defaults to sigmoid for single class and softmax for multiclass
sample_weight_col – int, optional (default=None)

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]¶

Fit the neural network model

Parameters:

X – array-like, shape (n_samples, n_features) The training input samples
y – array-like, shape (n_samples,) The target values
sample_weight – array-like, shape (n_samples,), optional (default=None) Sample weights
early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]¶

Predict the class labels for the provided data

Parameters:: X – array-like, shape (n_samples, n_features) The input samples
Returns:: array-like, shape (n_samples,) The predicted class labels

predict_proba(X)[source]¶

Predict the class probabilities for the provided data

Parameters:: X – array-like, shape (n_samples, n_features) The input samples
Returns:: array-like, shape (n_samples, n_classes) The predicted class probabilities

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') → NeuralNetworkClassifier¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for early_stopping_patience parameter in fit.
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → NeuralNetworkClassifier¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns¶

selfobject: The updated object.

edelweiss.custom_regs module¶

class edelweiss.custom_regs.NeuralNetworkRegressor(hidden_units=(64, 64), learning_rate=0.001, epochs=10, batch_size=32, loss='mse', activation='relu', activation_output='linear', dropout_prob=0.0)[source]¶

Bases: BaseEstimator

Neural network regressor based on Keras Sequential model

Parameters:

hidden_units – tuple/list, optional (default=(64, 64)) The number of units per hidden layer
learning_rate – float, optional (default=0.001) The learning rate for the Adam optimizer
epochs – int, optional (default=10) The number of epochs to train the model
batch_size – int, optional (default=32) The batch size for training the model
loss – str, optional (default=”mse”) The loss function to use
activation – str, optional (default=”relu”) The activation function to use for the hidden layers
activation_output – str, optional (default=”linear”) The activation function to use for the output layer

fit(X, y, sample_weight=None, early_stopping_patience=10)[source]¶

Fit the neural network model

Parameters:

X – array-like, shape (n_samples, n_features) The training input samples
y – array-like, shape (n_samples, n_outputs) The target values
sample_weight – array-like, shape (n_samples,), optional (default=None)
early_stopping_patience – int, optional (default=10) The number of epochs with no improvement after which training will be stopped

predict(X)[source]¶

Predict the output from the input.

Parameters:: X – the input data
Returns:: the predicted output

set_fit_request(*, early_stopping_patience: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') → NeuralNetworkRegressor¶

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters¶

early_stopping_patiencestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for early_stopping_patience parameter in fit.
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in fit.

Returns¶

selfobject: The updated object.

edelweiss.emulator module¶

edelweiss.emulator.load_emulator(path, bands=('g', 'r', 'i', 'z', 'y'), multiclassifier=False, subfolder_clf=None, subfolder_nflow=None)[source]¶

Load an emulator from a given path. If bands is None, returns the classifier and normalizing flow. If bands is not None, returns the classifier and a dictionary of normalizing flows for each band.

Parameters:

path – path to the folder containing the emulator
bands – the bands to load (if None, assumes that there is only one nflow)
multiclassifier – whether to load a multiclassifier or not
subfolder_clf – subfolder of the emulator folder where the classifier is stored
subfolder_nflow – subfolder of the emulator folder where the normalizing flow

is stored :return: the loaded classifier and normalizing flow

edelweiss.nflow module¶

class edelweiss.nflow.Nflow(output=None, input=None, scaler='standard')[source]¶

Bases: object

The normalizing flow class that wraps a pzflow normalizing flow.

Parameters:

output – the names of the output parameters
input – the names of the input parameters (=conditional parameters)
scaler – the scaler to use for the normalizing flow

fit(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)¶

Train the normalizing flow.

Parameters:

X – the features to train on (recarray)
epochs – number of epochs
batch_size – batch size
progress_bar – whether to show a progress bar
verbose – whether to print the losses
min_loss – minimum loss that is allowed for convergence

sample(X=None, n_samples=1)[source]¶

Sample from the normalizing flow.

Parameters:

X – the features to sample from (recarray or None for non-conditional sampling)
n_samples – number of samples to draw, number of total samples is n_samples * len(X)

Returns:

the sampled features (including the conditional parameters)

save(path, band=None, subfolder=None)[source]¶

Save the normalizing flow to a given path.

Parameters:

path – path to the folder where the emulator is saved
subfolder – subfolder of the emulator folder where the normalizing flow is stored

train(X, epochs=100, batch_size=1024, progress_bar=True, verbose=True, min_loss=5)[source]¶

Train the normalizing flow.

Parameters:

X – the features to train on (recarray)
epochs – number of epochs
batch_size – batch size
progress_bar – whether to show a progress bar
verbose – whether to print the losses
min_loss – minimum loss that is allowed for convergence

edelweiss.nflow.load_nflow(path, band=None, subfolder=None)[source]¶

Load a normalizing flow from a given path.

Parameters:

path – path to the folder containing the emulator
band – the band to load (if None, assumes that there is only one nflow)
subfolder – subfolder of the emulator folder where the normalizing flow is stored

Returns:

the loaded normalizing flow

edelweiss.nflow_utils module¶

exception edelweiss.nflow_utils.ModelNotConvergedError(model_name, reason=None)[source]¶

Bases: Exception

Custom error class for when a has not converged.

edelweiss.nflow_utils.check_convergence(losses, min_loss=5)[source]¶

Check if the model has converged.

Parameters:

losses – list of losses
min_loss – minimum loss, if the loss is higher than this,

the model has not converged :raises ModelNotConvergedError: if the model has not converged

edelweiss.nflow_utils.get_scalers(scaler)[source]¶

Get the scalers from the name.

Parameters:: scaler – name of the scaler (str)
Returns:: scaler
Raises:: ValueError – if the scaler is not implemented

edelweiss.nflow_utils.prepare_columns(args, bands=None)[source]¶

Prepare the columns for the training of the normalizing flow.

Parameters:

args – arparse arguments
bands – list of bands to use, if None, no bands are used

Returns:

input and output columns

edelweiss.nflow_utils.prepare_data(args, X)[source]¶

Prepare the data for the training of the normalizing flow by combining the different bands to one array.

Parameters:

args – argparse arguments
X – dictionary with the data (keys are the bands)

Returns:

rec array with the data

edelweiss.reg_utils module¶

edelweiss.reg_utils.get_regressor(regressor, scaler, **kwargs)[source]¶

Returns the regressor object

Parameters:

regressor – name of the regressor
scaler – scaler object
kwargs – additional arguments for the regressor

Returns:

regressor object (sklearn pipeline)

Raises:

ValueError if regressor is not known

edelweiss.regressor module¶

class edelweiss.regressor.Regressor(scaler='standard', reg='linear', cv=0, cv_scoring='neg_mean_squared_error', input_params=None, output_params=None, **reg_kwargs)[source]¶

Bases: object

Wrapper class for a several regression models.

Parameters:

scaler – the scaler to use for the regressor
reg – the regressor to use
cv – number of cross validation folds, if 0 no cross validation is performed
cv_scoring – the scoring method to use for cross validation
input_params – the names of the input parameters
output_params – the names of the output parameters
reg_kwargs – additional keyword arguments for the regressor

fit(X, y, flat_param=None, **args)¶

Train the regressor.

Parameters:

X – the training data
y – the training labels

predict(X)[source]¶

Predict the output from the input.

Parameters:: X – the input data
Returns:: the predicted output as a recarray

save(path, name='regressor')[source]¶

Save the regressor to a given path.

Parameters:

path – the path where to save the regressor
name – the name of the regressor

test(X, y)[source]¶

Test the regressor.

Parameters:

X – the test data
y – the test labels

train(X, y, flat_param=None, **args)[source]¶

Train the regressor.

Parameters:

X – the training data
y – the training labels

edelweiss.regressor.load_regressor(path, name='regressor')[source]¶

Load a regressor from a given path.

Parameters:

path – path to the folder containing the regressor
name – the name of the regressor

Returns:

the loaded regressor

edelweiss.tf_utils module¶

class edelweiss.tf_utils.EpochProgressCallback(total_epochs)[source]¶

Bases: Callback

Class to implement a tqdm progress bar over epochs, written by ChatGPT, provided by Arne Thomsen

on_epoch_end(epoch, logs=None)[source]¶

Called at the end of an epoch.

Subclasses should override for any actions to run. This function should only be called during TRAIN mode.

Args:: epoch: Integer, index of epoch. logs: Dict, metric results for this training epoch, and for the

validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the Model’s metrics are returned. Example: {‘loss’: 0.2, ‘accuracy’: 0.7}.

on_train_begin(logs=None)[source]¶

Called at the beginning of training.

Subclasses should override for any actions to run.

Args:

logs: Dict. Currently no data is passed to this argument for this: method but that may change in the future.

on_train_end(logs=None)[source]¶

Called at the end of training.

Subclasses should override for any actions to run.

Args:

logs: Dict. Currently the output of the last call to: on_epoch_end() is passed to this argument for this method but that may change in the future.

edelweiss package¶

Submodules¶

edelweiss.classifier module¶

edelweiss.clf_diagnostics module¶

edelweiss.clf_utils module¶

edelweiss.custom_clfs module¶

Parameters¶

Returns¶

Parameters¶

Returns¶

edelweiss.custom_regs module¶

Parameters¶

Returns¶

edelweiss.emulator module¶

edelweiss.nflow module¶

edelweiss.nflow_utils module¶

edelweiss.reg_utils module¶

edelweiss.regressor module¶

edelweiss.tf_utils module¶

Module contents¶