NegativeBinomialVAE
latent.models.vae.NegativeBinomialVAE
Negative binomial variational autoencoder with fixed variational encoder and negative binomial decoder networks.
__init__(self, name='nb_vae', x_dim=None, latent_dim=50, encoder_units=[128, 128], decoder_units=[128, 128], reconstruction_loss=None, use_conditions=False, dispersion='constant', kld_weight=1e-05, prior='normal', latent_dist='independent', iaf_units=[256, 256], n_pseudoinputs=200, **kwargs)
special
Parameters:
Name | Type | Description | Default |
---|---|---|---|
decoder |
|
Keras/tensorflow model object that inputs the latent space and outputs the reconstructed data. If not provided, a default model will be constructed from the arguments. |
required |
name |
str |
String indicating the name of the model. |
'nb_vae' |
x_dim |
int |
Integer indicating the number of features in the input data. |
None |
latent_dim |
int |
Integer indicating the number of dimensions in the latent space. |
50 |
encoder_units |
Iterable[int] |
Integer list indicating the number of units of the encoder
layers. Only used if |
[128, 128] |
decoder_units |
Iterable[int] |
An integer list indicating the number of units of the decoder
layers. Only used if |
[128, 128] |
reconstruction_loss |
Callable |
Loss function applied to the reconstructed data and to be
added by the decoder. Only used if |
None |
use_conditions |
bool |
Boolean, whether to force the unpacking of conditions from the inputs. |
False |
dispersion |
Union[Literal['gene', 'cell-gene', 'constant'], float] |
One of the following:
|
'constant' |
kld_weight |
float |
Float indicating the weight of the KL Divergence regularization loss. |
1e-05 |
prior |
Literal['normal', 'iaf', 'vamp'] |
The choice of prior distribution. One of the following:
|
'normal' |
latent_dist |
Literal['independent', 'multivariate'] |
The choice of latent distribution. One of the following:
|
'independent' |
iaf_units |
Iterable[int] |
Integer list indicating the units in the IAF bijector network.
Only used if |
[256, 256] |
n_pseudoinputs |
int |
Integer indicating the number of pseudoinputs for the VAMP
prior. Only used if |
200 |
**kwargs |
|
Other arguments passed on to |
{} |
call(self, inputs, training=None)
inherited
Full forward pass through model
compile(self, optimizer='adam', loss=None, **kwargs)
inherited
Configures the model for training.
Examples:
model.compile(optimizer=tf.keras.optimizer.Adam(learning_rate=1e-3),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=[tf.keras.metrics.BinaryAccuracy(),
tf.keras.metrics.FalseNegatives()])
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer |
|
String (name of optimizer) or optimizer instance. See
|
'adam' |
loss |
|
Loss function. Maybe be a string (name of loss function), or
a |
None |
metrics |
|
List of metrics to be evaluated by the model during training
and testing. Each of this can be a string (name of a built-in
function), function or a |
required |
loss_weights |
|
Optional list or dictionary specifying scalar coefficients
(Python floats) to weight the loss contributions of different model
outputs. The loss value that will be minimized by the model will then
be the weighted sum of all individual losses, weighted by the
|
required |
weighted_metrics |
|
List of metrics to be evaluated and weighted by
|
required |
run_eagerly |
|
Bool. Defaults to |
required |
steps_per_execution |
|
Int. Defaults to 1. The number of batches to
run during each |
required |
**kwargs |
|
Arguments supported for backwards compatibility only. |
{} |
Exceptions:
Type | Description |
---|---|
ValueError |
In case of invalid arguments for
|
fit(self, x, y=None, **kwargs)
inherited
Trains the model for a fixed number of epochs (iterations on a dataset).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
|
Input data. It could be:
- A Numpy array (or array-like), or a list of arrays
(in case the model has multiple inputs).
- A TensorFlow tensor, or a list of tensors
(in case the model has multiple inputs).
- A dict mapping input names to the corresponding array/tensors,
if the model has named inputs.
- A |
required |
y |
|
Target data. Like the input data |
None |
batch_size |
|
Integer or |
required |
epochs |
|
Integer. Number of epochs to train the model.
An epoch is an iteration over the entire |
required |
verbose |
|
'auto', 0, 1, or 2. Verbosity mode.
0 = silent, 1 = progress bar, 2 = one line per epoch.
'auto' defaults to 1 for most cases, but 2 when used with
|
required |
callbacks |
|
List of |
required |
validation_split |
|
Float between 0 and 1.
Fraction of the training data to be used as validation data.
The model will set apart this fraction of the training data,
will not train on it, and will evaluate
the loss and any model metrics
on this data at the end of each epoch.
The validation data is selected from the last samples
in the |
required |
validation_data |
|
Data on which to evaluate
the loss and any model metrics at the end of each epoch.
The model will not be trained on this data. Thus, note the fact
that the validation loss of data provided using |
required |
shuffle |
|
Boolean (whether to shuffle the training data
before each epoch) or str (for 'batch'). This argument is ignored
when |
required |
class_weight |
|
Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class. |
required |
sample_weight |
|
Optional Numpy array of weights for
the training samples, used for weighting the loss function
(during training only). You can either pass a flat (1D)
Numpy array with the same length as the input samples
(1:1 mapping between weights and samples),
or in the case of temporal data,
you can pass a 2D array with shape
|
required |
initial_epoch |
|
Integer. Epoch at which to start training (useful for resuming a previous training run). |
required |
steps_per_epoch |
|
Integer or |
required |
validation_steps |
|
Only relevant if |
required |
validation_batch_size |
|
Integer or |
required |
validation_freq |
|
Only relevant if validation data is provided. Integer
or |
required |
max_queue_size |
|
Integer. Used for generator or |
required |
workers |
|
Integer. Used for generator or |
required |
use_multiprocessing |
|
Boolean. Used for generator or
|
required |
Unpacking behavior for iterator-like inputs:
A common pattern is to pass a tf.data.Dataset, generator, or
tf.keras.utils.Sequence to the x
argument of fit, which will in fact
yield not only features (x) but optionally targets (y) and sample weights.
Keras requires that the output of such iterator-likes be unambiguous. The
iterator should return a tuple of length 1, 2, or 3, where the optional
second and third elements will be used for y and sample_weight
respectively. Any other type provided will be wrapped in a length one
tuple, effectively treating everything as 'x'. When yielding dicts, they
should still adhere to the top-level tuple structure.
e.g. ({"x0": x0, "x1": x1}, y)
. Keras will not attempt to separate
features, targets, and weights from the keys of a single dict.
A notable unsupported data type is the namedtuple. The reason is that
it behaves like both an ordered datatype (tuple) and a mapping
datatype (dict). So given a namedtuple of the form:
namedtuple("example_tuple", ["y", "x"])
it is ambiguous whether to reverse the order of the elements when
interpreting the value. Even worse is a tuple of the form:
namedtuple("other_tuple", ["x", "y", "z"])
where it is unclear if the tuple was intended to be unpacked into x, y,
and sample_weight or passed through as a single element to x
. As a
result the data processing code will simply raise a ValueError if it
encounters a namedtuple. (Along with instructions to remedy the issue.)
Returns:
Type | Description |
---|---|
|
A |
Exceptions:
Type | Description |
---|---|
RuntimeError |
|
ValueError |
In case of mismatch between the provided input data and what the model expects or when the input data is empty. |
transform(self, x, conditions=None)
inherited
Map data (x) to latent space (z).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
|
A numpy array with input data. |
required |
conditions |
|
A numpy array with conditions. |
None |
Returns:
Type | Description |
---|---|
|
A numpy array with the coordinates of the input data in latent space. |