Skip to content

deel.lipdp.sensitivity module

get_max_epochs(epsilon_max, model, epochs_max=1024, safe=True, atol=0.01)

Return the maximum number of epochs to reach a given epsilon_max value.

The computation of (epsilon, delta) is slow since it involves solving a minimization problem (mandatory to go from RDP accountant to DP results). Hence each call takes typically around 1s. This function is used to avoid calling get_eps_delta too many times be leveraging the fact that epsilon is a non-decreasing function of the number of epochs: we unlocks the dichotomy search.

Hence the number of calls is typically log2(epochs_max) + 1. The maximum of epochs is set to 1024 by default to avoid too long computation times, even in high privacy regimes.

Parameters:

Name Type Description Default
epsilon_max

The maximum value of epsilon we want to reach.

required
model

The model used to compute the values of epsilon.

required
epochs_max

The maximum number of epochs to reach epsilon_max. Defaults to 1024. If None, the dichotomy search is used to find the upper bound.

1024
safe

If True, the dichotomy search returns the largest number of epochs such that epsilon <= epsilon_max. Otherwise, it returns the smallest number of epochs such that epsilon >= epsilon_max.

True
atol

The absolute tolerance to panic on numerical inaccuracy. Defaults to 1e-2.

0.01

Returns:

Type Description

The maximum number of epochs to reach epsilon_max. It may be zero if epsilon_max is too small.

Source code in deel/lipdp/sensitivity.py
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def get_max_epochs(epsilon_max, model, epochs_max=1024, safe=True, atol=1e-2):
    """Return the maximum number of epochs to reach a given epsilon_max value.

    The computation of (epsilon, delta) is slow since it involves solving a minimization problem
    (mandatory to go from RDP accountant to DP results). Hence each call takes typically around 1s.
    This function is used to avoid calling get_eps_delta too many times be leveraging the fact that
    epsilon is a non-decreasing function of the number of epochs: we unlocks the dichotomy search.

    Hence the number of calls is typically log2(epochs_max) + 1.
    The maximum of epochs is set to 1024 by default to avoid too long computation times, even in high
    privacy regimes.

    Args:
        epsilon_max: The maximum value of epsilon we want to reach.
        model: The model used to compute the values of epsilon.
        epochs_max: The maximum number of epochs to reach epsilon_max. Defaults to 1024.
                    If None, the dichotomy search is used to find the upper bound.
        safe: If True, the dichotomy search returns the largest number of epochs such that epsilon <= epsilon_max.
              Otherwise, it returns the smallest number of epochs such that epsilon >= epsilon_max.
        atol: The absolute tolerance to panic on numerical inaccuracy. Defaults to 1e-2.

    Returns:
        The maximum number of epochs to reach epsilon_max. It may be zero if epsilon_max is too small.
    """
    steps_per_epoch = model.dataset_metadata.nb_steps_per_epochs

    def fun(epoch):
        if epoch == 0:
            epsilon = 0
        else:
            epsilon, _ = get_eps_delta(model, epoch)
        return epsilon

    # dichotomy search on the number of epochs.
    if epochs_max is None:
        epochs_max = 512
        epsilon = 0
        while epsilon < epsilon_max:
            epochs_max *= 2
            epsilon = fun(epochs_max)
            print(f"epochs_max = {epochs_max} at epsilon = {epsilon}")

    epochs_min = 0

    while epochs_max - epochs_min > 1:
        epoch = (epochs_max + epochs_min) // 2
        epsilon = fun(epoch)
        if epsilon < epsilon_max:
            epochs_min = epoch
        else:
            epochs_max = epoch
        print(
            f"epoch bounds = {epochs_min, epochs_max} and epsilon = {epsilon} at epoch {epoch}"
        )

    if safe:
        last_epsilon = fun(epochs_min)
        error = last_epsilon - epsilon_max
        if error <= 0:
            return epochs_min
        elif error < atol:
            # This branch should never be taken if fun is a non-decreasing function of the number of epochs.
            # fun is mathematcally non-decreasing, but numerical inaccuracy can lead to this case.
            print(f"Numerical inaccuracy with error {error:.7f} in the dichotomy search: using a conservative value.")
            return epochs_min - 1
        else:
            assert False, f"Numerical inaccuracy with error {error:.7f}>{atol:.3f} in the dichotomy search."

    return epochs_max

gradient_norm_check(upper_bounds, model, examples)

Verifies that the values of per-sample gradients on a layer never exceede a value determined by the theoretical work.

Args

upper_bounds: maximum gradient bounds for each layer (dictionnary of 'layers name ': 'bounds' pairs). model: The model containing the layers we are interested in. Layers must only have one trainable variable. examples: a batch of examples to test on.

Returns : Boolean value. True corresponds to upper bound has been validated.

Source code in deel/lipdp/sensitivity.py
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
def gradient_norm_check(upper_bounds, model, examples):
    """Verifies that the values of per-sample gradients on a layer never exceede a value
    determined by the theoretical work.

    Args :
        upper_bounds: maximum gradient bounds for each layer (dictionnary of 'layers name ': 'bounds' pairs).
        model: The model containing the layers we are interested in. Layers must only have one trainable variable.
        examples: a batch of examples to test on.  
    Returns :
        Boolean value. True corresponds to upper bound has been validated.
    """
    activations = examples
    var_seen = set()
    for layer in model.layers:
        post_activations = layer(activations, training=True)
        assert len(layer.trainable_variables) < 2
        if len(layer.trainable_variables) == 1:
            assert len(layer.trainable_variables) == 1
            train_var = layer.trainable_variables[0]
            var_name = layer.trainable_variables[0].name
            var_seen.add(var_name)
            bound = upper_bounds[var_name]
            check_layer_gradient_norm(bound, layer, activations)
        activations = post_activations
    for var_name in upper_bounds:
        assert var_name in var_seen