PIT is a mask-based DNAS tool capable to optimize the most relevant hyper-parameters of 1D and 2D CNNs balancing performance-task with other metrics denoted with in-general as regularization. The list of the supported regularization targets is available in section Supported Regularizers.
In order to gain more technical details about the algorithm, please refer to our publication Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge.
To optimize your model with PIT you will need in most situations only three additional steps with respect to a normal pytorch training loop:
- Import the
PIT
conversion module and use it to automatically convert yourmodel
in an optimizable format. In its basic usage,PIT
requires as arguments:- the
model
to be optimized - the
input_shape
of an input tensor (without batch-size) - the
regularizer
to be used (consult supported regularizers to know the different alternatives) which dictates the metric that will be optimized.
from plinio.methods import PIT pit_model = PIT(model, input_shape=input_shape, regularizer='size')
- the
- Inside the training loop compute regularization-loss and add it to task-loss to optimize the two quantities together. N.B., we suggest to control the relative balance between the two losses by multiplying a scalar
strength
value to the regularization loss.strength = 1e-6 # depends on user specific requirements for epoch in range(N_EPOCHS): for sample, target in data: output = pit_model(sample) task_loss = criterion(output, target) reg_loss = strength * pit_model.get_regularization_loss() loss = task_loss + reg_loss optimizer.zero_grad() loss.backward() optimizer.step()
- Finally export the optimized model. After conversion we suggest to perform some additional epochs of fine-tuning on the
exported_model
.exported_model = pit_model.arch_export()
In general, when PIT
is applied to a network all the supported layers are automatically marked as optimizable.
In the spirit of giving maximum flexibility to the user, PIT
allows to exclude layers from the optimization process.
In particular, PIT
exposes two arguments that can be used with this aim:
-
exclude_names
, is an optional tuple of layer identifiers that we want to exclude. Only the specified layers will not be optimized. E.g.,class Net(nn.Module): def __init__(self): self.c0 = nn.Conv1d() self.lin0 = nn.Linear() self.lin1 = nn.Linear() self.lin2 = nn.Linear() net = Net() exclude_names = ('net.lin1', 'net.lin2') pit_net = PIT(net, exclude_names=exclude_names)
In the example, the
Linear
layerslin1
andlin2
will not be optimized. -
exclude_types
, is an optional tuple of layer types that we want to exclude. All the layers of this type will not be optimized. E.g.,class Net(nn.Module): def __init__(self): self.c0 = nn.Conv1d() self.lin0 = nn.Linear() self.lin1 = nn.Linear() self.lin2 = nn.Linear() net = Net() exclude_types = (nn.Conv1d) pit_net = PIT(net, exclude_types=exclude_types)
In the example, all the
nn.Conv1d
will not be optimized. I.e., the layerc0
will be excluded from the optimization process.
Conversely, it may happens that we would to optimize only specific layers of our network. In this case, we give the possibility to the user to directly define and use the optimizable version of such layers in the net.
In particular, the user will use the layers defined in plinio.methods.pit.nn
. E.g.,
from plinio.methods.pit.nn import PITConv1d
class Net(nn.Module):
def __init__(self):
self.c0 = nn.PITConv1d()
self.lin0 = nn.Linear()
self.lin1 = nn.Linear()
self.lin2 = nn.Linear()
net = Net()
pit_net = PIT(net, autoconvert_layers=False)
In this example, only the layer c0
will be optimized.
Please note that in this case we need to specify the autoconvert_layers=False
argument to PIT
to tell that we do not want to automatically convert all supported layers.
At the current state the following regularization strategies are supported:
- Size: this strategy tries to reduce the total number of parameters of the target layers. It can be used by specificying the argument
regularizer
ofPIT()
with the string'size'
. - MACs: this strategy tries to reduce the total number of operations of the target layers. It can be used by specificying the argument
regularizer
ofPIT()
with the string'macs'
.
At the current state the optimization of the following layers is supported:
Layer | Hyper-Parameters |
---|---|
Conv1d | Output-Channels, Kernel-Size, Dilation |
Conv2d | Output-Channels |
Linear | Output-Features |