Bayesian-Torch 是一个神经网络层和工具库,扩展了 PyTorch 的核心功能,以在深度学习模型中实现贝叶斯推理,从而对模型预测进行原则性的不确定性估计。
概述
Bayesian-Torch 的设计具有灵活性,通过简单地用贝叶斯层替换确定性层,可以无缝地将确定性深度神经网络模型扩展为相应的贝叶斯形式。它使用户能够在深度神经网络中执行随机变分推理。
贝叶斯层:
-
具有重参数化蒙特卡洛估计器的变分层 [Blundell et al. 2015]
LinearReparameterization Conv1dReparameterization, Conv2dReparameterization, Conv3dReparameterization, ConvTranspose1dReparameterization, ConvTranspose2dReparameterization, ConvTranspose3dReparameterization LSTMReparameterization
-
具有 Flipout 蒙特卡洛估计器的变分层 [Wen et al. 2018]
LinearFlipout Conv1dFlipout, Conv2dFlipout, Conv3dFlipout, ConvTranspose1dFlipout, ConvTranspose2dFlipout, ConvTranspose3dFlipout LSTMFlipout
主要特性:
- dnn_to_bnn():通过一行代码将模型无缝转换为具有不确定性感知能力的模型。这是一个 API,可将任何架构的确定性深度神经网络(dnn)模型转换为贝叶斯深度神经网络(bnn)模型,简化了模型定义,即将卷积层、线性层和 LSTM 层直接替换为相应的贝叶斯层。这将使现有大型模型的拓扑结构能够无缝转换为贝叶斯深度神经网络模型,从而扩展到具有不确定性感知的应用。
- MOPED:通过经验贝叶斯方法在贝叶斯神经网络中指定权重先验和变分后验,使贝叶斯推理可扩展到大规模模型 [Krishnan et al. 2020]
- 量化:使用简单的 API 对贝叶斯深度神经网络模型进行训练后量化,以实现 INT8 推理 [Lin et al. 2023]
- AvUC:准确度与不确定性校准损失 [Krishnan and Tickoo 2020]
安装 Bayesian-Torch
使用 pip
安装核心库:
pip install bayesian-torch
从源代码安装最新开发版本:
git clone https://github.com/IntelLabs/bayesian-torch
cd bayesian-torch
pip install .
使用方法
使用 Bayesian-Torch 构建贝叶斯深度神经网络有两种方法:
(1) 例如,从 torchvision 的确定性 ResNet18 模型构建贝叶斯 ResNet18 非常简单:
import torch
import torchvision
from bayesian_torch.models.dnn_to_bnn import dnn_to_bnn, get_kl_loss
const_bnn_prior_parameters = {
"prior_mu": 0.0,
"prior_sigma": 1.0,
"posterior_mu_init": 0.0,
"posterior_rho_init": -3.0,
"type": "Reparameterization", # Flipout 或 Reparameterization
"moped_enable": False, # 设为 True 以从预训练的 dnn 权重初始化 mu/sigma
"moped_delta": 0.5,
}
model = torchvision.models.resnet18()
dnn_to_bnn(model, const_bnn_prior_parameters)
要使用 MOPED 方法,即从预训练的确定性模型设置先验并初始化变分参数(有助于大型模型的训练收敛):
const_bnn_prior_parameters = {
"prior_mu": 0.0,
"prior_sigma": 1.0,
"posterior_mu_init": 0.0,
"posterior_rho_init": -3.0,
"type": "Reparameterization", # Flipout 或 Reparameterization
"moped_enable": True, # 设为 True 以从预训练的 dnn 权重初始化 mu/sigma
"moped_delta": 0.5,
}
model = torchvision.models.resnet18(pretrained=True)
dnn_to_bnn(model, const_bnn_prior_parameters)
训练代码片段:
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), args.learning_rate)
output = model(x_train)
kl = get_kl_loss(model)
ce_loss = criterion(output, y_train)
loss = ce_loss + kl / args.batch_size
loss.backward()
optimizer.step()
测试代码片段:
model.eval()
with torch.no_grad():
output_mc = []
for mc_run in range(args.num_monte_carlo):
logits = model(x_test)
probs = torch.nn.functional.softmax(logits, dim=-1)
output_mc.append(probs)
output = torch.stack(output_mc)
pred_mean = output.mean(dim=0)
y_pred = torch.argmax(pred_mean, axis=-1)
test_acc = (y_pred.data.cpu().numpy() == y_test.data.cpu().numpy()).mean()
不确定性量化:
from utils.util import predictive_entropy, mutual_information
predictive_uncertainty = predictive_entropy(output.data.cpu().numpy())
model_uncertainty = mutual_information(output.data.cpu().numpy())
(2) 对于构建自定义模型,我们提供了使用贝叶斯层的示例模型实现。
示例用法(模型训练和评估)
我们提供了示例用法和脚本来训练/评估模型。以下是CIFAR10示例的说明,类似的ImageNet和MNIST脚本也可用。
cd bayesian_torch
训练
要在CIFAR10上训练贝叶斯ResNet,运行以下命令:
平均场变分推断(重参数化蒙特卡罗估计器)
sh scripts/train_bayesian_cifar.sh
平均场变分推断(Flipout蒙特卡罗估计器)
sh scripts/train_bayesian_flipout_cifar.sh
要在CIFAR10上训练确定性ResNet,运行以下命令:
普通
sh scripts/train_deterministic_cifar.sh
评估
要在CIFAR10上评估贝叶斯ResNet,运行以下命令:
平均场变分推断(重参数化蒙特卡罗估计器)
sh scripts/test_bayesian_cifar.sh
平均场变分推断(Flipout蒙特卡罗估计器)
sh scripts/test_bayesian_flipout_cifar.sh
要在CIFAR10上评估确定性ResNet,运行以下命令:
普通
sh scripts/test_deterministic_cifar.sh
训练后量化(PTQ)
要量化贝叶斯ResNet(转换为INT8)并在CIFAR10上评估,运行以下命令:
sh scripts/quantize_bayesian_cifar.sh
引用
如果您使用此代码,请按以下方式引用:
@software{krishnan2022bayesiantorch,
author = {Ranganath Krishnan and Pi Esposito and Mahesh Subedar},
title = {Bayesian-Torch: Bayesian neural network layers for uncertainty estimation},
month = jan,
year = 2022,
doi = {10.5281/zenodo.5908307},
url = {https://doi.org/10.5281/zenodo.5908307}
howpublished = {\url{https://github.com/IntelLabs/bayesian-torch}}
}
准确度与不确定性校准(AvUC)损失
@inproceedings{NEURIPS2020_d3d94468,
title = {Improving model calibration with accuracy versus uncertainty optimization},
author = {Krishnan, Ranganath and Tickoo, Omesh},
booktitle = {Advances in Neural Information Processing Systems},
volume = {33},
pages = {18237--18248},
year = {2020},
url = {https://proceedings.neurips.cc/paper/2020/file/d3d9446802a44259755d38e6d163e820-Paper.pdf}
}
贝叶斯深度学习量化框架
@inproceedings{lin2023quantization,
title={Quantization for Bayesian Deep Learning: Low-Precision Characterization and Robustness},
author={Lin, Jun-Liang and Krishnan, Ranganath and Ranipa, Keyur Ruganathbhai and Subedar, Mahesh and Sanghavi, Vrushabh and Arunachalam, Meena and Tickoo, Omesh and Iyer, Ravishankar and Kandemir, Mahmut Taylan},
booktitle={2023 IEEE International Symposium on Workload Characterization (IISWC)},
pages={180--192},
year={2023},
organization={IEEE}
}
使用DNN的经验贝叶斯模型先验(MOPED)
@inproceedings{krishnan2020specifying,
title={Specifying weight priors in bayesian deep neural networks with empirical bayes},
author={Krishnan, Ranganath and Subedar, Mahesh and Tickoo, Omesh},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={34},
number={04},
pages={4477--4484},
year={2020},
url = {https://ojs.aaai.org/index.php/AAAI/article/view/5875}
}
该库和代码面向研究人员和开发人员,使他们能够量化深度学习模型中的原则性不确定性估计,以开发具有不确定性意识的AI模型。 欢迎反馈、问题和贡献。