mobilenetv3_small_075.lamb_in1k - 移动网络V3小型模型的图像分类与优化方法

项目介绍：mobilenetv3_small_075.lamb_in1k

背景介绍

Mobilenetv3_small_075.lamb_in1k 是一个用于图像分类的深度学习模型。它属于 MobileNet-V3 系列，是一个轻量级的网络架构，专为在资源受限环境中的移动设备上进行快速推理而设计。这个模型是在 ImageNet-1k 数据集上进行训练的，并且使用了特定的优化策略，以提高其分类性能。

模型详细信息

模型类型: 图像分类/特征骨干网络
参数统计:
- 参数数量 (百万): 2.0
- 计算量 (GMACs): 0.0
- 激活操作数量 (百万): 1.3
- 图像尺寸: 224 x 224 像素

训练策略

该模型使用了 LAMB 优化器，并结合了一些其他的训练策略，例如 EMA 权重平均、RMSProp 优化器、以及带有阶梯指数衰减的学习率调度和预热机制。这些策略的应用方式与学术论文 "ResNet Strikes Back" 中提到的方法相仿，但在训练时长上延长了 50%，同时未使用 CutMix 数据增强。

使用方法

图像分类

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model('mobilenetv3_small_075.lamb_in1k', pretrained=True)
model = model.eval()

# 获取模型特定的图像转换（如归一化、调整大小）
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # 将单张图片转换为批次处理

_top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)

特征图提取

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'mobilenetv3_small_075.lamb_in1k',
    pretrained=True,
    features_only=True,
)
model = model.eval()

data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))  # 转换并处理图像

for o in output:
    print(o.shape)

图像嵌入

from urllib.request import urlopen
from PIL import Image
import timm

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model(
    'mobilenetv3_small_075.lamb_in1k',
    pretrained=True,
    num_classes=0,  # 去除分类器
)
model = model.eval()

data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0))

# 等效调用无需设置 num_classes=0
output = model.forward_features(transforms(img).unsqueeze(0))
output = model.forward_head(output, pre_logits=True)

模型性能

想要进一步了解关于这个模型在不同数据集和环境下的性能比较，可以查看 timm 模型结果。

引用

如果需要引用此项目的内容，请参考以下几篇文献：

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}

@inproceedings{howard2019searching,
  title={Searching for mobilenetv3},
  author={Howard, Andrew et al.},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={1314--1324},
  year={2019}
}