论文、数据集、代码和其他资源,使用深度学习进行目标检测和跟踪的集合
研究数据
我使用 DavidRM Journal 管理我的研究数据,因为它有出色的分层组织、交叉链接和标记功能。
我提供了一份日记条目导出文件,其中包含我在过去几年里收集的关于计算机视觉和深度学习的论文、文章、教程、代码和笔记的分类集合。
这是主题云的样子:
它需要 Jounal 8 并可由以下步骤导入:
- 使用 File -> Import -> Import User Preferences 导入我的用户偏好
- 使用 File -> Import -> Sync from The Journal Export File 导入研究数据
注意,必须在导入研究数据之前导入我的用户偏好,才能使标记的主题正常工作。
(可选)我的全局选项文件也提供给那些对暗主题感兴趣的人,可以通过 File -> Import -> Import Global Options 导入
更新日期:2023-11-22
论文
静态检测
区域提议
RCNN
- Faster R-CNN:具有区域提议网络的实时目标检测 [tpami17] [pdf] [notes]
- 区域全卷积网络中的目标检测 [nips16] [微软研究院] [pdf] [notes]
- Mask R-CNN [iccv17] [Facebook AI 研究] [pdf] [notes] [arxiv] [代码(keras)] [代码(tensorflow)]
- SNIPER 高效的多尺度训练 [ax1812/nips18] [pdf] [notes] [代码]
YOLO
- 只需看一次统一的实时目标检测 [ax1605] [pdf] [notes]
- YOLO9000 更好、更快、更强 [ax1612] [pdf] [notes]
- YOLOv3 增量改进 [ax1804] [pdf] [notes]
- YOLOv4 目标检测的最佳速度和准确性 [ax2004] [pdf] [notes] [代码]
SSD
RetinaNet
无锚检测器
- FoveaBox: 超越基于锚的目标检测 [ax1904] [pdf] [notes] [code]
- CornerNet: 将目标检测为配对关键点 [ax1903/ijcv19] [pdf] [notes] [code]
- FCOS 全卷积单阶段目标检测 [ax1908/iccv19] [pdf] [notes] [code] [code/FCOS_PLUS] [code/VoVNet] [code/HRNet] [code/NAS]
- 单阶段目标检测的特征选择无锚模块 [ax1903/cvpr19] [pdf] [notes] [code]
- 通过对极端点和中心点进行分组进行自底向上目标检测 [ax1901] [pdf] [notes] [code]
- 通过自适应训练样本选择弥合基于锚的检测与无锚检测之间的差距 [ax1912/cvpr20] [pdf] [notes] [code]
- 使用变压器进行端到端目标检测 [ax200528] [pdf] [notes] [code]
- 以点为目标 [ax1904] [pdf] [notes] [code]
- RepPoints: 用于目标检测的点集表示 [iccv19] [pdf] [notes] [code]
杂项
- OverFeat 使用卷积网络进行集成识别、定位和检测 [ax1402/iclr14] [pdf] [notes]
- LSDA 通过适应进行大规模检测 [ax1411/nips14] [pdf] [notes]
- 获取局部化置信度以进行准确目标检测 [ax1807/eccv18] [pdf] [notes] [code]
- EfficientDet: 可扩展且高效的目标检测 [cvpr20] [pdf]
- Generalized Intersection over Union: 一个用于边框回归的度量和损失函数 [ax1902/cvpr19] [pdf] [notes] [code] [project]
视频检测
Tubelet
- 通过卷积神经网络从视频Tubelet中检测目标 [cvpr16] [pdf] [notes]
- 使用Tubelet建议网络进行视频中的目标检测 [ax1704/cvpr17] [pdf] [notes]
FGFA
- 用于视频识别的深度特征流 [cvpr17] [微软研究] [pdf] [arxiv] [code]
- 用于视频目标检测的流引导特征聚合 [ax1708/iccv17] [pdf] [notes]
- 迈向高性能视频目标检测 [ax1711] [微软] [pdf] [notes]
RNN
多目标跟踪
联合检测
身份嵌入
-
MOTS 多目标跟踪与分割 [cvpr19] [pdf] [notes] [code] [project/data]
关联
深度学习
-
使用基于CNN的单目标跟踪器和时空注意力机制的在线多目标跟踪 [ax1708/iccv17] [pdf] [arxiv] [notes]
-
使用双匹配注意网络的在线多目标跟踪 [ax1902/eccv18] [pdf] [arxiv] [notes] [code]
RNN
无监督学习
强化学习
-
学习跟踪:通过决策进行在线多目标跟踪 [iccv15] [Stanford] [pdf] [notes] [code (matlab)] [project]
网络流
-
带有聚合的本地流描述符的近在线多目标跟踪 [iccv15] [NEC Labs] [pdf] [author] [notes]
-
多目标跟踪的深度网络流 [cvpr17] [NEC Labs] [pdf] [supplementary] [notes]
图优化
- 用于联合分割和多目标跟踪的多切割公式 [ax1607] [highest MT on MOT2015] [University of Freiburg, Germany] [pdf] [arxiv] [author] [notes]
基线
评估指标
单目标跟踪
强化学习
- 用于视频中视觉目标跟踪的深度强化学习 [ax1704] [USC-Santa Barbara, Samsung Research] [pdf] [arxiv] [作者] [笔记]
- 通过强化决策进行视觉跟踪 [ax1702] [Seoul National University, Chung-Ang University] [pdf] [arxiv] [作者] [笔记]
- 用于视觉跟踪的深度强化学习的动作决策网络 [cvpr17] [Seoul National University] [pdf] [补充材料] [项目] [笔记] [代码]
- 通过强化学习进行端到端的主动目标跟踪 [ax1705] [Peking University, Tencent AI Lab] [pdf] [arxiv]
Siamese
- 全卷积Siamese网络用于目标跟踪 [eccv16] [pdf] [项目] [笔记]
- 使用Siamese区域提议网络进行高性能视觉跟踪 [cvpr18] [pdf] [作者] [笔记]
- Siam R-CNN通过再检测进行视觉跟踪 [cvpr20] [pdf] [笔记] [项目] [代码]
相关性
- 通过重叠最大化实现准确跟踪(ATOM) [cvpr19] [pdf] [笔记] [代码]
- 学习判别模型预测进行跟踪(DiMP) [iccv19] [pdf] [笔记] [代码]
- D3S – 一种判别性单次分割跟踪器 [cvpr20] [pdf] [笔记] [代码]
其它
深度学习
合成梯度
高效
无监督学习
插值
- 通过自适应卷积进行视频帧插值 [cvpr17 / iccv17] [pdf (cvpr17)] [pdf (iccv17)] [ppt]
自编码器
变分
数据集
多目标跟踪
- IDOT
- UA-DETRAC Benchmark Suite
- GRAM Road-Traffic Monitoring
- Ko-PER Intersection Dataset
- TRANCOS
- Urban Tracker
- DARPA VIVID / PETS 2005 [不稳定的摄像头]
- KIT-AKS [没有地面真值]
- CBCL StreetScenes Challenge Framework [没有顶部视角]
- MOT 2015 [主要是街道视角]
- MOT 2016 [主要是街道视角]
- MOT 2017 [主要是街道视角]
- MOT 2020 [主要是顶部视角]
- MOTS: Multi-Object Tracking and Segmentation [MOT 和 KITTI]
- CVPR 2019 [主要是街道视角]
- PETS 2009 [没有车辆]
- PETS 2017 [低密度] [主要是行人]
- DukeMTMC [多摄像头] [静态背景] [行人] [高于街道视角] [网站不可用]
- KITTI Tracking Dataset [没有顶部视角] [不稳定的摄像头]
- The WILDTRACK Seven-Camera HD Dataset [行人检测和跟踪]
- 3D Traffic Scene Understanding from Movable Platforms [十字路口交通] [立体设置] [移动摄像头]
- LOST : Longterm Observation of Scenes with Tracks [顶部视角和街道视角] [没有地面真值]
- JTA [顶部视角和街道视角] [合成/GTA 5] [行人] [3D注释]
- PathTrack: Fast Trajectory Annotation with Path Supervision [顶部视角和街道视角] [iccv17] [行人]
- CityFlow [杆装] [交叉路口] [车辆] [重辨识] [cvpr19]
- JackRabbot Dataset [RGBD] [正面] [室内/室外] [斯坦福]
- TAO: A Large-Scale Benchmark for Tracking Any Object [eccv20] [代码]
- Edinburgh office monitoring video dataset [室内] [长期] [主要是静态人物]
- Waymo Open Dataset [室外] [车辆]
无人机
- Stanford Drone Dataset
- UAVDT - The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking [无人机] [交叉路口/高速公路] [车辆] [eccv18]
- VisDrone
合成
- MNIST-MOT / MNIST-Sprites [脚本生成] [cvpr19]
- TUB Multi-Object and Multi-Camera Tracking Dataset [avss16]
- Virtual KITTI [arxiv] [cvpr16] [链接似乎损坏]
显微镜 / 细胞跟踪
- Cell Tracking Challenge [自然方法/2017]
- CTMC: Cell Tracking with Mitosis Detection Dataset Challenge [cvprw20] [MOT]
单对象跟踪
- TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild [eccv18]
- LaSOT: Large-scale Single Object Tracking [cvpr19]
- Need for speed: A benchmark for higher frame rate object tracking [iccv17]
- Long-term Tracking in the Wild A Benchmark [eccv18]
- UAV123: A benchmark and simulator for UAV tracking [eccv16] [项目]
- Sim4CV A Photo-Realistic Simulator for Computer Vision Applications [ijcv18]
- CDTB: A Color and Depth Visual Object Tracking and Benchmark [iccv19] [RGBD]
- Temple Color 128 - Color Tracking Benchmark [tip15]
视频检测
视频理解 / 活动识别
- YouTube-8M
- AVA: A Video Dataset of Atomic Visual Action
- VIRAT Video Dataset
- Kinetics Action Recognition Dataset
静态检测
- PASCAL Visual Object Classes
- A Large-Scale Dataset for Vehicle Re-Identification in the Wild [cvpr19]
- Object Detection-based annotations for some frames of the VIRAT dataset
- MIO-TCD: A new benchmark dataset for vehicle classification and localization [tip18]
- Tiny ImageNet
动物
- Wildlife Image and Localization Dataset (species and bounding box labels) [wacv18]
- Stanford Dogs Dataset [cvpr11]
- Oxford-IIIT Pet Dataset [cvpr12]
- Caltech-UCSD Birds 200 [粗略分割] [属性]
- Gold Standard Snapshot Serengeti Bounding Box Coordinates
边界检测
静态分割
- COCO - Common Objects in Context
- Open Images
- ADE20K [cvpr17]
- SYNTHIA [cvpr16]
- UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation
视频分割
- DAVIS: Densely Annotated VIdeo Segmentation
- Mapillary Vistas Dataset [街道场景] [半自由]
- BDD100K [街道场景] [自动驾驶]
- ApolloScape [街道场景] [自动驾驶]
- Cityscapes [街道场景] [实例级别]
- YouTube-VOS [iccv19]
分类
- ImageNet Large Scale Visual Recognition Competition 2012
- Animals with Attributes 2
- CompCars Dataset
- ObjectNet [仅测试集]
光流
运动预测
代码
通用视觉
- Gluon CV Toolkit [mxnet] [pytorch]
- OpenMMLab Computer Vision Foundation [pytorch]
多对象跟踪
框架
- OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework [pytorch]
通用
- 全局最优贪婪算法追踪多个目标 [cvpr11] [matlab] [作者]
- 多目标追踪的连续能量最小化 [cvpr11 / iccv11 / tpami 2014] [matlab]
- 离散-连续能量最小化的多目标追踪 [cvpr12] [matlab] [项目]
- 以相似外观追踪多个目标 [iccv13] [matlab]
- 来自可移动平台的3D交通场景理解 [2d_tracking] [pami14/kit13/iccv13/nips11] [c++/matlab]
- 基于无向层次关系超图的多目标追踪 [cvpr14] [C++] [作者]
- 基于轨迹置信度和在线判别外观学习的稳健在线多目标追踪 [cvpr14] [matlab] (项目)
- 学会追踪:通过决策进行在线多目标追踪 [iccv15] [matlab]
- 多目标联合跟踪和分割 [cvpr15] [matlab]
- 重新审视多假设追踪 [iccv15] [在MOT2015开源追踪器中最高MT] [matlab]
- 交通场景中的图像和世界空间联合追踪 [icra 2017] [c++]
- 使用循环神经网络的在线多目标追踪 [aaai17] [lua/torch7]
- 实时多目标追踪 - 关于速度重要性的研究 [ax1710/硕士论文] [c++]
- 超越像素:利用几何和形状提示进行在线多目标追踪 [icra18] [matlab]
- 具有双匹配注意力网络的在线多目标追踪 [eccv18] [matlab/tensorflow]
- TrackR-CNN - 多目标追踪和分割 [cvpr19] [tensorflow] [项目]
- 消除多目标追踪中的曝光偏差和度量不匹配 [cvpr19] [tensorflow]
- 强大的多模态多目标追踪 [iccv19] [pytorch]
- 朝向实时多目标追踪 / 联合检测和嵌入 [ax1909] [pytorch] [CMU]
- 用于多目标追踪的深度亲和网络 [tpami19] [pytorch]
- 没有铃声和口哨的追踪 [iccv19] [pytorch]
- 提升的分离路径在多目标追踪中的应用 [icml20] [matlab] [mot15#1,mot16 #3,mot17#2]
- 学习多目标追踪的神经解算器 [cvpr20] [pytorch] [mot15#2]
- 将目标作为点进行追踪 [ax2004] [pytorch]
- 准密集相似性学习用于多目标追踪 [ax2006] [pytorch]
- DEFT:用于追踪的检测嵌入 [ax2102] [pytorch]
- 如何训练你的深度多目标追踪器 [ax1906/cvpr20] [pytorch] [traktor/gitlab]
- 追踪以检测和分割:一个在线多目标追踪器 [cvpr21] [pytorch] [项目]
- MOTR:使用Transformer的端到端多目标追踪 [ax2202] [pytorch]
基准
- 简单的在线实时追踪 [icip 2016] [python]
- 深度SORT:带有深入关联度量的简单在线实时追踪 [icip17] [python]
- 不使用图像信息的高速追踪 [avss17] [python]
- 单次多目标追踪的简单基准 [ax2004] [pytorch] [mot15,16,17,20获奖者]
Siamese
- SiamMOT:Siamese多目标追踪 [ax2105] [pytorch]
无监督
- 通过动画追踪:多目标注意追踪器的无监督学习 [cvpr19] [python/c++/pytorch]
Re-ID
- Torchreid:PyTorch中的深度学习人员重识别 [ax1910] [pytorch]
- SMOT:单次多目标追踪 [ax2010] [pytorch] [gluon-cv]
- FairMOT:多目标追踪中的检测和Re-Identification公平性 [ax2004] [pytorch] [微软] [BDD100K] [面部追踪]
- 重新审视多目标追踪中的检测和ReID竞争 [ax2010] [pytorch]
框架
- 用于无监督或领域自适应对象Re-ID的PyTorch开源工具箱 [pytorch]
图神经网络
- 使用图神经网络进行联合对象检测和多目标追踪 [ax2006/icra21] [pytorch]
显微镜 / 细胞追踪
- 巴克斯特算法 / 维特比追踪 [tmi14] [matlab]
- Deepcell:使用深度学习在活细胞成像实验中准确进行细胞追踪和谱系构建 [biorxiv1910] [tensorflow]
3D
- 3D多目标追踪:基准和新评测指标 [iros20/eccvw20] [pytorch]
- GNN3DMOT:用于3D多目标追踪的图神经网络多特征学习 [iros20/eccvw20] [pytorch]
评测
- HOTA:用于评估多目标追踪的高阶指标 [cvpr20] [python]
单目标追踪
- 一个常见的追踪算法集合(2003-2012) [c++/matlab]
- SenseTime单目标追踪研究平台,实施算法如SiamRPN和SiamMask [pytorch]
- 捍卫基于颜色的无模型追踪 [cvpr15] [c++]
- 视觉追踪中的层次卷积特征 [iccv15] [matlab]
- 使用全卷积网络进行视觉追踪 [iccv15] [matlab]
- 视觉追踪中的层次卷积特征 [iccv15] [matlab]
- DeepTracking:使用递归神经网络看得更远 [aaai16] [torch 7]
- 学习多域卷积神经网络用于视觉追踪 [cvpr16] [vot2015获奖者] [matlab/matconvnet] [pytorch]
- 超越相关滤波器:学习连续卷积算子用于视觉追踪 [eccv 2016] [matlab]
- 用于目标追踪的全卷积孪生网络 [eccvw 2016] [matlab/matconvnet] [项目] [pytorch] [pytorch (仅训练)]
- DCFNet:用于视觉追踪的判别相关滤波器网络 [ax1704] [matlab/matconvnet] [pytorch]
- 基于端到端表示学习的相关滤波器追踪 [cvpr17] [matlab/matconvnet] [tensorflow/仅推理] [项目]
- 双深度网络用于视觉追踪 [tip1704] [caffe]
- SiameseX:简化的PyTorch实现用于追踪的孪生网络:SiamFC、SiamRPN、SiamRPN++、SiamVGG、SiamDW、SiamRPN-VGG [pytorch]
- RATM:复发注意追踪模型 [cvprw17] [python]
- ROLO:用于视觉对象追踪的空间监督递归卷积神经网络 [iscas 2017] [tensorfow]
- ECO:用于追踪的高效卷积算子 [cvpr17] [matlab] [python/cuda] [pytorch]
- 深度强化学习的动作-决策网络用于视觉追踪 [cvpr17] [tensorflow]
- 检测-追踪与追踪-检测 [iccv17] [matlab]
- Meta-Tracker:快速且稳健的在线适应用于视觉对象追踪器 [eccv18] [pytorch]
- 学习时空正则化相关滤波器用于视觉追踪 [cvpr18] [matlab]
- 高性能视觉追踪与孪生区域建议网络 [cvpr18] [pytorch/195] [pytorch/313] [pytorch/无训练/104] [pytorch/177]
- 视觉对象追踪中的干扰感知孪生网络 [eccv18] [vot18获奖者] [pytorch]
- VITAL:通过对抗学习进行视觉追踪 [cvpr18] [matlab] [pytorch] [项目]
- 快速在线对象追踪与分割:统一的方法(SiamMask) [cvpr19] [pytorch] [项目]
- PyTracking:基于PyTorch的通用python框架,用于训练和运行视觉对象追踪器 [ECO/ATOM/DiMP/PrDiMP] [cvpr17/cvpr19/iccv19/cvpr20] [pytorch]
- 无监督深度追踪 [cvpr19] [matlab/matconvnet] [pytorch]
- 更深更宽的孪生网络用于实时视觉追踪 [cvpr19] [pytorch]
- GradNet:用于视觉对象追踪的梯度引导网络 [iccv19] [tensorflow]
- `Skimming-Perusal' 追踪:实时且稳健的长期追踪框架 [iccv19] [tensorflow]
- 学习用于实时无人机追踪的失调抑制相关滤波器 [iccv19] [matlab]
- 学习孪生追踪器的模型更新 [iccv19] [pytorch]
- SPM-Tracker:串并联匹配用于实时视觉对象追踪 [cvpr19] [pytorch] [仅推理]
- 联合特征选择与判别滤波器学习用于稳健视觉对象追踪 [iccv19] [matlab]
- Siam R-CNN:通过再检测进行视觉追踪 [cvpr20] [tensorflow]
- D3S - 判别单次拍摄分割追踪器 [cvpr20] [pytorch/pytracking]
- 用于孪生视觉追踪的判别和稳健在线学习 [aaai20] [pytorch/pysot]
- 用于视觉追踪的孪生框自适应网络 [cvpr20] [pytorch/pysot]
- Ocean:基于对象感知的无锚追踪 [ax2010] [pytorch]
GUI应用 / 大规模追踪 / 动物
- BioTracker 一个用于视觉动物追踪的开源计算机视觉框架[opencv/c++]
- Tracktor:基于图像的自动化动物移动和行为追踪[opencv/c++]
- MARGO(大规模自动实时GUI用于对象追踪),一个高通量生物行为学平台[matlab]
- idtracker.ai:追踪大型未标记动物群体中的所有个体 [tensorflow] [项目]
视频检测
- 基于流引导特征聚合的视频对象检测 [nips16 / iccv17] [mxnet]
- T-CNN:带卷积神经网络的Tubelets [cvpr16] [python]
- TPN:Tubelet提案网络 [cvpr17] [python]
- 用于视频识别的深度特征流 [cvpr17] [mxnet]
- 在时间感知特征图的移动视频对象检测 [cvpr18] [Google] [tensorflow]
动作检测
框架
- OpenMMLab 的下一代视频理解工具箱与基准 [pytorch]
静态检测和匹配
框架
- Tensorflow 对象检测 API [tensorflow]
- Detectron2 [pytorch]
- Detectron [pytorch]
- Open MMLab 检测工具箱与 PyTorch [pytorch]
- SimpleDet [mxnet]
区域提案
- MCG : 多尺度组合分组 - 对象提案与分割 (项目) [tpami16/cvpr14] [python]
- COB : 卷积方向边界 (项目) [tpami18/eccv16] [matlab/caffe]
FPN
- 用于对象检测的特征金字塔网络 [caffe/python]
RCNN
- RFCN(作者) [caffe/matlab]
- RFCN-tensorflow [tensorflow]
- PVANet:用于实时对象检测的轻量级深度神经网络 [intel] [emdnn16(nips16)]
- [Mask R-CNN [tensorflow] [keras]
- Light-head R-CNN [cvpr18] [tensorflow]
- 用于快速车辆检测的演化盒 [icme18] [caffe/python]
- Cascade R-CNN(cvpr18) [detectron] [caffe]
- 用于对象检测的多路径网络 [torch] [bmvc16] [facebook]
- SNIPER:高效多尺度训练/分析尺度不变性在对象检测中的作用-SNIP [nips18/cvpr18] [mxnet]
SSD
- SSD-Tensorflow [tensorflow]
- SSD-Tensorflow (tf.estimator) [tensorflow]
- SSD-Tensorflow (tf.slim) [tensorflow]
- SSD-Keras [keras]
- SSD-Pytorch [pytorch]
- Enhanced SSD with Feature Fusion and Visual Reasoning [nca18] [tensorflow]
- RefineDet - Single-Shot Refinement Neural Network for Object Detection [cvpr18] [caffe]
RetinaNet
- 9.277.41 [pytorch]
- 31.857.212 [pytorch]
- 25.274.84 [pytorch] [nvidia]
- 22.869.302 [pytorch]
YOLO
- Darknet: Convolutional Neural Networks [c/python]
- YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes! [c/python]
- Darkflow [tensorflow]
- Pytorch Yolov2 [pytorch]
- Yolo-v3 and Yolo-v2 for Windows and Linux [c/python]
- YOLOv3 in PyTorch [pytorch]
- pytorch-yolo-v3 [pytorch] [no training] [教程]
- YOLOv3_TensorFlow [tensorflow]
- tensorflow-yolo-v3 [tensorflow slim]
- tensorflow-yolov3 [tensorflow slim]
- keras-yolov3 [keras]
- YOLOv4 [darknet - c/python] [tensorflow] [pytorch/711] [pytorch/ONNX/TensorRT/1.9k] [pytorch 3D]
- YOLOv5 [pytorch]
- YOLOX [pytorch] MegEngine [ax2107]
Anchor Free
- FoveaBox: Beyond Anchor-based Object Detector [ax1904] [pytorch/mmdetection]
- Cornernet: Detecting objects as paired keypoints [ax1903/eccv18] [pytorch]
- FCOS: Fully Convolutional One-Stage Object Detection [iccv19] [pytorch] [VoVNet] [HRNet] [NAS] [FCOS_PLUS]
- Feature Selective Anchor-Free Module for Single-Shot Object Detection [cvpr19] [pytorch]
- CenterNet: Objects as Points [ax1904] [pytorch]
- Bottom-up Object Detection by Grouping Extreme and Center Points, [cvpr19] [pytorch]
- RepPoints Point Set Representation for Object Detection [iccv19] [pytorch] [microsoft]
- DE⫶TR: End-to-End Object Detection with Transformers [ax200528] [pytorch] [facebook]
- Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection [cvpr20] [pytorch]
Misc
- Relation Networks for Object Detection [cvpr18] [mxnet]
- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling [iccv17(poster)] [theano]
- Multi-scale Location-aware Kernel Representation for Object Detection [cvpr18] [caffe/python]
Matching
Boundary Detection
- Holistically-Nested Edge Detection (HED) (iccv15) [caffe]
- Edge-Detection-using-Deep-Learning (HED) [tensorflow]
- Holistically-Nested Edge Detection (HED) in OpenCV [python/c++]
- Crisp Boundary Detection Using Pointwise Mutual Information (eccv14) [matlab]
- Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection [wacv20] tensorflow pytorch
Text Detection
- Real-time Scene Text Detection with Differentiable Binarization [pytorch] [aaai20]
框架
3D Detection
框架
- OpenMMLab's next-generation platform for general 3D object detection [pytorch]
- OpenPCDet Toolbox for LiDAR-based 3D Object Detection [pytorch]
Optical Flow
- FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks (cvpr17) [caffe] [pytorch/nvidia]
- SPyNet: Spatial Pyramid Network for Optical Flow (cvpr17) [lua] [pytorch]
- Guided Optical Flow Learning (cvprw17) [caffe] [tensorflow]
- Fast Optical Flow using Dense Inverse Search (DIS) [eccv16] [C++]
- A Filter Formulation for Computing Real Time Optical Flow [ral16] [c++/cuda - matlab,python wrappers]
- PatchBatch - a Batch Augmented Loss for Optical Flow [cvpr16] [python/theano]
- Piecewise Rigid Scene Flow [iccv13/eccv14/ijcv15] [c++/matlab]
- DeepFlow v2 [iccv13] [c++/python/matlab], [project]
- An Evaluation of Data Costs for Optical Flow [gcpr13] [matlab]
框架
Instance Segmentation
- Fully Convolutional Instance-aware Semantic Segmentation [cvpr17] [coco16 winner] [mxnet]
- Instance-aware Semantic Segmentation via Multi-task Network Cascades [cvpr16] [caffe] [coco15 winner]
- DeepMask/SharpMask [nips15/eccv16] [facebook] [torch] [tensorflow] [pytorch/deepmask]
- Simultaneous Detection and Segmentation [eccv14] [matlab] [project]
- PANet [cvpr18] [pytorch]
- RetinaMask [arxviv1901] [pytorch]
- Mask Scoring R-CNN [cvpr19] [pytorch]
- DeepMAC [ax2104] [tensorflow]
- Swin Transformer [iccv21] [pytorch] [microsoft]
框架
- Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch [pytorch] [facebook]
- PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle. [2019]
语义分割
- 从合成数据中学习:解决语义分割的域偏移问题 [cvpr18] [spotlight] [pytorch]
- 通过指导网络进行小样本分割传播 [ax1806] [pytorch] [不完整]
- Pytorch-segmentation-toolbox [DeeplabV3 和 PSPNet] [pytorch]
- DeepLab [tensorflow]
- Auto-DeepLab [pytorch]
- DeepLab v3+ [pytorch]
- Deep Extreme Cut (DEXTR): 从极端点到对象分割[cvpr18][project] [pytorch]
- FastFCN: 重新思考主干网络中的膨胀卷积以解决语义分割[ax1903][project] [pytorch]
框架
- OpenMMLab 语义分割工具箱和基准 [pytorch]
息肉分割
全景分割
- Panoptic-DeepLab: 一种简单、强大且快速的自下而上全景分割基准 [cvpr20] [pytorch]
视频分割
- 通过视频预测和标签松弛改进语义分割 [cvpr19] [pytorch] [nvidia]
- PReMVOS: 提议生成、优化和合并,用于视频对象分割 [accv18/cvprw18/eccvw18] [tensorflow]
- 用于视频实例分割的MaskTrackRCNN [iccv19] [pytorch/detectron]
- MaskTrackRCNN [iccv19] [pytorch/detectron]
- 使用帧间通信转换器的视频实例分割 [nips21] [pytorch/detectron]
- VNext: SeqFormer / IDOL [eccv22] [pytorch/detectron2]
- SeqFormer: 用于视频实例分割的序列转换器 [eccv22] [pytorch/detectron2]
- VITA: 通过对象标记关联进行视频实例分割 [nips22] [pytorch/detectron2]
全景视频分割
- ViP-DeepLab [cvpr21]
运动预测
- 通过条件运动传播进行自监督学习 [cvpr19] [pytorch]
- 用于人类运动预测的神经时间模型 [cvpr19] [tensorflow]
- 学习运动轨迹依赖关系以进行人类运动预测 [iccv19] [pytorch]
- 结构化RNN:时空图上的深度学习 [cvpr15] [tensorflow]
- 一个用于对象轨迹预测的多输入多输出基于LSTM的神经网络Keras实现 [keras]
- 用于轨迹预测的转换器网络 [ax2003] [pytorch]
- 通过IRL框架对神经网络进行正则化以预测未来轨迹 [ietcv1907] [tensorflow]
- 窥探未来:预测视频中人的未来活动和位置 [cvpr19] [tensorflow]
- DAG-Net: 用于轨迹预测的双重注意力图神经网络 [ax200526] [pytorch]
- MCENET: 一种用于混合交通环境中同质体轨迹预测的多上下文编码器网络 [ax200405] [tensorflow]
- 使用基于CNN的架构预测社交互动人群中的人类轨迹 [pytorch]
- 轨迹预测工具集,可通过pip安装 [icai19/wacv19] [pytorch]
- RobustTP: 用于密集交通中异构道路代理的端到端轨迹预测,包含噪声传感器输入 [acmcscs19] [pytorch/tensorflow]
- 分叉路径的花园: 走向多未来的轨迹预测 [cvpr20] [dummy]
- 克服混合密度网络的局限性:一种用于多模态未来预测的采样和拟合框架 [cvpr19] [tensorflow]
- 用于人类轨迹预测的对抗性损失 [hEART19] [pytorch]
- 社会GAN: 用生成对抗网络生成符合社会规范的轨迹 [cvpr18] [pytorch]
- 使用图LSTM中的谱聚类预测道路代理的轨迹和行为 [ax1912] [pytorch]
- 研究深度学习中用于轨迹预测的注意力机制 [msc thesis] [python]
- 宝马ABSOLUT自动驾驶巴士项目中多模型估计算法的轨迹跟踪和预测的Python实现 [python]
- 预测人类轨迹 [theano]
- 行人未来轨迹预测的循环神经网络实现 [pytorch]
姿态估计
框架
- OpenMMLab 姿态估计工具箱和基准 [pytorch]
自动编码器
- β-VAE: 使用约束变分框架学习基本视觉概念 [iclr17] [deepmind] [tensorflow] [tensorflow] [pytorch]
- 通过分解进行解耦 [ax1806] [pytorch]
分类
- 通过网络瘦身学习高效卷积网络 [iccv17] [pytorch]
框架
- OpenMMLab 图像分类工具箱和基准 [pytorch]
深度强化学习
标注
- LabelImg
- ByLabel: 一种基于边界的半自动图像标注工具
- 边界框编辑器和导出器
- VGG 图片标注器
- 视觉对象标记工具:一个用于从图像和视频构建端到端对象检测模型的电子应用
- 像素标注工具
- labelme:使用Python进行图像多边形标注(多边形、矩形、圆形、线条、点和图像级别标注)
- VATIC - 来自加利福尼亚尔湾的视频标注工具 [ijcv12] [project]
- 计算机视觉标注工具(CVAT)
- 图像标注工具
- Labelbox [付费]
- RectLabel 一个用于边界框对象检测和分割的图像标注工具 [付费]
- Onepanel: 生产规模视觉AI平台,具有用于模型构建、自动标注、数据处理和模型训练流水线的完全集成组件 [docs]
编辑
增强
- Augmentor: 用于机器学习的Python图像增强库
- Albumentations: 快速图像增强库和易于使用的其他库包装器
- imgaug: 用于机器学习实验的图像增强
- solt: 图像流轻量级数据转换
深度学习
类别不平衡
- 不平衡数据集采样器 [pytorch]
- PyTorch中的可迭代数据集重采样 [pytorch]
小样本学习
- OpenMMLab 小样本学习工具箱和基准 [pytorch]
无监督学习
- 自监督学习工具箱和基准 [pytorch]
集锦
数据集
- Awesome Public Datasets
- List of traffic surveillance datasets
- Machine learning datasets: A list of the biggest machine learning datasets from across the web
- Labeled Information Library of Alexandria: Biology and Conservation [other conservation data sets]
- THOTH: Data Sets & Images
- Google AI Datasets
- Google Cloud Storage public datasets
- Microsoft Research Open Data
- Earth Engine Data Catalog
- Registry of Open Data on AWS
- Kaggle Datasets
- CVonline: Image Databases
- Synthetic for Computer Vision: A list of synthetic dataset and tools for computer vision
- pgram machine learning datasets
- pgram vision datasets
深度学习
静态检测
视频检测
单目标跟踪
- Visual Tracking Paper List
- List of deep learning based tracking papers
- List of single object trackers with results on OTB
- Collection of Correlation Filter based trackers with links to papers, codes, etc
- VOT2018 Trackers repository
- CUHK Datasets
- A Summary of CVPR19 Visual Tracking Papers
- Visual Trackers for Single Object
多目标跟踪
- List of multi object tracking papers
- A collection of Multiple Object Tracking (MOT) papers in recent years, with notes
- Papers with Code : Multiple Object Tracking
- Paper list and source code for multi-object-tracking
静态分割
- Segmentation Papers and Code
- Segmentation.X : Papers and Benchmarks about semantic segmentation, instance segmentation, panoptic segmentation and video segmentation
- Instance Segmentation Papers with Code
视频分割
运动预测
- Awesome-Trajectory-Prediction
- Awesome Interaction-aware Behavior and Trajectory Prediction
- Human Trajectory Prediction Datasets
深度压缩感知
其他
- Papers With Code : the latest in machine learning
- Awesome Deep Ecology
- List of Matlab frameworks, libraries and software
- Face Recognition
- A Month of Machine Learning Paper Summaries
- Awesome-model-compression-and-acceleration
- Model-Compression-Papers
教程
合辑
多目标跟踪
静态检测
- End-to-end object detection with Transformers
- Deep Learning for Object Detection: A Comprehensive Review
- Review of Deep Learning Algorithms for Object Detection
- A Simple Guide to the Versions of the Inception Network
- R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
- A gentle guide to deep learning object detection
- The intuition behind RetinaNet
- YOLO—You only look once, real time object detection explained
- Understanding Feature Pyramid Networks for object detection (FPN)
- Fast object detection with SqueezeDet on Keras
- Region of interest pooling explained
视频检测
实例分割
- Splash of Color: Instance Segmentation with Mask R-CNN and TensorFlow
- Simple Understanding of Mask RCNN
- Learning to Segment
- Analyzing The Papers Behind Facebook's Computer Vision Approach
- Review: MNC — Multi-task Network Cascade, Winner in 2015 COCO Segmentation
- Review: FCIS — Winner in 2016 COCO Segmentation
- Review: InstanceFCN — Instance-Sensitive Score Maps
深度学习
优化
类不平衡
- Learning from imbalanced data
- Learning from Imbalanced Classes
- Handling imbalanced datasets in machine learning [medium]
- How to handle Class Imbalance Problem [medium]
- Dealing with Imbalanced Data [towardsdatascience]
- How to Handle Imbalanced Classes in Machine Learning [elitedatascience]
- 7 Techniques to Handle Imbalanced Data [kdnuggets]
- 10 Techniques to deal with Imbalanced Classes in Machine Learning [analyticsvidhya]
循环神经网络(RNN)
深度强化学习(Deep RL)
自动编码器(Autoencoders)
- Guide to Autoencoders
- Applied Deep Learning - Part 3: Autoencoders
- Denoising Autoencoders
- Stacked Denoising Autoencoders
- A Gentle Introduction to LSTM Autoencoders
- Variational Autoencoder in TensorFlow
- Variational Autoencoders with Tensorflow Probability Layers