Skip to content

Revamp docs for Faster RCNN #5918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 3, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/source/models/faster_rcnn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Faster R-CNN
==========

.. currentmodule:: torchvision.models.detection

The Faster R-CNN model is based on the `Faster R-CNN: Towards Real-Time Object Detection
with Region Proposal Networks <https://arxiv.org/abs/1506.01497>`__
paper.


Model builders
--------------

The following model builders can be used to instantiate a Faster R-CNN model, with or
without pre-trained weights. All the model builders internally rely on the
``torchvision.models.detection.faster_rcnn.FasterRCNN`` base class. Please refer to the `source
code
<https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py>`_ for
more details about this class.

.. autosummary::
:toctree: generated/
:template: function.rst

fasterrcnn_resnet50_fpn
fasterrcnn_resnet50_fpn_v2
fasterrcnn_mobilenet_v3_large_fpn
fasterrcnn_mobilenet_v3_large_320_fpn

1 change: 1 addition & 0 deletions docs/source/models_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ weights:
.. toctree::
:maxdepth: 1

models/faster_rcnn
models/fcos
models/mask_rcnn
models/retinanet
Expand Down
120 changes: 86 additions & 34 deletions torchvision/models/detection/faster_rcnn.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,10 +453,9 @@ def fasterrcnn_resnet50_fpn(
**kwargs: Any,
) -> FasterRCNN:
"""
Constructs a Faster R-CNN model with a ResNet-50-FPN backbone.

Reference: `"Faster R-CNN: Towards Real-Time Object Detection with
Region Proposal Networks" <https://arxiv.org/abs/1506.01497>`_.
Faster R-CNN model with a ResNet-50-FPN backbone from the `Faster R-CNN: Towards Real-Time Object
Detection with Region Proposal Networks <https://arxiv.org/abs/1703.06870>`__
paper.

The input to the model is expected to be a list of tensors, each of shape ``[C, H, W]``, one for each
image, and should be in ``0-1`` range. Different images can have different sizes.
Expand Down Expand Up @@ -510,13 +509,26 @@ def fasterrcnn_resnet50_fpn(
>>> torch.onnx.export(model, x, "faster_rcnn.onnx", opset_version = 11)

Args:
weights (FasterRCNN_ResNet50_FPN_Weights, optional): The pretrained weights for the model
progress (bool): If True, displays a progress bar of the download to stderr
weights (:class:`~torchvision.models.detection.FasterRCNN_ResNet50_FPN_Weights`, optional): The
pretrained weights to use. See
:class:`~torchvision.models.detection.FasterRCNN_ResNet50_FPN_Weights` below for
more details, and possible values. By default, no pre-trained
weights are used.
progress (bool, optional): If True, displays a progress bar of the
download to stderr. Default is True.
num_classes (int, optional): number of output classes of the model (including the background)
weights_backbone (ResNet50_Weights, optional): The pretrained weights for the backbone
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from final block.
Valid values are between 0 and 5, with 5 meaning all backbone layers are trainable. If ``None`` is
passed (the default) this value is set to 3.
weights_backbone (:class:`~torchvision.models.ResNet50_Weights`, optional): The
pretrained weights for the backbone.
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from
final block. Valid values are between 0 and 5, with 5 meaning all backbone layers are
trainable. If ``None`` is passed (the default) this value is set to 3.
**kwargs: parameters passed to the ``torchvision.models.detection.faster_rcnn.FasterRCNN``
base class. Please refer to the `source code
<https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py>`_
for more details about this class.

.. autoclass:: torchvision.models.detection.FasterRCNN_ResNet50_FPN_Weights
:members:
"""
weights = FasterRCNN_ResNet50_FPN_Weights.verify(weights)
weights_backbone = ResNet50_Weights.verify(weights_backbone)
Expand Down Expand Up @@ -553,21 +565,34 @@ def fasterrcnn_resnet50_fpn_v2(
**kwargs: Any,
) -> FasterRCNN:
"""
Constructs an improved Faster R-CNN model with a ResNet-50-FPN backbone.

Reference: `"Benchmarking Detection Transfer Learning with Vision Transformers"
<https://arxiv.org/abs/2111.11429>`_.
Constructs an improved Faster R-CNN model with a ResNet-50-FPN backbone from `Benchmarking Detection
Transfer Learning with Vision Transformers <https://arxiv.org/abs/2111.11429>`__ paper.

:func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn` for more details.
It works similarly to Faster R-CNN with ResNet-50 FPN backbone. See
:func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn` for more
details.

Args:
weights (FasterRCNN_ResNet50_FPN_V2_Weights, optional): The pretrained weights for the model
progress (bool): If True, displays a progress bar of the download to stderr
weights (:class:`~torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights`, optional): The
pretrained weights to use. See
:class:`~torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights` below for
more details, and possible values. By default, no pre-trained
weights are used.
progress (bool, optional): If True, displays a progress bar of the
download to stderr. Default is True.
num_classes (int, optional): number of output classes of the model (including the background)
weights_backbone (ResNet50_Weights, optional): The pretrained weights for the backbone
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from final block.
Valid values are between 0 and 5, with 5 meaning all backbone layers are trainable. If ``None`` is
passed (the default) this value is set to 3.
weights_backbone (:class:`~torchvision.models.ResNet50_Weights`, optional): The
pretrained weights for the backbone.
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from
final block. Valid values are between 0 and 5, with 5 meaning all backbone layers are
trainable. If ``None`` is passed (the default) this value is set to 3.
**kwargs: parameters passed to the ``torchvision.models.detection.faster_rcnn.FasterRCNN``
base class. Please refer to the `source code
<https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py>`_
for more details about this class.

.. autoclass:: torchvision.models.detection.FasterRCNN_ResNet50_FPN_V2_Weights
:members:
"""
weights = FasterRCNN_ResNet50_FPN_V2_Weights.verify(weights)
weights_backbone = ResNet50_Weights.verify(weights_backbone)
Expand Down Expand Up @@ -658,7 +683,8 @@ def fasterrcnn_mobilenet_v3_large_320_fpn(
**kwargs: Any,
) -> FasterRCNN:
"""
Constructs a low resolution Faster R-CNN model with a MobileNetV3-Large FPN backbone tunned for mobile use-cases.
Low resolution Faster R-CNN model with a MobileNetV3-Large backbone tunned for mobile use cases.

It works similarly to Faster R-CNN with ResNet-50 FPN backbone. See
:func:`~torchvision.models.detection.fasterrcnn_resnet50_fpn` for more
details.
Expand All @@ -671,13 +697,26 @@ def fasterrcnn_mobilenet_v3_large_320_fpn(
>>> predictions = model(x)

Args:
weights (FasterRCNN_MobileNet_V3_Large_320_FPN_Weights, optional): The pretrained weights for the model
progress (bool): If True, displays a progress bar of the download to stderr
weights (:class:`~torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_320_FPN_Weights`, optional): The
pretrained weights to use. See
:class:`~torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_320_FPN_Weights` below for
more details, and possible values. By default, no pre-trained
weights are used.
progress (bool, optional): If True, displays a progress bar of the
download to stderr. Default is True.
num_classes (int, optional): number of output classes of the model (including the background)
weights_backbone (MobileNet_V3_Large_Weights, optional): The pretrained weights for the backbone
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from final block.
Valid values are between 0 and 6, with 6 meaning all backbone layers are trainable. If ``None`` is
passed (the default) this value is set to 3.
weights_backbone (:class:`~torchvision.models.MobileNet_V3_Large_Weights`, optional): The
pretrained weights for the backbone.
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from
final block. Valid values are between 0 and 6, with 6 meaning all backbone layers are
trainable. If ``None`` is passed (the default) this value is set to 3.
**kwargs: parameters passed to the ``torchvision.models.detection.faster_rcnn.FasterRCNN``
base class. Please refer to the `source code
<https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py>`_
for more details about this class.

.. autoclass:: torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_320_FPN_Weights
:members:
"""
weights = FasterRCNN_MobileNet_V3_Large_320_FPN_Weights.verify(weights)
weights_backbone = MobileNet_V3_Large_Weights.verify(weights_backbone)
Expand Down Expand Up @@ -728,13 +767,26 @@ def fasterrcnn_mobilenet_v3_large_fpn(
>>> predictions = model(x)

Args:
weights (FasterRCNN_MobileNet_V3_Large_FPN_Weights, optional): The pretrained weights for the model
progress (bool): If True, displays a progress bar of the download to stderr
weights (:class:`~torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_FPN_Weights`, optional): The
pretrained weights to use. See
:class:`~torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_FPN_Weights` below for
more details, and possible values. By default, no pre-trained
weights are used.
progress (bool, optional): If True, displays a progress bar of the
download to stderr. Default is True.
num_classes (int, optional): number of output classes of the model (including the background)
weights_backbone (MobileNet_V3_Large_Weights, optional): The pretrained weights for the backbone
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from final block.
Valid values are between 0 and 6, with 6 meaning all backbone layers are trainable. If ``None`` is
passed (the default) this value is set to 3.
weights_backbone (:class:`~torchvision.models.MobileNet_V3_Large_Weights`, optional): The
pretrained weights for the backbone.
trainable_backbone_layers (int, optional): number of trainable (not frozen) layers starting from
final block. Valid values are between 0 and 6, with 6 meaning all backbone layers are
trainable. If ``None`` is passed (the default) this value is set to 3.
**kwargs: parameters passed to the ``torchvision.models.detection.faster_rcnn.FasterRCNN``
base class. Please refer to the `source code
<https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py>`_
for more details about this class.

.. autoclass:: torchvision.models.detection.FasterRCNN_MobileNet_V3_Large_FPN_Weights
:members:
"""
weights = FasterRCNN_MobileNet_V3_Large_FPN_Weights.verify(weights)
weights_backbone = MobileNet_V3_Large_Weights.verify(weights_backbone)
Expand Down