region_detector
allennlp.modules.vision.region_detector
RegionDetectorOutput¶
class RegionDetectorOutput(NamedTuple)
The output type from the forward pass of a RegionDetector
.
features¶
class RegionDetectorOutput(NamedTuple):
| ...
| features: List[Tensor] = None
A list of tensors, each with shape (num_boxes, feature_dim)
.
boxes¶
class RegionDetectorOutput(NamedTuple):
| ...
| boxes: List[Tensor] = None
A list of tensors containing the coordinates for each box. Each has shape (num_boxes, 4)
.
class_probs¶
class RegionDetectorOutput(NamedTuple):
| ...
| class_probs: Optional[List[Tensor]] = None
An optional list of tensors. These tensors can have shape (num_boxes,)
or
(num_boxes, *)
if probabilities for multiple classes are given.
class_labels¶
class RegionDetectorOutput(NamedTuple):
| ...
| class_labels: Optional[List[Tensor]] = None
An optional list of tensors that give the labels corresponding to the class_probs
tensors. This should be non-None
whenever class_probs
is, and each tensor
should have the same shape as the corresponding tensor from class_probs
.
RegionDetector¶
class RegionDetector(nn.Module, Registrable)
A RegionDetector
takes a batch of images, their sizes, and an ordered dictionary
of image features as input, and finds regions of interest (or "boxes") within those images.
Those regions of interest are described by three values:
features
(List[Tensor]
): A feature vector for each region, which is a tensor of shape(num_boxes, feature_dim)
.boxes
(List[Tensor]
): The coordinates of each region within the original image, with shape(num_boxes, 4)
.class_probs
(Optional[List[Tensor]]
): Class probabilities from some object detector that was used to find the regions of interest, with shape(num_boxes,)
or(num_boxes, *)
if probabilities for more than one class are given.class_labels
(Optional[List[Tensor]]
): The labels corresponding toclass_probs
. Each tensor in this list has the same shape as the corresponding tensor inclass_probs
.
forward¶
class RegionDetector(nn.Module, Registrable):
| ...
| def forward(
| self,
| images: FloatTensor,
| sizes: IntTensor,
| image_features: "OrderedDict[str, FloatTensor]"
| ) -> RegionDetectorOutput
RandomRegionDetector¶
@RegionDetector.register("random")
class RandomRegionDetector(RegionDetector):
| def __init__(self, seed: Optional[int] = None)
A RegionDetector
that returns two proposals per image, for testing purposes. The features for
the proposal are a random 10-dimensional vector, and the coordinates are the size of the image.
forward¶
class RandomRegionDetector(RegionDetector):
| ...
| def forward(
| self,
| images: FloatTensor,
| sizes: IntTensor,
| image_features: "OrderedDict[str, FloatTensor]"
| ) -> RegionDetectorOutput
FasterRcnnRegionDetector¶
@RegionDetector.register("faster_rcnn")
class FasterRcnnRegionDetector(RegionDetector):
| def __init__(
| self,
| *, box_score_thresh: float = 0.05,
| *, box_nms_thresh: float = 0.5,
| *, max_boxes_per_image: int = 100
| )
A Faster R-CNN pretrained region detector.
Unless you really know what you're doing, this should be used with the image
features created from the ResnetBackbone
GridEmbedder
and on images loaded
using the TorchImageLoader
with the default settings.
Note
This module does not have any trainable parameters by default. All pretrained weights are frozen.
Parameters¶
-
box_score_thresh :
float
, optional (default =0.05
)
During inference, only proposal boxes / regions with a label classification score greater thanbox_score_thresh
will be returned. -
box_nms_thresh :
float
, optional (default =0.5
)
During inference, non-maximum suppression (NMS) will applied to groups of boxes that share a common label.NMS iteratively removes lower scoring boxes which have an intersection-over-union (IoU) greater than
box_nms_thresh
with another higher scoring box. -
max_boxes_per_image :
int
, optional (default =100
)
During inference, at mostmax_boxes_per_image
boxes will be returned. The number of boxes returned will vary by image and will often be lower thanmax_boxes_per_image
depending on the values ofbox_score_thresh
andbox_nms_thresh
.
forward¶
class FasterRcnnRegionDetector(RegionDetector):
| ...
| def forward(
| self,
| images: FloatTensor,
| sizes: IntTensor,
| image_features: "OrderedDict[str, FloatTensor]"
| ) -> RegionDetectorOutput
Extract regions and region features from the given images.
In most cases image_features
should come directly from the ResnetBackbone
GridEmbedder
. The images
themselves should be standardized and resized
using the default settings for the TorchImageLoader
.