Please enable JavaScript.
Coggle requires JavaScript to display documents.
Object Detection - Coggle Diagram
Object Detection
Datasets
ImageNet
large number of object categories(21841 categories)
MS COCO
closer to real world scenarios
PASCAL VOC (2012)
Places
the largest dataset for scene recognition
Open Images
exhaustively annotated
Main Challenges
Efficiency and Scalability
handle unseen objects
quick inference
Accuracy
samll inter-class variations
large intra-class variations
imaging conditions
intrinsic factors
Milestone Frameworks
One Stage
(region proposal free)
YOLO
YOLOv2 & YOLO9000
OverFeat
SSD
DetectorNet
Two Stage
(pre-processing for generating region proposal)
Fast R-CNN
Drawbacks
external region proposals become the speed bottleneck
sharing the computation of convolution across region proposals, and adds a Region of Interest (RoI) pooling layer
Faster R-CNN
the selective search is replaced by a CNN(RPN) in producing region proposals.
Drawbacks
the computation after the RoI pooling layer cannot be shared,
SPPNet
Drawbacks
training speed is still slow
speedup R-CNN by spatial pyramid pooling, which makes CNN accept inputs of arbitrary sizes
RFCN
a position-sensitive RoI pooling layer is added
Drawbacks
Mask R-CNN
tackle pixelwise object instance segmentation by extending Faster R-CNN
Drawbacks
R-CNN
Drawbacks
slow and hard to optimize a multistage pipeline
expensive in both space and time for SVM and bounding box regressor training
testing is slow, since CNN features are extracted per object proposal
the first to explore CNNs for generic object detection