DeepLab is a state-of-the-art semantic segmentation model designed and open-sourced by Google. The dense prediction is achieved by simply up-sampling the output of the last convolution layer and computing pixel-wise loss. It is one of the most promising techniques for semantic image segmentation with Deep Learning. Semantic segmentation is understanding an image at the pixel level, then assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.
- Set the model and module. Clone the official model repository git clone GitHub - tensorflow/models: Models and examples built with TensorFlow.
- Prepare the data. Prepare the following.
- Preprocess the data. If the label image is color, use a black and white label image.
- Set up training.
DeepLab v3 is a semantic segmentation architecture that improves upon DeepLab v2 with several modifications. To handle the problem of segmenting objects at multiple scales, modules are designed which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates.