Mask R-CNN - Single Class Instance Segmentation Tutorial

Hey everyone. Finally I have time to write a blogpost on Mask-RCNN which I'm using for a dental project. Mask R-CNN is an implementation of instance segmentation.


On how it Mask R-CNN works, my apprentice Mansour made a very nice slide:

But if you’re reading this, you’re probably looking to get the repo running right? Let’s start with cloning the original repo:

It has great documentation on how and what is Mask-RCNN, but getting it running wasn’t as straightforward. Let’s go step-by-step.

Preparing the Environment

Unfortunately the original repo still uses Tensorflow 1, it took some time get it all sorted out. The default repo had the following requirements.txt:

Numpy scipy Pillow cython matplotlib scikit-image tensorflow>=1.3.0 keras>=2.0.8 opencv-python h5py imgaug IPython[all]

Tensorflow is the most troublesome to get right, once you have it installed the rest of the libraries should play well with each other. If not, you can simply pip uninstall / install to update them. I’ve run the training on two machines.


(windows) Tensorflow 1.5

For a GPU:

(windows) tensorflow-gpu 1.13.1 CUDA : 10.0 CUDNN : 7.6.0
(linux)   tensorflow-gpu 1.15.2 CUDA : 10.1 CUDNN : 7.6.5

If you’re not using that GPU configuration, you can check from here:

Running an Example

The balloon example is the best one for single class classification. Navigate to samples > balloon.

Read the thoroughly. Download the .h5 and dataset and place them in the base folder, then run through the two jupyter notebooks (inspect_balloon_data.ipynb & inspect_balloon_model.ipynb). Make sure that you've changed the paths in the notebooks to reflect where you put the .h5 and dataset. We’re not running any training yet, but it’s good to know the basic repo codes are working.

Once that’s done, try running the training command.

python train --dataset=/path/to/balloon/dataset --weights=coco

If the environment is setup correctly, it should train a new .h5 file. Running on GPU took me around 1.5 hours on default configuration (30 epoch 100 steps) with a GTX 2060. You can reduce this by going to the file and modifying the STEPS_PER_EPOCH and VALIDATION_STEPS.

Ran into an error? Go refer to the bottom section. If not, congratulations! You try reruning the inspect balloon_model notebook again but this time redirect the BALLOON_WEIGHTS_PATH to the newly created .h5 file and run through the cells.

Creating Your Own (Single Class) Instance Segmentation

Honestly once you’ve gotten the balloon training working, you can be relieved since that’s the most troublesome part. Modifying the code for your own purposes isn’t as difficult. I’ve modified mine for a dental project, the first phase be segmenting teeth from the background.

Start of by duplicating the balloon folder. You’ll be mostly looking into modifying the file. You can change all “balloon” into “teeth” or whatever you’re using it for.

To train and test using your own dataset, first you first divide them into train and val folders. Refer to the balloon dataset on how they're structured. The mask r-cnn doesn't mind a different resolution.

For the annotator, you can either modify the code to ingest your own annotation tool OR you can use the annotator tool they use. If you’re going to use theirs, then you should use VIA < 1.6. We used this one:\~vgg/software/via/via-1.0.6.html

You’d want the annotated csv to have this output format:

{ 'filename': 'this_is_the_filename.jpg',
#   'regions': {
#       '0': {
#           'region_attributes': {},
#           'shape_attributes': {
#               'all_points_x': [...],
#               'all_points_y': [...],
#               'name': 'polygon'}},
#       ... more regions ...
#   },
#   'size': 554554
# }

Export as a json file and ensure that the output generated follows that convention, where each instance has its own number ( reformats the json into something readable). Note that both train and val needs their own via_region_data.json file.

Once you have the dataset prepared, simply run the

python train --dataset=/path/to/teeth/dataset --weights=coco

command, and it should start training a new model for you. If successful, you'll have a new h5 in your logs folder! Hurrah, you’ve got a Mask R-CNN working.

Some stuff you can get from the repo:

Instance Segmentation


Feature Map



Hopefully you’ll never need to consult this section, but these are the common mistakes that we found when I had my apprentice run the code in his machine.

mrcnn is missing

Running the jupyter notebook / training script from a different directory than expected. Do NOT install from pip or manual install a wheel, the mrcnn is already in the base directory with a specific set of and other scripts.

“…” file is missing!
  • Read the Readmes thoroughly. The original repo provided almost everything you need to run their examples, from .h5 to dataset. Are you sure you have everything?
  • Double check on how you point to the dataset or weights. Make sure their directories are pointing to the right folder.
  • Try running the balloon training from the balloon folder
Training stuck at first step / epoch

Running on CPU are you? The first step / epoch took me a long time too, around 30 mins. A full train (100 steps / 30 epoch) on CPU would take 4 days non-stop. Switch to GPU, it’s 100x faster.