Tree Identification with ResNet50 – Quick

This is just a quick update following the same format as the last post, but with some minor modifications. In summary, it has been expanded to include 66 of the species of trees included in the Cleveland Metroparks checklists and has been trained using ResNet50 instead of ResNet34.

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2
In [2]:
from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *

from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

Having a handful of these parameters available to be tweaked about here is useful, and it’s a format I’m probably going to stick to for small experiments in notebooks like this.

In [3]:
bs = 16
initial_dims = 224
workers = 2
valid = 0.2
In [4]:
IMG_PATH = Path("data/bark_66_categories")
In [5]:
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims,
                                  num_workers=workers).normalize(imagenet_stats)
In [6]:
data.show_batch(rows=4, figsize=(10, 10))
In [7]:
learn = cnn_learner(data, models.resnet50, metrics=error_rate)

Here, we can already see some modest improvements using a model that has a larger capacity. We’ll be training with the larger images immediately after this one to see just how much more performance we can get out of this model.

In [8]:
learn.fit_one_cycle(4)
epoch train_loss valid_loss error_rate time
0 4.757617 4.026068 0.869828 01:18
1 4.145549 3.616853 0.832759 01:19
2 3.440784 3.219274 0.793103 01:19
3 3.002847 3.121537 0.772414 01:18
In [9]:
learn.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
30.69% [89/290 00:19<00:43 8.9121]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
In [10]:
learn.recorder.plot()
In [11]:
learn.unfreeze()
learn.fit_one_cycle(4, max_lr=1e-4)
epoch train_loss valid_loss error_rate time
0 3.026054 3.164094 0.773276 01:19
1 3.100843 3.155196 0.770690 01:20
2 2.633893 2.938268 0.725000 01:18
3 2.292624 2.934356 0.718966 01:18
In [12]:
learn.save("cpi-0.0_66-categories_resnet50-1")
In [13]:
data_larger = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims*2,
                                  num_workers=workers).normalize(imagenet_stats)
In [16]:
learn_larger = cnn_learner(data_larger, models.resnet50, metrics=error_rate)
In [18]:
learn_larger.load("cpi-0.0_66-categories_resnet50-1")
In [19]:
learn_larger.fit_one_cycle(4)
epoch train_loss valid_loss error_rate time
0 3.055164 2.551959 0.679310 01:54
1 2.890265 2.595395 0.670690 01:52
2 2.633454 2.443143 0.643103 01:53
3 2.223217 2.389971 0.634483 01:54
In [20]:
interp = ClassificationInterpretation.from_learner(learn_larger)
In [21]:
interp.plot_confusion_matrix(figsize=(24, 24), dpi=60)

Saving this, because we might be able to use it in the future.

In [22]:
learn.save("cpi-0.0_66-categories_resnet50-2")

Next Steps

So, we can do it. We can double the number of classes and, even though they all have less than 100 images, we can still get modestly successful results. Looking at the error rate above, though, it’s still… Not good. I already mentioned in the last post that some of this is due to the taxonomic nature of the dataset. That is, many of these are in the same genus, and even a human might be likely to get them mixed up.

At this point, a good next step is obvious to me; use more than one label for each category. The model getting conifers mixed up with each other indicates that there is the structure for what a conifer broadly looks like. It just needs to be told that that’s a relevant category. From there, it might be able to offer best guesses.

Witht that target in mind, I will see you soon. Early Days!