This is just a quick update following the same format as the last post, but with some minor modifications. In summary, it has been expanded to include 66 of the species of trees included in the Cleveland Metroparks checklists and has been trained using ResNet50 instead of ResNet34.
%matplotlib inline
%reload_ext autoreload
%autoreload 2
from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *
from pathlib import Path
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
Having a handful of these parameters available to be tweaked about here is useful, and it’s a format I’m probably going to stick to for small experiments in notebooks like this.
bs = 16
initial_dims = 224
workers = 2
valid = 0.2
IMG_PATH = Path("data/bark_66_categories")
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
ds_tfms=get_transforms(), bs=bs, size=initial_dims,
num_workers=workers).normalize(imagenet_stats)
data.show_batch(rows=4, figsize=(10, 10))
learn = cnn_learner(data, models.resnet50, metrics=error_rate)
Here, we can already see some modest improvements using a model that has a larger capacity. We’ll be training with the larger images immediately after this one to see just how much more performance we can get out of this model.
learn.fit_one_cycle(4)
learn.lr_find()
learn.recorder.plot()
learn.unfreeze()
learn.fit_one_cycle(4, max_lr=1e-4)
learn.save("cpi-0.0_66-categories_resnet50-1")
data_larger = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
ds_tfms=get_transforms(), bs=bs, size=initial_dims*2,
num_workers=workers).normalize(imagenet_stats)
learn_larger = cnn_learner(data_larger, models.resnet50, metrics=error_rate)
learn_larger.load("cpi-0.0_66-categories_resnet50-1")
learn_larger.fit_one_cycle(4)
interp = ClassificationInterpretation.from_learner(learn_larger)
interp.plot_confusion_matrix(figsize=(24, 24), dpi=60)
Saving this, because we might be able to use it in the future.
learn.save("cpi-0.0_66-categories_resnet50-2")
Next Steps¶
So, we can do it. We can double the number of classes and, even though they all have less than 100 images, we can still get modestly successful results. Looking at the error rate above, though, it’s still… Not good. I already mentioned in the last post that some of this is due to the taxonomic nature of the dataset. That is, many of these are in the same genus, and even a human might be likely to get them mixed up.
At this point, a good next step is obvious to me; use more than one label for each category. The model getting conifers mixed up with each other indicates that there is the structure for what a conifer broadly looks like. It just needs to be told that that’s a relevant category. From there, it might be able to offer best guesses.
Witht that target in mind, I will see you soon. Early Days!