A couple of days ago, I used hws to download a very preliminary dataset. The true aim of this project is to identify some local trees by their bark, but today I’m going to use it as an illustration of the importance of dataset curation and one limitation of automated tools. For instance, web scrapers are dumb. Even if an image is incorrectly tagged by a person or ranked weirdly because of a search algorithm, it can still be downloaded into your raw dataset and needs to be processed by a person.

I noticed early on that, when searching for tree bark, Google returned a lot of accurate results at first, but then very quickly moved on to images that were probably of the same kind of three, but featuring leaves, berries, and picnicking families. A lot had already been downloaded, and I need to get back into the swing of fastai, anyway, so it made sense to see what would happen if I just threw all the data at a model to get a baseline.

To get a sense of the species included in this dataset, I found checklists of the local foliage from the Cleveland Metroparks site, then decided to pare down to just the first page of trees that are common or occasional, and don’t have numerous hybridizations. In all, this yielded about 2,500 images across 31 categories. That is to say, my first run captured less than 100 images for each species, which might produce some workable results if those had just been bark, but was almost certainly going to fail when learning so many different features.

To get some housekeeping out of the way, we use the standard imports.

%matplotlib inline
%reload_ext autoreload
%autoreload 2

from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *

from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

IMG_PATH = Path("data/images")

Because the dataset in question was just downloaded into folders, we’ll be getting an ImageDataBunch using from_folder. From experience, this is probably going to change to handle pandas DataFrames or CSV annotations as the dataset grows. If you’re following, keep an eye on size, bs, and num_workers. size comes in later because we want to squeeze as much as we can out of this data, and retraining a model on scaled-up images is a clever trick for that. bs and num_workers might have to be tuned down to hardware limitations.

data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=0.2,
                                 ds_tfms=get_transforms(), bs=16, size=224, num_workers=4).normalize(imagenet_stats)

This next line makes it a bit more apparent that the dataset needs to be curated. show_batch() takes a random sample of images; you might see a lot of bark, and you might not.

data.show_batch(rows=4, figsize=(10, 10))

Pressing ahead, we’re just using ResNet34.

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

Here we see that the results leave a lot to be desired. Error rates on the order of 80% are basically noise.

learn.fit_one_cycle(4)

But! We’re going to save it and try training on slightly larger images, anyway, just for fun.

learn.save("cpi-0.0_ueg-1")
learn.unfreeze()
learn.lr_find()

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

Intuitively, I can see from this plot that this gradient is going to be way too small to get much better out of this.

learn.recorder.plot()

data_336 = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=0.2,
                                          ds_tfms=get_transforms(), bs=16, size=336, num_workers=1).normalize(imagenet_stats)

learn_336 = cnn_learner(data_336, models.resnet34, metrics=error_rate)

learn_336.load("cpi-0.0_ueg-1")

learn_336.unfreeze()
learn_336.lr_find()

LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

learn_336.recorder.plot()

And here is why I wanted to highlight this technique. The error rate is down ~10-20% just by using the larger images

learn_336.fit_one_cycle(4, max_lr=1e-5)

Because sometimes difficult to figure out what results mean from just looking at the error rate, and it’s helpful to see exactly what is being miscategorized, let’s take a look at a confusion matrix.

interp = ClassificationInterpretation.from_learner(learn_336)

SURPRISE!¶

interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

learn_336.save("cpi-0.0_ueg-2")

In spite of the small size of the dataset, the model could actually stand to be a lot worse. We see a nice string of correct classifications along the diagonal. What about the deviations? Comparing some of the most prominent misclassifications, e.g. the strong difficulty in telling pin cherry trees from sweet cherry trees, a given person could understand how a mistake was made.

Alright, this is fun. There are a couple of other tools that will let us take a closer look at what is going wrong here. most_confused() here lets us look at every time one class was confused for the other more than once.

interp.most_confused(2, 10)

[('pin_cherry_bark', 'sweet_cherry_bark', 6),
 ('red_maple_bark', 'silver_maple_bark', 6),
 ('radford_pear_bark', 'canadian_serviceberry_bark', 5),
 ('black_cherry_bark', 'sweet_cherry_bark', 4),
 ('sweet_cherry_bark', 'black_cherry_bark', 4),
 ('allegheny_serviceberry_bark', 'canadian_serviceberry_bark', 3),
 ('black_maple_bark', 'english_field_maple_bark', 3),
 ('black_maple_bark', 'norway_maple_bark', 3),
 ('black_tupelo_bark', 'silver_maple_bark', 3),
 ('boxelder_bark', 'white_ash_bark', 3),
 ('canadian_serviceberry_bark', 'common_serviceberry_bark', 3),
 ('eastern_redbud_bark', 'american_crabapple_bark', 3),
 ('garden_plum_bark', 'honeylocust_bark', 3),
 ('norway_maple_bark', 'white_ash_bark', 3),
 ('pumpkin_ash_bark', 'white_ash_bark', 3),
 ('red_maple_bark', 'sugar_maple_bark', 3),
 ('silver_maple_bark', 'sugar_maple_bark', 3),
 ('sour_cherry_bark', 'black_cherry_bark', 3),
 ('sour_cherry_bark', 'sweet_cherry_bark', 3),
 ('sugar_maple_bark', 'silver_maple_bark', 3),
 ('white_ash_bark', 'boxelder_bark', 3),
 ('ailanthus_bark', 'black_maple_bark', 2),
 ('ailanthus_bark', 'white_ash_bark', 2),
 ('allegheny_serviceberry_bark', 'common_serviceberry_bark', 2),
 ('black_maple_bark', 'silver_maple_bark', 2),
 ('black_tupelo_bark', 'flowering_dogwood_bark', 2),
 ('black_tupelo_bark', 'horse_chestnut_bark', 2),
 ('black_tupelo_bark', 'sugar_maple_bark', 2),
 ('boxelder_bark', 'norway_maple_bark', 2),
 ('canadian_serviceberry_bark', 'allegheny_serviceberry_bark', 2),
 ('common_serviceberry_bark', 'american_crabapple_bark', 2),
 ('common_serviceberry_bark', 'canadian_serviceberry_bark', 2),
 ('common_serviceberry_bark', 'sour_cherry_bark', 2),
 ('eastern_redbud_bark', 'red_maple_bark', 2),
 ('english_field_maple_bark', 'ailanthus_bark', 2),
 ('english_field_maple_bark', 'eastern_redbud_bark', 2),
 ('english_field_maple_bark', 'garden_plum_bark', 2),
 ('flowering_dogwood_bark', 'radford_pear_bark', 2),
 ('garden_plum_bark', 'american_crabapple_bark', 2),
 ('garden_plum_bark', 'sweet_cherry_bark', 2),
 ('green_ash_bark', 'pumpkin_ash_bark', 2),
 ('green_ash_bark', 'white_ash_bark', 2),
 ('honeylocust_bark', 'black_locust_bark', 2),
 ('honeylocust_bark', 'pin_cherry_bark', 2),
 ('horse_chestnut_bark', 'black_cherry_bark', 2),
 ('northern_catalpa_bark', 'red_horsechestnut_bark', 2),
 ('norway_maple_bark', 'black_maple_bark', 2),
 ('norway_maple_bark', 'sugar_maple_bark', 2),
 ('ohio_buckeye_bark', 'sweet_cherry_bark', 2),
 ('pin_cherry_bark', 'canadian_serviceberry_bark', 2),
 ('pin_cherry_bark', 'sour_cherry_bark', 2),
 ('pumpkin_ash_bark', 'green_ash_bark', 2),
 ('red_horsechestnut_bark', 'horse_chestnut_bark', 2),
 ('red_maple_bark', 'norway_maple_bark', 2),
 ('sour_cherry_bark', 'american_crabapple_bark', 2),
 ('sugar_maple_bark', 'red_maple_bark', 2),
 ('white_ash_bark', 'black_tupelo_bark', 2),
 ('white_ash_bark', 'pumpkin_ash_bark', 2),
 ('yellow_buckeye_bark', 'northern_catalpa_bark', 2)]

Finally, we can use plot_top_losses() to look at some details of extreme outliers. Interestingly, we can see here that the losses were enormous. The model’s confidence in its predictions on all of these are extremely low.

interp.plot_top_losses(9, figsize=(20, 20))

Alright, so a lot of work needs to be done here, but I think we have the groundwork for an interesting, workable project. Some good possibilities for next steps might be:

Expand the dataset to include bark from all trees listed.
An experiment focusing on trees that the checklist mentioned hybridize easily to compare results.
Manually paring down the existing dataset; misclassification aside, we can see from the above that the scraper captured images that simply do not belong in the set.

Early Days!

epoch	train_loss	valid_loss	error_rate	time
0	4.448251	3.510020	0.870201	00:20
1	3.843391	3.168392	0.811700	00:20
2	3.177452	2.997584	0.802559	00:20
3	2.746491	2.947996	0.804388	00:20

epoch	train_loss	valid_loss	error_rate	time
0	2.877044	2.116479	0.606947	01:18
1	2.843391	2.076565	0.595978	01:18
2	2.735655	2.080849	0.605119	01:19
3	2.642251	2.076464	0.597806	01:18