Tree Identification – the Data has Been Doubled!

Because the most obvious means of improving a model is to give it more data to learn from, and we have a tool that can do this easily enough, I want to see what kind of performance increase we might see from ~doubling the size of our set. Here, we expand the dataset to 75 categories with ~200 images each. The size of this raw set is ~15,000 images. Much better!

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2
In [2]:
from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *

from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
In [3]:
bs = 16
initial_dims = 224
workers = 2
valid = 0.2
In [4]:
IMG_PATH = Path("data/bark_75_categories")
In [5]:
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims,
                                  num_workers=workers).normalize(imagenet_stats)
In [6]:
data.show_batch(rows=4, figsize=(10, 10))
In [7]:
learn = cnn_learner(data, models.resnet50, metrics=error_rate)
In [8]:
learn.fit_one_cycle(5)
epoch train_loss valid_loss error_rate time
0 4.529002 4.010970 0.875541 03:15
1 3.810512 3.475730 0.826479 03:14
2 3.470911 3.218662 0.792929 03:15
3 3.190804 3.085810 0.769841 03:17
4 2.996430 3.059711 0.763348 03:15
In [9]:
learn.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
12.84% [89/693 00:20<02:19 10.3016]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

Looking at this plot, I’m already unexcited about the prospect of seeing improvements here. The gradient is too flat; the model isn’t really able to tell the differences between a lot of images. It’s like when you’re getting an eye exam, and the doctor is asking you which is better of two seemingly identical lenses.

In [10]:
learn.recorder.plot()

Here we can see numerically that adding twice the data hasn’t dont much to improve things. The losses and the error rate are still pretty high.

In [11]:
learn.unfreeze()
learn.fit_one_cycle(5, max_lr=1e-4)
epoch train_loss valid_loss error_rate time
0 3.052122 3.101303 0.777417 03:14
1 3.245432 3.101461 0.765873 03:14
2 2.884305 2.891907 0.738456 03:15
3 2.567502 2.761083 0.701659 03:14
4 2.345342 2.728074 0.695887 03:15
In [12]:
learn.save("cpi-0.0_75-categories-1")
In [13]:
data_larger = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims*2,
                                  num_workers=workers).normalize(imagenet_stats)

Still, it won’t take long to do the progressive upscaling for the sake of comparison.

In [14]:
learn_larger = cnn_learner(data_larger, models.resnet50, metrics=error_rate)
In [ ]:
learn_larger.load("cpi-0.0_75-categories-1")
In [16]:
#learn_larger.fit_one_cycle(4, max_lr=1e-4)
learn_larger.fit_one_cycle(5)
epoch train_loss valid_loss error_rate time
0 2.869457 2.432699 0.651154 04:49
1 2.812652 2.453286 0.652237 04:44
2 2.622365 2.370909 0.632035 04:45
3 2.446855 2.301858 0.615079 04:49
4 2.271797 2.282915 0.616162 04:44
In [18]:
learn_larger.save("cpi-0.0_75-categories-1b")
In [19]:
learn_larger.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
12.70% [88/693 00:31<03:34 6.9534]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
In [20]:
learn_larger.recorder.plot()

The model is doing its best, but it is still having a lot of trouble beating a ~60% error rate.

In [21]:
learn_larger.fit_one_cycle(4, max_lr=slice(1e-5, 1e-4))
epoch train_loss valid_loss error_rate time
0 2.264575 2.272218 0.612554 04:44
1 2.260219 2.268570 0.615440 04:45
2 2.219465 2.270867 0.615801 04:45
3 2.199016 2.264873 0.609307 04:49
In [22]:
interp = ClassificationInterpretation.from_learner(learn_larger)

It is important to maintain the perspective that we are dealing with a dataset with 75 classes. Random chance would yield an error rate of ~98.667%, so the model is picking up on some important patterns. Here, we can see the confusion matrix in all its toddling glory.

In [23]:
interp.plot_confusion_matrix(figsize=(24, 24), dpi=60)

A Higher Perspective

Doubling the number of images did not improve results in a meaningful way. There is a chance that I am drastically underestimating the number of images required here, but ~15,000 images is enough that I don’t want to get 5X-10X more without seeing what can be done with this existing set. Let’s try zooming out a bit.

Up until now, we’ve been looking at classification on the species level, and this has a lot of issues. A lot of species can hybridize, and of the specimens that haven’t, a lot of them are similar enough that they could be confused in a casual inspection. Next step: we can bypass a lot of these issues by regrouping the data into taxonomic orders instead of species. There are a lot of explanations for a model that gets cherry tree species mixed up, but not quite as many for one that confuses cherry for pine trees. Let’s do this!

(Pardon the naming and numbering weirdness; I’m stitching a few notebooks together in editing)

Classifying by Taxonomic Order – What Does the Dataset Look Like?

The original dataset was ~15,000 images spread across 75 classes. This is a lot, but not so many that it can’t be done manually with a little patience. The Metroparks checklist provides just enough taxonomic information to go on, and the species that I downloaded were covered by 10 orders. The resultant classes are unfortunately unbalanced, and I’ll say more about that later.

In merging classes from the original set, I noticed that there were a large number of redundant images in some of the orders. Given that we’re getting these images from a somewhat blind search on Google, this was to be expected. In all, the new dataset features ~12,000 images across 10 categories, meaning we lost something like 2,000-3,000 images worth of redundant or mislabelled data.

image.png

In [6]:
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims,
                                  num_workers=workers).normalize(imagenet_stats)
In [7]:
data.show_batch(rows=4, figsize=(10, 10))
In [8]:
learn = cnn_learner(data, models.resnet50, metrics=error_rate)

Is this encouraging? It seems to be learning faster, but we are also dealing with dataset that has far fewer categories.

In [9]:
learn.fit_one_cycle(5)
epoch train_loss valid_loss error_rate time
0 2.293406 1.954188 0.609133 02:48
1 1.983944 1.719169 0.573104 02:49
2 1.779403 1.547382 0.510683 02:47
3 1.643190 1.500176 0.497696 02:49
4 1.535627 1.479375 0.488060 02:49
In [10]:
learn.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
14.57% [87/597 00:19<01:54 4.0592]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

The similarly flat learn rate plot is giving me a weird feeling about this, too, but we’ll see where it goes.

In [11]:
learn.recorder.plot()
In [13]:
learn.unfreeze()
learn.fit_one_cycle(5, max_lr=slice(1e-5, 5e-5))
epoch train_loss valid_loss error_rate time
0 1.510465 1.453368 0.472979 02:49
1 1.498507 1.409679 0.457478 02:50
2 1.375820 1.350060 0.441558 02:50
3 1.291470 1.344369 0.440721 02:50
4 1.211504 1.334942 0.436531 02:49
In [14]:
learn.save("cpi-0.0-orders-1")
In [15]:
data_larger = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims*2,
                                  num_workers=workers).normalize(imagenet_stats)
In [16]:
learn_larger = cnn_learner(data_larger, models.resnet50, metrics=error_rate)
In [ ]:
learn_larger.load("cpi-0.0-orders-1")

We see the upscaled version starts off with a similar error rate as the above and doesn’t train as quickly.

In [18]:
learn_larger.unfreeze()
learn_larger.fit_one_cycle(5)
epoch train_loss valid_loss error_rate time
0 1.839270 1.841498 0.596146 04:18
1 2.060362 2.006438 0.692920 04:13
2 1.845123 1.747595 0.587348 04:15
3 1.692970 1.582104 0.516129 04:15
4 1.557250 1.513704 0.513615 04:18
In [19]:
learn_larger.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
14.57% [87/597 00:31<03:03 4.9113]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

But we are seeing some improvement here.

In [20]:
learn_larger.recorder.plot()
In [23]:
learn_larger.fit_one_cycle(5, max_lr=slice(2e-5, 1e-4))
epoch train_loss valid_loss error_rate time
0 1.526821 1.504949 0.497277 04:14
1 1.444442 1.469969 0.486804 04:12
2 1.433790 1.434834 0.477168 04:16
3 1.410849 1.431510 0.462505 04:14
4 1.379590 1.422813 0.459573 04:20

Uh… Let’s do that again.

In [27]:
learn_larger.fit_one_cycle(5, max_lr=slice(2e-5, 1e-4))
epoch train_loss valid_loss error_rate time
0 1.379468 1.441278 0.467114 04:13
1 1.403144 1.421605 0.467114 04:16
2 1.363362 1.403969 0.458735 04:19
3 1.260465 1.386433 0.454964 04:16
4 1.268007 1.363443 0.448680 04:22

Alright, I think we’re coming up on a plateau. Let’s do the honors.

In [28]:
interp = ClassificationInterpretation.from_learner(learn_larger)

I think this is a good illustration of where the baseline confusion matrix falls down. Our classes are imbalanced enough where the visual of a solid diagonal of about the same intensity doesn’t tell the whole story. There are just two big classes, the visual impression of which dominates the chart.

In [29]:
interp.plot_confusion_matrix(figsize=(24, 24), dpi=60)

Results

Alright, on a dataset with 10 plant classes, ~12,000 images, a ResNet50 model like this will give us results of ~55% accuracy. Clearly, there is some room for improvement.

But! I did some nosing around and found How we beat the FastAI leaderboard score by +19.77%…a synergy of new deep learning techniques for your consideration.. I was especially interested in its discussion of the ImageWoof dataset, concerning the classification of dog breeds. It also has about 10 classes and ~12,000 images, and good performance on that is also on the order of 55% accuracy (or at least, it was before this article came out).

Additionally, dog breeds are kind of… “Meant to be distinguishable” is not the right term, but certainly a lot more work went into them being distinct than serviceberry trees!

Next Steps

If nothing else, trying to classify by order instead of species has given us a lot of information about the difficulty of the problem at hand. Immediate next steps will be to examine methods of dealing with class imbalance, but I want to do more thinking about what can be done at the species level. The error rate was much higher at that level of classification, but so was the specificity, and I think that’s worth exploring. Early Days!

Tree Identification with ResNet50 – Quick

This is just a quick update following the same format as the last post, but with some minor modifications. In summary, it has been expanded to include 66 of the species of trees included in the Cleveland Metroparks checklists and has been trained using ResNet50 instead of ResNet34.

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2
In [2]:
from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *

from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

Having a handful of these parameters available to be tweaked about here is useful, and it’s a format I’m probably going to stick to for small experiments in notebooks like this.

In [3]:
bs = 16
initial_dims = 224
workers = 2
valid = 0.2
In [4]:
IMG_PATH = Path("data/bark_66_categories")
In [5]:
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims,
                                  num_workers=workers).normalize(imagenet_stats)
In [6]:
data.show_batch(rows=4, figsize=(10, 10))
In [7]:
learn = cnn_learner(data, models.resnet50, metrics=error_rate)

Here, we can already see some modest improvements using a model that has a larger capacity. We’ll be training with the larger images immediately after this one to see just how much more performance we can get out of this model.

In [8]:
learn.fit_one_cycle(4)
epoch train_loss valid_loss error_rate time
0 4.757617 4.026068 0.869828 01:18
1 4.145549 3.616853 0.832759 01:19
2 3.440784 3.219274 0.793103 01:19
3 3.002847 3.121537 0.772414 01:18
In [9]:
learn.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
30.69% [89/290 00:19<00:43 8.9121]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
In [10]:
learn.recorder.plot()
In [11]:
learn.unfreeze()
learn.fit_one_cycle(4, max_lr=1e-4)
epoch train_loss valid_loss error_rate time
0 3.026054 3.164094 0.773276 01:19
1 3.100843 3.155196 0.770690 01:20
2 2.633893 2.938268 0.725000 01:18
3 2.292624 2.934356 0.718966 01:18
In [12]:
learn.save("cpi-0.0_66-categories_resnet50-1")
In [13]:
data_larger = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=valid,
                                 ds_tfms=get_transforms(), bs=bs, size=initial_dims*2,
                                  num_workers=workers).normalize(imagenet_stats)
In [16]:
learn_larger = cnn_learner(data_larger, models.resnet50, metrics=error_rate)
In [18]:
learn_larger.load("cpi-0.0_66-categories_resnet50-1")
In [19]:
learn_larger.fit_one_cycle(4)
epoch train_loss valid_loss error_rate time
0 3.055164 2.551959 0.679310 01:54
1 2.890265 2.595395 0.670690 01:52
2 2.633454 2.443143 0.643103 01:53
3 2.223217 2.389971 0.634483 01:54
In [20]:
interp = ClassificationInterpretation.from_learner(learn_larger)
In [21]:
interp.plot_confusion_matrix(figsize=(24, 24), dpi=60)

Saving this, because we might be able to use it in the future.

In [22]:
learn.save("cpi-0.0_66-categories_resnet50-2")

Next Steps

So, we can do it. We can double the number of classes and, even though they all have less than 100 images, we can still get modestly successful results. Looking at the error rate above, though, it’s still… Not good. I already mentioned in the last post that some of this is due to the taxonomic nature of the dataset. That is, many of these are in the same genus, and even a human might be likely to get them mixed up.

At this point, a good next step is obvious to me; use more than one label for each category. The model getting conifers mixed up with each other indicates that there is the structure for what a conifer broadly looks like. It just needs to be told that that’s a relevant category. From there, it might be able to offer best guesses.

Witht that target in mind, I will see you soon. Early Days!

Tree Identification – Getting Back on the Saddle

A couple of days ago, I used hws to download a very preliminary dataset. The true aim of this project is to identify some local trees by their bark, but today I’m going to use it as an illustration of the importance of dataset curation and one limitation of automated tools. For instance, web scrapers are dumb. Even if an image is incorrectly tagged by a person or ranked weirdly because of a search algorithm, it can still be downloaded into your raw dataset and needs to be processed by a person.

I noticed early on that, when searching for tree bark, Google returned a lot of accurate results at first, but then very quickly moved on to images that were probably of the same kind of three, but featuring leaves, berries, and picnicking families. A lot had already been downloaded, and I need to get back into the swing of fastai, anyway, so it made sense to see what would happen if I just threw all the data at a model to get a baseline.

To get a sense of the species included in this dataset, I found checklists of the local foliage from the Cleveland Metroparks site, then decided to pare down to just the first page of trees that are common or occasional, and don’t have numerous hybridizations. In all, this yielded about 2,500 images across 31 categories. That is to say, my first run captured less than 100 images for each species, which might produce some workable results if those had just been bark, but was almost certainly going to fail when learning so many different features.

To get some housekeeping out of the way, we use the standard imports.

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2
In [2]:
from fastai.vision import *
from fastai.datasets import *
from fastai.widgets import *

from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
In [3]:
IMG_PATH = Path("data/images")

Because the dataset in question was just downloaded into folders, we’ll be getting an ImageDataBunch using from_folder. From experience, this is probably going to change to handle pandas DataFrames or CSV annotations as the dataset grows. If you’re following, keep an eye on size, bs, and num_workers. size comes in later because we want to squeeze as much as we can out of this data, and retraining a model on scaled-up images is a clever trick for that. bs and num_workers might have to be tuned down to hardware limitations.

In [4]:
data = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=0.2,
                                 ds_tfms=get_transforms(), bs=16, size=224, num_workers=4).normalize(imagenet_stats)

This next line makes it a bit more apparent that the dataset needs to be curated. show_batch() takes a random sample of images; you might see a lot of bark, and you might not.

In [6]:
data.show_batch(rows=4, figsize=(10, 10))

Pressing ahead, we’re just using ResNet34.

In [7]:
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

Here we see that the results leave a lot to be desired. Error rates on the order of 80% are basically noise.

In [8]:
learn.fit_one_cycle(4)
epoch train_loss valid_loss error_rate time
0 4.448251 3.510020 0.870201 00:20
1 3.843391 3.168392 0.811700 00:20
2 3.177452 2.997584 0.802559 00:20
3 2.746491 2.947996 0.804388 00:20

But! We’re going to save it and try training on slightly larger images, anyway, just for fun.

In [9]:
learn.save("cpi-0.0_ueg-1")
learn.unfreeze()
learn.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
63.97% [87/136 00:11<00:06 9.3524]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

Intuitively, I can see from this plot that this gradient is going to be way too small to get much better out of this.

In [10]:
learn.recorder.plot()
In [20]:
data_336 = ImageDataBunch.from_folder(IMG_PATH, train=".", valid_pct=0.2,
                                          ds_tfms=get_transforms(), bs=16, size=336, num_workers=1).normalize(imagenet_stats)
In [21]:
learn_336 = cnn_learner(data_336, models.resnet34, metrics=error_rate)
In [ ]:
learn_336.load("cpi-0.0_ueg-1")
In [23]:
learn_336.unfreeze()
learn_336.lr_find()
0.00% [0/1 00:00<00:00]
epoch train_loss valid_loss error_rate time
64.71% [88/136 00:42<00:23 11.7046]
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
In [24]:
learn_336.recorder.plot()

And here is why I wanted to highlight this technique. The error rate is down ~10-20% just by using the larger images

In [25]:
learn_336.fit_one_cycle(4, max_lr=1e-5)
epoch train_loss valid_loss error_rate time
0 2.877044 2.116479 0.606947 01:18
1 2.843391 2.076565 0.595978 01:18
2 2.735655 2.080849 0.605119 01:19
3 2.642251 2.076464 0.597806 01:18

Because sometimes difficult to figure out what results mean from just looking at the error rate, and it’s helpful to see exactly what is being miscategorized, let’s take a look at a confusion matrix.

In [26]:
interp = ClassificationInterpretation.from_learner(learn_336)

SURPRISE!

In [27]:
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
In [28]:
learn_336.save("cpi-0.0_ueg-2")

In spite of the small size of the dataset, the model could actually stand to be a lot worse. We see a nice string of correct classifications along the diagonal. What about the deviations? Comparing some of the most prominent misclassifications, e.g. the strong difficulty in telling pin cherry trees from sweet cherry trees, a given person could understand how a mistake was made.

Alright, this is fun. There are a couple of other tools that will let us take a closer look at what is going wrong here. most_confused() here lets us look at every time one class was confused for the other more than once.

In [37]:
interp.most_confused(2, 10)
Out[37]:
[('pin_cherry_bark', 'sweet_cherry_bark', 6),
 ('red_maple_bark', 'silver_maple_bark', 6),
 ('radford_pear_bark', 'canadian_serviceberry_bark', 5),
 ('black_cherry_bark', 'sweet_cherry_bark', 4),
 ('sweet_cherry_bark', 'black_cherry_bark', 4),
 ('allegheny_serviceberry_bark', 'canadian_serviceberry_bark', 3),
 ('black_maple_bark', 'english_field_maple_bark', 3),
 ('black_maple_bark', 'norway_maple_bark', 3),
 ('black_tupelo_bark', 'silver_maple_bark', 3),
 ('boxelder_bark', 'white_ash_bark', 3),
 ('canadian_serviceberry_bark', 'common_serviceberry_bark', 3),
 ('eastern_redbud_bark', 'american_crabapple_bark', 3),
 ('garden_plum_bark', 'honeylocust_bark', 3),
 ('norway_maple_bark', 'white_ash_bark', 3),
 ('pumpkin_ash_bark', 'white_ash_bark', 3),
 ('red_maple_bark', 'sugar_maple_bark', 3),
 ('silver_maple_bark', 'sugar_maple_bark', 3),
 ('sour_cherry_bark', 'black_cherry_bark', 3),
 ('sour_cherry_bark', 'sweet_cherry_bark', 3),
 ('sugar_maple_bark', 'silver_maple_bark', 3),
 ('white_ash_bark', 'boxelder_bark', 3),
 ('ailanthus_bark', 'black_maple_bark', 2),
 ('ailanthus_bark', 'white_ash_bark', 2),
 ('allegheny_serviceberry_bark', 'common_serviceberry_bark', 2),
 ('black_maple_bark', 'silver_maple_bark', 2),
 ('black_tupelo_bark', 'flowering_dogwood_bark', 2),
 ('black_tupelo_bark', 'horse_chestnut_bark', 2),
 ('black_tupelo_bark', 'sugar_maple_bark', 2),
 ('boxelder_bark', 'norway_maple_bark', 2),
 ('canadian_serviceberry_bark', 'allegheny_serviceberry_bark', 2),
 ('common_serviceberry_bark', 'american_crabapple_bark', 2),
 ('common_serviceberry_bark', 'canadian_serviceberry_bark', 2),
 ('common_serviceberry_bark', 'sour_cherry_bark', 2),
 ('eastern_redbud_bark', 'red_maple_bark', 2),
 ('english_field_maple_bark', 'ailanthus_bark', 2),
 ('english_field_maple_bark', 'eastern_redbud_bark', 2),
 ('english_field_maple_bark', 'garden_plum_bark', 2),
 ('flowering_dogwood_bark', 'radford_pear_bark', 2),
 ('garden_plum_bark', 'american_crabapple_bark', 2),
 ('garden_plum_bark', 'sweet_cherry_bark', 2),
 ('green_ash_bark', 'pumpkin_ash_bark', 2),
 ('green_ash_bark', 'white_ash_bark', 2),
 ('honeylocust_bark', 'black_locust_bark', 2),
 ('honeylocust_bark', 'pin_cherry_bark', 2),
 ('horse_chestnut_bark', 'black_cherry_bark', 2),
 ('northern_catalpa_bark', 'red_horsechestnut_bark', 2),
 ('norway_maple_bark', 'black_maple_bark', 2),
 ('norway_maple_bark', 'sugar_maple_bark', 2),
 ('ohio_buckeye_bark', 'sweet_cherry_bark', 2),
 ('pin_cherry_bark', 'canadian_serviceberry_bark', 2),
 ('pin_cherry_bark', 'sour_cherry_bark', 2),
 ('pumpkin_ash_bark', 'green_ash_bark', 2),
 ('red_horsechestnut_bark', 'horse_chestnut_bark', 2),
 ('red_maple_bark', 'norway_maple_bark', 2),
 ('sour_cherry_bark', 'american_crabapple_bark', 2),
 ('sugar_maple_bark', 'red_maple_bark', 2),
 ('white_ash_bark', 'black_tupelo_bark', 2),
 ('white_ash_bark', 'pumpkin_ash_bark', 2),
 ('yellow_buckeye_bark', 'northern_catalpa_bark', 2)]

Finally, we can use plot_top_losses() to look at some details of extreme outliers. Interestingly, we can see here that the losses were enormous. The model’s confidence in its predictions on all of these are extremely low.

In [40]:
interp.plot_top_losses(9, figsize=(20, 20))

Alright, so a lot of work needs to be done here, but I think we have the groundwork for an interesting, workable project. Some good possibilities for next steps might be:

  • Expand the dataset to include bark from all trees listed.
  • An experiment focusing on trees that the checklist mentioned hybridize easily to compare results.
  • Manually paring down the existing dataset; misclassification aside, we can see from the above that the scraper captured images that simply do not belong in the set.

Early Days!