Making New Image Datasets – Quick

A lot has changed since the last time I made a solid run at machine learning. Importantly, the tools that I used to scrape images from Google have been broken. All of them, even ones that I didn’t want to use in the beginning, but could verify that they worked at the time. Command line tools that can be installed with pip, Firefox extensions, little JS scripts that have to be manually entered into a browser’s console, you name it.

Recognizing a need in this space, it is with great reluctance that I am uploading a very preliminary shot at a GUI scraper for anyone who might need one. It uses Selenium for the web driver, the Tkinter library for the GUI, and there is a bit of work with threads and such in there.

No lie, I’m not super happy with it right now. It is a bit slow, and admittedly everything is more byzantine than I would like. But guess what? Perfect is the enemy of done. The plan is to let anyone who wants to use it as-is do so, and do a tear-down of the code here when it is a bit less… Fresh. Anyway, the repo can be found here. Happy Halloween, everybody. Early Days.