the_terry•27d agoProlific Poster

That moment I realized I'd been training AI models wrong for 2 years

I was messing with an image classifier for a side project, trying to get it to tell dogs from cats. Kept feeding it thousands of photos from my own camera roll, thinking more data was always better. Then a buddy who works at a big tech company looked at my training set and laughed. He pointed out every single picture had my dog in the same spot on the couch with the same lighting. I had basically built a model that recognized my living room, not dogs. That was 6 months ago. Now I actually hunt for diverse angles and backgrounds, even if it means fewer total pictures. Has anyone else made a similar mistake with their dataset?

2 comments

2 Comments

uma_patel1926d ago

See it a little different actually... I think the real lesson is about how easy it is to trick yourself into thinking you're being thorough when you're really just repeating the same patterns. Two years is a long time to miss something that obvious but I bet most of us have done some version of this. The couch lighting thing is funny because it shows how our brains fill in gaps for us but the model just sees pixels. Good on you for catching it though, that's the kind of mistake that makes you a better builder in the long run.

grant_torres26d ago

Spent three months debugging a model that was just learning to detect the timestamp overlay in my training images instead of the actual objects.