That moment I realized I'd been training AI models wrong for 2 years
I was messing with an image classifier for a side project, trying to get it to tell dogs from cats. Kept feeding it thousands of photos from my own camera roll, thinking more data was always better. Then a buddy who works at a big tech company looked at my training set and laughed. He pointed out every single picture had my dog in the same spot on the couch with the same lighting. I had basically built a model that recognized my living room, not dogs. That was 6 months ago. Now I actually hunt for diverse angles and backgrounds, even if it means fewer total pictures. Has anyone else made a similar mistake with their dataset?