-1
Just hit 10,000 lines of training data for my dumb chatbot project
I've been scraping old forum posts for months to train a simple AI to talk like a real person. I finally checked the count and it's over 10k lines. It's a huge mess of text files on my desktop. I thought I'd maybe have 2,000 by now. The weird part is the bot still sounds like a robot from 2010. Has anyone else put in a ton of work on a dataset and been shocked by how much you actually needed?
2 comments
Log in to join the discussion
Log In2 Comments
faith_schmidt1mo agoTop Commenter
Honestly, that's a tiny dataset, you need way more data.
6
ivan8731mo ago
I get what you mean about needing more data, but in my experience sometimes a small, clean set can show you the pattern. Your mileage may vary, of course.
1