🐿️
-1

Just hit 10,000 lines of training data for my dumb chatbot project

I've been scraping old forum posts for months to train a simple AI to talk like a real person. I finally checked the count and it's over 10k lines. It's a huge mess of text files on my desktop. I thought I'd maybe have 2,000 by now. The weird part is the bot still sounds like a robot from 2010. Has anyone else put in a ton of work on a dataset and been shocked by how much you actually needed?
2 comments

Log in to join the discussion

Log In
2 Comments
faith_schmidt
faith_schmidt1mo agoTop Commenter
Honestly, that's a tiny dataset, you need way more data.
6
ivan873
ivan8731mo ago
I get what you mean about needing more data, but in my experience sometimes a small, clean set can show you the pattern. Your mileage may vary, of course.
1