🐿️
1

Someone on here told me my old data cleaning scripts were a mess

They said I was basically just patching over the same problems every month instead of fixing the root cause. I switched to using a proper pipeline with PyTorch DataLoader and it cut my prep time in half. Anyone have a good method for handling missing time-series data in training sets?
2 comments

Log in to join the discussion

Log In
2 Comments
the_amy
the_amy1mo ago
Try a small generative model just to fill the gaps, not for the whole series.
7
jade639
jade6391mo ago
My uncle ran a car shop and did the same thing with quick fixes instead of real repairs. It always cost more later. For missing time series data, I've had good luck with a simple forward fill, then adding a binary flag column to mark where the fill happened. The model learns to notice those spots. What's the time scale you're working with?
4