-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Labels
Description
Description: PyDeepFlow currently requires users to manually handle all data preprocessing, batching, and loading. This creates a poor user experience and limits the framework's usability. We need a unified data loading system similar to PyTorch's DataLoader that handles batching, shuffling, preprocessing, and data augmentation automatically.
Current Problems:
# Current - users must do everything manually
X_train = np.array(...) # Manual data loading
X_train = (X_train - mean) / std # Manual normalization
# No batching, no shuffling, no augmentation support
model = Multi_Layer_ANN(X_train, y_train, ...) # Pass entire dataset
Why This Issue is Critical:
- User Experience: Makes the framework much easier to use
- Performance: Enables proper mini-batch training and memory management
- Scalability: Handles datasets larger than memory
- Standard Practice: Every modern ML framework has this
- Foundation: Required for data augmentation, preprocessing, and advanced training techniques