Traceback (most recent call last):
File "G:\Python\pythonProject2\dog_vs_cat\train_catdo.py", line 57, in <module>
train_loader = DataLoader(tran_data, batch_size=BATCH_SIZE, shuffle=True, num_workers=4)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ProgramData\anaconda3\envs\202111612409\Lib\site-packages\torch\utils\data\dataloader.py", line 376, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ProgramData\anaconda3\envs\202111612409\Lib\site-packages\torch\utils\data\sampler.py", line 164, in init
raise ValueError(
ValueError: num_samples should be a positive integer value, but got num_samples=0
I understand you're encountering an error while trying to create a DataLoader for your dog vs. cat classification project. The error message suggests that your dataset is empty. Let's go through this step-by-step to identify and fix the issue.
Here's what we can do to troubleshoot and fix this issue:
tran_data dataset?Make sure your dataset directory structure is correct. Typically, for a dog vs. cat classification task, you might have a structure like this:
data/
├── train/
│ ├── cat/
│ │ ├── cat1.jpg
│ │ ├── cat2.jpg
│ │ └── ...
│ └── dog/
│ ├── dog1.jpg
│ ├── dog2.jpg
│ └── ...
└── test/
├── cat/
└── dog/Add some print statements before creating the DataLoader to check the size of your dataset:
print(f"Number of samples in tran_data: {len(tran_data)}")__len__ method is implemented correctly.Here's a basic structure of how your code might look:
import torch
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
# Define your transforms
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
# Load your dataset
tran_data = datasets.ImageFolder(root='path/to/your/train/data', transform=transform)
# Print the number of samples
print(f"Number of samples in tran_data: {len(tran_data)}")
# Create DataLoader
BATCH_SIZE = 32
train_loader = DataLoader(tran_data, batch_size=BATCH_SIZE, shuffle=True, num_workers=4)Can you provide more details about how you're creating your dataset? Once we have that information, we can better diagnose and fix the issue.