randomly selection of images from File
I have a file that contains a 400 images. What I want is to separate this file into two files: train_images and test_images.
The train_images should contains 150 images selected randomly, and all these images must be different from each other. Then, the test_images should also contains 150 images selected randomly, and should be different from each other, even from the images selected in the file train_images.
I begin by writing a code that aims to select a random number of images from a Faces file and put them on train_images file. I need your help in order to respond to my behavior described above.
clear all; close all; clc; Train_images='train_faces'; mkdir(Train_images); ImageFiles = dir('Faces'); totalNumberOfImages = length(ImageFiles)-1; scrambledList = randperm(totalNumberOfImages); numberIWantToUse = 150; loop_counter = 1; for index = scrambledList(1:numberIWantToUse) baseFileName = ImageFiles(index).name; str = fullfile('faces', baseFileName); % Better than STRCAT face = imread(str); imwrite( face, fullfile(Train_images, ['hello' num2str(index) '.jpg'])); loop_counter = loop_counter + 1; end
Any help will be very appreciated.
Your code looks good to me. When you implement the test, you can re-run the scrambledList = randperm(totalNumberOfImages); then select the first 150 elements in scrambledList as you did in training process.
You can also directly re-initialize the loop:
for index = scrambledList(numberIWantToUse+1 : 2*numberIWantToUse) ... % same thing you wrote in your training loop end
with this approach, your test sample will be completely different from the training sample.
Supposing that you have the Bioinformatics Toolbox, you can use crossvalind using the parameter HoldOut:
This is an example. trainand test are logical arrays, so you can use findto get the actual indexes:
ImageFiles = dir('Faces'); ImageFilesIndexes = ones(1,length(ImageFiles )) %Use a numeric array instead the char array proportion = 150/400; %Testing set [train,test] = crossvalind('holdout',ImageFilesIndexes,proportion ); training_files = ImageFiles(train); %250 files: It is better to use more data to train testing_files = ImageFiles(test); %150 files %Then do whatever you like with the files