-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In which file are the 11(+1) attribute scores recorded with image IDs? #13
Comments
Hi, Ely,
Thanks for your interest!
Please just use TestNewRegression... (NEW here) for testing. "New" means
that we excluded bad images, like the one containing adult content,
cartoon, non-photo, etc. For training, you can use TrainRegression and
ValidationRegression.
The averaged overall score is in the list with suffix "score", and other
attributes are those with the attribute name as suffix.
Hope this helps. Let me know if you have other concerns!
Regards,
Shu
…On Wed, Aug 23, 2017 at 12:56 PM, Ely Spears ***@***.***> wrote:
Thank you so much for sharing your data and code! I have a simple question
is you can spare some time for helping me out.
I downloaded the full data set and the file imgListFiles_label.zip, but I
cannot determine which files contain the actual labeled scores per each
image.
There are files of name "TestRegression...", "TestNewRegression..",
"TrainRegression", and "ValidationRegression".
I am just looking for the (raw or averaged) individual scores for each
aesthetic component and the overall score. Not any post-processing outputs
that are specific to your model's input.
Assuming these represent splits of the data, should I concatenate all of
these files together if I want a list for all of the AADB images? Also,
what does "TestRegression" vs "TestNewRegression" mean. Do I need both, or
other "TestRegression"?
Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGKZJIBkSu5lacmNPy7cAIwJAVN6UxePks5sbIPygaJpZM4PAgX2>
.
|
Hi Shu, Thanks for following up. When I approach it with your suggestion, I am finding there is a difference of 127 image file names between the set of 9958 images inside the 'datasetImages' directory and the list of images you get by concatenating them from 'TestNew', 'Train', and 'Validation' files. For example, if I pick a single attribute, like 'DoF', then all 9958 rows of images should be accounted for by looking at all rows of However, if I separately make a
In above, Is there a location with the 127 "new" image files, that I should add to the 'datasetImages' folder? I want to be sure I am getting the file names correct so I can align one CSV file which will have 9958 rows and have separate columns for each attribute score. |
Hi, Ely,
Thanks for the careful check.
According to what you said, I wrote matlab script as below to check the
consistency. I concatenate the rows for DoF attribute from Train,
Validation and TestNew. The output information is
#total images: 9831
no_missing_file
This means that there are 9831 "good" images in total. If I replace
"TestNew" with "Test", then I get all the 9958 images. This is consistent
with what you see. The reason is that "TestNew" excludes "bad" images, like
the one containing adult content, cartoon, non-photo pictures, etc. That's
why there are 127 images less in "TestNew" than "Test". I believe this
explains why you saw 127 different images at your end.
Hope this helps.
Regards,
Shu
%%%%%%%%%%%%%%%%%%%%%%%%%%
clear;
clc;
attribute_name = 'DoF';
folder_to_images = './datasetImages';
namelist = {};
scorelist = [];
%% concatenating the records
set_list = {'Train','Validation','TestNew'}; % three sets,
for i = 1:3
filename = ['imgList' set_list{i} 'Regression_' attribute_name '.txt'];
fid = fopen(filename, 'r');
tline = fgets(fid);
while ischar(tline)
tline = strtrim(tline);
C = strsplit(tline, ' ');
namelist{end+1} = C{1};
scorelist(end+1) = str2double(C{2});
tline = fgets(fid);
end
fclose(fid);
end
%% double-check
fprintf('#total images: %d\n', length(unique(namelist)));
no_missing_file = true;
for i = 1:length(namelist)
if ~exist(fullfile(folder_to_images,namelist{i}), 'file')
fprintf('%s does not exist!\n', namelist{i});
no_missing_file = false;
end
end
if no_missing_file
fprintf('no_missing_file\n');
else
fprintf('There are files missing!\n');
end
…On Sun, Aug 27, 2017 at 1:24 PM, Ely Spears ***@***.***> wrote:
Hi Shu,
Thanks for following up. When I approach it with your suggestion, I am
finding there is a different of 127 image file names between the set of
9958 images inside the 'datasetImages' directory and the list of images you
get by concatenating them from 'TestNew', 'Train', and 'Validation' files.
For example, if I pick a single attribute, like 'DoF', then all 9958 rows
of images should be accounted for by looking at all rows of
imgListTestNewRegression_DoF.txt`, `imgListTrainRegression_DoF.txt`, and
imgListValidationRegression_DoF.txt`.
However, if I separately make a set from all image file names in the
'datasetImages' folder, the two sets are not equal.
In [10]: filename_set = set(os.listdir("../datasetImages"))
In [11]: filenames_from_annotations = set(df.image_name.values)
In [12]: filename_set.difference(filenames_from_annotations)
Out[12]:
{'farm1_257_19551457934_78009e3cdf_b.jpg',
'farm1_257_20081367568_9a5e46c52d_b.jpg',
'farm1_258_19977693740_69a64c722c_b.jpg',
'farm1_258_19998330681_7947141b9a_b.jpg',
'farm1_258_20153488196_8430392dfd_b.jpg',
'farm1_260_20011542710_18512a2e49_b.jpg',
'farm1_261_19650341253_739d80b488_b.jpg',
'farm1_261_19661237724_ca01339159_b.jpg',
'farm1_261_20025642938_efb5355efe_b.jpg',
'farm1_263_20112620711_1c82b95850_b.jpg',
'farm1_264_20112107155_d1f42b858a_b.jpg',
'farm1_264_20191045435_20ec509328_b.jpg',
'farm1_265_19537402954_f14313e970_b.jpg',
'farm1_265_20001813849_5306c57b05_b.jpg',
'farm1_265_20154984142_7b03f70b4b_b.jpg',
'farm1_265_20182300971_b73b9544bc_b.jpg',
'farm1_267_20002219440_4532b202c1_b.jpg',
'farm1_268_20188568811_697ba6146b_b.jpg',
'farm1_269_19987423131_934dde76f2_b.jpg',
'farm1_270_20055053966_49218d4b2b_b.jpg',
'farm1_272_20282861605_161f622564_b.jpg',
'farm1_274_20071314641_4f74bed682_b.jpg',
'farm1_275_19976062488_db63a20c8f_b.jpg',
'farm1_276_20010653219_3b2fdd31bb_b.jpg',
'farm1_277_19923603720_834f9acc23_b.jpg',
'farm1_277_20282094821_24bc8bbb50_b.jpg',
'farm1_278_20101808021_71ab53801c_b.jpg',
'farm1_278_20156668495_5384aed590_b.jpg',
'farm1_279_20082259019_0f83066639_b.jpg',
'farm1_281_20277374661_efc9d5cbb2_b.jpg',
'farm1_282_19639704434_344f38bc28_b.jpg',
'farm1_282_20073012902_4f496354f3_b.jpg',
'farm1_283_20145205466_d31c079919_b.jpg',
'farm1_285_19794630170_6ebcce2e3f_b.jpg',
'farm1_285_20181900336_ab89701c63_b.jpg',
'farm1_285_20210153511_b1ca18f614_b.jpg',
'farm1_287_20198450001_579a228325_b.jpg',
'farm1_287_20205812381_e88e4ed07c_b.jpg',
'farm1_288_20218571861_392e702708_b.jpg',
'farm1_289_19474429064_c6c95ca5f8_b.jpg',
'farm1_293_19561491214_56756bc9fa_b.jpg',
'farm1_293_20183043961_17b5521f09_b.jpg',
'farm1_301_20174123975_4660281e14_b.jpg',
'farm1_304_19537298213_42a785d534_b.jpg',
'farm1_307_20194160235_412611ca37_b.jpg',
'farm1_310_19440925454_e17979bfe9_b.jpg',
'farm1_310_19545127013_6f9ee1e594_b.jpg',
'farm1_310_20008499479_ec470a400c_b.jpg',
'farm1_310_20156587556_748d5c2a95_b.jpg',
'farm1_313_20276894925_cdaa0aeddf_b.jpg',
'farm1_314_20109140896_c842328513_b.jpg',
'farm1_315_19555734304_e2c3fe5045_b.jpg',
'farm1_316_19806425039_cd3d8d6481_b.jpg',
'farm1_321_19928964810_58fc6afff5_b.jpg',
'farm1_321_20268434772_a42fedb758_b.jpg',
'farm1_322_19509281824_a28e71ac42_b.jpg',
'farm1_322_20177766422_2a6a383a0b_b.jpg',
'farm1_323_19512878813_df9e5f80a9_b.jpg',
'farm1_323_19642688713_f540f8e28a_b.jpg',
'farm1_323_20131578256_0a0035c0a4_b.jpg',
'farm1_323_20188849836_e3d4f8c4c0_b.jpg',
'farm1_324_19903715209_da2412b794_b.jpg',
'farm1_325_19999445778_f9894b0ac2_b.jpg',
'farm1_327_19995572685_80e5e27670_b.jpg',
'farm1_328_19972035690_01ba0f3ac3_b.jpg',
'farm1_328_19988362780_e5b8e5dbac_b.jpg',
'farm1_329_20088721718_b4d794c353_b.jpg',
'farm1_330_20100146912_225be2471c_b.jpg',
'farm1_331_19969233476_f2f4eabf76_b.jpg',
'farm1_333_20108487141_98b794abc4_b.jpg',
'farm1_335_20016713269_0cf280cb2a_b.jpg',
'farm1_335_20019567978_b0c1954a64_b.jpg',
'farm1_336_19546620374_f489bf5820_b.jpg',
'farm1_339_19708725023_b585ea2968_b.jpg',
'farm1_340_20066952012_2b02827e07_b.jpg',
'farm1_341_20099301301_aaee7e576f_b.jpg',
'farm1_341_20170995106_f91c81e04a_b.jpg',
'farm1_341_20190091625_87e842697e_b.jpg',
'farm1_341_20204595795_fd5b05bbb5_b.jpg',
'farm1_342_20176919346_64b1afc730_b.jpg',
'farm1_343_19923860928_7817f0a2e4_b.jpg',
'farm1_346_20107559161_96e716418b_b.jpg',
'farm1_349_20155857456_423018d405_b.jpg',
'farm1_356_20021866788_37d48f6cc4_b.jpg',
'farm1_358_19552188594_ac3e5ddcb2_b.jpg',
'farm1_358_20120536361_26b988773d_b.jpg',
'farm1_362_19482800604_6b55aa36bf_b.jpg',
'farm1_362_19513797184_9e80bf33af_b.jpg',
'farm1_367_19574973194_0813255782_b.jpg',
'farm1_367_19579961904_2afe3d61f2_b.jpg',
'farm1_372_20182354826_288048e80f_b.jpg',
'farm1_373_20051791466_465c2f0cd7_b.jpg',
'farm1_378_19994962928_4d1e3b5273_b.jpg',
'farm1_386_20077151338_565eca8245_b.jpg',
'farm1_391_19798851399_148ffae7bb_b.jpg',
'farm1_392_20149304556_66089212d3_b.jpg',
'farm1_392_20238659696_ab64fea61c_b.jpg',
'farm1_397_20133335455_6ac76df185_b.jpg',
'farm1_399_19477069424_487c07b573_b.jpg',
'farm1_403_20178595996_91aa9c535f_b.jpg',
'farm1_411_20078230839_df5d0bc006_b.jpg',
'farm1_411_20147625392_b9af7bf64e_b.jpg',
'farm1_415_19918210110_b64ca0c8ba_b.jpg',
'farm1_422_19805700400_a9a92b5640_b.jpg',
'farm1_428_19999459841_fe1801bcb9_b.jpg',
'farm1_430_20284364465_0d8c21e1be_b.jpg',
'farm1_437_19967443660_7d78f0f45e_b.jpg',
'farm1_439_20270730081_41d0a0ef74_b.jpg',
'farm1_448_20254011462_26901f13df_b.jpg',
'farm1_451_19647580264_fa094e4e18_b.jpg',
'farm1_457_19998705569_768dbff33a_b.jpg',
'farm1_470_20167424041_68c6fa7226_b.jpg',
'farm1_477_20208306105_13f7718315_b.jpg',
'farm1_486_20002923240_770ed2f527_b.jpg',
'farm1_493_19470626354_474a3f8453_b.jpg',
'farm1_514_19888436879_c91d7a9bc9_b.jpg',
'farm1_530_20093590000_52d0b7495a_b.jpg',
'farm1_533_20009980668_db03e9fdf0_b.jpg',
'farm1_544_19980254768_938d15ac44_b.jpg',
'farm1_547_19983479270_9f267ff96c_b.jpg',
'farm1_567_19709308274_d7a0697d8a_b.jpg',
'farm4_3668_19478079044_8b1075a121_b.jpg',
'farm4_3675_19534648023_6b02413da6_b.jpg',
'farm4_3676_19516813253_a6796867bc_b.jpg',
'farm4_3750_20121023921_2493f86a5c_b.jpg',
'farm4_3828_20273296635_0f1039c8c1_b.jpg',
'farm4_3832_19557447513_3fc826aff4_b.jpg'}
In above, df is a pandas.DataFrame which I construct by concatenating the
files for a single attribute, like DoF, so that the image_name column
contains all 9958 image names.
I want to be sure I am getting the file names correct so I can align one
CSV file which will have 9958 rows and have separate columns for each
attribute score.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGKZJLR3vng5Io3j8bvxXYljn693kb5Tks5scdB4gaJpZM4PAgX2>
.
|
Thanks again for your help. I am sorry I am missing some detail preventing me from understanding. When I look at the number of images listed in any of the "TestNew" files, it is 1000, just like for the plain "Test" files. For example:
So, if I use I also see 9958 images in the
In summary, whether I use the "TestNew" files or the "Test" files, the total number of rows for a fixed attribute, across the {"TestNew" or "Test"}, "Train", and "Validation" files, is always 9958 (never 9831), and the number of images inside I guess my question is: why are there 1000 entries in the "TestNew" files -- it sounds like you are saying only 783 of those entries should be valid, and the others are expected to be incorrect (missing) image file names that do not exist in |
Hi, Ely,
Do you mean that some of the images do not appear in the "datasetImages"
folder but they are listed in the TestNew txt file?
At my end, I can find all the images in the "datasetImages_warp256" folder.
Another thing is that I copy 127 images from validation set to TestNew set
to make TestNew 1000 images in total. Admittedly this means some images are
duplicate in validation and test set; while as validation set is not used
in the training, we believe this won't cause problem. The reason for
keeping 1000 images in test set is for easy management. Sorry for the
confusion.
Besides, can you make sure that you can find all the images in
"datasetImages_warp256" folder, and cannot find some in "datasetImages"
folder? Thanks for the careful check again!
Regards,
Shu
…On Mon, Aug 28, 2017 at 4:38 AM, Ely Spears ***@***.***> wrote:
Thanks again for your help. I am sorry I am missing some detail preventing
me from understanding. When I look at the number of images listed in any of
the "TestNew" files, it is 1000, just like for the plain "Test" files. For
example:
In [4]: with open("imgListTestRegression_BalacingElements.txt", 'r') as _f:
...: from_test_regression = _f.read().splitlines()
...:
In [5]: with open("imgListTestNewRegression_BalacingElements.txt", 'r') as _f:
...: from_test_new_regression = _f.read().splitlines()
...:
...:
In [6]: len(from_test_regression)
Out[6]: 1000
In [7]: len(from_test_new_regression)
Out[7]: 1000
In [8]: with open("imgListTrainRegression_BalacingElements.txt", 'r') as _f:
...: from_train_regression = _f.read().splitlines()
...:
...:
In [9]: with open("imgListValidationRegression_BalacingElements.txt", 'r') as _f:
...: from_validation_regression = _f.read().splitlines()
...:
...:
...:
In [10]: len(from_train_regression)
Out[10]: 8458
In [11]: len(from_validation_regression)
Out[11]: 500
In [12]: 1000 + 8458 + 500
Out[12]: 9958
So, if I use TestNewRegression_... files, I still always see 9958 images,
it's just that there are a new set of 127 image labels which do not appear
in the folder datasetImages.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGKZJLZ3umS4ZWDEU3mlr_t70y_zET3Yks5scqaugaJpZM4PAgX2>
.
|
Thank you so much for sharing your data and code! I have a simple question if you can spare some time for helping me out.
I downloaded the full data set and the file
imgListFiles_label.zip
, but I cannot determine which files contain the actual labeled scores per each image.There are files of name "TestRegression...", "TestNewRegression..", "TrainRegression", and "ValidationRegression".
I am just looking for the (raw or averaged) individual scores for each aesthetic component and the overall score. Not any post-processing outputs that are specific to your model's input.
Assuming these represent splits of the data, should I concatenate all of these files together if I want a list for all of the AADB images? Also, what does "TestRegression" vs "TestNewRegression" mean. Do I need both, or other "TestRegression"?
Thanks!
The text was updated successfully, but these errors were encountered: