Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies #8

fragmential · 2024-01-23T13:05:36Z

demonstration:

Steps to reproduce:

make sure LD_LIBRARY_PATH is set correctly

echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo performance > /sys/devices/system/cpu/cpufreq/policy2/scaling_governor
echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
./graph_alexnet_all_pipe_sync --threads=4  --threads2=2 --n=60 --total_cores=6 --partition_point=8 --partition_point2=8 --order="L-G-B"

Happens on frequencies of 1 GHz and higher. Once it didn't occur on 1.2 GHz, but it is generally very consistent in occurring.

Consistently reproducible on our system.

Hardware is plugged directly into power with the supplied Anker PowerPort+ 1 power supply.

Significance of the issue:

It disrupts the proper operation of our parser, which cost us hours trying to "fix" a bug yesterday that was due to this.

The text was updated successfully, but these errors were encountered:

Ehsan-aghapour · 2024-01-23T13:26:38Z

Sorry for the inconvenience.
It seems that the output report of the stages are mixed. The reason is that each stage is run with a separate Thread, and when they try to print simultaneously, this problem happens. One possible solution is to use std::cerr instead of the std::cout for reporting the output reports. The source file is examples/graph_alexnet_all_pipe_sync.cpp. At the end of do_run_1, do_run_2, and do_run_3 the std::cout could be replaced with std::cerr.

What was your solution for this problem?

fragmential · 2024-01-23T15:46:14Z

We didn't have a solution, we couldn't process the inference times for this data, and we couldn't average the other measurements with other data because our data processing scripts were relying on the order of items being consistent, which it wasn't, we didn't have time to change this so we just settled on not including the inference times on those tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies #8

Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies #8

fragmential commented Jan 23, 2024

Ehsan-aghapour commented Jan 23, 2024

fragmential commented Jan 23, 2024

Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies #8

Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies #8

Comments

fragmential commented Jan 23, 2024

Ehsan-aghapour commented Jan 23, 2024

fragmential commented Jan 23, 2024