Skip to content

Hashes by microservice #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
1313bf5
implemented not tested
jessberg Dec 17, 2022
121879c
parallelized for easy testing
jessberg Dec 17, 2022
7afd94b
simpler version done and works on duration query
jessberg Dec 17, 2022
d2b6821
doesn't help sadge
jessberg Dec 17, 2022
773d8a9
forgot a slash
jessberg Jan 6, 2023
479bf10
better now
jessberg Jan 6, 2023
f296e38
parallelized
jessberg Jan 6, 2023
df676f2
better printing
jessberg Jan 6, 2023
363ebce
thread pool
jessberg Jan 6, 2023
3e579e4
better timing
jessberg Jan 6, 2023
948b996
more timing info
jessberg Jan 6, 2023
4184dd6
better print outs
jessberg Jan 6, 2023
a6ff691
now has pool
jessberg Jan 6, 2023
2cec2a0
better pool sizes
jessberg Jan 9, 2023
747c631
implemented not tested
jessberg Jan 18, 2023
c39d57b
it compiles
jessberg Jan 18, 2023
0319822
issue is parallelization
jessberg Jan 18, 2023
a9f9897
parallelization implemented, hopefully this is better
jessberg Jan 18, 2023
f3a8676
proper printing
jessberg Jan 18, 2023
eeed3c1
good now
jessberg Jan 18, 2023
4c67820
now should work for batched
jessberg Jan 19, 2023
e13b29d
Merge branch 'hashes_by_microservice' of github.com:dyn-tracing/trace…
jessberg Jan 19, 2023
322a8b1
works for batched
jessberg Jan 19, 2023
fd6462d
really good latencies this time - I think this is the move
jessberg Jan 19, 2023
910af30
preliminary plotting
jessberg Jan 19, 2023
2a9ef1d
bytes count
jessberg Jan 19, 2023
c2561a4
only a few runs
jessberg Jan 19, 2023
57453a5
new data
jessberg Jan 20, 2023
78dd479
new data
jessberg Jan 20, 2023
94342ae
decent graph plotted
jessberg Jan 20, 2023
8375a51
resultsgit add graph.png
jessberg Jan 20, 2023
2c52bba
thirty CSVs results
jessberg Jan 20, 2023
501b5c0
30 CSVs
jessberg Jan 20, 2023
5146499
proper processed
jessberg Jan 20, 2023
b6f4e54
proper count
jessberg Jan 20, 2023
9d74972
These are the 30 CSV numbers
jessberg Jan 20, 2023
0ad9c71
70 csvs
jessberg Jan 21, 2023
563d660
70 preprocessed data
jessberg Jan 21, 2023
7fee3df
now with more results
jessberg Jan 23, 2023
d1c45c3
70 csv complete
jessberg Jan 23, 2023
a16f25b
saves properly, in pdf
jessberg Jan 23, 2023
bd48686
prettier graph
jessberg Jan 23, 2023
0faff7d
did all of the data set
jessberg Jan 24, 2023
6bde49e
Merge branch 'hashes_by_microservice' of https://github.com/dyn-traci…
jessberg Jan 24, 2023
a084456
all data
jessberg Jan 24, 2023
90c9e66
done
jessberg Jan 24, 2023
3494cd2
structural query okay
jessberg Jan 25, 2023
7a89788
trace id works
jessberg Jan 25, 2023
79dc951
now actually uses bloom index
jessberg Jan 25, 2023
f190cd8
all ready for tomorrow
jessberg Jan 25, 2023
0668945
lint
jessberg Jan 25, 2023
00db953
folders index test
jessberg Jan 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions analyze_results/analyze.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@

FOLDER='batched_index_all'
cd ../${FOLDER}
cp ../analyze_results/process_results.py .
python3 process_results.py
cp processed.csv ../analyze_results/
cd ../../microservices_env/send_alibaba_data_to_gcs
python3 count_traces.py
python3 count_bytes.py
cp traces_count.csv ../../trace_storage/analyze_results/
cp bytes_count.csv ../../trace_storage/analyze_results/

1 change: 1 addition & 0 deletions analyze_results/bytes_count.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0,424762836,872422956,1287220931,1703937206,2150374650,2575278768,2994730955,3450404225,3896093797,4338945409,4792142818,5184710229,5584101236,6007462718,6408867920,6813501448,7246283096,7662772393,8072592040,8504579434,8891513423,9338761598,9753384541,10154991202,10607195656,11045875795,11496293404,11962189860,12392922205,12859127325,13269405933,13688726186,14142884874,14575690436,14991536331,15424129949,15873455042,16324377147,16809764434,17255765969,17714570746,18201534050,18643440315,19097802920,19588621209,20070484705,20514262216,20976617658,21434954219,21892043266,22374432264,22793835547,23209039145,23645559062,24078799131,24502222525,24924563111,25325525182,25738362450,26174070417,26617501029,27055808240,27521594785,27936213689,28348723294,28801270737,29258216105,29730370912,30220822680,30613062423,31023970676,31458745786,31879499429,32303195371,32751830249,33144068367,33532734400,33955430961,34348224486,34747673150,35163549154,35573817181,35958131970,36368400277,36739945978,37114673151,37510964294,37902824347,38288473510,38706549874,39064956315,39433624776,39820906901,40208014620,40600423356,41021141799,41392088699,41766839148,42148065372,42518215725,42882412984,43275605916,43597645962,43925556018,44253966593,44547666812,44882623393,45244432742,45593007306,45903456823,46231455740,46522544855,46814116605,47117586298,47435451702,47758482911,48097078125,48400131954,48708322126,49037379368,49354531188,49666477752,50010214739,50354203116,50694194350,51054460433,51406300106,51779051699,52168969741,52581089986,52975595637,53405452938,53810691965,54219331112,54660979525,55123046634,55578472077,56065896196,56524937837,57003795130,57508027052,57508027758,57897288750
Binary file added analyze_results/graph.pdf
Binary file not shown.
Binary file added analyze_results/graph.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletions analyze_results/plot_graph.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
import matplotlib.pyplot as plt
import numpy as np
import csv

def import_csv(filename):
with open(filename) as csvfile:
spamreader = csv.reader(csvfile)
nums = []

for row in spamreader:
nums.extend(row)

to_return = []
# the reason for the weirdness at the end is that I did every 5,
# but 144 isn't divisible by 5, so I did it once we had all data in the
# system. Then 22 and 30 were messed up so I added that and did it again
# So now we have two tacked on and the end that aren't divisible by 5
for i in range(len(nums)):
if (i+1) % 5 == 0 or i == 141 or i == 143: # want to include the last one
to_return.append(int(nums[i]))

return to_return

def import_query_data(query):
with open("processed.csv") as csvfile:
spamreader = csv.reader(csvfile)
latencies = []

for row in spamreader:
if query in row[1]:
if row[0] == "last":
latencies.append((145, row[2]))
else:
latencies.append((int(row[0]), row[2]))

latencies.sort()
to_return_lat = []
for l in latencies:
to_return_lat.append(float(l[1])/1000.0) # get into seconds
return to_return_lat

# We make two graphs - latency by num traces, latency by bytes.
bytes_count = import_csv("bytes_count.csv")
bytes_count = [x*0.000001 for x in bytes_count] # convert to MB

traces_count = import_csv("traces_count.csv")
traces_count = [x/1000.0 for x in traces_count] # convert to thousands


queries = ["duration", "fanout", "one_call", "height"]
latencies = []
for query in queries:
latencies.append(import_query_data(query))
fig, axs = plt.subplots(2)

for query in range(len(queries)):
axs[0].plot(bytes_count, latencies[query], label = queries[query])
axs[1].plot(traces_count, latencies[query], label = queries[query])

#plt.plot(bytes_count, new_x_duration, label = "duration", linestyle='--', marker='o', color='b')
#plt.plot(bytes_count, new_x_fanout, label = "fanout", linestyle='--', marker='o', color='r')
#plt.plot(bytes_count, new_x_one_other_call, label = "one other call", linestyle='--', marker='o', color='g')
#plt.plot(bytes_count, new_x_height, label = "height", linestyle='--', marker='o', color='c')
axs[0].set(xlabel='Bytes of AliBaba Data (MB)', ylabel='Latency (s)')
axs[1].set(xlabel='Number of Traces (Thousands)', ylabel='Latency (s)')
#fig.tight_layout()
#plt.ylim(0, 60)

for ax in axs:
box = ax.get_position()
ax.set_position([box.x0, box.y0 + box.height * 0.15,
box.width, box.height * 0.85])

axs[0].set_title("Structural Query Latencies on AliBaba Data")
axs[1].legend(loc='upper center', ncol=4, bbox_to_anchor=(0.5, -0.3), fancybox=True, shadow=True)

plt.savefig('graph.pdf')
plt.show()
36 changes: 36 additions & 0 deletions analyze_results/process_results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

csv_hdr = ["number_of_csvs", "query", "median_time(ms)"]

queries = ["duration", "fanout", "height", "one_call"]

all_files = []
for i in range(1, 145):
if i%5 == 0:
all_files.append(i)
all_files.append(144)
all_files.append("last")
#all_files = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

results = ",".join(csv_hdr) + "\n"

for file_num in all_files:
for q in queries:
txt_file = f"{file_num}{q}.txt"

with open(txt_file) as f:
data = f.readlines()[-1]
print(data)
if not data.startswith("Median: "):
print("Something wrong!")
print(txt_file)
exit(0)

median = float((data.split(" ")[1]).strip('\n'))

line_to_insert = f"{file_num},{file_num}{q},{median}" + "\n"
results += line_to_insert


with open("processed.csv", "w") as f:
f.write(results)

121 changes: 121 additions & 0 deletions analyze_results/processed.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
number_of_csvs,query,median_time(ms)
5,5duration,3382.0
5,5fanout,4886.0
5,5height,2099.5
5,5one_call,10864.0
10,10duration,4453.0
10,10fanout,6035.5
10,10height,3148.5
10,10one_call,19144.5
15,15duration,4879.5
15,15fanout,7194.5
15,15height,4793.0
15,15one_call,26027.0
20,20duration,5866.5
20,20fanout,8219.5
20,20height,5972.0
20,20one_call,33578.0
25,25duration,5933.0
25,25fanout,8387.5
25,25height,6916.5
25,25one_call,38360.0
30,30duration,6716.0
30,30fanout,9827.0
30,30height,8210.5
30,30one_call,53658.5
35,35duration,7900.0
35,35fanout,10863.0
35,35height,9423.0
35,35one_call,56566.5
40,40duration,11249.0
40,40fanout,11289.5
40,40height,10411.5
40,40one_call,57602.0
45,45duration,10139.5
45,45fanout,12158.0
45,45height,10662.5
45,45one_call,74428.5
50,50duration,10289.5
50,50fanout,13796.5
50,50height,12997.0
50,50one_call,89174.5
55,55duration,11389.0
55,55fanout,14674.0
55,55height,13927.0
55,55one_call,96361.0
60,60duration,11631.5
60,60fanout,16155.0
60,60height,14419.5
60,60one_call,100974.0
65,65duration,12197.5
65,65fanout,16929.0
65,65height,15544.0
65,65one_call,102143.0
70,70duration,13267.5
70,70fanout,18101.5
70,70height,16145.0
70,70one_call,110492.0
75,75duration,14275.0
75,75fanout,18994.0
75,75height,17881.0
75,75one_call,118724.0
80,80duration,15597.5
80,80fanout,21176.0
80,80height,20503.0
80,80one_call,128909.0
85,85duration,18124.0
85,85fanout,22341.0
85,85height,21786.5
85,85one_call,134332.0
90,90duration,19282.0
90,90fanout,22825.5
90,90height,21957.0
90,90one_call,142225.0
95,95duration,18814.5
95,95fanout,23038.5
95,95height,21864.5
95,95one_call,142174.0
100,100duration,18315.0
100,100fanout,24182.0
100,100height,23604.0
100,100one_call,155973.0
105,105duration,21236.5
105,105fanout,28794.0
105,105height,23135.0
105,105one_call,157896.0
110,110duration,19981.5
110,110fanout,24523.0
110,110height,28392.5
110,110one_call,156226.0
115,115duration,22786.0
115,115fanout,28010.0
115,115height,31166.0
115,115one_call,166320.0
120,120duration,21909.0
120,120fanout,28918.5
120,120height,26546.5
120,120one_call,162682.0
125,125duration,22031.5
125,125fanout,27458.5
125,125height,28742.0
125,125one_call,168307.0
130,130duration,22953.5
130,130fanout,29120.0
130,130height,30328.5
130,130one_call,183506.0
135,135duration,23958.5
135,135fanout,28290.5
135,135height,29767.5
135,135one_call,179278.0
140,140duration,23880.0
140,140fanout,28575.5
140,140height,30586.0
140,140one_call,188568.0
144,144duration,21972.5
144,144fanout,28084.5
144,144height,28691.0
144,144one_call,180668.0
last,lastduration,21603.5
last,lastfanout,32567.5
last,lastheight,30821.0
last,lastone_call,182750.0
1 change: 1 addition & 0 deletions analyze_results/traces_count.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0,105282,212114,318048,423719,530453,637381,744663,853516,964052,1077104,1190868,1303154,1413963,1526660,1638596,1749131,1862465,1974186,2084578,2196069,2304694,2421096,2539053,2659288,2783942,2911188,3038337,3166551,3288751,3412705,3531190,3653535,3780956,3911200,4027121,4143849,4263153,4383926,4507874,4631614,4756646,4884596,5010296,5137030,5265668,5397069,5516634,5635760,5755619,5875162,5995018,6107876,6219243,6330777,6445543,6557801,6668676,6782847,6897259,7012246,7130306,7248311,7367236,7484916,7604235,7724238,7846947,7971813,8099629,8221241,8345879,8472812,8606474,8739855,8874556,8999794,9125947,9256936,9388582,9523468,9659044,9794170,9911161,10028660,10143023,10256061,10370080,10489945,10615344,10742864,10862701,10981578,11091383,11211680,11339931,11472108,11605323,11741546,11863261,11983617,12107831,12230258,12331958,12434343,12537067,12632730,12735932,12843907,12940759,13032216,13123129,13210479,13298471,13386421,13479835,13571884,13667078,13762101,13860045,13962291,14067153,14172936,14285067,14402454,14519717,14638691,14756493,14880615,15011571,15151367,15285347,15425990,15566790,15710278,15862765,16031699,16211827,16405513,16589787,16777490,16981630,16981641,17091862
55 changes: 55 additions & 0 deletions batched_index/10duration.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Loading:
Loading: 0 packages loaded
Analyzing: target //:graph_query (0 packages loaded, 0 targets configured)
INFO: Analyzed target //:graph_query (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 3] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //:graph_query up-to-date:
bazel-bin/graph_query
INFO: Elapsed time: 0.256s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/graph_query 10 duration
INFO: Build completed successfully, 1 total action
Running duration_condition()
intersection size is 1
Total results: 1
Time Taken: 6714 ms

intersection size is 1
Total results: 1
Time Taken: 4268 ms

intersection size is 1
Total results: 1
Time Taken: 4113 ms

intersection size is 1
Total results: 1
Time Taken: 4100 ms

intersection size is 1
Total results: 1
Time Taken: 4305 ms

intersection size is 1
Total results: 1
Time Taken: 4089 ms

intersection size is 1
Total results: 1
Time Taken: 4091 ms

intersection size is 1
Total results: 1
Time Taken: 4046 ms

intersection size is 1
Total results: 1
Time Taken: 4045 ms

intersection size is 1
Total results: 1
Time Taken: 4035 ms

Median: 4095.5
55 changes: 55 additions & 0 deletions batched_index/10fanout.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Loading:
Loading: 0 packages loaded
Analyzing: target //:graph_query (0 packages loaded, 0 targets configured)
INFO: Analyzed target //:graph_query (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 4] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //:graph_query up-to-date:
bazel-bin/graph_query
INFO: Elapsed time: 0.226s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/graph_query 10 fanout
INFO: Build completed successfully, 1 total action
Running four_fan_out()
intersection size is 1
Total results: 45
Time Taken: 9189 ms

intersection size is 1
Total results: 45
Time Taken: 8038 ms

intersection size is 1
Total results: 45
Time Taken: 4393 ms

intersection size is 1
Total results: 45
Time Taken: 4231 ms

intersection size is 1
Total results: 45
Time Taken: 4261 ms

intersection size is 1
Total results: 45
Time Taken: 4321 ms

intersection size is 1
Total results: 45
Time Taken: 4339 ms

intersection size is 1
Total results: 45
Time Taken: 4291 ms

intersection size is 1
Total results: 45
Time Taken: 4200 ms

intersection size is 1
Total results: 45
Time Taken: 4258 ms

Median: 4306
Loading