-
Notifications
You must be signed in to change notification settings - Fork 4
Conversation
49dbbef
to
212a924
Compare
Hey Anders - I went through this code and made a few notes and suggested tweaks as I went. Thought these comments and tweaks might be useful so I have dumped them on a branch here https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code Also noticed the black hasn't been running on the files in this repo that don't end in .py! I will create a PR to fix this when I get a spare minute! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few review comments! Would recommend looking at https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code first then looking over the comments :)
squad-track-duration
Outdated
|
||
def get_data(args, build_cache): | ||
from_datetime = args.from_datetime | ||
if "T" not in from_datetime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do painful string manipulation here! I've dumped some proposed changes on this branch https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code to make better use of the datetime library
squad-track-duration
Outdated
if tmp_to_date == end_date: | ||
to_time = f"T{to_datetime.split('T')[1]}" | ||
else: | ||
to_time = "T23:59:59" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this actually creates a very small corner case where we drop any data between 23:59:59 and 00:00:00 - SQUAD stores time with a decimal point after the seconds. In my changes in https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code my tweaks should fix this minor issue.
|
||
tmp_from_date = date(from_year, from_month, from_day) | ||
end_date = date(to_year, to_month, to_day) | ||
delta = timedelta(days=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we go through the data a day at a time? I feel like this should be configurable - when I ran locally I increased this to 30 days at a time :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did set it to 1 day so we can get some output. =)
could be nice to have in the pipeline.
squad-track-duration
Outdated
else: | ||
logger.debug(f"no-cache: {build_id}") | ||
tmp_build_cache = [] | ||
testruns = build.testruns(count=-1, prefetch_metadata=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 should probably be "ALL" like in the rest of the code (have also made that as a change in https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code)
data = [] | ||
data, build_cache = get_data(args, build_cache) | ||
|
||
save_build_cache_to_artifactorial(build_cache) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of this data caching - I was actually interested in doing something similar in another project. In that case, there is too much data for json files to be viable, so I was wondering if a proper database could be used to better handle the scale of the data (DuckDB seemed easiest from my investigation). Seems overkill in this case unless you start seeing issues with the jsons, of course :)
squad-track-duration
Outdated
def combine_plotly_figs_to_html(figs, html_fname, main_title, main_description, | ||
include_plotlyjs='cdn', | ||
separator=None, auto_open=False): | ||
with open(html_fname, 'w') as f: | ||
f.write(f"<h1>{main_title}</h1>") | ||
f.write(f"<div>{main_description}</div>") | ||
index = 0 | ||
f.write("<h2>Page content</h2>") | ||
f.write("<ul>") | ||
for fig in figs[1:]: | ||
index = index + 1 | ||
f.write(f'<li><a href="#fig{index}">{fig.title}</a></li>') | ||
f.write("</ul>") | ||
f.write(f'<h2><a id="fig0">{figs[0].title}</a></h2>') | ||
f.write(f"<div>{figs[0].description}</div>") | ||
f.write(figs[0].plotly_fig.to_html(include_plotlyjs=include_plotlyjs)) | ||
index = 0 | ||
for fig in figs[1:]: | ||
index = index + 1 | ||
if separator: | ||
f.write(separator) | ||
f.write(f'<h2><a id="fig{index}">{fig.title}</a></h2>') | ||
f.write(f"<div>{fig.description}</div>") | ||
f.write(fig.plotly_fig.to_html(full_html=False, include_plotlyjs=False)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this saving as html is cool. What do we generally think about the proposed ways of creating html pages from the plotly docs? https://plotly.com/python/interactive-html-export/ I think Dash might be a good way going forward if this is something we want to build on in future.
squad-track-duration
Outdated
|
||
save_build_cache_to_artifactorial(build_cache) | ||
for build in data: | ||
df.loc[len(df.index)] = {a: build[a] for a in ["build_name", "git_describe", "device", "boottime", "created_at"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's super discouraged to assign new entries to a dataframe like this - better to convert the list of dicts to a df directly with pd.DataFrame. Found a nice Stack Overflow answer which shows the performance differences of building up a dataframe like this if you're interested https://stackoverflow.com/a/47979665
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, way better. Thank you =)
squad-track-duration
Outdated
dft = df.groupby(["created_at", "git_describe", "device", "build_name"])["boottime"].mean() | ||
dft = dft.reset_index().sort_values(by="created_at") | ||
|
||
dft = dft[dft['build_name'].isin([args.build_name])] | ||
figure_colletion.append( | ||
metaFigure( | ||
px.line(dft, x="created_at", y="boottime", color="device", markers=True).update_xaxes(tickvals=dft['created_at'], ticktext=dft['git_describe']).update_layout(xaxis_title="Version", yaxis_title="Boot time"), | ||
f"Line graph, {args.build_name}", | ||
f"This line graph, is generated from build_name {args.build_name}.", | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth putting in a function since the code below also looks quite similar.
squad-track-duration
Outdated
parser.add_argument( | ||
"--from-datetime", | ||
required=True, | ||
help="Starting date time. Example: 2022-01-01 or 2022-01-01T00:00:00", | ||
) | ||
|
||
parser.add_argument( | ||
"--to-datetime", | ||
required=True, | ||
help="Ending date time. Example: 2022-12-31 or 2022-12-31T00:00:00", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throughout the code, would it be easier to understand if we used "start" instead of "from" and "end" instead of "to"? Personally, it takes less brain power for me to interpret "start date" and "end date" compared to "from date" and "to date".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
eaddf2e
to
a99105c
Compare
Thank you @katieworton amazing, I reworked it. |
a99105c
to
bc68e0b
Compare
Awesome - thanks for reworking this and incorporating my changes! Looks good to me :) |
bc68e0b
to
5422b60
Compare
Today its hardcoded to view build_names gcc-13-lkftconfig or clang-17-lkftconfig, two line charts is presented, one for devices and the other with build-name+devices. Example: ./squad-track-duration --group lkft --project linux-next-master \ --start-datetime 2024-04-01 --end-datetime 2024-05-02 A file called builds.json functions as a database, it stores finished builds from SQUAD. Signed-off-by: Anders Roxell <[email protected]>
Make code easier to read and understand by switching out use of strings for datetimes, for making use of the datetime objects. Annotate the code to clarify what it does. Signed-off-by: Katie Worton <[email protected]> Signed-off-by: Anders Roxell <[email protected]>
5422b60
to
215a4e3
Compare
Today its hardcoded to view build_names gcc-13-lkftconfig or clang-17-lkftconfig, two line charts is presented, one for devices and the other with build-name+devices.
Example:
./squad-track-duration --group lkft --project linux-next-master \ --from-datetime 2024-04-01 --to-datetime 2024-05-02
A file called builds.json functions as a database, it stores finished builds from SQUAD.