Skip to content
This repository has been archived by the owner on Jul 30, 2024. It is now read-only.

add squad-track-duration #43

Merged
merged 2 commits into from
May 8, 2024
Merged

Conversation

roxell
Copy link
Collaborator

@roxell roxell commented May 7, 2024

Today its hardcoded to view build_names gcc-13-lkftconfig or clang-17-lkftconfig, two line charts is presented, one for devices and the other with build-name+devices.

Example:
./squad-track-duration --group lkft --project linux-next-master \ --from-datetime 2024-04-01 --to-datetime 2024-05-02

A file called builds.json functions as a database, it stores finished builds from SQUAD.

@roxell roxell force-pushed the squad-track-boottime branch 3 times, most recently from 49dbbef to 212a924 Compare May 8, 2024 06:11
@katieworton
Copy link
Member

Hey Anders - I went through this code and made a few notes and suggested tweaks as I went. Thought these comments and tweaks might be useful so I have dumped them on a branch here https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code
Feel free to take these changes if they are useful - I will also add a couple of review comments on this PR shortly :)

Also noticed the black hasn't been running on the files in this repo that don't end in .py! I will create a PR to fix this when I get a spare minute!

Copy link
Member

@katieworton katieworton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few review comments! Would recommend looking at https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code first then looking over the comments :)


def get_data(args, build_cache):
from_datetime = args.from_datetime
if "T" not in from_datetime:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do painful string manipulation here! I've dumped some proposed changes on this branch https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code to make better use of the datetime library

if tmp_to_date == end_date:
to_time = f"T{to_datetime.split('T')[1]}"
else:
to_time = "T23:59:59"
Copy link
Member

@katieworton katieworton May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this actually creates a very small corner case where we drop any data between 23:59:59 and 00:00:00 - SQUAD stores time with a decimal point after the seconds. In my changes in https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code my tweaks should fix this minor issue.


tmp_from_date = date(from_year, from_month, from_day)
end_date = date(to_year, to_month, to_day)
delta = timedelta(days=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we go through the data a day at a time? I feel like this should be configurable - when I ran locally I increased this to 30 days at a time :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did set it to 1 day so we can get some output. =)
could be nice to have in the pipeline.

else:
logger.debug(f"no-cache: {build_id}")
tmp_build_cache = []
testruns = build.testruns(count=-1, prefetch_metadata=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 should probably be "ALL" like in the rest of the code (have also made that as a change in https://github.com/katieworton/squad-client-utils/tree/improvements-to-boottime-code)

data = []
data, build_cache = get_data(args, build_cache)

save_build_cache_to_artifactorial(build_cache)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of this data caching - I was actually interested in doing something similar in another project. In that case, there is too much data for json files to be viable, so I was wondering if a proper database could be used to better handle the scale of the data (DuckDB seemed easiest from my investigation). Seems overkill in this case unless you start seeing issues with the jsons, of course :)

Comment on lines 205 to 270
def combine_plotly_figs_to_html(figs, html_fname, main_title, main_description,
include_plotlyjs='cdn',
separator=None, auto_open=False):
with open(html_fname, 'w') as f:
f.write(f"<h1>{main_title}</h1>")
f.write(f"<div>{main_description}</div>")
index = 0
f.write("<h2>Page content</h2>")
f.write("<ul>")
for fig in figs[1:]:
index = index + 1
f.write(f'<li><a href="#fig{index}">{fig.title}</a></li>')
f.write("</ul>")
f.write(f'<h2><a id="fig0">{figs[0].title}</a></h2>')
f.write(f"<div>{figs[0].description}</div>")
f.write(figs[0].plotly_fig.to_html(include_plotlyjs=include_plotlyjs))
index = 0
for fig in figs[1:]:
index = index + 1
if separator:
f.write(separator)
f.write(f'<h2><a id="fig{index}">{fig.title}</a></h2>')
f.write(f"<div>{fig.description}</div>")
f.write(fig.plotly_fig.to_html(full_html=False, include_plotlyjs=False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this saving as html is cool. What do we generally think about the proposed ways of creating html pages from the plotly docs? https://plotly.com/python/interactive-html-export/ I think Dash might be a good way going forward if this is something we want to build on in future.


save_build_cache_to_artifactorial(build_cache)
for build in data:
df.loc[len(df.index)] = {a: build[a] for a in ["build_name", "git_describe", "device", "boottime", "created_at"]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's super discouraged to assign new entries to a dataframe like this - better to convert the list of dicts to a df directly with pd.DataFrame. Found a nice Stack Overflow answer which shows the performance differences of building up a dataframe like this if you're interested https://stackoverflow.com/a/47979665

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, way better. Thank you =)

Comment on lines 264 to 337
dft = df.groupby(["created_at", "git_describe", "device", "build_name"])["boottime"].mean()
dft = dft.reset_index().sort_values(by="created_at")

dft = dft[dft['build_name'].isin([args.build_name])]
figure_colletion.append(
metaFigure(
px.line(dft, x="created_at", y="boottime", color="device", markers=True).update_xaxes(tickvals=dft['created_at'], ticktext=dft['git_describe']).update_layout(xaxis_title="Version", yaxis_title="Boot time"),
f"Line graph, {args.build_name}",
f"This line graph, is generated from build_name {args.build_name}.",
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth putting in a function since the code below also looks quite similar.

Comment on lines 62 to 92
parser.add_argument(
"--from-datetime",
required=True,
help="Starting date time. Example: 2022-01-01 or 2022-01-01T00:00:00",
)

parser.add_argument(
"--to-datetime",
required=True,
help="Ending date time. Example: 2022-12-31 or 2022-12-31T00:00:00",
)
Copy link
Member

@katieworton katieworton May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout the code, would it be easier to understand if we used "start" instead of "from" and "end" instead of "to"? Personally, it takes less brain power for me to interpret "start date" and "end date" compared to "from date" and "to date".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

@roxell roxell force-pushed the squad-track-boottime branch 2 times, most recently from eaddf2e to a99105c Compare May 8, 2024 13:47
@roxell
Copy link
Collaborator Author

roxell commented May 8, 2024

Thank you @katieworton amazing, I reworked it.

@roxell roxell force-pushed the squad-track-boottime branch from a99105c to bc68e0b Compare May 8, 2024 13:59
@katieworton
Copy link
Member

Awesome - thanks for reworking this and incorporating my changes! Looks good to me :)

@roxell roxell force-pushed the squad-track-boottime branch from bc68e0b to 5422b60 Compare May 8, 2024 14:35
roxell and others added 2 commits May 8, 2024 16:46
Today its hardcoded to view build_names gcc-13-lkftconfig or
clang-17-lkftconfig, two line charts is presented, one for devices and
the other with build-name+devices.

Example:
./squad-track-duration --group lkft --project linux-next-master \
--start-datetime 2024-04-01 --end-datetime 2024-05-02

A file called builds.json functions as a database, it stores finished
builds from SQUAD.

Signed-off-by: Anders Roxell <[email protected]>
Make code easier to read and understand by switching out use of
strings for datetimes, for making use of the datetime objects.
Annotate the code to clarify what it does.

Signed-off-by: Katie Worton <[email protected]>
Signed-off-by: Anders Roxell <[email protected]>
@roxell roxell force-pushed the squad-track-boottime branch from 5422b60 to 215a4e3 Compare May 8, 2024 14:46
@roxell roxell merged commit b1f5147 into Linaro:master May 8, 2024
1 check passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants