Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple for-loop optimizations in Assembly for AD #30036

Merged
merged 4 commits into from
Mar 20, 2025

Conversation

GiudGiud
Copy link
Contributor

@GiudGiud GiudGiud commented Mar 5, 2025

refs #10256
what is a good case to check if it's impactful or not?
is there a reference AD + displaced mesh case we can try?

@GiudGiud GiudGiud self-assigned this Mar 5, 2025
@GiudGiud GiudGiud marked this pull request as ready for review March 5, 2025 20:42
@moosebuild
Copy link
Contributor

moosebuild commented Mar 5, 2025

Job Documentation, step Docs: sync website on 0871122 wanted to post the following:

View the site here

This comment will be updated on new commits.

@moosebuild
Copy link
Contributor

moosebuild commented Mar 5, 2025

Job Coverage, step Generate coverage on 0871122 wanted to post the following:

Framework coverage

3403d2 #30036 087112
Total Total +/- New
Rate 85.29% 85.29% -0.00% 57.69%
Hits 109051 109057 +6 30
Misses 18808 18813 +5 22

Diff coverage report

Full coverage report

Modules coverage

Coverage did not change

Full coverage reports

Reports

Warnings

  • framework new line coverage rate 57.69% is less than the suggested 90.0%

This comment will be updated on new commits.

Copy link
Member

@lindsayad lindsayad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we figured out whether make_range vectorizes? If we haven't figured it out, can we? If we have figured it out and it vectorizes, let's switch to it

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Mar 7, 2025

nothing vectorizes at the moment as far I could tell when I tried.

My next hope for vectorizing is Logan switching out MooseArrays to std::vector for variables and a lot of other constructs

@GiudGiud
Copy link
Contributor Author

GiudGiud commented Mar 16, 2025

changed to range loops for qps.
I tried on simple - diffusion + r 3.5 + displacements and no improvements on runtime

Though we do take a x2.2 slowdown when turning on the displaced mesh.

@lindsayad
Copy link
Member

Though we do take a x2.2 slowdown when turning on the displaced mesh.

Wait waaat? Using make_range gives you a 2.2 factor slow-down?

@GiudGiud
Copy link
Contributor Author

Wait waaat? Using make_range gives you a 2.2 factor slow-down?

no no no it's turning on the displaced mesh that does that

@GiudGiud
Copy link
Contributor Author

This input shows it. You can comment out the global params to turn on/off

[Mesh]
  type = GeneratedMesh
  dim = 3
  nx = 15
  ny = 15
  nz = 15

  uniform_refine = 3
[]

[GlobalParams]
  use_displaced_mesh = true
  displacements = 'disp_x disp_y disp_z'
[]

[AuxVariables]
  [disp_x]
  []
  [disp_y]
  []
  [disp_z]
  []
[]

[Variables]
  [./u]
  [../]
[]

[Kernels]
  [./diff]
    type = ADDiffusion
    variable = u
  [../]
[]

[BCs]
  [./left]
    type = DirichletBC
    variable = u
    boundary = left
    value = 0
  [../]
  [./right]
    type = DirichletBC
    variable = u
    boundary = right
    value = 1
  [../]
[]

[Preconditioning]
  [./smp]
    type = SMP
    full = true
  [../]
[]

[Executioner]
  type = Steady

  # Preconditioned JFNK (default)
  # solve_type = 'Newton'


  petsc_options_iname = '-pc_type -pc_hypre_type'
  petsc_options_value = 'hypre boomeramg'

  l_tol = 1e-10
  nl_rel_tol = 1e-4
  nl_max_its = 1
[]

[Outputs]
  exodus = true
[]

@GiudGiud
Copy link
Contributor Author

Doco failure unrelated

@GiudGiud GiudGiud merged commit f2d4288 into idaholab:next Mar 20, 2025
50 of 51 checks passed
@GiudGiud GiudGiud deleted the PR_opt_march branch March 20, 2025 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants