Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run-time performance #5

Open
6 of 21 tasks
KiaraGrouwstra opened this issue Jan 21, 2020 · 5 comments
Open
6 of 21 tasks

run-time performance #5

KiaraGrouwstra opened this issue Jan 21, 2020 · 5 comments
Labels
generation features needed for dataset generation nice-to-have synthesis features needed for program synthesis

Comments

@KiaraGrouwstra
Copy link
Owner

KiaraGrouwstra commented Jan 21, 2020

I may need to perform some profiling (see readme for commands as per here) to locate the primary culprits.

  • performance for dataset generation might or might not become a bottleneck.
  • performance during synthesizer training will become crucial

synthesizer (crucial):

  • training memory
  • track wall-time:
    • until convergence
    • per epoch
  • batching:
    • over samples: my current implementation for samples simply discards any i/o sample over the batch size.
    • over task functions: check how this is done for tree LSTM or ask NSPS author
  • speed up the encoder by trimming down MaxChar - go over dataset i/o characters, then map between these and 0..i: on dataset generation find the maximum character index used in input-output string representations (or ideally, if I map out what's what, the number of unique characters used in them, and to map them, which ones)
  • consider eval frequency
  • enable concurrency using either:

both:

generator (optional):

@KiaraGrouwstra KiaraGrouwstra changed the title performance run-time performance Jan 21, 2020
@KiaraGrouwstra KiaraGrouwstra added the synthesis features needed for program synthesis label Jan 24, 2020
@KiaraGrouwstra KiaraGrouwstra added generation features needed for dataset generation and removed nice-to-have labels Feb 1, 2020
@KiaraGrouwstra
Copy link
Owner Author

whelp, since generation now crashes my laptop I guess this should get bumped up in priority.

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Feb 7, 2020

For profiling, stack runs into this, whereas on Arch using cabal is known to be challenging; following the instructions:
Here's how I tried getting a statically

sudo pacman -Rs stack cabal-install ghc ghc-libs ghc-static haskell-aeson haskell-aeson-pretty haskell-annotated-wl-pprint haskell-ansi-terminal haskell-ansi-wl-pprint haskell-appar haskell-asn1-encoding haskell-asn1-parse haskell-asn1-types haskell-async haskell-attoparsec haskell-attoparsec-iso8601 haskell-auto-update haskell-base-compat haskell-base-orphans haskell-base-prelude haskell-base16-bytestring haskell-base64-bytestring haskell-basement haskell-bifunctors haskell-bindings-uname haskell-bitarray haskell-blaze-builder haskell-blaze-html haskell-blaze-markup haskell-byteable haskell-byteorder haskell-case-insensitive haskell-cereal haskell-cheapskate haskell-clock haskell-cmark-gfm haskell-cmdargs haskell-code-page haskell-colour haskell-comonad haskell-conduit haskell-conduit-extra haskell-connection haskell-constraints haskell-contravariant haskell-cookie haskell-cpphs haskell-cryptohash-sha256 haskell-cryptonite haskell-cryptonite-conduit haskell-css-text haskell-data-default haskell-data-default-class haskell-data-default-instances-containers haskell-data-default-instances-dlist haskell-data-default-instances-old-locale haskell-digest haskell-distributive haskell-dlist haskell-doctemplates haskell-easy-file haskell-echo haskell-ed25519 haskell-edit-distance haskell-either haskell-enclosed-exceptions haskell-erf haskell-exceptions haskell-extra haskell-fast-logger haskell-file-embed haskell-filelock haskell-fingertree haskell-fsnotify haskell-generic-deriving haskell-githash haskell-glob haskell-gtk2hs-buildtools haskell-hackage-security haskell-haddock-library haskell-hashable haskell-hashtables haskell-hi-file-parser haskell-hinotify haskell-hourglass haskell-hpack haskell-hscolour haskell-hslua haskell-hslua-module-system haskell-hslua-module-text haskell-hsyaml haskell-http haskell-http-api-data haskell-http-client haskell-http-client-tls haskell-http-conduit haskell-http-download haskell-http-types haskell-http2 haskell-hxt haskell-hxt-charproperties haskell-hxt-regex-xmlschema haskell-hxt-unicode haskell-ide-engine haskell-ieee754 haskell-infer-license haskell-integer-logarithms haskell-iproute haskell-ipynb haskell-juicypixels haskell-libffi haskell-libyaml haskell-lifted-async haskell-lifted-base haskell-megaparsec haskell-memory haskell-microlens haskell-microlens-th haskell-mime-types haskell-mintty haskell-monad-control haskell-monad-logger haskell-monad-loops haskell-mono-traversable haskell-mustache haskell-neat-interpolation haskell-network haskell-network-byte-order haskell-network-uri haskell-old-locale haskell-old-time haskell-only haskell-open-browser haskell-optparse-applicative haskell-optparse-generic haskell-optparse-simple haskell-pandoc-types haskell-pantry haskell-parser-combinators haskell-path haskell-path-io haskell-path-pieces haskell-pem haskell-persistent haskell-persistent-sqlite haskell-persistent-template haskell-polyparse haskell-primitive haskell-profunctors haskell-project-template haskell-psqueues haskell-quickcheck haskell-random haskell-refact haskell-regex-applicative haskell-regex-applicative-text haskell-regex-base haskell-regex-pcre haskell-regex-tdfa haskell-resolv haskell-resource-pool haskell-resourcet haskell-retry haskell-rio haskell-rio-orphans haskell-rio-prettyprint haskell-safe haskell-safe-exceptions haskell-scientific haskell-semigroupoids haskell-sha haskell-shelly haskell-silently haskell-skylighting haskell-skylighting-core haskell-socks haskell-split haskell-splitmix haskell-src-exts haskell-src-exts-util haskell-statevar haskell-stm-chans haskell-streaming-commons haskell-syb haskell-system-fileio haskell-system-filepath haskell-tagged haskell-tagsoup haskell-tar haskell-tar-conduit haskell-temporary haskell-terminal-size haskell-texmath haskell-text-metrics haskell-th-abstraction haskell-th-expand-syns haskell-th-lift haskell-th-lift-instances haskell-th-orphans haskell-th-reify-many haskell-th-utilities haskell-time-compat haskell-time-manager haskell-tls haskell-transformers-base haskell-transformers-compat haskell-type-equality haskell-typed-process haskell-unicode-transforms haskell-uniplate haskell-unix-compat haskell-unix-time haskell-unliftio haskell-unliftio-core haskell-unordered-containers haskell-utf8-string haskell-uuid-types haskell-vault haskell-vector haskell-vector-algorithms haskell-vector-binary-instances haskell-void haskell-wai haskell-wai-extra haskell-wai-logger haskell-word8 haskell-x509 haskell-x509-store haskell-x509-system haskell-x509-validation haskell-xml haskell-xss-sanitize haskell-yaml haskell-zip-archive haskell-zlib ihaskell-git pandoc hlint happy

yay ghc
yay ghc-static
yay stack-bin

stack setup --system-ghc
stack install --system-ghc cabal-install

Even after doing this I get Perhaps you haven't installed the "p_dyn" libraries for package ‘base-4.12.0.0’?.
For DAS-5 installing Haskell Platform thru ghcup seems to do it tho!

I can then transfer it back and convert it:

ssh vu
scp das:~/synthesis/synthesis.svg .
^D
scp vu:~/synthesis.svg .
convert synthesis.svg synthesis.png

@KiaraGrouwstra
Copy link
Owner Author

my first profiling results are as follows:
synthesis

So most time is spent in GHC internals for type-checking and parsing expressions, potentially matching my intuition that the slowest function on my end seemed fnIoPairs, which computes output of synthesized programs given inputs.
Maybe I can further investigate this as per #17, hopefully building like GHC ASTs upfront or something.
Alternatively, I could cut down on type-checks, tho that could shave off like max 35%, not say help by an order of magnitude.

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Feb 8, 2020

  • try-evaluate just crash over typecheck
  • combine type instantiations for a function into a single interpret call for generating outputs

settings:

influencing the size of expressions:

  • listLengths
  • numInputs

influencing the number of compilations:

  • contributing to number of task functions:
    • number of blocks (exponential), thru number of programs
    • maxWildcardDepth
    • genMaxHoles
  • contributing to number of types sampled per program:
    • maxInstances (exponential given function signatures containing multiple type variables?)
    • nestLimit (exponential)

@KiaraGrouwstra
Copy link
Owner Author

I just got generation to run on my laptop within a couple minutes, so the main concern here seems perhaps resolved for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generation features needed for dataset generation nice-to-have synthesis features needed for program synthesis
Projects
None yet
Development

No branches or pull requests

1 participant