Summary: With flambda (`-O3`), compilation time is ~5x slower, but the backend is ~25% faster! To mitigate the atrocious compilation times, introduce a new `opt` build mode in the jbuilder files. - build in "opt" mode by default from the toplevel (so that install scripts and external users get the fastest infer by default), in "default" mode by default from infer/src (since the latter is only called directly by infer devs, for faster builds) - `make byte` is as fast as before in any mode - `make test` will build "opt" by default, which is very slow. Solution for testing (or building the models) locally: `make BUILD_MODE=default test`. - You can even change the default locally with `export BUILD_MODE=default`. The benchmarks are to be taken with a sizable pinch of salt because I ran them only once and other stuff could be running in the background. That said, the perf win is consistent across all projects, with 15-20% win in wallclock time and around 25% win in total CPU time, ~9% win in sys time, and ~25% fewer minor allocations, and ~5-10% fewer overall allocations. This is only for the backend; the capture is by and large unaffected (either the same or a tad faster within noise range). Here are the results running on OpenSSL 1.0.2d on osx (12 cores, 32G RAM) === base infer binary: 26193088 bytes compile time: 40s capture: ```lang=text real 1m7.513s user 3m11.437s sys 0m55.236s ``` analysis: ```lang=text real 5m41.580s user 61m37.855s sys 1m12.870s ``` Memory profile: ```lang=json { ... "minor_gb": 0.1534719169139862, "promoted_gb": 0.0038930922746658325, "major_gb": 0.4546157643198967, "allocated_gb": 0.6041945889592171, "minor_collections": 78, "major_collections": 23, "compactions": 7, "top_heap_gb": 0.07388687133789062, "stack_kb": 0.3984375, "minor_heap_kb": 8192.0, ... } ``` === flambda with stock options (no `-Oclassic`, just the same flags as base) Exactly the same as base. === flambda `-O3` infer binary: 56870376 bytes (2.17x bigger) compile time: 191s (4.78x slower) capture is the same as base: ```lang=text real 1m9.203s user 3m12.242s sys 0m58.905s ``` analysis is ~20% wallclock time faster, ~25% CPU time faster: ```lang=text real 4m32.656s user 46m43.987s sys 1m2.424s ``` memory usage is a bit lower too: ```lang=json { ... "minor_gb": 0.11583046615123749, // 75% of previous "promoted_gb": 0.00363825261592865, // 93% of previous "major_gb": 0.45415670424699783, // about same "allocated_gb": 0.5663489177823067, // 94% of previous "minor_collections": 73, "major_collections": 22, "compactions": 7, "top_heap_gb": 0.07165145874023438, "stack_kb": 0.3359375, "minor_heap_kb": 8192.0, ... } ``` === flambda `-O2` Not nearly as exciting as `-O3`, but the compilation cost is still quite high: infer: 37826856 bytes compilation of infer: 100s Capture and analysis timings are mostly the same as base. Reviewed By: jberdine Differential Revision: D4867979 fbshipit-source-id: 99230b7master
parent
16dcae58fa
commit
f8d7c81045
@ -1,12 +1,14 @@
|
||||
(* -*- tuareg -*- *)
|
||||
(* NOTE: prepend jbuild.common to this file! *)
|
||||
|
||||
;; Format.sprintf {|
|
||||
;; (* -*- tuareg -*- *)
|
||||
(* NOTE: prepend jbuild.common to this file! *)
|
||||
Format.sprintf
|
||||
{|
|
||||
(library
|
||||
((name InferStdlib)
|
||||
(flags (%s))
|
||||
(ocamlopt_flags (%s))
|
||||
(libraries (%s))
|
||||
))
|
||||
|}
|
||||
(String.concat " " common_cflags) (String.concat " " common_libraries)
|
||||
(String.concat " " common_cflags) (String.concat " " common_optflags)
|
||||
(String.concat " " common_libraries)
|
||||
|> Jbuild_plugin.V1.send
|
||||
|
Loading…
Reference in new issue