Summary:
In looking at summaries that Quandary took a long time to compute, one thing I notice frequently is redundancy in the footprint sources (e.g., I might see `Footprint(x), Footprint(x.f), Footprint(x*)`).
`sudo perf top` indicates that joining big sets of sources is a major performance bottleneck, and a large number of footprint sources is surely a big part of this (since we expect the number of non-footprint sources to be small).
This diff addresses the redundancy issue by using a more complex representation for a set of sources. The "known" sources are still in a set, but the footprint sources are now represented as a set of access paths (via an access trie).
The access path trie is a minimal representation of a set of access paths, so it would represent the example above as a simple `x*`.
This should make join/widen/<= faster and improve performance
Reviewed By: jberdine
Differential Revision: D5663980
fbshipit-source-id: 9fb66f8
master
Sam Blackshear7 years agocommitted byFacebook Github Bot
codetoanalyze/cpp/quandary/execs.cpp, execs::callExecBad, 31, QUANDARY_TAINT_ERROR, [Return from getenv,Call to execve]
codetoanalyze/cpp/quandary/execs.cpp, execs::callExecBad, 33, QUANDARY_TAINT_ERROR, [Return from getenv,Call to system]
codetoanalyze/cpp/quandary/execs.cpp, execs::callExecBad, 35, QUANDARY_TAINT_ERROR, [Return from getenv,Call to popen]
codetoanalyze/cpp/quandary/execs.cpp, execs::exec_flag_bad, 0, QUANDARY_TAINT_ERROR, [Return from execs::exec_flag_bad,Call to execl]
codetoanalyze/cpp/quandary/execs.cpp, execs::sql_on_env_var_bad, 2, QUANDARY_TAINT_ERROR, [Return from getenv,Call to __infer_sql_sink]
codetoanalyze/cpp/quandary/files.cpp, files::read_file_call_exec_bad1, 5, QUANDARY_TAINT_ERROR, [Return from std::basic_istream<char,std::char_traits<char>>_read,Call to execle]
codetoanalyze/cpp/quandary/files.cpp, files::read_file_call_exec_bad2, 5, QUANDARY_TAINT_ERROR, [Return from std::basic_istream<char,std::char_traits<char>>_readsome,Call to execle]