How to use static analysis to improve performance

Jesper Juhl

07.09.2015

5:06 pm

Since you are asking for ideas for performance related checkers I thought I’d post a few small ideas.

1. Check for comparisons of .size() against 0 instead of calling .empty()/.isEmpty()

2. Check for use of post-increment where pre-increment could have been used just as well.

3. Check for loops that open-code std::fill() rather than calling the algorithm.

4. Check for loops that iterate column-major rather than row-major. Since the latter usually has better cache performance characteristics.

Reply

Clare Macrae

07.09.2015

9:03 pm

It would be great to have a check for passing shared_ptr parameters by value – and better still still to have a tool that converted such calls to passing by const reference!

Reply

mlim 08.09.2015 1:29 pm

@Clare,

You might be interested in this CppCon talk on automated refactoring for C++, which covers ideas like automatically fixing common anti-patterns:

Reply

Jean-Michaël Celerier

07.09.2015

10:01 pm

– Heap variables that could be put on the stack (like a new at the beginning of a block and a delete at the end).
– Passing types by const& when it would be faster to pass them by value.
– Use of Qt containers in C++11 range-based for loops which implies a deep copy.
– Downcasts with static_cast instead of dynamic_ / qobject_
– Functions taking pointers in argument if they don’t take ownership somewhere (they could take references instead).
– Functions taking non-const elements and which does not modify them (a bit tricky because for instance the function could be virtual and the variable could be modified in the same function in another child class)
– Elements initialized in the ctor instead of the member init-list

And more broadly-scoped ones :
– Long types used in more than one place (e.g. QMap<QPair, std::tuple>) which should be given a name instead
– Code that would benefit from inlining (dunno if it’s doable / computable ? maybe if it’s very few LLVM instructions ?)

Reply

Jesper Juhl

08.09.2015

7:15 am

– Check for use of ‘new’ to initialize unique_ptr/shared_ptr rather than using make_unique/make_shared.

– Suggest using ‘extern template’ if the same template is instantiated in multiple compilation units (to improve build performance, not runtime performance).

– Suggest using emplace()/emplace_back() rather than insert()/push_back() if the object in question is an rvalue.

– Suggest removing virtual if a class has virtual functions but no other classes inherit from it and the virtual functions also do not override any from base classes.

– If a class has a move constructor that is not ‘noexcept’, suggest to make it so. So that it can be used with functions that use std::move_if_noexcept to determine whether to move or copy objects (for example std::vector::resize()).

Reply

trackvegeta

08.09.2015

9:48 am

It seems that your plug-in is made for checking the use of Qt.

It will be great to separate the different kinds of checkings you do.

You can make a plug-in with several kind of checkings

1) Check code before c++11
2) Check c++11 code
3) Check code for specific libraries (like Qt …)

It will be great to separate the different kinds of checking into several plug-in.

Reply

mlim

08.09.2015

1:27 pm

Check if a class is following the “rule of zero” — if it defines any of the 5 auto-generated functions (dtor, copy ctor and assignment, move ctor and assignment), it should declare all of them.

Even continuing to follow the “rule of three” under C++11 can leave some performance on the table because if one of the auto-generated functions appears, the compiler won’t generate move operations implicitly.

Reply

Paul Barbot

09.09.2015

8:06 pm

Very Nice, I was considering using libclang to compute the “sizeof” of all the class/struct defined in a project. Your checks takes in account object usage which is even better.

Maybe integrate your checks with clang-tidy (http://clang.llvm.org/extra/clang-tidy/index.html). This tool is handy if you just want the analysis without the compilation/code-emission.

Reply

Jesper Juhl

11.09.2015

5:29 pm

A few more ideas:

– check for uses of static’s in header files. They will get a copy in multiple compilation units, and if the linker is not smart (enough) about removing them they can bloat your libraries and executables (which can have a performance impact).

– Check for classes/structs where shuffling the order of member variables can significantly reduce the amount of padding the compiler has to add. Smaller objects are always nice, but when reordering members can bring a class size down from occupying more than 1 cache line to less than one then there can really be significant performance gains.

– Look for code that uses “out parameters” that should just use a normal return value. The RVO and NRVO optimizations can kick in for normal returns – not so much for out parameters.

– Check for empty user-defined implementations dof destructors and default constructors and suggest that they either be deleted or defaulted (=default) instead. When user-defined they are by dwfinition never trivial, but when they are trivial they can often enable compiler optimizations that are otherwise not available (especially if making those functions trivial makes the entire type POD).

Reply

Jesper Juhl

11.09.2015

5:52 pm

By the way. Would you appreciate outside contributions on the code or would you rather just get ideas and input and implement them yourself?
I think a tool like this is a great idea and I’d be happy to contribute (with what little time I have (which may be zero)). But I won’t bother if you’d rather just have suggestions and evolve it on your own…?

Reply

Sérgio Martins 11.09.2015 6:09 pm

I would absolutely accept contributions, I’m even moving the repo to KDE playground where everyone can participate.

I wonder however which types of new checks we want. For pure C++ checks we already have clang static analyser and clang-tidy, so maybe we should be collaborating with them.

For Qt specific checks I think clazy is the right place for it.

Reply
1. Jesper Juhl 11.09.2015 6:47 pm
  
  Clang tidy and analyzer are focused on finding bugs. As I see it, clazy fills a different niche; finding performance issues.
  So I’d say both Qt and plain C++ checks could go in, as long as they are performance related. For bug checks, submitting those to clang analyzer or tidy would make sense.
  Just my opinion 🙂
  
  Reply

Ram

02.10.2015

12:16 pm

Hi,

need a bit of help please. I’ve downloaded clazy from https://quickgit.kde.org/?p=scratch%2Fsmartins%2Fclazy.git and clang from http://llvm.org/releases/download.html. Compilation failing here.

Downloads/scratch-smartins-clazy/Utils.cpp:859:48: error: no matching function for call to ‘std::vector::erase(__gnu_cxx::__normal_iterator<clang::DeclContext* const*, std::vector >&, __gnu_cxx::__normal_iterator<clang::DeclContext* const*, std::vector >)’
methodContexts.erase(it, it + 1);
^
Downloads/scratch-smartins-clazy/Utils.cpp:859:48: note: candidates are:
In file included from /usr/include/c++/4.8/vector:69:0,
from /usr/include/c++/4.8/bits/random.h:34,
from /usr/include/c++/4.8/random:50,
from /usr/include/c++/4.8/bits/stl_algo.h:65,
from /usr/include/c++/4.8/algorithm:62,
from /usr/include/llvm/ADT/SmallVector.h:22,
from /usr/include/llvm/Support/Allocator.h:24,
from /usr/include/clang/AST/ASTVector.h:23,
from /usr/include/clang/AST/ASTUnresolvedSet.h:18,
from /usr/include/clang/AST/DeclCXX.h:19,
from Downloads/scratch-smartins-clazy/Utils.h:31,
from Downloads/scratch-smartins-clazy/Utils.cpp:28:
/usr/include/c++/4.8/bits/vector.tcc:134:5: note: std::vector::iterator std::vector::erase(std::vector::iterator) [with _Tp = clang::DeclContext*; _Alloc = std::allocator; std::vector::iterator = __gnu_cxx::__normal_iterator<clang::DeclContext**, std::vector >; typename std::_Vector_base::pointer = clang::DeclContext**]
vector::
^
/usr/include/c++/4.8/bits/vector.tcc:134:5: note: candidate expects 1 argument, 2 provided
/usr/include/c++/4.8/bits/vector.tcc:146:5: note: std::vector::iterator std::vector::erase(std::vector::iterator, std::vector::iterator) [with _Tp = clang::DeclContext*; _Alloc = std::allocator; std::vector::iterator = __gnu_cxx::__normal_iterator<clang::DeclContext**, std::vector >; typename std::_Vector_base::pointer = clang::DeclContext**]
vector::
^
/usr/include/c++/4.8/bits/vector.tcc:146:5: note: no known conversion for argument 1 from ‘__gnu_cxx::__normal_iterator<clang::DeclContext* const*, std::vector >’ to ‘std::vector::iterator {aka __gnu_cxx::__normal_iterator<clang::DeclContext**, std::vector >}’
make[2]: *** [CMakeFiles/ClangLazy.dir/Utils.cpp.o] Error 1
make[1]: *** [CMakeFiles/ClangLazy.dir/all] Error 2
make: *** [all] Error 2

Thanks in advance.

Reply

Sérgio Martins 02.10.2015 12:20 pm

Which compiler are you using to compile clazy itself ?
While it should also compile fine with gcc, please try clang first (export CXX=clang++)

Reply

Nyall Dawson

06.10.2015

10:37 pm

Fantastic stuff – I’m loving it!

One small issue: with the function-args-by-ref check, I see a warnings generated whenever the Q_DECLARE_OPERATORS_FOR_FLAGS macro is used. Could this macro be ignored for this test?

Reply

Sérgio Martins 07.10.2015 10:00 am

will have a look, thanks

Reply
Sérgio Martins 07.10.2015 9:48 pm

I can’t reproduce this problem.
Please send me a minimal testcase (a compilable test.cpp).

Reply
1. Nyall Dawson 08.10.2015 10:51 pm
  
  I suspect it may be because I’m building my project using Qt 4.8. I’ll investigate further.
  
  In the meantime, would you consider adding this check?
  https://github.com/nyalldawson/clazy/commit/e1d08e988ee66bc16abed6655fb86024dc7fd75f
  
  It’s based heavily off the existing detach temporary check, but is designed to warn when iterator functions like begin() and end() are called on temporary containers. I got hit by this bug yesterday and thought it would make a good clazy check! The code may need some cleaning, but it works well for me.
  
  Reply
  1. Sérgio Martins 09.10.2015 7:53 pm
    
    Thanks, seems very useful!
    
    Filed as https://bugs.kde.org/show_bug.cgi?id=353732
    
    Reply

Alexey Ivanov

26.10.2015

2:45 am

[ 1%] Building CXX object src/thirdparty/qtlockedfile/CMakeFiles/qtlockedfile.dir/qtlockedfile.cpp.o
terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc
0 clang-3.7 0x00000000013e06b8 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1 clang-3.7 0x00000000013e1a1b
2 libpthread.so.0 0x00007ff6acdf1d10
3 libc.so.6 0x00007ff6ac1ab267 gsignal + 55
4 libc.so.6 0x00007ff6ac1aceca abort + 362
5 libstdc++.so.6 0x00007ff6ac7e6b7d __gnu_cxx::__verbose_terminate_handler() + 365
6 libstdc++.so.6 0x00007ff6ac7e49c6
7 libstdc++.so.6 0x00007ff6ac7e4a11
8 libstdc++.so.6 0x00007ff6ac7e4c29
9 libstdc++.so.6 0x00007ff6ac7e51cc
10 ClangLazy.so 0x00007ff6ab3a3553
11 ClangLazy.so 0x00007ff6ab39fc4b OldStyleConnect::OldStyleConnect(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 107
12 ClangLazy.so 0x00007ff6ab3a3233
13 ClangLazy.so 0x00007ff6ab39154f CheckManager::createCheck(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 319
14 ClangLazy.so 0x00007ff6ab3924d4 CheckManager::createChecks(std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >) + 516
15 ClangLazy.so 0x00007ff6ab3c9ce1
16 clang-3.7 0x000000000173b4ea clang::FrontendAction::CreateWrappedASTConsumer(clang::CompilerInstance&, llvm::StringRef) + 330
17 clang-3.7 0x000000000173c092 clang::FrontendAction::BeginSourceFile(clang::CompilerInstance&, clang::FrontendInputFile const&) + 2578
18 clang-3.7 0x000000000170d4c7 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 695
19 clang-3.7 0x00000000017a9de3 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 3123
20 clang-3.7 0x00000000006f6634 cc1_main(llvm::ArrayRef, char const*, void*) + 1156
21 clang-3.7 0x00000000006f586f main + 11983
22 libc.so.6 0x00007ff6ac196a40 __libc_start_main + 240
23 clang-3.7 0x00000000006f2893

using prebuild clang from llvm website, ubuntu 15.10. with cmake command:

cmake .. -DCMAKE_INSTALL_PREFIX=/opt/dev_qt5 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS_DEBUG=”-Xclang -load -Xclang /opt/clang37/lib/ClangLazy.so -Xclang -add-plugin -Xclang clang-lazy” -DCMAKE_PREFIX_PATH=/opt/clang37

Reply

Sérgio Martins 26.10.2015 11:37 am

Be sure do not mix different ABIs. gcc 5 is now using the new ABI (where std::string isn’t COW), and the pre-built binaries of clang might still be using old ABI.
So try clang from the ubuntu repos:
apt-get install g++ cmake clang llvm git-core libclang-3.6-dev

Reply
1. Alexey Ivanov 26.10.2015 12:49 pm
  
  problem with clang from official repos:
  
  /usr/lib/llvm-3.6/bin/clang: symbol lookup error: /usr/lib/ClangLazy.so: undefined symbol: _ZNK5clang15DeclarationName11getAsStringEv
  
  Reply
  1. Alexey Ivanov 26.10.2015 4:04 pm
    
    Ok. it works now. Requires libc++-dev libc++abi-dev packages. And -stdlib=libc++ flag.
    
    Clazy builded with command: cmake .. -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS_RELEASE=”-std=c++11 -stdlib=libc++”
    
    And program with command: qmake ../vacuum.pro -r INSTALL_PREFIX=/dev/qt5 CONFIG-=release CONFIG+=debug -spec linux-clang-libc++ QMAKE_CXXFLAGS+=”-std=c++11 -stdlib=libc++ -Xclang -load -Xclang ClangLazy.so -Xclang -add-plugin -Xclang clang-lazy”
    
    Reply
  2. Alexey Ivanov 26.10.2015 4:07 pm
    
    Oops, nvm, works without CLAZY_CHECKS env.
    
    And “symbol lookup error” with export CLAZY_CHECKS=all_checks
    
    Reply
    1. Alexey Ivanov 26.10.2015 4:13 pm
      
      Looks like problem with inefficient-qlist check.

Eric Lemanissier

01.11.2015

1:19 pm

Thanks for this great tool !
I am getting a lot of “c++11 range-loop might detach Qt container”, especially when iterating on member variables in non const member function. What would be the best way to make sure it does not detach ? The only I can come up is “for(const auto &e: static_cast(this)->member)”, but it feels ugly

Reply

Sérgio Martins 01.11.2015 3:10 pm

Use foreach instead:
foreach (const auto &e, member) {}

Reply
1. ericLemanissier 02.11.2015 2:10 pm
  
  Really ? I thought foreach was slower than range-based for loop (if there is not detach) : https://bugreports.qt.io/browse/QTBUG-41636
  
  Reply
  1. Sérgio Martins 02.11.2015 3:15 pm
    
    Note the difference in orders of magnitude:
    
    range-for with detachment:
    RESULT : Rangefor_Test::rangeFor_Vector():
    7.1 msecs per iteration (total: 57, iterations: 8)
    
    range-for without detachment:
    RESULT : Rangefor_Test::rangeFor_ConstVector():
    0.000023 msecs per iteration (total: 99, iterations: 4194304)
    
    foreach:
    RESULT : Rangefor_Test::forEach_Vector():
    0.000046 msecs per iteration (total: 97, iterations: 2097152)
    
    Yes the foreach takes 2x as long as a range-loop that carefully doesn’t detach.
    But a range-loop that detaches takes 15000x times more.
    
    the clazy foreach/range-loop check was made to tackle the former.
    
    Reply
    1. ericLemanissier 02.11.2015 4:20 pm
      
      These numbers actually do not have any meaning, because the loops were totally optimized away. Please refer to Torgeir Lilleskog’s comment : https://bugreports.qt.io/browse/QTBUG-41636?focusedCommentId=258507&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-258507
      
      According to him switching from detaching rangeFor to foreach is a 4x speedup, but switching from foreach to non detaching rangefor if a 2x speedup, which is quite important

Eugene Zelenko

17.11.2015

2:10 am

A lot of suggested checks are already implemented in Clang-tidy (http://clang.llvm.org/extra/clang-tidy/). C++11 migration checks were moved in there recently.

By the word, it may make sense to port Clazy checks to Clang-tidy.

Reply