On the Removal of toSet(), toList() and Others

17 thoughts on “On the Removal of toSet(), toList() and Others”

David Nolden 23.06.2021 8:21 pm

Hi, while some of the code snippets look interesting, all of this is still much more complicated to read and maintain than code using the simple “toSet()” function. For most code written in Qt, it is more important that the code CORRECT, than it is that the code is PERFORMANT. Forcing users to think about algorithms in code paths that are not performance critical IMO means premature optimization, which, as some people claim, is the root of all evil.

Reply
1. Giuseppe D'Angelo 24.06.2021 1:15 pm
  Hi,
  
  let me try to counter-argument a bit, sorry for nitpicking on your answer. 🙂
  
  Hi, while some of the code snippets look interesting, all of this is still much more complicated to read and maintain than code using the simple “toSet()” function.
  
  I’d say that’s partially true. However,
  
  a solution like ranges::to or kdToContainer isn’t that much more complicated or verbose to use (and please bear with me, if the documentation and Google hits hinted at those instead of toSet, you’d just be using those without wondering about “why there isn’t a toSet() function?” — I mean, is anyone asking why there isn’t a “toStdDeque()“? (And, very ironically, there is a QList::toStdList()…)
  
  If this is truly supposed to be a one-off in the code, then one can accept a tiny bit of extra verbosity.
  
  If usages of toSet() aren’t a one-off in the code, then this is no longer a matter of premature optimization; this is really optimization.
  
  For most code written in Qt, it is more important that the code CORRECT, than it is that the code is PERFORMANT. Forcing users to think about algorithms in code paths that are not performance critical IMO means premature optimization
  
  As I’ve tried to argue, this is a double edged sword, about the danger of offering an easy-to-misuse API.
  
  First and foremost, offering the convenience is, well, convenient; the downside is that such convenience gets used even in code that is supposed to be performant.
  
  Second, this risks bringing in a “death by thousand cuts”. There isn’t just premature optimization; there’s also premature pessimization, caused by making a poor choice whereas a better one would’ve costed maybe 5 more seconds of typing. The danger of premature pessimization is that if one has slightly inefficient code all over the place, there is absolutely no easy way for someone to improve the overall performances of an application; a profiler will not be able to pinpoint a few easy problems to fix. That’s because the inefficiency, eventually, becomes so thinly spread that finding the causes and fixing them becomes a super-massive investment (e.g. go around and change all of your container classes, string classes, and so on). I’d rather take the extra second before making a “worse” decision every time, rather than having to revisit the code N years down the line in a desperate search on how we can make the application boot in 1 second instead of 5.
  
  In a nutshell: the ultimate goal would be that the default easy-to-go choice should also be the most efficient one; that’s very hard to do with a “blatant”, possibly non-efficient choice available immediately.
  
  (Eventually, if one does down this path, this leads me to question whether Qt users would be better served by using Python or other languages where these conversion facilities are “first-class”, rather than C++? It’s an important question to ask, for Qt’s own survival.)
  
  Thanks for the comment!
  Reply
  1. David Nolden 24.06.2021 2:31 pm
    
    Thanks for the reply.
    
    It’s interesting that you bring in python.
    
    IMO, one of the great advantages of Qt (especially it’s container classes) is/was, that they are much easier to use than STL counterparts (at the price of less flexibility), and they can be programmed with a similar convenience as when programming in python. We’re operating on a continuum here, from best to worst performance, and worst to best convenience:
    * A custom container class optimized with custom algorithms for your specific usecase (most performant, least convenient)
    * STL containers using STL algorithm and all the template specialization possibilities (performant, more convenient)
    * Qt containers (less performant, even more convenient)
    * python (worst performance, highest convenience, bad long-term maintainability)
    
    It should be everybody’s own choice which level of performance and convenience they want to use, but an advantage of CONVENIENT Qt containers is, that you can get the best from both worlds: the convenience of python, but the performance of C++, if you can find the performance critical paths and optimize them at a later point.
    
    Reply
    1. Giuseppe D'Angelo 25.06.2021 3:22 pm
      
      Hi,
      
      Well, in the blog post itself I specifically argue that yes, Qt containers do offer some extra convenience, and I’m completely fine with it. What I’m less fine with is when this extra convenience starts falling on the verge of “easy to use, easy to misuse”. Something like indexOf() doesn’t fall there, something like toSet() does (IMNSHO). The very fact that there’s a lot of reactions to its deprecation is also somehow a warning sign to me: does it mean that, in most codebases, it wasn’t a one-off usage that can be quickly fixed — and move on? Does it mean that it was extensively used? Doesn’t that warrant that the code should be at least checked for its algorithmic characteristics?
      
      Aside: the “bad long-term maintainability” of Python is a case of [citation needed]. And, “if you can find the performance critical paths and optimize them at a later point” is a tremendous effort. As I said in my comment before: if most of your code is ever so slightly “slow”, you won’t find them easily. This has historically led to quite some drastic decisions, like rewriting everything in another framework or another programming language. Do you want a data point? Qt for MCU, in order to cope with the small CPU/memory requirements, has a complete reimplementation of the Qt string/container classes, to be more efficient. That’s because it’s not about ONE specific container instance or string instance, it’s about the fact that ALL of them are slightly slow (because their implementation in “mainstream” Qt is). I fail too see why non-MCU users should not get the same benefits. (Ok, now the discussion is completely derailed, apologies. :-))
      
      Thanks again,
      
      Reply
David Nolden 23.06.2021 9:59 pm

Some say that “premature optimization is the root of all evil”, and by forcing people to think about algorithms and use more complex code, you’re forcing them to do exactly that.

Reply
1. Ilya 30.06.2021 10:59 am
  
  Why no list.uniqu() to ease the migration from all the toSet().toList() ? That would be very convenient as it allows for bulk search & replace
  
  Reply
2. André Somers 30.06.2021 11:18 am
  
  Others say that algorithms express intent much better than all the ways people invent to avoid them. More expressive code is easier to read and easier to maintain, contains less bugs and generally is faster.
  
  Reply
Tired Today 27.07.2021 8:07 am

Giuseppe, you said that the Qt policy is to have nice-to-use APIs, and yet you are here defending that people should use a non-discoverable (no auto-completion), much larger and error prone to use API. See the huge contradiction?

The only way in which this would not be contradictory is because, as you said, you are a C++ teacher. You know what? Many people are not with Qt because C++, but in spite of C++.

You will not drive people from C++ to Qt, but you will drive Qt people away from Qt with this kind of changes.

Reply
1. Giuseppe D'Angelo 27.07.2021 10:57 pm
  
  Hi!
  
  Giuseppe, you said that the Qt policy is to have nice-to-use APIs, and yet you are here defending that people should use a non-discoverable (no auto-completion), much larger and error prone to use API. See the huge contradiction?
  
  Well, I also said that nice-to-use shouldn’t clash with easy-to-misuse. I do see the struggle, and find hard to draw the line. And I don’t claim I have all the answers. Thus, this blog post. 🙂
  
  The only way in which this would not be contradictory is because, as you said, you are a C++ teacher. You know what? Many people are not with Qt because C++, but in spite of C++.
  
  Well, this opens an interesting conversion. For instance, would users prefer to use Qt combined with another language (Python)?
  
  Reply
round earther 02.08.2021 3:59 pm

Have i read this correctly? Are you suggesting I replace toSet with

std::sort(container.begin(), container.end());
container.erase(std::unique(container.begin(), container.end()), container.end());

1. I dont want c++ garbage in my beautiful qt code

2. You cant be serious

Reply
1. Giuseppe D'Angelo 05.08.2021 6:19 pm
  
  Hi!
  
  This sounds like an over-simplification of my argument, given the code you pasted doesn’t even do the same thing; so I’m not sure what you meant there…?
  
  PS: I am serious, and don’t call me Shirley.
  
  Reply
Tristan Lewis 21.08.2021 5:13 am

Thanks for making these kinds of performance improving choices! We’re working with an older code base that needs performance improvements and is suffering from “death by a thousand cuts” issues.

These kinds of changes, along with articles like this, help give us a better idea of where to head!

Reply
1. Giuseppe D'Angelo 22.08.2021 8:48 am
  
  Thanks, glad you liked it.
  
  Reply
Doug Rogers 20.12.2021 5:46 pm
You could use these functions to replace the Qt functions
```
template<typename T> QSet<T> fromList(QList<T> &list)
{
    QSet<T> set(list.begin(), list.end());
    return set;
}

template<typename T> QSet<T> toSet(QList<T> &list)
{
    QSet<T> set(list.begin(), list.end());
    return set;
}
```
Reply
1. Giuseppe D'Angelo 20.12.2021 7:55 pm
  Hi. Took the liberty of fixing formatting in the above, as the blog ate some code parts.
  
  Sure, the above “works”. But:
  
  1) fromList is such a generic term that shouldn’t be “reserved” by something that makes QSets out of it. Nowhere in its name it says that it’s building QSets.
  
  2) Again this funcionality is packaged in a *much* more generic and convenient form by ranges-v3, kdToContainer, and so on.
  
  QList<int> myList = ~~~; auto set1 = myList | kdToContainer<QSet>();
  
  3) I still struggle to understand why this specific functionality is much wanted. What is it about conversions from/to sets that deserves their own? Why not generic conversions?
  
  4) The signatures above are actually wrong — so don’t use that snippet as-is.
  Reply
Pavel Celba 13.10.2022 3:06 pm

I disagree that std::sort() + std::unique() is best way – for any large data set – e.g. 100M entries. I’d say most performant will still be myList.toSet().toList(); for making it unique as it’s O(N) algorithm compared to O(N * log N) which you need for sorting. Memory wise, it will not be best, but if you are interested in performance.

Or do you have hard data that it’s the other case???

Reply
1. Giuseppe D'Angelo 13.10.2022 4:16 pm
  
  Hi,
  
  Please be very aware of the hidden constants in the big-O notation, as well as the hidden amortized costs. Yes, creating a set from a sequence costs O(N) (but nitpicking: in the optimal case where all insertions are O(1). If there a lot of duplicates, insertions start costing a linear amount of time, possibly up to O(N²)). Hidden in that O(1) insertion cost there is a huge constant: the cost of allocating a piece of memory *per element* (at least that’s how QSet works in Qt 5, similarly to std::unordered_set). I’m not talking about the memory usage, I’m talking about CPU time. This cost (CPU time for O(N) allocations, O(N) deallocations) quickly trumps the sort+unique approach. For instance: https://quick-bench.com/q/Dk0bWszN-DVApEdcm4hZhS8-I2o
  
  Reply

On the Removal of toSet(), toList() and Others or "How Do I Convert a QList to QSet in Qt 6?"

This has made a lot of people very angry and been widely regarded as a bad move.

That’s why we…actually have nice things!

The struggle between “easy to use,” and “hard to misuse”

Summary (or: TL;DR: give me the solutions!)

“I need to eliminate duplicates from a linear, sequential container (`QList`, `QVector`, `std::vector`, etc.).”

“I need to process unique elements, skipping duplicates.”

“I really, really need to convert a QList (or another container) to a QSet.”

“It’s still me. I really, really need to convert a QList (or another container) to a QSet. AND I HATE THIS SYNTAX!”

“It’s still me. I really, really need to convert a QList (or another container) to a QSet. range-v3 sounds sweet, but I can’t use it yet — slows down my compilation times too much, etc.”

FAQs

Could Qt have left these methods in Qt 6, although deprecated?

Why not add some generic `Container::to<OtherContainer>()` functions?

On the Removal of toSet(), toList() and Others or "How Do I Convert a QList to QSet in Qt 6?"

This has made a lot of people very angry and been widely regarded as a bad move.

That’s why we…actually have nice things!

The struggle between “easy to use,” and “hard to misuse”

Summary (or: TL;DR: give me the solutions!)

“I need to eliminate duplicates from a linear, sequential container (QList, QVector, std::vector, etc.).”

“I need to process unique elements, skipping duplicates.”

“I really, really need to convert a QList (or another container) to a QSet.”

“It’s still me. I really, really need to convert a QList (or another container) to a QSet. AND I HATE THIS SYNTAX!”

“It’s still me. I really, really need to convert a QList (or another container) to a QSet. range-v3 sounds sweet, but I can’t use it yet — slows down my compilation times too much, etc.”

FAQs

Could Qt have left these methods in Qt 6, although deprecated?

Why not add some generic Container::to<OtherContainer>() functions?

“I need to eliminate duplicates from a linear, sequential container (`QList`, `QVector`, `std::vector`, etc.).”

Why not add some generic `Container::to<OtherContainer>()` functions?