Saturday, 18 June 2016

gcc - Is it intended by the C++ standards committee that in C++11 unordered_map destroys what it inserts?




I've just lost three days of my life tracking down a very strange bug where unordered_map::insert() destroys the variable you insert. This highly non-obvious behaviour occurs in very recent compilers only: I found that clang 3.2-3.4 and GCC 4.8 are the only compilers to demonstrate this "feature".



Here's some reduced code from my main code base which demonstrates the issue:



#include 
#include
#include

int main(void)
{

std::unordered_map> map;
auto a(std::make_pair(5, std::make_shared(5)));
std::cout << "a.second is " << a.second.get() << std::endl;
map.insert(a); // Note we are NOT doing insert(std::move(a))
std::cout << "a.second is now " << a.second.get() << std::endl;
return 0;
}


I, like probably most C++ programmers, would expect output to look something like this:




a.second is 0x8c14048
a.second is now 0x8c14048


But with clang 3.2-3.4 and GCC 4.8 I get this instead:



a.second is 0xe03088
a.second is now 0



Which might make no sense, until you examine closely the docs for unordered_map::insert() at http://www.cplusplus.com/reference/unordered_map/unordered_map/insert/ where overload no 2 is:



template  pair insert ( P&& val );


Which is a greedy universal reference move overload, consuming anything not matching any of the other overloads, and move constructing it into a value_type. So why did our code above choose this overload, and not the unordered_map::value_type overload as probably most would expect?



The answer stares you in the face: unordered_map::value_type is a pair<const int, std::shared_ptr> and the compiler would correctly think that a pair<int, std::shared_ptr> isn't convertible. Therefore the compiler chooses the move universal reference overload, and that destroys the original, despite the programmer not using std::move() which is the typical convention for indicating you are okay with a variable getting destroyed. Therefore the insert destroying behaviour is in fact correct as per the C++11 standard, and older compilers were incorrect.




You can probably see now why I took three days to diagnose this bug. It was not at all obvious in a large code base where the type being inserted into unordered_map was a typedef defined far away in source code terms, and it never occurred to anyone to check if the typedef was identical to value_type.



So my questions to :




  1. Why do older compilers not destroy variables inserted like newer compilers? I mean, even GCC 4.7 doesn't do this, and it's pretty standards conforming.


  2. Is this problem widely known, because surely upgrading compilers will cause code which used to work to suddenly stop working?


  3. Did the C++ standards committee intend this behaviour?


  4. How would you suggest that unordered_map::insert() be modified to give better behaviour? I ask this because if there is support here, I intend to submit this behaviour as a N note to WG21 and ask them to implement a better behaviour.




Answer



As others have pointed out in the comments, the "universal" constructor is not, in fact, supposed to always move from its argument. It's supposed to move if the argument is really an rvalue, and copy if it's an lvalue.



The behaviour, you observe, which always moves, is a bug in libstdc++, which is now fixed according to a comment on the question. For those curious, I took a look at the g++-4.8 headers.



bits/stl_map.h, lines 598-603



  template           std::enable_if                                                _Pair&&>::value>::type>

std::pair
insert(_Pair&& __x)
{ return _M_t._M_insert_unique(std::forward<_Pair>(__x)); }


bits/unordered_map.h, lines 365-370



  template           std::enable_if                                                _Pair&&>::value>::type>

std::pair
insert(_Pair&& __x)
{ return _M_h.insert(std::move(__x)); }


The latter is incorrectly using std::move where it should be using std::forward.


No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...