Safely passing std::strings and std::string_view
Many of you will agree that C++ is a language that comes with sharp edges.
One example is `std::string_view`; introduced as a type to prevent unnecessary std::string-copies, but it introduces a new footgun, namely when passing a temporary string into it:
#include <string>
#include <string_view>
auto foo(std::string_view s) {
return s;
}
int main()
{
std::string h("hello ");
auto x = foo(h + "world\n"); // passing in a temporary std::string
std::cout << x << '\n'; // BOOM! (the string_view x is dangling)
}
Let's introduce a safer wrapper type to solve this problem. The wrapper type has some overhead, though. On my platform (linux), a sizeof(std::sting_view) equals 16 bytes, whereas the object bytesize of the following class will be the same as std::string, which is 40 bytes on my platform.
class safe_string
{
std::variant<std::string, std::string_view> s_;
public:
safe_string(std::string_view s)
: s_{s} {}
safe_string(std::string const& s)
: s_{std::string_view{s}} {}
safe_string(std::string&& s)
: s_{std::move(s)} {}
safe_string(const char* s)
: s_{std::string_view{s}} {
}
std::string_view string_view() const [[clang::lifetimebound]]
{
if (std::string const* sptr = std::get_if<std::string>(&s_)){
return std::string_view{*sptr};
} else {
return std::get<std::string_view>(s_);
}
}
};
The idea is that we usually store a std::string_view, except for the case we pass in a temporary string; in that case we store the temporary inside the class as a std::string. We use C++'s sum type, std::variant, to store one of the two types in the same storage.
Note the use of the [[clang::lifetimebound]] attribute in the .string_view() member function; it generates a warning in case we use our class in the wrong way:
auto foo(safe_string h) {
return h;
}
int main()
{
std::string s("hi ");
auto okay = foo(s + "there\n");
std::cout << okay.string_view();
auto dangling = foo(s + "there").string_view(); // wrong use
std::cout << dangling;
}
The Clang compiler nicely warns us about this:
<source>:71:21: error: temporary whose address is used as value of local variable 'dangling' will be destroyed at the end of the full-expression [-Werror,-Wdangling]
71 | auto dangling = foo(s + "there").string_view();
| ^~~~~~~~~~~~~~~~
1 error generated.
See for yourself in Compiler Explorer:
It's not that safe. Dangling issues can still happen without a warning, for example:
ReplyDeletestd::optional opt;
{
std::string s("hi ");
opt = foo(s);
}
std::cout << opt.value().string_view();
Also, the forth constructor is not safe if the null terminator is missing.
In these cases you have to rely on the address sanitizer.
Passing a null pointer will also cause runtime issues, although this can easily be avoided by adding a deleted constructor overload for std::nullptr_t.
Likewise, returning a safe_string is also not safe and doesn't trigger a warning.
ReplyDeletesafe_string bar() {
std::string s("hi ");
return foo(s);
}
int main() {
std::cout << bar().string_view();
}