Tuesday, July 8, 2025

Safely passing std::strings and std::string_view

Many of you will agree that C++ is a language that comes with sharp edges.
One example is `std::string_view`; introduced as a type to prevent unnecessary std::string-copies, but it introduces a new footgun, namely when passing a temporary string into it:

#include <string>
#include <string_view>

auto foo(std::string_view s) {
return s;
}

int main()
{
std::string h("hello ");
auto x = foo(h + "world\n"); // passing in a temporary std::string
std::cout << x << '\n'; // BOOM! (the string_view x is dangling)
}

Let's introduce a safer wrapper type to solve this problem. The wrapper type has some overhead, though. On my platform (linux), a sizeof(std::sting_view) equals 16 bytes, whereas the object bytesize of the following class will be the same as std::string, which is 40 bytes on my platform.

class safe_string
{
std::variant<std::string, std::string_view> s_;

public:
safe_string(std::string_view s)
: s_{s} {}

safe_string(std::string const& s)
: s_{std::string_view{s}} {}

safe_string(std::string&& s)
: s_{std::move(s)} {}

safe_string(const char* s)
: s_{std::string_view{s}} {
}

//! View its contents
std::string_view string_view() const [[clang::lifetimebound]]
{
if (std::string const* sptr = std::get_if<std::string>(&s_)){
return std::string_view{*sptr};
} else {
return std::get<std::string_view>(s_);
}
}
};

The idea is that we usually store a std::string_view, except for the case we pass in a temporary string; in that case we store the temporary inside the class as a std::string. We use C++'s sum type, std::variant, to store one of the two types in the same storage.

Note the use of the [[clang::lifetimebound]] attribute in the .string_view() member function; it generates a warning in case we use our class in the wrong way:

auto foo(safe_string h) {
return h;
}

int main()
{
std::string s("hi ");

auto okay = foo(s + "there\n");
std::cout << okay.string_view();

auto dangling = foo(s + "there").string_view(); // wrong use
std::cout << dangling;
}

The Clang compiler nicely warns us about this:

<source>:71:21: error: temporary whose address is used as value of local variable 'dangling' will be destroyed at the end of the full-expression [-Werror,-Wdangling] 71 | auto dangling = foo(s + "there").string_view(); | ^~~~~~~~~~~~~~~~ 1 error generated.

See for yourself in Compiler Explorer:


2 comments:

  1. It's not that safe. Dangling issues can still happen without a warning, for example:

    std::optional opt;
    {
    std::string s("hi ");
    opt = foo(s);
    }
    std::cout << opt.value().string_view();

    Also, the forth constructor is not safe if the null terminator is missing.

    In these cases you have to rely on the address sanitizer.

    Passing a null pointer will also cause runtime issues, although this can easily be avoided by adding a deleted constructor overload for std::nullptr_t.

    ReplyDelete
  2. Likewise, returning a safe_string is also not safe and doesn't trigger a warning.

    safe_string bar() {
    std::string s("hi ");
    return foo(s);
    }

    int main() {
    std::cout << bar().string_view();
    }

    ReplyDelete

Safely passing std::strings and std::string_view Many of you will agree that C++ is a language that comes with sharp edges. One example is `...