or
Jonathan Mee
I can tokenize by writing my own function:

    std::vector<std::string> Foo(const std::string& input) {
        auto start = std::find(std::cbegin(input), std::cend(input), ' ');
        std::vector<std::string> output { std::string(std::cbegin(input), start) };

        while (start != std::cend(input)) {
            const auto finish = std::find(++start, std::cend(input), ' ');

            output.push_back(std::string(start, finish));
            start = finish;
        }
        return output;
    }

This has several issues, most importantly, doesn't C++ provide me something to do this? But also:

 1. `Foo` includes spaces in the tokens
 1. `Foo` makes a token for each space, even repeated spaces
 1. `Foo` only delimits based on spaces, not other white space

Is there something better available to me?
Top Answer
Jonathan Mee
There are 4 solutions which C++ provides, listed from least to most expensive at run time:

 1. [`std::strtok`](https://en.cppreference.com/w/cpp/string/byte/strtok)
 1. [`std::split_view`](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0789r1.pdf)
 1. [`std::istream_iterator`](https://en.cppreference.com/w/cpp/iterator/istream_iterator)
 1. [`std::regex_token_iterator`](https://en.cppreference.com/w/cpp/regex/regex_token_iterator)

They are discussed in detail below:

# `std::strtok`

`std::strtok` is a destructive tokenizer meaning:

 1. `std::strtok` will modify the string to be tokenized, so it cannot operate on `const std::string`s or `const char*`s, if the string to be tokenized needs to be preserved a copy must be made to use `std::strtok` upon
 1. Because `std::strtok` depends upon modifications to the string to be tokenized the tokenization of multiple strings cannot be interlaced, though some implementations do support this, such as: [`strtok_s`](https://msdn.microsoft.com/en-us/library/ftsafwz3.aspx/)
 1. Additionally the standard does not place any requirements upon `std::strtok` to be thread safe, though some implementations are thread safe: https://msdn.microsoft.com/en-us/library/ftsafwz3.aspx/

You could rewrite `Foo` with `std::strtok` as follows:

    std::vector<std::string> Foo(std::string input) {
        std::vector<std::string> output;

        for (auto i = strtok(std::data(input), " "); i != nullptr; i = strtok(nullptr, " ")) {
            output.push_back(i);
        }
        return outupt;
    }

This suffers from issues **1**, **2**, and **3** as listed in your question, and really only adds the use of a C++ function for doing the tokenizing.

# `std::split_view`

In C++20 has given us `std::split_view`, the exact implementation is not yet official, but examples that we've been given describe that `Foo` should be written like:

    std::vector<std::string> Foo(const std::string& input) {
        std::vector<std::string> output;

        for(const auto& i : input | std::ranges::views::split(' ')) {
            output.emplace_back(std::cbegin(i), std::cend(i));
        }
        return output;
    }

This suffers from issues **1**, **2**, and **3** as listed in your question, but improves over `std::strtok` by tokenizing without destroying `input`. It should be noted that the C++20 standard hasn't been finalized I've used [this resource](https://ezoeryou.github.io/blog/article/2019-01-10-range-view.html) in prototyping `std::split_view` code.

# `std::istream_iterator`

`std::istream_iterator` requires a `std::istringstream` to be created, but makes `Foo` very easy to write:

    std::vector<std::string> Foo(const std::string& input) {
        std::istringstream output(input);

        return { std::istream_iterator<std::string>(output), std::istream_iterator<std::string>() };
    }

This solves all issues listed in your question, but adds the cost of constructing a `std::istringstream`.

# `std::regex_token_iterator`

`std::regex_token_iterator` requires a regex which captures tokens. This provides greater flexability because the delimiters need not be whitespace, but requires a regex to be run on the string to be tokenized. If `Foo` were to be rewritten with a `std::regex_token_iterator` it would look something like:

    std::vector<std::string> Foo(std::string input) {
        std::regex output((?:^|\s*)(\S+))

        return { std::sregex_token_iterator(std::cbegin(input), std::cend(input), output, 1), std::sregex_token_iterator() };
    }

This solves all the issues listed in your question, but adds the cost of running a regex on the string to be tokenized.
What's the Best way to Tokenize a string?
GreenDragon
Yes, but with std::getline it can split not only by space, I think it worth to mention it. 
Jonathan Mee
Yup that's the `std::istream_iterator` method.  As mentioned it can be in the more expensive half, but it's also really easy to use. If you're dealing with user input, this cost increase will almost always be absorbed into the cost of reading the input.
GreenDragon
I always prefer have one complete answer so it would be better to supple existing. 
Jack Douglas replying to GreenDragon
Would you post that as an answer @GreenDragon? 
GreenDragon
@Jonathan   
There is yet one way - very simple that I personally use:  

    std::istringstream stream(str); 
    while (std::getline(stream, token, delim))
    {
        vec.push_back(token);
    }