add tag
Jonathan Mee
I've written a toy example to try to help me understand this. Given the following string:

    const char input[] = "if (KnR)\n"
                         "\tfoo();\n"
                         "if (spaces) {\n"
                         "    foo();\n"
                         "}\n"
                         "if (allman)\n"
                         "{\n"
                         "\tfoo();\n"
                         "}\n"
                         "if (horstmann)\n"
                         "{\tfoo();\n"
                         "}\n"
                         "if (pico)\n"
                         "{\tfoo(); }\n"
                         "if (whitesmiths)\n"
                         "\t{\n"
                         "\tfoo();\n"
                         "\t}\n";

If I'm using the following: `const regex r("(.+?)\\s*\\{?\\s*(.+?;)\\s*\\}?\\s*")` How can I find the begin and end position of all of the first capture in `input` and of all the second capture in `input`?

So for example I expect capture 1 to have the following ranges:

 1. 0 to 8
 1. 17 to 28
 1. 44 to 55
 1. 68 to 82
 1. 94 to 103
 1. 115 to 131

And I expect capture 2 to have the following ranges:

 1. 10 to 16
 1. 35 to 41
 1. 59 to 65
 1. 85 to 91
 1. 106 to 112
 1. 136 to 142
Top Answer
Jonathan Mee
We should start by talking about [`std::match_results`](https://en.cppreference.com/w/cpp/regex/match_results) which is what `std::regex_match` or `std::regex_search` would store the function results into. Provided the regex suceeded it will contain [`std::match_results::size`](https://en.cppreference.com/w/cpp/regex/match_results/size) [`std::sub_match`](https://en.cppreference.com/w/cpp/regex/sub_match)s. Provided that the function generating the `std::match_results` succeeded there will be a 1-to-1 mapping from the captures in the regex to the `std::sub_match`s in the `std::match_results`. When indexing a `std::match_results`' `std::sub_match`s:

 * The `std::sub_match`s at indices less than 0 contain the portion of the matched string which precededs the first matched character of the entire regex
 * The `std::sub_match` at index 0 contains the portion of the string matched by the entire regex
 * The `std::sub_match`s greater than 0 and less than `std::match_results::size` contain the portion of the string matched by the regex's corresponding 1-based capture
 * The `std::sub_match`s greater than or equal to `std::match_results::size` contain the portion of the string which follows the last matched character of the entire regex

 We can use a [`std::regex_iterator`](https://en.cppreference.com/w/cpp/regex/regex_iterator). To obtain the `std::match_results`s from the 1^st^ capture we could do:

     const std::vector<std::cmatch> output = { std::cregex_iterator(std::cbegin(input), std::cend(input), r), std::cregex_iterator() };

To obtain the matched range from these `std::cmatch`s you can use the `position` method to find the offset and the `length` method to find the size of the match, simply provide these methods the index of the desired capture.

So for example the 1^st^ captures offset in the 1^st^ match could be found by doing: `output.front().position(1)`

The length of this match could be found by doing: `output.front().length(1)`

These could be added together to find the end of the range.

[**Live Example**](https://ideone.com/x0Ekkt)

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.