add tag
joulev
## Problem

`parser` module parses token by token. It does not respect groups. In other words, if `parser` parses `Foo {bar} baz`, I believe it parses like this (correct me if I am wrong)

```none
F-o-o-␣-{-b-a-r-}-␣-b-a-z
```

I want it to parse like this (it is guaranteed that there is no nesting like `{{ba}r}`).

```none
F-o-o-␣-{bar}-␣-b-a-z
```

Note that `{` can be preceeded by any character (not just the space character, most frequently it is preceeded by a control sequence), so things like `\pgfparserdef{foo}{initial}{blank space}[m]` unfortunately won't help.

## Approach

I think I will define an action when `{` is parsed. It will then execute another parser that does nothing and ends at `}`.

So although the problem does not explicitly ask for a special action at `{`, all of the following attempts aim to do so.

## Attempt 1

Use the character directly obviously doesn't work:

```tex
% arara: pdflatex
\documentclass{article}
\usepackage{pgf}
\usepgfmodule{parser}
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\pgfparserdef{foo}{initial}{{\typeout{Got a \char`\{}}
\pgfparserdeffinal{foo}{}
\pgfparserset{silent=true}
\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}

% Error: Runaway argument?
```

## Attempt 2

`\meaning{` gives me `begin-group character {`. But using that phrase also doesn't work.

```tex
% arara: pdflatex
\documentclass{article}
\usepackage{pgf}
\usepgfmodule{parser}
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\pgfparserdef{foo}{initial}{begin-group character {}{\typeout{Got a \char`\{}}
\pgfparserdeffinal{foo}{}
\pgfparserset{silent=true}
\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}

% Error: Runaway argument?
```

### Attempt 2.1

Using `begin-group character` also doesn't help.

```tex
% arara: pdflatex
\documentclass{article}
\usepackage{pgf}
\usepgfmodule{parser}
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\pgfparserdef{foo}{initial}{begin-group character}{\typeout{Got a \char`\{}}
\pgfparserdeffinal{foo}{}
\pgfparserset{silent=true}
\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}
```

No errors, but because `begin-group character` is not a meaning of any character, `\typeout` is not executed.

## Attempt 3

Changing category code is my last resource. It does not return an error, but once again, `\typeout` is not executed.

```tex
% arara: pdflatex
\documentclass{article}
\usepackage{pgf}
\usepgfmodule{parser}
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\begingroup
  \catcode`\{12\relax
  \catcode`\}12\relax
  \catcode`\(1\relax
  \catcode`\)2\relax
  \pgfparserdef(foo)(initial){(\typeout(Got a \char`\{))
\endgroup
\pgfparserdeffinal{foo}{}
\pgfparserset{silent=true}
\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}
```

## Question

So how to parse `{` using `parser`? Or is there a better approach to solve the 'root' problem?

Note that while answers using other tools, e.g. `expl3`, are welcome, I am afraid they will hardly be useful to me, as my colleagues won't understand it :)
Top Answer
Skillmon
# Gobbling the Tokens using `parser`

The easiest way to define a rule for `{` or `}` is to use the `\meaning` of `\bgroup` and `\egroup`, since those are let to `{` and `}`. (Obviously) you can't use `\pgfparserdef{foo}{initial}\bgroup{stuff}` as the code couldn't distinguish this from an actual opening brace (it's using `\futurelet` -- pretty equivalent of `\@ifnextchar` -- to look for the opening brace).

Also, while of course possible, I'd advice you to silence parsers individually so that you don't accidentally break other code using the module.

Your code should look like this:

```tex
\documentclass{article}
\usepackage{pgf}
\usepgfmodule{parser}
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\pgfparserdef{foo}{initial}{\meaning\bgroup}{\typeout{Got a \char`\{}}
\pgfparserdef{foo}{initial}\egroup{\typeout{Got a \char`\}}}
\pgfparserset{foo/silent=true}
\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}
```

----

# Grabbing the tokens as an argument

You could actually also grab the argument in braces instead of trashing it with `parser`'s ability to ignore everything, but this requires a bit of extra code, a small trick to reinsert an unbalanced opening brace, and an undocumented `parser` internal:

```tex
\documentclass{article}

\usepackage{pgf}
\usepgfmodule{parser}

\makeatletter
\pgfparserdef{foo}{initial}.{\pgfparserswitch{final}}
\pgfparserdef{foo}{initial}{\meaning\bgroup}{\foogroupremover}
\newcommand*\foogroupremover[1]
  {%
    \expandafter\foogroupremoverAUX\expandafter{\iffalse}\fi
  }
\newcommand\foogroupremoverAUX[1]
  {%
    \typeout{There was a group containing `#1'}%
    \pgfparser@getnexttoken
  }
\pgfparserset{foo/silent=true}
\makeatother

\begin{document}
\pgfparserparse{foo}This is an \emph{emphasized} word.
\end{document}
```

The above uses two steps to grab the braced content. The first step removes a `parser` internal which would parse the next token (that's the argument grabbed by `\foogroupremover`) and inserts an unbalanced opening brace. The next step (`\foogroupremoverAUX`) grabs the braced contents and reinserts the `parser` internal to give control back to `parser`.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.