Each row in a table has a [system column](http://www.postgresql.org/docs/9.3/static/ddl-system-columns.html) `ctid` of type `tid` that represents the physical location of the row:
<>http://dbfiddle.uk/?rdbms=postgres_9.4&fiddle=98abf5bffd03364076bd4cd4b30feb6e
What's the best way of getting just the page number as from the `ctid` in the most appropriate type (eg `integer`, `bigint` or `numeric(1000,0)`)?
The [only way I can think of](http://dbfiddle.uk/?rdbms=postgres_9.4&fiddle=1dc8740cf72ca8a8c7c1a26113e6d442) is very ugly.
Top Answer
Erwin Brandstetter (imported from SE)
SELECT (ctid::text::point)[0]::bigint AS page_number FROM t;
[Your **fiddle** with my solution.][1]
@bma already hinted something similar in a comment. Here is a ...
### Rationale for the type
`ctid` is of type `tid` (tuple identifier), called `ItemPointer` in the C code. [Per documentation:][2]
> This is the data type of the system column `ctid`. A tuple ID is a
> pair (**block number**, **tuple index within block**) that identifies the
> physical location of the row within its table.
Bold emphasis mine. [And:][3]
> (`ItemPointer`, also known as `CTID`)
A block is **8 KB** in standard installations. [Maximum Table Size is **32 TB**.][4] It follows logically that block numbers must accommodate *at least* a maximum of (calculation fixed according to comment by @Daniel):
SELECT (2^45 / 2^13)::int -- = 2^32 = 4294967294
Which would fit into an unsigned `integer`. On further investigation I found [**in the source code**][5] that ...
> blocks are numbered sequentially, **0 to 0xFFFFFFFE**.
Bold emphasis mine. Which confirms the first calculation:
SELECT 'xFFFFFFFE'::bit(32)::int8 -- max page number: 4294967294
Postgres uses signed integer and is therefore one bit short. I couldn't pin down, yet, whether the text representation is shifted to accommodate signed integer. Until somebody can clear this up, I would fall back to **`bigint`**, which works in any case.
### Cast
There is [no cast registered][6] for the `tid` type in Postgres 9.3:
SELECT *
FROM pg_cast
WHERE castsource = 'tid'::regtype
OR casttarget = 'tid'::regtype;
castsource | casttarget | castfunc | castcontext | castmethod
------------+------------+----------+-------------+------------
(0 rows)
You can still cast to `text`. There is a [text representation for everything in Postgres][7]:
> Another important exception is that "automatic I/O conversion casts",
> those performed using a data type's own I/O functions to convert to or
> from text or other string types, are not explicitly represented in
> `pg_cast`.
The text representation matches that of a point, which consists of two `float8` numbers, that cast is lossless.
You can access the first number of a point with index 0. Cast to `bigint`. Voilá.
## Performance
I ran a quick test on a table with 30k rows (best of 5) on a couple of alternative expressions that came to mind, including your original:
SELECT (ctid::text::point)[0]::int -- 25 ms
,right(split_part(ctid::text, ',', 1), -1)::int -- 28 ms
,ltrim(split_part(ctid::text, ',', 1), '(')::int -- 29 ms
,(ctid::text::t_tid).page_number -- 31 ms
,(translate(ctid::text,'()', '{}')::int[])[1] -- 45 ms
,(replace(replace(ctid::text,'(','{'),')','}')::int[])[1] -- 51 ms
,substring(right(ctid::text, -1), '^\d+')::int -- 52 ms
,substring(ctid::text, '^\((\d+),')::int -- 143 ms
FROM tbl;
`int` instead of `bigint` here, mostly irrelevant for the purpose of the test. I didn't repeat for `bigint`.
The cast to `t_tid` builds on a user-defined composite type, like @Jake commented.
The gist of it: Casting tends to be faster than string manipulation. Regular expressions are expensive. The above solution is shortest and fastest.
[1]: http://sqlfiddle.com/#!15/9d249/4
[2]: http://www.postgresql.org/docs/current/interactive/datatype-oid.html
[3]: http://www.postgresql.org/docs/current/interactive/storage-page-layout.html
[4]: http://www.postgresql.org/about/
[5]: https://github.com/postgres/postgres/blob/master/src/include/storage/block.h
[6]: http://www.postgresql.org/docs/current/interactive/catalog-pg-cast.html
[7]: http://www.postgresql.org/docs/current/interactive/catalog-pg-cast.html