KDE PIM/KItinerary/Trenitalia Barcode
< KDE PIM | KItinerary
General Observations
- always 67 bytes
- exactly for one passenger/leg
- does not seem to contain signatures, checksums or compression, based on minimal bit pattern changes on adjacent tickets
- a lot less null bytes when PNR/seat reservation are present (ie. highspeed train tickets?)
- there is a unique, globally sequential ticket number, 2017 in the 600M range, 2019 in the 1B range, which suggests 32bit might be a bit short for this
Bit Layout
Byte:Bit (MSB) | Content | Meaning | Notes |
---|---|---|---|
0:0 - 4:7 | 0x20 0x14 0xC2 0x08 0x10 | header? | fixed value in all samples |
5:0 - 7:7 | date? | varies between samples | |
8:0 - 13:4 | null | ||
13:5 - 14:3 | 100 xxxx | 0000 for Italian stations codes (UIC 83.....), 0111 for international destination? | |
14:4 - 17:3 | 24 bit uint | UIC station code of departure | only for Italian destinations apparently |
17:4 - 18:2 | 100 xxxx | see above | |
18:3 - 21:2 | 24 bit uint | UIC station code of arrival | see above |
21:3 - 22:1 | unknown | 22:1 seems 0 in all samples, the rest varies | |
22:2 - 24:1 | 16 bit uint | train number | could be as little as 14 bits too, no train number > 16k has been observed |
24:3 - 29:7 | null | ||
30:0 - 31:1 | null if no PNR present, content unknown | ||
31:2 - 32:0 | 7 bit uint | seat number | seat row for trains with an airplane-like numbering scheme |
32:1 - 32:2 | null | ||
32:3 - 32:6 | 4 bit uint | seat column as hex number | 0 for trains without an airplane-like numbering scheme |
32:7 - 33:6 | null? | ||
33:7 - 38:3 | 6x6 bit as listed below | PNR | all null if ticket has no PNR |
38:4 -43:3 | null | ||
43:4 - 44:2 | 1010 011 or null | Issuer UIC code | only set if PNR is present, "83" for Trenitalia tickets otherwise, a few preceding bits are likely part of this fields as well |
44:3 -45:7 | null | ||
46:0 - 48:7 | unknown | null if no PNR present, unknown content otherwise | |
49:0 - 49:7 | null | ||
50:0 - 50:7 | unknown | null if no PNR present, unknown content otherwise | |
51:0 - 57:7 | 0x0B 0x65 0x23 0x18 0x40 0xE6 0xC0 | fixed in all samples? | |
58:0 - 58:3 | null | might be part of the ticket number? | |
58:4 - 62:3 | 32 bit uint | ticket number | |
63:4 - 65:7 | null | ||
66:0 - 66:7 | varies between samples, same as 6:4 - 7:3? |
PNR Decoding Table
Based on all available samples, PNR strings map to the PNR binary encoding as listed in the table below:
Code | Symbol |
---|---|
0 | W |
4 | 2 |
6 | 3, Z |
8 | 4 |
10 | 5 |
12 | 6 |
14 | 7 |
18 | 9 |
22 | B |
24 | C |
26 | D |
28 | E |
30, 31 | F |
32 | G |
34 | H |
38, 39 | J |
40 | K |
42, 43 | L |
44 | M |
46 | N |
50 | P |
52 | Q |
55 | R |
56 | S |
59 | T |
60 | U |
62 | V |
This is obviously incomplete due to too few samples. Observations/speculations:
- the lowest bit seems irrelevant for which symbol a code maps to, the meaning of that bit is unknown
- there are only 32 rather than 36 symbols used, probably omitting those that are hard to differentiate (1/I or O/0).
- the gaps in the table are suspected to be 2 -> X, 12 -> 8, 20 -> A, 36 -> I, 48 -> O
- the only conflicting sample with that theory is the clash on 6 -> 3/Z.
Open Questions
- there is one sample where the departure and arrival UIC station codes are equal, contrary to what's in the corresponding PDF
- for international destinations the station code does not actually seem to be a valid UIC station code, but there is only one sample to back this up so far
- It seems plausible that the coach number is also encoded given the seat number is, 30:0 - 31:1 would seem like the obvious range. In the sample data this does not seem the case with any obvious encoding.
- Is the train type derived from the train number, or is that also encoded? 21:3 -22:1 would seem to be the obvious range for that, but so far no correlation to current samples found.
- Is the class encoded somewhere?
- Date/time encoding is still a mystery.
- Is the PNR encoded? 46-50 seems like a plausible range for that. 36^5 possible values, ie. needs >= 26 bit.
- What is the "CP Code" in the PDF, and is that encoded somewhere? -> unlikely based on comparing adjacent codes