The below is outdated and doesn't reflect the latest understanding of MÁV barcodes anymore!
Current Version (>= 2020?)
General Observation
Uses PDF417 or Aztec barcode format, same content on PDF and in the official app.
Variable length.
For domestic tickets only.
No similarities with a known ERA format.
Outer Structure
First byte is a u8 version number.
Second byte is a u8 signing key id.
Gzip-compressed payload using deflate compression, starting with the standard Gzip header 0x1f8b0800000000000000.
256 remaining bytes, high entropy and length suggest a cryptographic signature.
Payload Structure (v4)
Seems byte- rather than bit-aligned.
String encoding is UTF-8.
Number encoding seems big endian.
Content has a high amount of null bytes.
Date/time values are encoded as seconds since 2017-01-01 00:00:00 CET (or 2016-12-31 23:00:00 UTC). Exception: traveler birth date.
Consists of a variable set of blocks.
Block layout is defined by information in the header block, there does not seem to be a TLV-like structure.
Observed block layouts:
Header block.
Either passenger or bike addon block.
Trip block.
0-2 reservation/surcharge blocks.
Header Block
Always present.
Always at offset 0.
39 bytes long.
Offset
Size
Data Type
Meaning
Notes
0
17
string
ticket number
printed as "CIV" in the PDF
17
1
null
?
null in all samples
18
2
uint16
UIC company code
issuer?, 0x0483 (1155) for MÁV-Start
20
4
time
issuing time
24
4
float32
price
in HUF
28
1
?
ticket type??
Bits 0x01 and 0x80 indicate presence of passenger and trip blocks. Can be null for e.g. bike place reservations.
29
1
null
?
null in all samples
30
1
uint8
number of reservation/surcharge blocks
31
4
null
?
null in all samples
35
4
?
?
only two distinct values observed in all samples
Passenger Block
Present when header block byte 28 has bit 0x80 set.
When present, located right after the header block (offset 39).
68 bytes long.
Offset
Size
Data Type
Meaning
Notes
0
45
string
passenger name
null terminated
45
4
uint32
passenger birth date
year * 10000 + month * 100 + day
49
15
null
?
null in all samples
64
4
?
?
Bike Addon Block
Present when header block byte 28 is 0x01.
When present, located right after the header block.
4 bytes long.
Offset
Size
Data Type
Meaning
Notes
0
4
?
?
values seem fixed in all samples?
Trip Block
Present when header block byte 28 has bit 0x01 set.
Follows the passenger or bike addon block (offset 107 or 43).
110 bytes long.
Offset
Size
Data Type
Meaning
Notes
0
3
uint24
UIC departure station code
including the national prefix ('55' for HU)
3
3
uint24
UIC arrival station code
6
90
10 * uint24
UIC station codes
list of vias, null if not set
96
1
text
class
"1" or "2"
97
1
?
?
0x01 in all samples
98
4
time
time of validity/travel
102
3
uint24
validity length
in minutes
105
1
?
?
106
4
?
?
varying between samples
Seat Reservation Block
57 bytes long.
Also used for surcharge ("Pótjegy") blocks, in which case the fields for coach and seat numbers are all null.
Order of surcharge and reservation blocks seems undefined if both are present.
Follow all other blocks, number of blocks determined by header block byte 30.
Offset
Size
Data Type
Meaning
Notes
0
3
uint24
UIC departure station code
3
3
uint24
UIC arrival station code
6
4
?
?
10
4
time
time of validity/travel
14
2
uint16
UIC company code
operator?, 0x0483 (1155) for MÁV-Start
16
5
string
train number
null-terminated
21
1
?
?
0x01 in all samples
22
3
string
coach number
null terminated
25
2
uint16
seat number
27
2
uint16
seat number
repeated from byte 25/26?
29
28
null
?
null bytes in all samples
Missing/Suspected Information
Station names are not included, but station codes might be. UIC station numbers (possibly without the country prefix) would be the obvious suspect, given the MÁV website uses those as well.
If the train number is included, one would expect at least the day of travel to be included as well.
Class: several candidate locations exist, but given it's small footprint we need a lot more samples to confirm one of those.
Version 5/6 Format
Outer Structure
First byte is a u8 version number
Second byte is a u8 signing key id
17 digit ticket numbers, ASCII
1 null byte
4 digit issuer UIC code (1155 for MAV), ASCII
GZip header and gzip-comprressed data
Inner Structure
The compressed data matches the format of the above format, with the following exceptions:
The header block misses its first 20 bytes, ie. the information that occur before the compressed data already.