Line 27: Line 27:
 
| 6 - 13:1 || 0x00 0x49 0x24 0x92 0x49 0x24 0x92 0b00 || || fixed in all samples, patterns repeats below
 
| 6 - 13:1 || 0x00 0x49 0x24 0x92 0x49 0x24 0x92 0b00 || || fixed in all samples, patterns repeats below
 
|-
 
|-
| 13:2 - 15:3 || 3 x 6bit || departure station || see below
+
| 13:2 - 16:7 || 5 x 6bit || departure station || see below
 
|-
 
|-
| 15:4 - 17:0 || 0b1001 0b0010 0100 0b0 || || fixed in all samples
+
| 17:0 || null || ||
 
|-
 
|-
| 17:1 - 19:2 || 3 x 6bit || arrival station || see below
+
| 17:1 - 20:6 || 5 x 6bit || arrival station || see below
 
|-
 
|-
| 19:3 - 21:7 || todo || todo ||
+
| 20:7 - 21:7 || todo || todo ||
 
|-
 
|-
 
| 22:0 - 23:5 || 14 bit uint || train number || might be including too many/few leading 0 bits, 0 for buses
 
| 22:0 - 23:5 || 14 bit uint || train number || might be including too many/few leading 0 bits, 0 for buses
Line 61: Line 61:
 
= Station Codes =
 
= Station Codes =
  
* Using 2-3 alphabetical codes from https://rata.digitraffic.fi/api/v1/metadata/stations, Wikidata has those codes as untyped P296 properties
+
* Using 2-4 characters long alphabetical codes from https://rata.digitraffic.fi/api/v1/metadata/stations, Wikidata has those codes as untyped P296 properties
* Values are encoded as a six bit number per character. Adding 55 to that number results in the corresponding ASCII code, which would suggest values 0-9 are reserved for digits. The value 36 (']' by that rule) is used to mark an unset character in case of two letter station codes.
+
* Values are encoded as a six bit number per character. Adding 55 to that number results in the corresponding ASCII code, which would suggest values 0-9 are reserved for digits. The value 36 (']' by that rule) is used to mark an unset character (for codes shorter than 5 characters).
 +
* The official table also uses the 'Ä' and 'Ö' characters, it's unclear how those would be encoded here though.

Latest revision as of 14:47, 1 April 2020

General Observations

  • always exactly 107 byte
  • all binary, there are no recognizable ASCII strings in this
  • the last ~64 byte have very high entropy, suggesting a signature or a compressed section
  • there's a base64 encoded sequential number printed below the Aztec code on the ticket, containing 24bit.
  • there's a 13 digit order number and optionally a 15 digit reference number on each ticket, sequential and covering multiple sections/travelers
  • one barcode seems to be exactly for one journey section
  • all available samples are for a single adult traveler, and for second class

Bit Layout

Byte[:Bit] (MSB) Content Meaning Notes
0 - 1 0b0001 0000 0b0000 0010 unknown fixed in all samples
2 0b1010 0000 or 0b1000 0000 unknown
3 0b0000 0001 unknown fixed in all samples
4:0 4:6 null
4:7 - 5:7 9 bit uint day of travel counted from Jan 1
6 - 13:1 0x00 0x49 0x24 0x92 0x49 0x24 0x92 0b00 fixed in all samples, patterns repeats below
13:2 - 16:7 5 x 6bit departure station see below
17:0 null
17:1 - 20:6 5 x 6bit arrival station see below
20:7 - 21:7 todo todo
22:0 - 23:5 14 bit uint train number might be including too many/few leading 0 bits, 0 for buses
23:6 - 25:3 null
25:4 - 28:7 todo todo
29 0b0010 0000 fixed in all samples
30:0 - 30:5 6 bit uint coach number leading 0 bits uncertain
30:6 - 31:4 7 bit uint seat number 0 if no seat reserved
31:5 - 31:7 0b100 fixed in all samples
32 - 40 0x89 0x24 0x92 0x49 0x24 0x90 0x60 0x00 0x01 fixed in all samples, repeats a pattern from 6-12
41:0 - 42:0 todo todo varies slightly between samples
42:1 - 106:0 64 byte varying high entropy in all samples, signature?
106:1 - 106:7 0x00 or 0x80

Station Codes

  • Using 2-4 characters long alphabetical codes from https://rata.digitraffic.fi/api/v1/metadata/stations, Wikidata has those codes as untyped P296 properties
  • Values are encoded as a six bit number per character. Adding 55 to that number results in the corresponding ASCII code, which would suggest values 0-9 are reserved for digits. The value 36 (']' by that rule) is used to mark an unset character (for codes shorter than 5 characters).
  • The official table also uses the 'Ä' and 'Ö' characters, it's unclear how those would be encoded here though.

This page was last edited on 1 April 2020, at 14:47. Content is available under Creative Commons License SA 4.0 unless otherwise noted.