(General Observations)
(General Observations)
Line 6: Line 6:
 
* Records consist of a 1 byte type field, a length field and N content bytes.
 
* Records consist of a 1 byte type field, a length field and N content bytes.
 
* The type field does not seem to describe semantics, as values repeat for different values (rather than a data type of some form?). This would suggest that the record order defines their semantics. The presence of null records suggests that too, as well as the same record sequence in all samples.
 
* The type field does not seem to describe semantics, as values repeat for different values (rather than a data type of some form?). This would suggest that the record order defines their semantics. The presence of null records suggests that too, as well as the same record sequence in all samples.
* It is possible that there is a multi-byte length encoding comparable to BER/DER.
 
  
 
= Record Structure =
 
= Record Structure =

Revision as of 17:10, 2 March 2020

General Observations

  • QR code, content has variable length at around 330 bytes.
  • Contains readable strings as well as binary parts.
  • Seems to be a sequence of variable length records, rather than a fixed binary layout, could be some form of TLV encoding (there are some similarities to ASN.1 BER/DER for example).
  • Records consist of a 1 byte type field, a length field and N content bytes.
  • The type field does not seem to describe semantics, as values repeat for different values (rather than a data type of some form?). This would suggest that the record order defines their semantics. The presence of null records suggests that too, as well as the same record sequence in all samples.

Record Structure

  • 1 byte type, although no idea what that indicates. Largely doesn't match the BER/DER type byte.
  • Length:
    • for values <= 127 byte this is just one byte
    • for value larger than 127 this is encoded in two bytes. This is quite different from BER/DER multi-byte length encoding however. It looks like a little endian layout , with the most significant bit of the first length byte being removed and the second byte being shifted by one bit to fill that space.
  • N bytes of content, with varying data types.

Data Types

Strings

  • Strings seem to be UTF-8 encoded.
  • The length is the amount of bytes needed to represent the UTF-8 string, not the amount of characters.

Date/Time

This is largely speculation at this point!

There's 5 7 byte sequences included that could be date/time values. These could be date of purchase/issue, begin of validity, end of validity, traveler birth date. All of those are printed on the ticket.

  • the first byte is 0x08 in all samples
  • there's a surprising amount of entropy in those values, esp. when looking at differences at close-by dates.
  • differential view shows no obvious correlation beyond multiple tickets, but close-by values do "look" similar nevertheless.
  • the suspected birthday field is the same in all samples for the same traveler
  • this does not seem to use a UNIX timestamp
  • this does not seem to use BCD encoding
  • this does not seem to use sub-byte per-component encoding

Record Sequence

Nesting Depth Type Id Content Type Meaning Notes
0 0x0A date/time, nested record ? speculative, given the 7 byte field there matches suspected date/time values below
1 0x12 nested record
2 0x0A 2bytes followed by nested records ? 0x08 04
3 0x12 string ticket type? "Point-to-point Ticket", "Supersaver Ticket"
2 0x12 string Departure station
2 0x1A string Arrival station
2 0x20 2 byte ? 0x2801 in all samples
2 0x32 string Via
2 0x38 1 byte ? 0x42 in all samples
2 0x42 7 byte ? date/time?
2 0x4A 7 byte ? date/time?
2 0x60 variable, null terminated ? 2-3 bytes in all samples
2 0x7A string ticket type or tariff parenthesis enclosed abbreviations, string is also printed on the ticket
2 0x88 1 byte ? null
2 0x98 1 byte ? 0x02
2 0xA0 1 byte ? null
1 0x1A nested record traveler information could also be loyalty program info?
2 0x0A string ? 8 digit number - always the same for the same traveler
2 0x12 string ? 36 char uuid - always the same for the same traveler
2 0x1A string family name
2 0x22 string given name
2 0x2A 7 byte ? date/time? - possibly traveler birth date
2 0x3A string tariff information? "HALBTAX"
1 0x2A nested record ?
2 0x0A 7 byte ? date/time?
2 0x10 4 byte ? followed by 0x200B outside of TLV structures?
1 0x32 nested record price information?
2 0x0A string ? "PCD"
2 0x12 string currency "CHF"
2 0x1A string ticket price
1 0x3A null ?
1 0x42 nested record train information not present in unbound tickets
2 0x5A string train number
2 0x70 null
2 0x78 null
1 0x48 5byte, no length byte!?! ? breaks the overall TLV structure?? almost fixed value in all samples
0 0x22 nested record ?
1 0x0A string ? 4 digit number (3342 in all samples)
1 0x12 string ? 5 digit number (00001 in all samples)
1 0x2A nested record
2 0x30 nested record see KDE_PIM/KItinerary/Thalys_Barcode, same structure there
3 0x02 20-21 byte ? signature?
3 0x02 20-21 byte ?

Content is available under Creative Commons License SA 4.0 unless otherwise noted.