KDE Core/OpenCodes

From KDE Community Wiki

The OpenCodes project is part of the KDE Open Data initiative seeking to develop a standard json file format for ISO Codes and to provide a set of data files derived from Wikidata.

Architecture

OpenCodes will be a single git repository containing a set of scripts to maintain the ISO Codes data as well as the data files themselves. The repo will not contain any code APIs to utilise the data, this is to ensure the project is completely standalone and can be utilised by as many other projects as possible.

The data files will be in JSON format with a schema defined using the JSON Schema standard will will allow for automated verification and consumption.

A Python script will use the Wikidata Query API once available (or other tools such as Autolists in the interim) to list all Items for the ISO code Property and then obtain all the required Properties for each Item instance. This data will then be merged with any extra fields OpenCodes requires and written to the base set of json files which will be committed to the repo.

A second Python script will generate the payload files from the base files in a choice of formats:

  • Data as separate files for each ISO code instance or a single file containing all instances
  • Translations as .po files or JSON translation files (node.js format, and any others required) or inline in data files


For Linux installs generated using 'make install' the base files will be installed to /usr/share/opencodes and .po translation files installed to /usr/locale/.

Translations will be sourced from both Wikidata and KDE. It is expected that Wikidata will have a greater number of languages supported than KDE so will be the preferred source, but KDE may have some languages unsupported in WIkidata so we need to cater for this. It is hoped KDE translators will submit translations directly to Wikidata, but we cannot automate this as Wikidata uses CC-0 licensing.

Country Code

JSON File Format

Wikidata Feed

Wikidata Feed

The following Items are defined in Wikidata but are not connected to the Properties and have poor translations and definitions, probably due to import from Freebase:

  • ISO 3166 (Q106487)
    • ISO 3166-1 (Q25275)
      • ISO 3166-1 alpha-2 (Q1140221)
      • ISO 3166-1 alpha-3 (Q1341492)
      • ISO 3166-1 numeric (Q2725758)
    • ISO 3166-2 (Q133153)
    • ISO 3166-3 (Q877561)


We should propose changes to Wikidata to link these to the actual properties used for the countries. Because they are not currently connected we cannot start the query from the Item.

The following Properties are defined on the Item for each Country:

  • ISO 3166-1 alpha-2 (P297)
  • ISO 3166-1 alpha-3 (P298)
  • ISO 3166-1 numeric (P299)
  • country calling code (P474)
  • FIPS 10-4 (countries and regions) (P901)
  • IOC country code (P984)
  • continent (P30)
  • top-level domain (P78)
  • flag (P163)


To obtain all country Items for the ISO alpha-2 code run the following query: http://wdq.wmflabs.org/api?q=claim[297]