KDE Core/ISO Codes

From KDE Community Wiki

ISO Codes in KDE

KDE uses ISO standard codes in a number of places, primarily the Country Code, Language Code and Currency Code in KLocale. Currently KDE maintains our own data files for these codes and our own translations which imposes a maintenance burden to keep the codes and translations up to date.

The Debian iso-codes project maintains a package that includes xml files of various ISO Codes and translations for them in po files. This project is well maintained and regularly updated and is used by many projects and distro's for this. It would make sense to adopt iso-codes as the source for our codes and translations.

TODO: check if part of MeeGo architecture

The iso-codes package is 1.1Mb for the data files and 10.3MB for the translation files, however these are often installed anyway. This compares to KDE requiring approx 2MB for Country and Currency data and translations.

If the problems involved in migrating to iso-codes cannot be resolved, then we could migrate to our own xml file format as a new kdesupport project which could attract external use and help in maintenance.

Translation Problems

We cannot switch until we are sure that translations will not regress. We need to ensure all shipping or near-shipping KDE languages are fully supported by iso-codes to our high standards. This will likely require the KDE translators to donate translations to iso-codes where necessary and possibly agree to maintain the translations where iso-codes does not currently have a team.

ISO 3166 Country Code

The iso-codes ISO 3166 xml file contains two sections of the ISO Country Code standard:

The iso-codes xml format provides for the three different code types (alpha2, alpha3 and numeric) and both official and unofficial/common names of a country.

The iso-codes project xml format is as follows:

<!DOCTYPE iso_3166_entries [
        <!ELEMENT iso_3166_entries (iso_3166_entry+, iso_3166_3_entry*)>
        <!ELEMENT iso_3166_entry EMPTY>
        <!ATTLIST iso_3166_entry
                alpha_2_code            CDATA   #REQUIRED
                alpha_3_code            CDATA   #REQUIRED
                numeric_code            CDATA   #REQUIRED
                common_name             CDATA   #IMPLIED
                name                    CDATA   #REQUIRED
                official_name           CDATA   #IMPLIED
        >
        <!ELEMENT iso_3166_3_entry EMPTY>
        <!ATTLIST iso_3166_3_entry
                alpha_4_code            CDATA   #REQUIRED
                alpha_3_code            CDATA   #REQUIRED
                numeric_code            CDATA   #IMPLIED
                date_withdrawn          CDATA   #IMPLIED
                names                   CDATA   #REQUIRED
                comment                 CDATA   #IMPLIED
        >
]>

Some example entries are:

        <iso_3166_entry
                alpha_2_code="AF"
                alpha_3_code="AFG"
                numeric_code="004"
                name="Afghanistan"
                official_name="Islamic Republic of Afghanistan" />
        <iso_3166_3_entry
                alpha_4_code="YUCS"
                alpha_3_code="YUG"
                numeric_code="891"
                date_withdrawn="1993-07-28"
                names="Yugoslavia, Socialist Federal Republic of" />

Translation Status

The base xml file is in standard US English.

http://translationproject.org/domain/iso_3166.html

Version 3.25 of iso-codes ships with 89 translations files for ISO 3166, 61 languages are translated via The Translation Project, 8 are externally hosted and 20 are unmaintained.

At least 50% of the languages are 97% to 100% complete, with a further 15% at least 80% complete.

It's little hard directly comparing translations stats as iso-codes include official, unofficial and former names, whereas KDE only has unofficial names which are mixed into 1 file with other translations.

  • KDE 4.6 shipped 53 languages, 8 of which are not shipped by iso-codes 3166, and 4 of which are 88%-89% the rest being 98%-100%
  • Previous KDE4 versions shipped 17 other languages, 9 of which are not in iso-codes 3166
  • KDE has 19 other languages that have not yet shipped, 12 of which are not in iso-codes 3166
  • In total KDE4 has 89 languages, 29 of which are not in iso-codes 3166, and only 4 of which are less than 75% so all could possibly be useful to iso-codes
  • iso-codes has 9 languages that KDE does not have

Interestingly, many of the unmaintained translations appear to have been copied from KDE3.

iso-codes Change Required

May need to review unofficial names to see if close enough match to ours.

KDE Changes Required

  • Add KLocale::countryCodes() that returns a QList<QString> of all Country Codes loaded from the iso-codes xml file. Returns correct uppercase format.
  • Add KLocale::countryName() taking a country code, name type (official/unofficial name) to return, and a language code to translate into. Default values to return current locale country name in informal format for current language. Loads name translations from the iso-codes .po files
  • Add KLocale::countryNames() that returns a QList<QPair<QString,QString>> of all Country Codes and their Names in requested format and langauge.
  • Modify KLocale::allCountriesList() to call countryCodes() and return as lowercase. Add C value. Mark as deprecated.
  • Modify KLocale::countryCodeToName() to countryName(). Mark as deprecated.
  • Modify kde-runtime/l10n/ *.desktop files to remove the Name field and their translations, probably rename from .desktop to .locale or similar if doesn't break some implied API guarantee.

Country Code Format Conversion

A number of apps in KDE may need to convert between the different code formats, i.e. EXIV2 stores the country code using the Alpha3 code. As the iso-codes file provides all the code formats we can provide conversion tools. We can either add an extra parm to all the country code api calls to allow any code format to be used, but I think this would just confuse issues. We should stick with a single format as standard, and just provide a single api call to convert the codes.

ISO 3166-2 Country Subdivision Code

The iso-codes file for ISO 3166-2 contains one section of the ISO Country Code standard:

ISO xxx Language Codes

ISO 4217 Currency Codes

The iso-codes ISO 4217 xml file contains the ISO Currency Code standard.

The iso-codes xml format provides for

The iso-codes project xml format is as follows:

<!DOCTYPE iso_4217_entries [
        <!ELEMENT iso_4217_entries (iso_4217_entry+, historic_iso_4217_entry*)>
        <!ELEMENT iso_4217_entry EMPTY>
        <!ATTLIST iso_4217_entry
                letter_code             CDATA   #REQUIRED
                numeric_code            CDATA   #IMPLIED
                currency_name           CDATA   #REQUIRED
        >
        <!ELEMENT historic_iso_4217_entry EMPTY>
        <!ATTLIST historic_iso_4217_entry
                letter_code             CDATA   #REQUIRED
                numeric_code            CDATA   #IMPLIED
                currency_name           CDATA   #REQUIRED
                date_withdrawn          CDATA   #REQUIRED
        >
]>

Some example entries are:

        <iso_4217_entry
                letter_code="NZD"
                numeric_code="554"
                currency_name="New Zealand Dollar" />
        <historic_iso_4217_entry
                letter_code="YUN"
                numeric_code="890"
                currency_name="Yugoslavian Dinar"
                date_withdrawn="1995-11" />

Translation Status

The base xml file is in standard US English.

http://translationproject.org/domain/iso_4217.html

Version 3.25 of iso-codes ships with 39 translations files for ISO 4217, 35 languages are translated via The Translation Project and 4 are externally hosted.

At least x% of the languages are 90% to 100% complete, with a further x% at least 80% complete.

It's little hard directly comparing translations stats as iso-codes uses official names, whereas KDE uses adjectival form names which are mixed into 1 file with other translations.

  • KDE 4.6 shipped 53 languages, x of which are not shipped by iso-codes 4217, and x of which are 80%-89% the rest being 90%-100%
  • Previous KDE4 versions shipped 17 other languages, x of which are not in iso-codes 4217
  • KDE has 19 other languages that have not yet shipped, x of which are not in iso-codes 4217
  • In total KDE4 has 89 languages, x of which are not in iso-codes 4217, and only x of which are less than 75% so all could possibly be useful to iso-codes
  • iso-codes 4217 has x languages that KDE does not have

iso-codes Change Required

KDE Changes Required