Infrastructure/Project Metadata: Difference between revisions

From KDE Community Wiki
m (Grammar fix)
(Clarify why comment objects are permitted in all JSON objects (and that they are not permitted in arrays))
Line 182: Line 182:
* Key names starting with "_[a-zA-Z]" (1 underscore and an ASCII charset letter) are reserved for implementation-specific data. The part of the key name after the leading underscore should be the implementation name (it is required only that the key be namespaced, the key name can contain additional data beyond the implementation name). Nothing shall be assumed regarding the type or contents of the value. Implementors are reminded to consider using other top-level files within kde-build-metadata instead of storing data within this JSON file.
* Key names starting with "_[a-zA-Z]" (1 underscore and an ASCII charset letter) are reserved for implementation-specific data. The part of the key name after the leading underscore should be the implementation name (it is required only that the key be namespaced, the key name can contain additional data beyond the implementation name). Nothing shall be assumed regarding the type or contents of the value. Implementors are reminded to consider using other top-level files within kde-build-metadata instead of storing data within this JSON file.
* The user/implementation keys may appear in any JSON object, not just the top-level object.
* The user/implementation keys may appear in any JSON object, not just the top-level object.
** ''Note'': This means that the user/implementation keys may '''not''' appear in JSON arrays, such as the top-level <code>layers</code> array.
** It is possible that these "comment objects" may need to be reserved to only specific JSON objects, as implementations will otherwise have to filter them out of things like <code>dependencies</code> or <code>groups</code>. However as this is meant to be human-editable it seems wise to assume that this type of comment will eventually be left, and accordingly to program defensively.
** Additionally some implementation-specific data might actually be needed within those objects and not at the top-level.
** Therefore implementations should be sure to first filter out all user/implementation comments from a JSON objects, if they will use all keys within that object (e.g. <code>groups</code> or <code>dependencies</code>).
* All other key names are reserved for this specification.
* All other key names are reserved for this specification.



Revision as of 18:25, 25 July 2013

KDE Project Metadata

Metadata describing the Git repositories that make up KDE software, and the relationships between those repositories, are contained in two different sources.

  1. A KDE Projects Management website, where various data about individual repos can be altered by git repository maintainers, including which branches are considered 'stable' and 'development' branches for i18n purposes.
  2. Metadata about the relationships between individual repositories are kept in a separate git repository, kde-build-metadata.

kde-build-metadata

The kde-build-metadata repository contains several files which can be used by scripts and automated programs to properly handle the KDE git repositories. As of this writing there are several files that make up this repository:

  • build-script-ignore: This file contains a list of git repositories that should be ignored by scripts used to build the KDE source repositories. Empty lines and comments (prefixed by a #) are permitted. Each other line should be the full kde-project path of a module to ignore. Most examples are for modules that simply have nothing to build and install, but other uses include convenience modules that duplicate functionality handled in other source code modules.
  • dependency-data: This file contains a list of dependencies between KDE git repositories. It is used by the kdesrc-build build script, and the continuous integration infrastructure.

Logical module grouping

Note

This section documents a proposed addition. Nothing actually uses this at this point, although it has been reviewed by some of the sysadmins.


In order to make it easy for the various groups building KDE software to get the version they wish, there is a proposal to add the concept of logical module groups so that scripts may automatically select the most appropriate branch for a given individual git repository.

The current proposal is to use a JSON file, with the following structure:

A top-level object, with the following key-value pairs:

version
Will be set to the version supported by conforming scripts. This documentation documents version 0 (the number, not a string). It is intended that the version is only increased for changes that cannot be made in a backward-compatible fashion. Scripts should check that the version is set to a supported version and fail if not.
layers
Will be set to an array of the logical module groupings that are available for use. Currently this is stable-qt4, latest-qt4, kf5-qt5, but this can change as needed. Scripts should allow groupings only from this array.
groups
This is set to an object describing the group layout of the layers described above. See Grouping syntax for more details.
dependencies
This contains and supersedes the information defined in dependency-data. See Specifying dependencies for more details.

Grouping syntax

As described above, the groups key has an object as its value, which itself contains key/value pairs, where each key describes the kde-project module path to operate on, with wildcards being acceptable. The value for this path is another object, describing the layers : branch name mappings for repositories included in the given kde-project module path.

Note

The kde-project module path used as the key should be the full module path in this case. The same is true for all usages of a module path within this file.


For example:

{
/* .... */
        "kde/*": {
            "stable-qt4": "KDE/4.11",
            "latest-qt4": "master",
            "kf5-qt5": "frameworks"
        },
        "kde/kde-workspace" : {
            "stable-qt4": "KDE/4.11",
            "latest-qt4": "KDE/4.11",
            "kf5-qt5": "master"
        }
/* .... */
}

This shows that for kde/kde-workspace (and only for this module, since there are no wildcards), that that stable-qt4 and latest-qt4 logical groupings both pull from the remote git branch called KDE/4.11, while kf5-qt5 developers / scripts should use the remote git branch called master

Selecting a logical group

A logical group may apply to multiple independent git repositories. However, each individual git repository will only match a single logical group (or none at all). In other words, there is no "cascading", so if a layer is not defined within that logical group there is no fallback to a more-generic logical group.

When deciding how to select a logical group, the rule is that the most specific possible match is selected. A * wildcard may be used to stand in for a path component, and all remaining components on the module path. In other words, kde/* would find all kde-project modules that begin with kde/, even those with many intermediate path components, e.g. kde/kdelibs/nepomuk-core.

Note

The wildcard standing in for a path component means that it does not make sense to see except for directly after a path separator, and the wildcard should always be the last part of the specifier. 'kde/*' makes sense, but 'kde*/libs' does not!


An easy implementation of the wildcard logic (once we've failed to find an exact match) is to take the list of logical group specifiers, and sort them all by length in descending order. Search each logical group specifier in that order, seeing if the module path being considered starts with the full specifier (except the wildcard). The first specifier where this is true is chosen. Do not forget to handle the case of a specifier consisting of nothing but '*', which will match everything (though, this is probably a bad idea to include in the file itself!)

If a logical group is matched but does not define a branch name to use for a layer, the script should assume a branch name to use by default. This normally means master, but the idea is that a failure to find a module name in a logical group will not completely stop a build, since many kde-project repositories will not differentiate between different development layers in this fashion.

Using the example given above, the kde/kde-workspace repository would match its own group, kde/kdeutils/kcalc would match the kde/* group, while kdesupport/phonon/phonon would not match any group at all (and therefore would get default branch names, whatever 'default' means for that script).

Specifying dependencies

The last major object in the JSON container is dependencies, which describes how the various KDE git modules relate (and don't relate) to each other. This is essential for the continuous integration system, but is also useful for automated build software.

Warning

This dependencies object is not actually in use yet -- see dependency-data for the currently-used variant


This container is rather hard to describe, because it has a hard problem to solve: It is not enough to specify that a repository relates to another repository, since the dependency concept stretches all the way to individual git branches within a repository.

So what we do instead is to allow branches to depend on branches, allow a repository to depend on a branch with exceptions, or break up dependencies from a branch to a whole repository.

Example dependency specification

As before, we will use an example to illustrate. Although it looks complex, that is mostly because it demonstrates every possible specification.

{
/* .... */
    "dependencies": {
        "kde/kdelibs": {
            /* Essentially just here for self-documentation, should be the only object */
            "branch_dependencies": {
                "frameworks": {
                    "dependent_on": {
                        "Qt5": "stable",
                        "kdesupport/phonon/phonon": "phonon4qt5"
                    },
                    "can_use": {
                    },
                    "not_dependent_on": {
                        "Qt": "*",
                        "kdesupport/automoc": "master",
                        "kdesupport/phonon/phonon": "master",
                        "kdesupport/polkit-qt-1": "master",
                        "kdesupport/soprano": "master"
                    }
                },
                "*": {
                    "dependent_on": {
                        "kdesupport/attica": "master",
                        "kdesupport/phonon/phonon": "master",
                        "kdesupport/strigi/libstreams": "master",
                        "kdesupport/strigi/libstreamanalyzer": "master",
                        "kdesupport/strigi/strigiclient": "master",
                        "kdesupport/polkit-qt-1": "master",
                        "kdesupport/soprano": "master",
                    },
                    "can_use": {
                    },
                    "not_dependent_on": {
                    }
                }
            }
        }
    }
/* .... */
}

With the example displayed in mind we can specify the components. The dependencies object consists of key/value pairs, where each key is a wildcard module specifier, with wildcard semantics as described in Selecting a logical group, and each value is a...

Module dependency object

The module dependency object should have a single child, a key/value pair with a key of branch_dependencies, with a value of another object. Because JSON does not permit comments of any sort, this is done to allow for some semblance of self-documentation within the file. Additionally it allows for future expansion without having to bump the version.

The value of this single key is the aptly named....

Branch dependency objects

As mentioned before, this is the lowest level at which dependency relationships make sense. Accordingly, this object is where you first start seeing dependency entries being made.

This object consists of a list of key/value pairs. Each key is either a specific git repository branch name (as given on the git.kde.org side of the repository), or *, which means that the dependencies will apply to all other branches.

The value for each of these keys is yet another object (though we're getting close to the end). This object should contain three keys/value pairs. The three keys are:

dependent_on
The value object for this key indicates module/branch pairs that must be built before the current module/branch.
can_use
The value object for this key indicates module/branch pairs that provide functionality which the current module/branch could use, but do not require. (This is the kind of thing which might be instructive to evaluate with the continuous integration system as well, to ensure the current module/branch builds and passes tests with and without the optional module/branch).
not_dependent_on
The value object for this key indicates module/branch pairs that the current module/branch specifically does not depend upon. This may be needed in situations where the mentioned module/branch might otherwise be innocently included in the build (e.g. simply from being present on the C.I. system) but would conflict with a module/branch from dependent_on or can_use.
Dependency specifier objects

The object value for each of the 3 listed types of dependency specifiers are key/value pairs, with the key and value both being strings.

Key
The key is the full kde-projects module path describing the git repository to add/remove a dependency on, or one of the values Qt (meaning Qt4) or Qt5. The latter special specifiers are used by the C.I. system to ensure a correct build environment is prepared.
Value
The value is a string listing the branch to add/remove a dependency on. The branch name should be a branch available for the given module on the git.kde.org server, or may be * to indicate the branch doesn't matter (i.e. that this dependency applies to the entire module).

Implementation notes

Some important notes for implementors (and eventual editors of this file):

With regard to dependency data

  • It is required to use the { } key/value pair syntax for each module/branch dependency specifier, even if all the dependencies are branch-independent. Implementations may assume this.
  • Implementations must not assume that the dependency data for a given module/branch pair included all 3 required keys from Branch dependency objects. Any of the 3 not included should be assumed to have been specified with an empty list of module/branch dependency pairs.

In general

  • JSON does not support comments.
  • JSON DOES NOT SUPPORT COMMENTS.
  • With that in mind, key names starting with "__" (2 underscores) should be reserved to allow for user comments. No special handling is needed on the part of implementations, though they may treat any such key/value pairs as comments.
  • Key names starting with "_[a-zA-Z]" (1 underscore and an ASCII charset letter) are reserved for implementation-specific data. The part of the key name after the leading underscore should be the implementation name (it is required only that the key be namespaced, the key name can contain additional data beyond the implementation name). Nothing shall be assumed regarding the type or contents of the value. Implementors are reminded to consider using other top-level files within kde-build-metadata instead of storing data within this JSON file.
  • The user/implementation keys may appear in any JSON object, not just the top-level object.
    • Note: This means that the user/implementation keys may not appear in JSON arrays, such as the top-level layers array.
    • It is possible that these "comment objects" may need to be reserved to only specific JSON objects, as implementations will otherwise have to filter them out of things like dependencies or groups. However as this is meant to be human-editable it seems wise to assume that this type of comment will eventually be left, and accordingly to program defensively.
    • Additionally some implementation-specific data might actually be needed within those objects and not at the top-level.
    • Therefore implementations should be sure to first filter out all user/implementation comments from a JSON objects, if they will use all keys within that object (e.g. groups or dependencies).
  • All other key names are reserved for this specification.

Example

This is an example of a full shell JSON file.

{
    "version" : 0,

    "__README": "http://community.kde.org/Infrastructure/Project_Metadata#kde-build-metadata",

    "layers" : [
        "stable-qt4",
        "latest-qt4",
        "kf5-qt5"
    ],

    "groups" : {
        "kde/kde-workspace" : {
            "stable-qt4": "KDE/4.11",
            "latest-qt4": "KDE/4.11",
            "kf5-qt5": "master"
        }
    },

    "dependencies": {
        "kde/kdelibs": {
            "branch_dependencies": {
                "*": {
                    "dependent_on": {
                        "kdesupport/soprano": "master"
                    },
                    "can_use": {
                    },
                    "not_dependent_on": {
                    }
                },
                "frameworks": {
                    "dependent_on": {
                        "Qt5": "stable"
                    },
                    "can_use": {
                    },
                    "not_dependent_on": {
                        "Qt": "*"
                    }
                }
            }
        }
    }
}