Jump to content

Infrastructure/Project Metadata: Difference between revisions

From KDE Community Wiki
Mpyne (talk | contribs)
Describe logical module group syntax and semantics.
Ashark (talk | contribs)
m Example: fix https link and fix highlighting
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= KDE Project Metadata =
= Git Repository Metadata =


Metadata describing the Git repositories that make up KDE software, and the relationships between those repositories, are contained in two different sources.
Metadata describing the Git repositories that make up KDE software, and the relationships between those repositories, are contained in two different sources.


# A [https://projects.kde.org/projects KDE Projects Management website], where various data about individual repos can be altered by git repository maintainers, including which branches are considered 'stable' and 'development' branches for i18n purposes.
# [https://invent.kde.org/ KDE GitLab instance], where various data about individual repos can be altered by git repository maintainers, including which branches are considered 'stable' and 'development' branches for i18n purposes.
# Metadata about the relationships between individual repositories are kept in a separate git repository, <code>kde-build-metadata</code>.
# Metadata about the relationships between individual repositories are kept in a separate git repository, [https://invent.kde.org/sysadmin/repo-metadata Git Repository Metadata].


== kde-build-metadata ==
== Dependencies ==


The <code>kde-build-metadata</code> repository contains several files which can be used by scripts and automated programs to properly handle the KDE git repositories. As of this writing there are several files that make up this repository:
The [https://invent.kde.org/sysadmin/repo-metadata/-/tree/master/dependencies dependencies] subfolder of repository contains several files which can be used by scripts and automated programs to properly handle the KDE git repositories. There are several items in that directory:


* {{Path|1=build-script-ignore}}: This file contains a list of git repositories that should be ignored by scripts used to build the KDE source repositories. Empty lines and comments (prefixed by a <code>#</code>) are permitted. Each other line should be the full kde-project path of a module to ignore. Most examples are for modules that simply have nothing to build and install, but other uses include convenience modules that duplicate functionality handled in other source code modules.
* {{Path|1={{ic|build-script-ignore}}}}: This file contains a list of git repositories that should be ignored by scripts used to build the KDE source repositories. Empty lines and comments (prefixed by a {{ic|#}}) are permitted. Each other line should be the full kde-project path of a module to ignore. Most examples are for modules that simply have nothing to build and install, but other uses include convenience modules that duplicate functionality handled in other source code modules.
* {{Path|1=dependency-data}}: This file contains a list of dependencies between KDE git repositories. It is used by the [http://kdesrc-build.kde.org kdesrc-build] build script, and the [http://build.kde.org continuous integration infrastructure].
* {{Path|1={{ic|dependency-data-*}}}}: These files contain a list of dependencies between KDE git repositories. They are used by the [http://kdesrc-build.kde.org kdesrc-build] script, and the [[Infrastructure/Continuous_Integration_System|continuous integration infrastructure]].
* {{Path|1={{ic|logical-module-structure.json}}}}: This is a file containing JSON data that describes the proper git branch to use for various high-level groupings of KDE software, such as "KDE Frameworks 5", etc. It is used by the ''kdesrc-build'' script, and the ''continuous integration infrastructure''. See the [[#Logical module grouping]] section for details.
* {{Path|1={{ic|tools/}}}} directory: Contains scripts which can be used to more easily examine the effects of this metadata, without using ''kdesrc-build'' or having to wait for CI.


=== Logical module grouping ===
=== Logical module grouping ===


{{Note|This section documents a proposed addition. Nothing actually uses this at this point, although it has been reviewed by some of the sysadmins.}}
In order to make it easy for the various groups building KDE software to get the version they wish, there is a JSON file, "[https://invent.kde.org/sysadmin/repo-metadata/-/blob/master/dependencies/logical-module-structure logical-module-structure]", that describes logical module groups, so that scripts may automatically select the most appropriate branch for an individual git repository.


In order to make it easy for the various groups building KDE software to get the version they wish, there is a proposal to add the concept of logical module groups so that scripts may automatically select the most appropriate branch for a given individual git repository.
The JSON structure is as follows:
 
The current proposal is to use a JSON file, with the following structure:


A top-level object, with the following key-value pairs:
A top-level object, with the following key-value pairs:


; <code>version</code>
; {{ic|version}}
: Will be set to the version supported by conforming scripts. This documentation documents version <code>0</code> (the number, not a string). It is intended that the version is only increased for changes that cannot be made in a backward-compatible fashion. Scripts should check that the version is set to a supported version and fail if not.
: Will be set to the version supported by conforming scripts. This documentation documents version {{ic|0}} (the number, not a string). It is intended that the version is only increased for changes that cannot be made in a backward-compatible fashion. Scripts should check that the version is set to a supported version and fail if not.


; <code>layers</code>
; {{ic|layers}}
: Will be set to an array of the logical module groupings that are available for use. Currently this is <code>stable-qt4</code>, <code>latest-qt4</code>, <code>kf5-qt5</code>, but this can change as needed. Scripts should allow groupings only from this array.
: Will be set to an array of the logical module groupings that are available for use. Currently there are {{ic|kf5-qt5}}, {{ic|kf6-qt6}}, {{ic|stable-kf5-qt5}}. But this can change as needed. Scripts should allow groupings only from this array.


; <code>groups</code>
; {{ic|groups}}
: This is set to an object describing the group layout of the layers described above. See [[#Grouping syntax|Grouping syntax]] for more details.
: This is set to an object describing the group layout of the layers described above. See [[#Grouping syntax]] for more details.


; <code>dependencies</code>
==== Grouping syntax ====
: This contains '''and supersedes''' the information defined in {{Path|1=dependency-data}}.


==== Grouping syntax ====
As described above, the {{ic|groups}} key has an object as its value, which itself contains key/value pairs, where each key describes the kde-project module path to operate on, with wildcards being acceptable. The value for this path is another object, describing the {{ic|layer_name: branch_name}}  mappings for repositories included in the given kde-project module path.
As described above, the <code>groups</code> key has an object as its value, which itself contains key/value pairs, where each key describes the kde-project module path to operate on, with wildcards being acceptable. The value for this path is another object, describing the layers : branch name mappings for repositories included in the given kde-project module path.


{{Note|The kde-project module path used as the key should be the ''full'' module path in this case. The same is true for all usages of a module path within this file.}}
{{Note|The kde-project module path used as the key should be the ''full'' module path in this case. The same is true for all usages of a module path within this file.}}
Line 42: Line 40:
For example:
For example:


<syntaxhighlight lang="javascript">
{{bc-hl|lang=javascript|code=
{
{
/* .... */
/* .... */
         "kde/*": {
         "kde/*" : {
             "stable-qt4": "KDE/4.11",
            "kf6-qt6": "master",
             "latest-qt4": "master",
            "kf5-qt5": "master",
             "kf5-qt5": "frameworks"
             "stable-kf5-qt5": "release/23.08"
        },
        "kde/kdeutils/kcalc" : {
             "kf6-qt6": "master",
             "kf5-qt5": "release/23.08",
            "stable-kf5-qt5": "release/23.08"
         },
         },
        "kde/kde-workspace" : {
            "stable-qt4": "KDE/4.11",
            "latest-qt4": "KDE/4.11",
            "kf5-qt5": "master"
        }
/* .... */
/* .... */
}
}
</syntaxhighlight>
}}


This shows that for <code>kde/kde-workspace</code> (and '''only''' for this module, since there are no wildcards), that that <code>stable-qt4</code> and <code>latest-qt4</code> logical groupings both pull from the remote git branch called <code>KDE/4.11</code>, while <code>kf5-qt5</code> developers / scripts should use the remote git branch called <code>master</code>
This shows that for {{ic|kde/kdeutils/kcalc}} (and '''only''' for this module, since there are no wildcards), that {{ic|kf5-qt5}} and {{ic|stable-kf5-qt5}} logical groupings both pull from the remote git branch called {{ic|release/23.08}}, while {{ic|kf6-qt6}} developers / scripts should use the remote git branch called {{ic|master}}.


===== Selecting a logical group =====
===== Selecting a logical group =====
Line 65: Line 63:
A logical group may apply to multiple independent git repositories. However, each individual git repository will only match a single logical group (or none at all). In other words, there is no "cascading", so if a layer is not defined within that logical group there is no fallback to a more-generic logical group.
A logical group may apply to multiple independent git repositories. However, each individual git repository will only match a single logical group (or none at all). In other words, there is no "cascading", so if a layer is not defined within that logical group there is no fallback to a more-generic logical group.


When deciding how to select a logical group, the rule is that the most specific possible match is selected. A <code>*</code> wildcard may be used to stand in for a path component, and all remaining components on the module path. In other words, <code>kde/*</code> would find '''all''' kde-project modules that begin with <code>kde/</code>, even those with many intermediate path components, e.g. <code>kde/kdelibs/nepomuk-core</code>.
When deciding how to select a logical group, the rule is that the most specific possible match is selected. A {{ic|*}} (wildcard) may be used to stand in for a path component, and all remaining components on the module path. In other words, {{ic|kde/*}} would find '''all''' kde-project modules that begin with {{ic|kde/}}, even those with many intermediate path components, e.g. {{ic|extragear/utils/kweather}}.


{{Note|1=The wildcard standing in for a path component means that it does not make sense to see except for directly after a path separator, and the wildcard should always be the last part of the specifier. 'kde/*' makes sense, but 'kde*/libs' does not!}}
{{Note|1=The wildcard standing in for a path component means that it does not make sense to see except for directly after a path separator, and the wildcard should always be the last part of the specifier. 'extragear/*' makes sense, but 'extragear*/libs' does not!}}


An easy implementation of the wildcard logic (once we've failed to find an exact match) is to take the list of logical group specifiers, and sort them all by length in descending order. Search each logical group specifier in that order, seeing if the module path being considered starts with the full specifier (except the wildcard). The first specifier where this is true is chosen. Do not forget to handle the case of a specifier consisting of nothing but '*', which will match everything (though, this is probably a bad idea to include in the file itself!)
An easy implementation of the wildcard logic (once we've failed to find an exact match) is to take the list of logical group specifiers, and sort them all by length in descending order. Search each logical group specifier in that order, seeing if the module path being considered starts with the full specifier (except the wildcard). The first specifier where this is true is chosen. Do not forget to handle the case of a specifier consisting of nothing but '*', which will match everything (though, this is probably a bad idea to include in the file itself!)


If a logical group is matched but does not define a branch name to use for a layer, the script should assume a branch name to use by default. This normally means <code>master</code>, but the idea is that a failure to find a module name in a logical group will not completely stop a build, since many kde-project repositories will not differentiate between different development layers in this fashion.
If a logical group is matched but does not define a branch name to use for a layer, the script should assume a branch name to use by default. This normally means {{ic|master}}, but the idea is that a failure to find a module name in a logical group will not completely stop a build, since many kde-project repositories will not differentiate between different development layers in this fashion.


Using the example given above, the <code>kde/kde-workspace</code> repository would match its own group, <code>kde/kdeutils/kcalc</code> would match the <code>kde/*</code> group, while <code>kdesupport/phonon/phonon</code> would not match any group at all (and therefore would get default branch names, whatever 'default' means for that script).
Using the example given above, the {{ic|kde/workspace}} repository would match its own group, {{ic|kde/kdeutils/kcalc}} would match the {{ic|kde/*}} group, while {{ic|kdesupport/phonon/phonon}} would not match any group at all (and therefore would get default branch names, whatever 'default' means for that script).
 
=== Implementation notes ===
 
Some important notes for implementors (and eventual editors of this file):
 
* JSON does not support comments.
* '''JSON DOES NOT SUPPORT COMMENTS'''.
* With that in mind, key names starting with "__" (2 underscores) should be reserved to allow for user comments. No special handling is needed on the part of implementations, though they may treat any such key/value pairs as comments.
* Key names starting with "_[a-zA-Z]" (1 underscore and an ASCII charset letter) are reserved for implementation-specific data. The part of the key name after the leading underscore should be the implementation name (it is required only that the key be namespaced, the key name can contain additional data beyond the implementation name). Nothing shall be assumed regarding the type or contents of the value. Implementors are reminded to consider using other top-level files within sysadmin/repo-metadata instead of storing data within this JSON file.
* The user/implementation keys may appear in any JSON object, not just the top-level object.
** ''Note'': This means that the user/implementation keys may '''not''' appear in JSON arrays, such as the top-level {{ic|layers}} array.
** It is possible that these "comment objects" may need to be reserved to only specific JSON objects, as implementations will otherwise have to filter them out of things like {{ic|layers}} or {{ic|groups}}. However as this is meant to be human-editable it seems wise to assume that this type of comment will eventually be left, and accordingly to program defensively.
** Additionally some implementation-specific data might actually be needed within those objects and not at the top-level.
** Therefore implementations should be sure to first filter out all user/implementation comments from a JSON objects, if they will use all keys within that object (e.g. {{ic|groups}} or {{ic|layers}}).
* All other key names are reserved for this specification.
 
=== Example ===
This is an example of a full shell JSON file.
 
{{bc-hl|lang=json|code=
{
    "version" : 0,
 
    "__README": "https://community.kde.org/Infrastructure/Project_Metadata",
 
    "layers" : [
        "kf5-qt5",
        "kf6-qt6",
        "stable-kf5-qt5"
    ],
 
    "groups" : {
        "kde/*" : {
            "kf6-qt6": "master",
            "kf5-qt5": "master",
            "stable-kf5-qt5": "release/23.08"
        },
        "kde/kdeutils/kcalc" : {
            "kf6-qt6": "master",
            "kf5-qt5": "release/23.08",
            "stable-kf5-qt5": "release/23.08"
        }
    }
}
}}

Latest revision as of 23:09, 13 June 2024

Git Repository Metadata

Metadata describing the Git repositories that make up KDE software, and the relationships between those repositories, are contained in two different sources.

  1. KDE GitLab instance, where various data about individual repos can be altered by git repository maintainers, including which branches are considered 'stable' and 'development' branches for i18n purposes.
  2. Metadata about the relationships between individual repositories are kept in a separate git repository, Git Repository Metadata.

Dependencies

The dependencies subfolder of repository contains several files which can be used by scripts and automated programs to properly handle the KDE git repositories. There are several items in that directory:

  • build-script-ignore: This file contains a list of git repositories that should be ignored by scripts used to build the KDE source repositories. Empty lines and comments (prefixed by a #) are permitted. Each other line should be the full kde-project path of a module to ignore. Most examples are for modules that simply have nothing to build and install, but other uses include convenience modules that duplicate functionality handled in other source code modules.
  • dependency-data-*: These files contain a list of dependencies between KDE git repositories. They are used by the kdesrc-build script, and the continuous integration infrastructure.
  • logical-module-structure.json: This is a file containing JSON data that describes the proper git branch to use for various high-level groupings of KDE software, such as "KDE Frameworks 5", etc. It is used by the kdesrc-build script, and the continuous integration infrastructure. See the #Logical module grouping section for details.
  • tools/ directory: Contains scripts which can be used to more easily examine the effects of this metadata, without using kdesrc-build or having to wait for CI.

Logical module grouping

In order to make it easy for the various groups building KDE software to get the version they wish, there is a JSON file, "logical-module-structure", that describes logical module groups, so that scripts may automatically select the most appropriate branch for an individual git repository.

The JSON structure is as follows:

A top-level object, with the following key-value pairs:

version
Will be set to the version supported by conforming scripts. This documentation documents version 0 (the number, not a string). It is intended that the version is only increased for changes that cannot be made in a backward-compatible fashion. Scripts should check that the version is set to a supported version and fail if not.
layers
Will be set to an array of the logical module groupings that are available for use. Currently there are kf5-qt5, kf6-qt6, stable-kf5-qt5. But this can change as needed. Scripts should allow groupings only from this array.
groups
This is set to an object describing the group layout of the layers described above. See #Grouping syntax for more details.

Grouping syntax

As described above, the groups key has an object as its value, which itself contains key/value pairs, where each key describes the kde-project module path to operate on, with wildcards being acceptable. The value for this path is another object, describing the layer_name: branch_name mappings for repositories included in the given kde-project module path.

Note

The kde-project module path used as the key should be the full module path in this case. The same is true for all usages of a module path within this file.


For example:

{
/* .... */
        "kde/*" : {
            "kf6-qt6": "master",
            "kf5-qt5": "master",
            "stable-kf5-qt5": "release/23.08"
        },
        "kde/kdeutils/kcalc" : {
            "kf6-qt6": "master",
            "kf5-qt5": "release/23.08",
            "stable-kf5-qt5": "release/23.08"
        },
/* .... */
}

This shows that for kde/kdeutils/kcalc (and only for this module, since there are no wildcards), that kf5-qt5 and stable-kf5-qt5 logical groupings both pull from the remote git branch called release/23.08, while kf6-qt6 developers / scripts should use the remote git branch called master.

Selecting a logical group

A logical group may apply to multiple independent git repositories. However, each individual git repository will only match a single logical group (or none at all). In other words, there is no "cascading", so if a layer is not defined within that logical group there is no fallback to a more-generic logical group.

When deciding how to select a logical group, the rule is that the most specific possible match is selected. A * (wildcard) may be used to stand in for a path component, and all remaining components on the module path. In other words, kde/* would find all kde-project modules that begin with kde/, even those with many intermediate path components, e.g. extragear/utils/kweather.

Note

The wildcard standing in for a path component means that it does not make sense to see except for directly after a path separator, and the wildcard should always be the last part of the specifier. 'extragear/*' makes sense, but 'extragear*/libs' does not!


An easy implementation of the wildcard logic (once we've failed to find an exact match) is to take the list of logical group specifiers, and sort them all by length in descending order. Search each logical group specifier in that order, seeing if the module path being considered starts with the full specifier (except the wildcard). The first specifier where this is true is chosen. Do not forget to handle the case of a specifier consisting of nothing but '*', which will match everything (though, this is probably a bad idea to include in the file itself!)

If a logical group is matched but does not define a branch name to use for a layer, the script should assume a branch name to use by default. This normally means master, but the idea is that a failure to find a module name in a logical group will not completely stop a build, since many kde-project repositories will not differentiate between different development layers in this fashion.

Using the example given above, the kde/workspace repository would match its own group, kde/kdeutils/kcalc would match the kde/* group, while kdesupport/phonon/phonon would not match any group at all (and therefore would get default branch names, whatever 'default' means for that script).

Implementation notes

Some important notes for implementors (and eventual editors of this file):

  • JSON does not support comments.
  • JSON DOES NOT SUPPORT COMMENTS.
  • With that in mind, key names starting with "__" (2 underscores) should be reserved to allow for user comments. No special handling is needed on the part of implementations, though they may treat any such key/value pairs as comments.
  • Key names starting with "_[a-zA-Z]" (1 underscore and an ASCII charset letter) are reserved for implementation-specific data. The part of the key name after the leading underscore should be the implementation name (it is required only that the key be namespaced, the key name can contain additional data beyond the implementation name). Nothing shall be assumed regarding the type or contents of the value. Implementors are reminded to consider using other top-level files within sysadmin/repo-metadata instead of storing data within this JSON file.
  • The user/implementation keys may appear in any JSON object, not just the top-level object.
    • Note: This means that the user/implementation keys may not appear in JSON arrays, such as the top-level layers array.
    • It is possible that these "comment objects" may need to be reserved to only specific JSON objects, as implementations will otherwise have to filter them out of things like layers or groups. However as this is meant to be human-editable it seems wise to assume that this type of comment will eventually be left, and accordingly to program defensively.
    • Additionally some implementation-specific data might actually be needed within those objects and not at the top-level.
    • Therefore implementations should be sure to first filter out all user/implementation comments from a JSON objects, if they will use all keys within that object (e.g. groups or layers).
  • All other key names are reserved for this specification.

Example

This is an example of a full shell JSON file.

{
    "version" : 0,

    "__README": "https://community.kde.org/Infrastructure/Project_Metadata",

    "layers" : [
        "kf5-qt5",
        "kf6-qt6",
        "stable-kf5-qt5"
    ],

    "groups" : {
        "kde/*" : {
            "kf6-qt6": "master",
            "kf5-qt5": "master",
            "stable-kf5-qt5": "release/23.08"
        },
        "kde/kdeutils/kcalc" : {
            "kf6-qt6": "master",
            "kf5-qt5": "release/23.08",
            "stable-kf5-qt5": "release/23.08"
        }
    }
}