KDE PIM/Akonadi/Architecture

From KDE Community Wiki
Revision as of 16:20, 17 March 2018 by Dvratil (talk | contribs) (Created page with "= Akonadi Concepts and Architecture = This document describes and explains the core elements within Akonadi (like ''Items'', ''Collections'', etc.) as well as the architectur...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Akonadi Concepts and Architecture

This document describes and explains the core elements within Akonadi (like Items, Collections, etc.) as well as the architecture of the entire solution (clients, agents, server, etc.) and how they interact with each other. The reason this is all explained in a single document is so that it's easier to see how all the dots connect.

Eventually, this should be moved or copied into Akonadi docs.


Basic Entities

The term Entity is often used as a common terms for all the elements described below.

Attributes

Attributes are additional metadata that can be attached to other Entities (except for other Attributes). An Atribute has a type and a value. Client applications and Agents can define their own Attributes but there are also some pre-defined Attributes, like the "EntityDisplay" Attribute which allows customizing how an Entity is presented to user in clients (by setting custom display name, icon, background color etc.).

Items

Item is an abstract representation of data. Items have metadata (ID, size, mimetype, etc.), payload parts (the actual data, e.g. email envelope, email head and email body) and attributes. An Item can represent an email, a contact, a calendar event etc. One Item has exactly one parent Collection.

Collections

Collection, as the name suggests, is a collection of Items. A Collection can also have child Collections, thus creating a Collection tree. Collections can also have attributes.

Virtual Collections

Virtual Collection is represented as a regular Collection, but it has a special property that it cannot own any Items nor it can have any subcollections unless they are virtual as well. Instead of being a parent of Items, Items are linked into Virtual Collections. One Item can be linked into multiple Virtual Collections.

Virtual Collections are typically used to hold search results, that is a Virtual Collection represents a search query and all ``Items`` linked to it are those that match the query.

Tags

Tag describes a common abstract relation between multiple Items. For example, a "Work" tag can be assigned to many emails, tasks and events (or rather Items representing those) that are somehow related to user's work. A single Item can have multiple Tags and a single Tag can be assigned to multiple Items.

Relation

Relation describes a specific relation between exactly two Items - for example we can have an "INVITATION" Relation between an Item that represents an email with meeting invitation and an Item that represents a calendar event that was created from the invitation email. Single Item can be in multiple Relations, even in multiple Relations of the same type, but there are always exactly two Items in each Relation.


Basic components

Server

Server is the server process that other components talk to via the Akonadi Protocol. It manages the cached Entities and persists them in a database. Database is considered an implementation detail of the Server, no-one else knows about it or interacts with it.

Agents

Agents are single-purpose processes that get notified when an Entity is created, modified or removed from the Server. Example can be the MailFilterAgent which is notified whenever a new Item is created and if the Item holds an email, it will apply a local mail filters to it and store the change back in Akonadi.

Resources

Resources are special cases of Agents that synchronize data between Akonadi Server and a remote server - for example the IMAP resource synchronizes data between Akonadi Server and a chosen IMAP server. To have multiple IMAP accounts, a multiple instances of the IMAP resource are created. When talking about Resources and Agents we can talk either about Agent (or Resource) Type or Agent (or Resource) Instance. Type is the implementation of the Resource and Instance is a running instance of the Type. Types' are unique (e.g. there can only be a single Resource called IMAPResource, but there could be multiple Instances of the Type, i.e. multiple IMAPResource Resources running providing connection to different IMAP servers or accounts.

Clients

Clients are user-facing application like KMail or KOrganizer that presents data from Akonadi to users and allows them to interact with the data.


DB Tables

This is a brief description of tables in the database that the Server stores all the data in and how they relate to the Entities and components described above.

SchemaVersion

A standalone table that holds information about the current version of the schema. Nothing to get excited about.

ResourceTable

Holds list of active Agent and Resources Instances.

PimItemTable

Holds metadata about Items - ID, parent Collection, size etc. This is a very big table - one row per every email, contact, event etc.

PartTable

PartTable holds the actual payload parts and attributes for Items. This is the largest table in Akonadi as it contains on average 3 rows per each row in PimItemTable.

PartTypeTable

Contains names of parts and attributes from PartTable (like PLD:ENVELOPE, PLD:HEAD, ATR:noselect, etc.) - this is a very small table (around 10 rows normally) and its purpose is purely to de-duplicate the often-repeated strings from the already-big PartTable.

MimeTypeTable

MimeTypeTable holds list of mime types. This is a very small table and like PartTypeTable is used simply to de-duplicate repeated strings from the PimItemTable and to allow a many-to-many relation between Collections and mimetypes.

FlagTable

FlagTable holds Item flags, like "seen", "spam", "hasattachment" etc. The table only holds simple strings and is fairly small (we have around 20 flags).

PimItemFlagRelation

A single Item can have 0-N flags and this table describes the relation. This is a fairly big table as it usually has more than one flag per each PimItem row.

CollectionTable

The CollectionTable holds Collections - their ID, parent Collection, cache policy etc. This normally a small-ish table - one row per a mail folder, calendar, addressbook etc. Each Collection is owned by a Resource.

CollectionMimeTypeRelation

As a single Collection can have multiple mimetypes (those are actually mimetypes of Items that are permitted within this Collection) and this table describes the relation between CollectionTable and MimeTypeTable.

CollectionAttributeTable

This table holds additional attributes for Collections. One Collection can have multiple Attributes, but an attribute belongs to exactly one Collection.

CollectionPimItemRelation

This table describes relation between Items and Virtual Collections. This does not describe parent-child relationship, that's in PimItemTable.collectionId. The size of this table varies depending on how much you use the "Search" feature in KMail.

TagTable

TagTable holds Tags. Usually a small table, one row per Tag and people generally don't have more then a few dozen Tags (most people don't use this feature at all).

TagTypeTable

This table olds tag types - this is purely to de-duplicate common strings from TabTable.

TagAttributeTable

A table equivalent to CollectionAttributeTable, but for Tags.

TagRemoteIdResourceRelation

A single Tag can exist in various backends - for example an IMAP account can have a tag called "KDE" that user uses to tag all emails related to KDE with. A calendar account can also have a "KDE" tag that user can use to tag KDE-related events with. To user we want to represent these two tags as a single Tag, so that they can see everything tagged with "KDE" Tag regardless of whether it's an email or an event. However each backend identifies the Tag differently - the IMAP resource will identify the Tag as "$KDE" while the CalDAV resource will identify the Tag with some random UUID like "{abcde-ef012-3456}". This table holds a RemoteID for each Tag as seen by each Resource that has the Tag.

PimItemTagRelation

A single Item can have multiple Tags and this table describes the relation.

RelationTable

Holds Relations between two Items

RelationTypeTable

An equivalent to TagTypeTable, but for Relations.


Some more concepts

ID

Item ID, Collection ID, Tag ID is a database primary key but is exposed to clients to uniquely identify each Entity.

Remote ID

RemoteID is a string-based identifier that is used by the backend (IMAP server, CalDAV server etc.) to identify the Entity. On IMAP server this can be an IMAP UID for an Item, mailbox name for a Collection, for maildir this can be a filename of the email etc. This is only exposed to Resources, since those are the only ones to actually understand what the RID means.

GID

GID is a string-based identifier extracted from the payload (Message-ID header in emails, UID in iCal events etc.) and is exposed to clients.

Payload Type

As described above the actual Item data (e.g. body of an email) as stored in PartTable as a BLOB in the data column. This is called the Internal Payload. To avoid storing massive BLOBs in the database, we store payloads larger than certain threshold (4kB by default) as files on the filesystem and the PartTable only refers to the filename on the filesystem. Those are called External Payloads. There are also Foreign Payloads but right now they are not actually used by anyone.

Cache

Akonadi is a cache, not a storage. New Items are downloaded from the backend services (IMAP server, CalDAV server, maildir, ...) by Resources and uploaded into Akonadi reguarily. Any changes done to Entities by clients (marking an email as read, creating a new event, deleting a contact etc.) are send to the respective Resource that owns the Item in question, and the Resource replays the change to the remote service. If the remote service is not available (let's say user is offline but they mark a bunch of email as read or move them to some other folder) the changes are recorded by the Resource and are replayed once network is available.


How the whole thing works together

Resources

Clients

Notifications