Dumping ground of use cases with sketched solutions (Properties, Actions). Nothing official, used for brainstorming for now.
Some players (e.g. Youtube) do not have a Stop action. Plasma-Browser-Integration does not have it as well (because browser API does not expose it?). Some players have elaborated media picker dialogs going beyond what can be implemented using the MPRIS properties "supported schemes" and "supported mimetypes". It might be nice to alternatively trigger the native media picker dialog. There are players for media without sound and players not exposing control about the sound volume (see https://lists.freedesktop.org/archives/mpris/2018q1/000070.html). Some players provide thumbnails for the whole track and others even individuals for parts of the track. Mimetype for the current track (xesam:mimeType) in the Metadata property.
Stream players can have the option to buffer if instant replay is on hold. The buffering could be done data-provider/source side or player-side. Length of the buffered track can depend on policies (fixed) or available storage (variable). Players would allow to seek in the buffer and playing the stream with an offset.
Players for commercial media often embed advertisements. E.g. embedded clips. Or overlay information. Some allow skipping or hiding the advertisement, some after some timeout.
Players support showing a pointer.
What can consider a "track" a media object which can have multiple parallel subtracks of types like typically sound and image-frames (not wide-spread but possible would be physical object control like for puppets moving, fountains shooting, pipes of street organ blowing, light spots glowing, or, he, odor spraying ;) ). And the player then goes and "renders" the data from the tracks. The data themselves would either allow random access because coming as a fixed object from some storage like the filesystem or full database or coming from some deterministic data generator. Or the data would not allow random access because it is generated non-deterministically e.g. from sensors on the physical world (like microphone, camera) without any buffering.
So with that abstract thinking, a simple static slide with some timeout is the same as a short video just showing only the same image. And a simple static slide with no timeout is the same as a video livestream showing only the same image. And thus "Stop" and "Pause" with their different concepts at least by design should be applied the same.
Thinking further slides in a presentation show also can have a let's-call-it sub-slideshow, where a slide can reach several states by items appearing, changing or disappearing (and just thinking about linear organized shows :) ). Once we get there and try to create model concepts for the needed new MPRIS interfaces, perhaps the right now proposed mapping has to be rethought indeed. But for what I drafted some time ago, for now mapping a single (main) slide, which is usually shown for some minutes, to a track, which usually is some 3-minutes pop music, should work out with what exists in MPRIS. Multi-hierarchy track notation might be also interesting for non-3-minutes-pop-music tracks. Think movies separated in story chapters, classical Western music (operas, symphonies) being composed of units of units. So the same structuring as known e.g. from books might be useful to have, to allow navigation using the same interaction patterns where sane.