summaryrefslogtreecommitdiff
path: root/docs/design/draft-metadata.txt
diff options
context:
space:
mode:
authorStefan Kost <ensonic@users.sf.net>2010-02-17 09:55:52 +0200
committerStefan Kost <ensonic@users.sf.net>2010-02-17 09:55:52 +0200
commita12fc2646141eb9832631be485be4b95306beb7c (patch)
tree82f78b63ed2b29100c6d36e2ecd1e758a35ddfac /docs/design/draft-metadata.txt
parentb45abd271058a092d088c990ae193d009e8b4789 (diff)
design: write about the current state of tag-handling
Document the taghandling in gstreamer. List gaps and propose new mechanisms to deal with them.
Diffstat (limited to 'docs/design/draft-metadata.txt')
-rw-r--r--docs/design/draft-metadata.txt199
1 files changed, 199 insertions, 0 deletions
diff --git a/docs/design/draft-metadata.txt b/docs/design/draft-metadata.txt
new file mode 100644
index 0000000000..68d28142b3
--- /dev/null
+++ b/docs/design/draft-metadata.txt
@@ -0,0 +1,199 @@
+Metadata
+--------
+
+This draft recaps the current metadata handling in GStreamer and proposes some
+additions.
+
+
+Supported Metadata standards
+----------------------------
+
+The paragraphs below list supported native metadata standards sorted by type and
+then in alphabetical order. Some standards have been extended to support
+additional metadata. GStreamer already supports all of those to some extend.
+This is showns in the table below as either [--], [r-], [-w] or [rw] depending on
+read/write support (08.Feb.2010).
+
+Audio
+- mp3
+ ID3v2: [rw]
+ http://www.id3.org/Developer_Information
+ ID3v1: [rw]
+ http://www.id3.org/ID3v1
+ XMP: [--] (inside ID3v2 PRIV tag of owner XMP)
+ http://www.adobe.com/devnet/xmp/
+- ogg/vorbis
+ vorbiscomment: [rw]
+ http://www.xiph.org/vorbis/doc/v-comment.html
+ http://wiki.xiph.org/VorbisComment
+- wav
+ LIST/INFO chunk: [rw]
+ http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
+ http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
+ XMP: [--]
+ http://www.adobe.com/devnet/xmp/
+
+Video
+- 3gp
+ {moov,trak}.udta: [rw]
+ http://www.3gpp.org/ftp/Specs/html-info/26244.htm
+ ID3V2: [--]
+ http://www.3gpp.org/ftp/Specs/html-info/26244.htm
+ http://www.mp4ra.org/specs.html#id3v2
+- avi
+ LIST/INFO chunk: [rw]
+ http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Info
+ http://www.kk.iij4u.or.jp/~kondo/wave/mpidata.txt
+ XMP: [--] (inside "_PMX" chunk)
+ http://www.adobe.com/devnet/xmp/
+- asf
+ ??:
+ XMP: [--]
+ http://www.adobe.com/devnet/xmp/
+- flv [--]
+ XMP: (inside onXMPData script data tag)
+ http://www.adobe.com/devnet/xmp/
+- mkv
+ tags: [rw]
+ http://www.matroska.org/technical/specs/tagging/index.html
+- mov
+ XMP: [--] (inside moov.udta.XMP_ box)
+ http://www.adobe.com/devnet/xmp/
+- mp4
+ {moov,trak}.udta: [rw]
+ http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip
+ moov.udta.meta.ilst: [rw]
+ http://atomicparsley.sourceforge.net/
+ http://atomicparsley.sourceforge.net/mpeg-4files.html
+ ID3v2: [--]
+ http://www.mp4ra.org/specs.html#id3v2
+ XMP: [--] (inside UUID box)
+ http://www.adobe.com/devnet/xmp/
+- mxf
+ ??
+
+Images
+- gif
+ XMP: [--]
+ http://www.adobe.com/devnet/xmp/
+- jpg
+ jif: [rw] (only comments)
+ EXIF: [rw] (via metadata plugin)
+ http://www.exif.org/specifications.html
+ IPTC: [rw] (via metadata plugin)
+ http://www.iptc.org/IPTC4XMP/
+ XMP: [rw] (via metadata plugin)
+ http://www.adobe.com/devnet/xmp/
+- png
+ XMP: [--]
+ http://www.adobe.com/devnet/xmp/
+
+further Links:
+http://age.hobba.nl/audio/tag_frame_reference.html
+http://wiki.creativecommons.org/Tracker_CC_Indexing
+
+
+Current Metadata handling
+-------------------------
+
+When reading files, demuxers or parsers extract the metadata. It will be sent as
+a GST_MESSAGE_TAG to the GstBus and GST_EVENT_TAG to downstream elements.
+
+Elements receiving GST_EVENT_TAG events can mangle them, mux them into the
+buffers they send or just pass them through. Usually is muxers that will format
+the tag data into the form required by the format they mux. Such elements would
+also implement the GstTagSetter interface to receive tags from the application.
+
+ GstMessage(tag) over GstBus to application
+ ^
+ |
+ +----------+
+ | demux |
+ sink src --> GstEvent(tag) over GstPad to downstream element
+ +----------+
+
+ method call over GstTagSetter interface from application
+ |
+ v
+ +----------+
+ | mux |
+GstEvent(tag) over GstPad from upstream element --> sink src
+ +----------+
+
+The data used in all those interfaces is GstTagList. It is based on a
+GstStructure which is like a hash table with differently typed entries. The key
+is always a string/GQuark. Many keys are predefined in GStreamer core. More keys
+are defined in gst-plugins-base/gst-libs/gst/tag/tag.h.
+If elements and applications use predefined types, it is possible to transcode a
+file from one format into another while preserving all known and mapped
+metadata.
+
+
+Issues
+------
+
+Unknown/Unmapped metadata
+- - - - - - - - - - - - -
+Right now GStreamer can lose metadata when transcoding, remuxing content. This
+can happend as we don't map all metadata fields to generic ones.
+
+We should probably also add the whole metadata blob to the GstTagList. We would
+need a GST_TAG_SYSTEM_xxx define (e.g. GST_TAG_SYSTEM_ID3V2) for each standard.
+The content is not printable and should be treated as binary if not known. The
+tag is not mergeable - call gst_tag_register() with GstTagMergeFunc=NULL. Also
+the tag data is only useful for upstream elements, not for the application.
+
+A muxer would first scan a taglist for known system tags. Unknown tags are
+ignored as already. It would first populate its own metadata store with the
+entries from the system tag and the update the entries with the data in normal
+tags.
+
+Below is an initial list of tag systems:
+ID3V1 - GST_TAG_SYSTEM_ID3V1
+ID3V2 - GST_TAG_SYSTEM_ID3V2
+RIFF_INFO - GST_TAG_SYSTEM_RIFF_INFO
+XMP - GST_TAG_SYSTEM_XMP
+
+We would basically need this for each container format.
+
+See also https://bugzilla.gnome.org/show_bug.cgi?id=345352
+
+Lost metadata
+- - - - - - -
+A case slighly different from the previous is that when an application sets a
+GstTagList on a pipeline. Right elements consuming tags do not report which tags
+have been consumed. Especially when using elements that make metadata
+persistent, we have no means of knowing which of the tags made it into the
+target stream and which were not serialized. Ideally the application would like
+to know which kind of metadata is accepted by a pipleine to reflect that in the
+UI.
+
+Although it is in practise so that elements implementing GstTagSetter are the
+ones that serialize, this does not have to be so. Otherwise we could add a
+means to that interface, where elements add the tags they have serialized. The
+application could build one list from all the tag messages and then query all
+the serialized tags from tag-setters. The delta tells what has not been
+serialized.
+
+A different approach would be to query the list of supported tags in advance.
+This could be a query (GST_QUERY_TAG_SUPPORT). The query result could be a list
+of elements and their tags. As a convenience we could flatten the list of tags
+for the top-level element (if the query was sent to a bin) and add that.
+
+Tags are per Element
+- - - - - - - - - -
+In many cases we want tags per stream. Even metadata standards like mp4/3gp
+metadata supports that. Right now GST_MESSAGE_SRC(tags) is the element. We tried
+changing that to the pad, but that broke applications.
+Also we miss the symmetric functionality in GstTagSetter. This interface is
+usually implemented by elements.
+
+Open bugs
+- - - - -
+
+https://bugzilla.gnome.org/buglist.cgi?query_format=advanced;short_desc=tag;bug_status=UNCONFIRMED;bug_status=NEW;bug_status=ASSIGNED;bug_status=REOPENED;bug_status=NEEDINFO;short_desc_type=allwordssubstr;product=GStreamer
+
+Add GST_TAG_MERGE_REMOVE
+https://bugzilla.gnome.org/show_bug.cgi?id=560302
+
+