summaryrefslogtreecommitdiff
path: root/docs/design/part-overview.txt
blob: 735c42c4b8fa8a95757c50acf7d8ce60ea80bc6d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
Overview
--------

 This part gives an overview of the design of GStreamer with references to
 the more detailed explanations of the different topics.

 This document is intented for people that want to have a global overview of
 the inner workings of GStreamer.


Introduction
------------

 GStreamer is a set of libraries and plugins that can be used to implement various
 multimedia applications ranging from desktop players, audio/video recorders, 
 multimedia servers, transcoders, etc. 

 Applications are built by constructing a pipeline composed of elements. An element
 is an object that performs some action on a multimedia stream such as:

  - read a file
  - decode or encoder between formats
  - capture from a hardware device
  - render to a hardware device
  - mix or multiplex multiple streams

 Elements have input and output pads called sink and source pads in GStreamer. An
 application links elements together on pads to construct a pipeline. Below is
 an example of an ogg/vorbis playback pipeline.

   +-----------------------------------------------------------+
   | pipeline                                                  |
   | +---------+   +----------+   +-----------+   +----------+ |
   | | filesrc |   | oggdemux |   | vorbisdec |   | alsasink | |
   | |        src-sink       src-sink        src-sink        | |
   | +---------+   +----------+   +-----------+   +----------+ |
   +-----------------------------------------------------------+

 The filesrc element reads data from a file on disk. The oggdemux element parses
 the data and sends the compressed audio data to the vorbisdec element. The
 vorbisdec element decodes the compressed data and sends it to the alsasink
 element. The alsasink element sends the samples to the audio card for playback.

 Downstream and upstream are the term used to describe the direction in the
 Pipeline. From source to sink is called "downstream" and "upstream" is 
 from sink to source.

 The task of the application is to construct a pipeline as above using existing
 elements. This is further explained in the pipeline building topic.

 The application does not have to manage any of the complexities of the
 actual dataflow/decoding/conversions/synchronsiation etc. but only calls high
 level functions on the pipeline object such as PLAY/PAUSE/STOP.

 The application also receives messages and notifications from the pipeline such
 as metadata, warning or error messages.
 
 If the application needs more control over the graph it is possible to directly
 access the elements and pads in the pipeline.


Design overview
---------------

 GStreamer design goals include:

  - Process large amounts of data quickly
  - Allow fully multithreaded processing
  - Ability to deal with multiple formats
  - Synchronize different dataflows 
  - Ability to deal with multiple devices

 The capabilities presented to the application depends on the number of elements
 installed on the system and their functionality.
 
 The GStreamer core is designed to be media agnostic but provides many features
 to elements to describe media formats.
 

Elements
--------

 The smallest building blocks in a pipeline are elements. An element provides a
 number of pads which can be source or sinkpads. Sourcepads provide data and
 sinkpads consume data. Below is an example of an ogg demuxer element that has
 one pad that takes (sinks) data and two source pads that produce data.

    +-----------+
    | oggdemux  |
    |          src0
   sink        src1
    +-----------+

 An element can be in four different states: NULL, READY, PAUSED, PLAYING. In the
 NULL and READY state, the element is not processing any data. In the PLAYING state
 it is processing data. The intermediate PAUSED state is used to preroll data in
 the pipeline. A state change can be performed with gst_element_set_state().

 An element always goes through all the intermediate state changes. This means that
 when en element is in the READY state and is put to PLAYING, it will first go
 through the intermediate PAUSED state.

 An element state change to PAUSED will activate the pads of the element. First the
 source pads are activated, then the sinkpads. When the pads are activated, the
 pad activate function is called. Some pads will start a thread or some other
 mechanism to start producing or consuming data.

 The PAUSED state is special as it is used to preroll data in the pipeline. The purpose
 is to fill all connected elements in the pipeline with data so that the subsequent
 PLAYING state change happens very quickly. Some elements will therefore not complete
 the state change to PAUSED before they have received enough data. Sink elements are
 required to only complete the state change to PAUSED after receiving the first data.

 Normally the state changes of elements are coordinated by the pipeline as explained
 in [part-states.txt].

 Different categories of elements exist:

  - source elements, these are elements that do not consume data but only provide data
      for the pipeline.
  - sink elements, these are elements that do not produce data but renders data to
      an output device.
  - transform elements, these elements transform an input stream in a certain format 
     into a stream of another format. Encoder/decoder/converters are examples.
  - demuxer elements, these elements parse a stream and produce several output streams.
  - mixer/muxer elements, combine several input streams into one output stream.
 
 Other categories of elements can be constructed.


Bins
----

 A bin is an element subclass and acts as a container for other elements so that multiple 
 elements can be combined into one element.

 A bin coordinates its children's state changes as explained later. It also distributes
 events and various other functionality to elements.

 A bin can have its own source and sinkpads by ghostpadding one or more of its children's
 pads to itself.

 Below is a picture of a bin with two elements. The sinkpad of one element is ghostpadded
 to the bin.

    +---------------------------+    
    | bin                       |
    |    +--------+   +-------+ |
    |    |        |   |       | |
    |  /sink     src-sink     | |
   sink  +--------+   +-------+ |
    +---------------------------+


Pipeline
--------

 A pipeline is a special bin subclass that provides the following features to its 
 children:

   - Select and manage a clock
   - Provide means for elements to comunicate with the application by the Bus.
   - Manage the global state of the elements such as Errors and end-of-stream.
 
 Normally the application creates one pipeline that will manage all the elements
 in the application.
 

Dataflow and buffers
--------------------

 GStreamer supports two possible types of dataflow, the push and pull model. In the
 push model, an upstream element sends data to a downstream element by calling a
 method on a sinkpad. In the pull model, a downstream element requests data from
 an upstream element by calling a method on a source pad.

 The most common dataflow is the push model. The pull model can be used in specific
 circumstances by demuxer elements. The pull model can also be used by low latency
 audio applications.

 The data passed between pads is encapsulated in Buffers. The buffer contains a
 pointer to the actual data and also metadata describing the data. This metadata
 includes:
   
    - timestamp of the data, this is the time instance at which the data was captured
        or the time at which the data should be played back.
    - offset of the data: a media specific offset, this could be samples for audio or
        frames for video.
    - the duration of the data in time.
    - the media type of the data described with caps, these are key/value pairs that
        describe the media type in a unique way.

 When an element whishes to send a buffer to another element is does this using one
 of the pads that is linked to a pad of the other element. In the push model, a 
 buffer is pushed to the peer pad with gst_pad_push(). In the pull model, a buffer
 is pulled from the peer with the gst_pad_pull_range() function.

 Before an element pushes out a buffer, it should make sure that the peer element
 can understand the buffer contents.  It does this by querying the peer element
 for the supported formats and by selecting a suitable common format. The selected
 format is then attached to the buffer with gst_buffer_set_caps() before pushing
 out the buffer.

 When an element pad receives a buffer, if has to check if it understands the media
 type of the buffer before starting processing it. The GStreamer core does this
 automatically and will call the gst_pad_set_caps() function of the element before
 sending the buffer to the element.

 Both gst_pad_push() and gst_pad_pull_range() have a return value indicating wether
 the operation succeeded. An error code means that no more data should be send
 to that pad. A source element that initiates the data flow in a thread typically
 pauses the producing thread when this happens.
 
 A buffer can be created with gst_buffer_new() or by requesting a usable buffer
 from the peer pad using gst_pad_alloc_buffer(). Using the second method, it is
 possible for the peer element to suggest the element to produce data in another 
 format by attaching another media type caps to the buffer.

 The process of selecting a media type and attaching it to the buffers is called
 caps negotiation.


Caps
----
 
 A media type (Caps) is described using a generic list of key/value pairs. The key is
 a string and the value can be a single/list/range of int/float/string. 

 Caps that have no ranges/list or other variable parts are said to be fixed and
 can be used to put on a buffer.

 Caps with variables in them are used to describe possible media types that can be
 handled by a pad.


Dataflow and events
-------------------

 Parallel to the dataflow is a flow of events. Unlike the buffers, events can pass
 upstream and downstream. Some events only travel upstream others only downstream.

 the events are used to denote special conditions in the dataflow such as EOS or 
 to inform plugins of special events such as flushing or seeking.


Pipeline construction
---------------------

 The application starts by creating a Pipeline element using gst_pipeline_new ().
 Elements are added to and removed from the pipeline with gst_bin_add() and
 gst_bin_remove().

 After adding the elements, the pads of an element can be retrieved with
 gst_element_get_pad(). Pads can then be linked together with gst_pad_link().

 Some elements create new pads when actual dataflow is happening in the pipeline.
 With g_signal_connect() one can receive a notification when an element has created
 a pad. These new pads can then be linked to other unlinked pads.

 Some elements cannot be linked together because they operate on different 
 incompatible data types. The possible datatypes a pad can provide or consume can
 be retrieved with gst_pad_get_caps().

 Below is a simple mp3 playback pipeline that we constructed. We will use this
 pipeline in further examples.
 
   +-------------------------------------------+
   | pipeline                                  |
   | +---------+   +----------+   +----------+ |
   | | filesrc |   | mp3dec   |   | alsasink | |
   | |        src-sink       src-sink        | |
   | +---------+   +----------+   +----------+ |
   +-------------------------------------------+


Pipeline clock
--------------

 One of the important functions of the pipeline is to select a global clock
 for all the elements in the pipeline.

 The purpose of the clock is to provide a stricly increasing value at the rate
 of one GST_SECOND per second. Clock values are expressed in nanoseconds.
 Elements use the clock time to synchronized the playback of data.

 Before the pipeline is set to PAUSED, the pipeline asks each element if they can
 provide a clock. The clock is selected in the following order:

  - If the application selected a clock, use that one.
  - If a source element provides a clock, use that clock.
  - Select a clock from any other element that provides a clock, start with the
    sinks.
  - If no element provides a clock a default system clock is used for the pipeline.

 In a typical playback pipeline this will select the clock provided by a sink element
 such as an audio sink.

 In capture pipelines, this will typically select the clock of the data producer, which
 can in most cases not control the rate at which it delivers data.


Pipeline states
---------------

 When all the pads are linked or signals have been connected, the pipeline can
 be put in the PAUSED state to start dataflow. 

 When a bin (and hence a pipeline) performs a state change, it will change the state
 of all its children. The pipeline will change the state of its children from the
 sink elements to the source elements, this to make sure that no upstream element
 produces data to an element that is not yet ready to accept it. 

 In the mp3 playback pipeline, the state of the elements is changed in the order
 alsasink, mp3dec, filesrc.

 All intermediate states are traversed for each element resulting in the following
 chain of state changes:

   alsasink to READY:  the audio device is opened
   mp3dec to READY:    the decoding library is initialized
   filesrc to READY:   the file is opened
   alsasink to PAUSED: alsasink is a sink and returns ASYNC because it did not receive
                       data yet.
   mp3dec to PAUSED:   nothing happens
   filesrc to PAUSED:  a thread is started to push data to mp3dec

 At this point data flows from filesrc to mp3dec and alsasink. Since mp3dec is PAUSED,
 it accepts the data from filesrc on the sinkpad and starts decoding the compressed
 data to raw audio samples.

 The mp3 decoder figures out the samplerate, the number of channels and other audio
 properties of the raw audio samples, puts the decoded samples into a Buffer,  
 attaches the media type caps to the buffer and pushes this buffer to the next
 element.

 Alsasink then receives the buffer, inspects the caps and reconfigures itself to process
 the buffer. Since it received the first buffer of samples, it completes the state change
 to the PAUSED state. At this point the pipeline is prerolled and all elements have
 samples.

 Since alsasink is now in the PAUSED state it blocks while receiving the first buffer. This
 effectively blocks both mp3dec and filesrc in their gst_pad_push().

 Since all elements now return SUCCESS from the gst_element_get_state() function,
 the pipeline can be put in the PLAYING state.

 Before going to PLAYING, the pipeline samples the current time of the clock. This is
 the base time. It then distributes this time to all elements. Elements can then
 synchronize against the clock using the buffer timestamp+base time. 

 The following chain of state changes then takes place:

   alsasink to PLAYING:  the samples are played to the audio device
   mp3dec to PLAYING:    nothing happens
   filesrc to PLAYING:   nothing happens

 
Pipeline status
---------------

 The pipeline informs the application of any special events that occur in the
 pipeline with the bus. The bus is an object that the pipeline provides and that
 can be retrieved with gst_pipeline_get_bus().

 The bus can be polled or added to the glib mainloop.

 The bus is distributed to all elements added to the pipeline. The elements use the bus
 to post messages on. Various message types exist such as ERRORS, WARNINGS, EOS, 
 STATE_CHANGED,  etc..

 The pipeline handles EOS messages received from elements in a special way. It will
 only forward the message to the application when all sink elements have posted an
 EOS message.

 Other methods for obtaining the pipeline status include the Query functionality that
 can be performed with gst_element_query() on the pipeline. This type of query
 is usefull for obtaining information about the current position and total time of
 the pipeline. It can also be used to query for the supported seeking formats and
 ranges.


Pipeline EOS
------------

 When the source filter encounters the end of the stream, it sends an EOS event to
 the peer element. This event will then travel downstream to all of the connected
 elements to inform them of the EOS. The element is not supposed to accept any more
 data after receiving an EOS event on a sinkpad.

 The element providing the streaming thread stops sending data after sending the 
 EOS message.

 The EOS even will eventually arrive in the sink element. The sink will then post
 an EOS message on the bus to inform the pipeline that a particular stream has
 finished. When all sinks have reported EOS, the pipeline forwards the EOS message
 to the application. 

 When in EOS, the pipeline remains in the playing state, if is the application
 responsability to PAUSE or READY the pipeline. 


Pipeline READY
--------------

 When a running pipeline is set from the RUNNING to READY the following actions 
 occur in the pipeline:
  
   alsasink to PAUSED:  alsasink blocks and completes the state change on the
                        next sample. If the element was EOS, it does not wait for
			a sample to complete the state change.
   mp3dec to PAUSED:    nothing 
   filesrc to PAUSED:   nothing

 Going to the intermediate PAUSED state will block all elements in the _push()
 functions. This happens because the sink element blocks on the first buffer
 it receives.

 Some elements might be performing blocking operations in the PLAYING state that
 must be unblocked when they go into the PAUSED state. This makes sure that the
 state change happens very fast.

 In the next PAUSED to READY state change the pipeline has to shut down and all
 streaming threads must stop sending data. This happens in the following sequence:

   alsasink to READY:   alsasink unblocks from the _chain() function and returns a
                        WRONG_STATE return value to the peer element. The sinkpad is
			deactivated and becomes unusable for sending more data. 
   mp3dec to READY:     the pads are deactivated and the state change completes when
                        mp3dec leaves its _chain() function.
   filesrc to READY:    the pads are deactivated and the thread is paused.

 The upstream elements finish their chain() function because the downstream element
 returned an error code from the _push() functions. These error codes are eventually
 returned to the element that started the streaming thread (filesrc), which pauses
 the thread and completes the state change.
 
 This sequence of events ensure that all elements are unblocked and all streaming
 threads stopped.

 
Pipeline seeking
----------------

 Seeking in the pipeline requires a very specific order of operations to make
 sure that the elements remain synchronized and that the seek is performed with
 a minimal amount of latency.

 An application issues a seek event on the pipeline using gst_element_send_event()
 on the pipeline element. The event can be a seek event in any of the formats
 supported by the elements.

 The pipeline first pauses the pipeline to speed up the seek operations.
 
 The pipeline then issues the seek event to all sink elements. The sink then forwards 
 the seek event upstream until some element can perform the seek operation, which is
 typically the source or demuxer element. All intermediate elements can transform the
 requested seek offset to another format, this way a decoder element can transform a
 seek to a frame number to a timestamp, for example.
 
 When the seek event reaches an element that will perform the seek operation, that 
 element performs the following steps.

  1) send a FLUSH_START event to all downstream and upstream peer elements.
  2) make sure the streaming thread is not running. The streaming thread will
     always stop because of step 1).
  3) perform the seek operation
  4) send a FLUSH done event to all downstream and upstream peer elements.
  5) send NEWSEGMENT event to inform all elements of the new position and to complete
     the seek.

 In step 1) all dowstream elements have to return from any blocking operations
 and have to refuse any further buffers or events different from a FLUSH done.

 The first step ensures that the streaming thread eventually unblocks and that
 step 2) can be performed. At this point, dataflow is completely stopped in the
 pipeline. 

 In step 3) the element performs the seek to the requested position. 

 In step 4) all peer elements are allowed to accept data again and streaming
 can continue from the new position. A FLUSH done event is sent to all the peer
 elements so that they accept new data again and restart their streaming threads.

 Step 5) informs all elements of the new position in the stream. After that the
 event function returns back to the application. and the streaming threads start
 to produce new data.

 Since the pipeline is still PAUSED, this will preroll the next media sample in the
 sinks. 

 The last step in the seek operation is then to adjust the media time of the pipeline
 to the new position and to set the pipeline back to PLAYING.

 The sequence of events in out mp3 playback example.

                                      | a) seek on pipeline
                                      | b) PAUSE pipeline
   +----------------------------------V--------+
   | pipeline                         | c) seek on sink
   | +---------+   +----------+   +---V------+ |
   | | filesrc |   | mp3dec   |   | alsasink | |
   | |        src-sink       src-sink        | |
   | +---------+   +----------+   +----|-----+ |
   +-----------------------------------|-------+
              <------------------------+
                    d) seek travels upstream

              --------------------------> 1) FLUSH event
	      | 2) stop streaming
	      | 3) perform seek
              --------------------------> 4) FLUSH done event
              --------------------------> 5) NEWSEGMENT event
  
                                      | e) update stream time
				      | f) PLAY pipeline