- . . . (maybe more)
EE is driven by some sort of protocol chosen by development team - which has complete freedom to choose one protocol over another, restricted only by project requirements. In case of media server it is not that easy. Over time different "parties" created different control protocols for media server (media gateway).
For instance with CISCO equipment media server is controled with MGCP, in IMS environment media server is controled with H.248, in WWW it can be controled with RTSP, etc, etc.
In Mobicents Media Server 1.x.y we support only MGCP (aside of local MSC API), however we felt that this is not enough to meet demand from partners and community (more on this here).
So for MMS 2.x we have more in plans:
Here we go.
Media Server Operation Model
As mentioned above media servers follow certain model of operation. See image below:
Media Server estabilishes only RTP stream exchanged between UAs. To setup streaming application makes use of some dedicated control protocol. Dedicated controler on Media Server side converts control protocol reqeusts into physical setup of underlying resources(generators,detectors, codecs, etc....). This happens regardles of signaling protocol used(on image it is SIP, but in reality it could be any suitable for application).
MSML: Media Server Markup Language
- XML Based
- It is independent of transport protocol ( SIP, HTTP, IP, ...)
- Follows simple request/response model
- Supports transactions
- Supports semi-recovery from errors
- Requires resource reservation
- RTP service oriented
MSML is also RTP service oriented. Protocol does not define endpoint types as MGCP does for instance. It only defines actions that need to be undertaken to produce stream/interaction with user stream.
As protocol MSML is designed to be extensible, just as Diameter does. Functional elements are grouped into package. Each package defines set of primitives which affect Media Server. Prmitives are tags and their parameters which are intepreted by Media Server controler.
MSML allows new packages to be defined along ability to extend already defined. This is how whole language structure is defined. Definition from RFC are grouped as follows:
MSML: Base abstractionBase MSML abstraction defines functional elements which are aligned as in image below:
- conn - connection is logical unit which termiantes all RTP sessions from one UA
- operator - logical unit representing DSP
- dialog - set of automated actions like: play, DTMF generator/detector, etc...
- conf - logical unit which mixes streams
MSML: Protocol use example
Below is use case of MSML encapsulated in SIP protocol.
MSML used inside SIP requires usually dialog to be established between application server and media server. MSML Requests/Answers are embeded in SIP INFO/INFO OK message.
In image above there is only one UA, which awaits video to be played. However for different number of UAS scheme is similar - only difference is dialog time when dialog is estabilished between app server and media server.
In example scenario app server asks Media Server to play announcement like (this is MSML Dialog): "Press one to play movie XXXXX, Press two to play movie xXXYYx" and detect DTMFs. Once UA sends DTMF, Media Server sends DTMF embeded in SIP INFO back to application server and indication that MSML Dialog has ended. Application server decodes INFO, picks file to be streamed back to UA and starts new MSML Dialog: "Play Movie Dialog".
Other protocols will follow in subsequent posts.