In mobicents projects we have different EE:
- SLEE
- MSS
- . . . (maybe more)
In addition we also have Media Server. Media Server is only responsible for media part of applications. To make it short: regardless of application logic, signaling protocol utilized by application, media server role is always the same, to process media.
EE is driven by some sort of protocol chosen by development team - which has complete freedom to choose one protocol over another, restricted only by project requirements. In case of media server it is not that easy. Over time different "parties" created different control protocols for media server (media gateway).
For instance with CISCO equipment media server is controled with MGCP, in IMS environment media server is controled with H.248, in WWW it can be controled with RTSP, etc, etc.
In Mobicents Media Server 1.x.y we support only MGCP (aside of local MSC API), however we felt that this is not enough to meet demand from partners and community (more on this
here).
So for MMS 2.x we have more in plans:
During Brno we were supposed to iterate over this short list, high light use cases, differences and basic concepts. However we did not have enough time. So in order to compensate I decided to post it here.
Here we go.
Media Server Operation Model
As mentioned above media servers follow certain model of operation. See image below:
Media Server estabilishes only RTP stream exchanged between UAs. To setup streaming application makes use of some dedicated control protocol. Dedicated controler on Media Server side converts control protocol reqeusts into physical setup of underlying resources(generators,detectors, codecs, etc....). This happens regardles of signaling protocol used(on image it is SIP, but in reality it could be any suitable for application).
MSML: Media Server Markup Language
Basic inforamtion:
- XML Based
- It is independent of transport protocol ( SIP, HTTP, IP, ...)
- Follows simple request/response model
- Supports transactions
- Supports semi-recovery from errors
- Requires resource reservation
- RTP service oriented
MSML supports something similar to transaction on Media Server side. It is not strictly defined as transaction but has very similar semantic/behaviour. Each MSML request can be seen as set of actions. Each action has arbitrary ID, set by application developer. If particular action fails, error response will contain this ID to make application aware of failure point - this is semi-recovery of error condition. MSML pinpoint exact place of failure, allowing application to free resources (yes, responsibility is pushed to application). This is required since Media Server does not reclaim them (according to MSML specs).
MSML is also RTP service oriented. Protocol does not define endpoint types as MGCP does for instance. It only defines actions that need to be undertaken to produce stream/interaction with user stream.
As protocol MSML is designed to be extensible, just as Diameter does. Functional elements are grouped into package. Each package defines set of primitives which affect Media Server. Prmitives are tags and their parameters which are intepreted by Media Server controler.
MSML allows new packages to be defined along ability to extend already defined. This is how whole language structure is defined. Definition from RFC are grouped as follows:
- Core
- Dialog-Core
- Conf-Core
- Audit-Core
Dialog-Core defines primitives to control automated actions of Media Server, it defines also subpackages to extend those capabilities:
- Dialog-Base
- Dialog-Transform
- Dialog-Speech
- .....
MSML: Base abstraction
Base MSML abstraction defines functional elements which are aligned as in image below:
- conn - connection is logical unit which termiantes all RTP sessions from one UA
- operator - logical unit representing DSP
- dialog - set of automated actions like: play, DTMF generator/detector, etc...
- conf - logical unit which mixes streams
MSML: Protocol use example
Below is use case of MSML encapsulated in SIP protocol.
MSML used inside SIP requires usually dialog to be established between application server and media server. MSML Requests/Answers are embeded in SIP INFO/INFO OK message.
In image above there is only one UA, which awaits video to be played. However for different number of UAS scheme is similar - only difference is dialog time when dialog is estabilished between app server and media server.
In example scenario app server asks Media Server to play announcement like (this is MSML Dialog): "Press one to play movie XXXXX, Press two to play movie xXXYYx" and detect DTMFs. Once UA sends DTMF, Media Server sends DTMF embeded in SIP INFO back to application server and indication that MSML Dialog has ended. Application server decodes INFO, picks file to be streamed back to UA and starts new MSML Dialog: "Play Movie Dialog".
Other protocols will follow in subsequent posts.