"For authoring a karaoke presentation, the author has to describe content structure of an audio and a text media and then he/she has to manually set synchronization between text fragment and audio segment a well description of a textís content and an audioís content. Our work submited to ACM MM2002 has provided such an authoring environment. However, in this work, the media content structuring is performed independently. Thus the synchronizations between them after that have to manually specify. Although the authoring environment provides the power visual editing tools, it is also very hard work if the authors have to manually synchronize, for instance, a video during three hours with a long textual document annotating it. A more semantic description for media segments or media content plus the interoperability between semantic description models could give us automatic solutions for composing multimedia presentation. We are experimenting on this solution for observing its realities and effection."