Problem
- Multimedia content on the web needs to be described
- Semantic Web languages provide a simple triple model to annotate resources on the web
- subject → predicate → object
- resources are identified by URI
- How to localize parts of a media into a URI?
- The region of an image?
- The sequence of a video?
- A moving region over the time?
- Let's see W3C recommendations, ISO standards and RFC ...
Identify Spatial Fragments
Identify Spatial Fragments: SVG Approach
Scalable Vector Graphics (SVG 1.1), W3C Recommendation in 2003
<svg xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="layer1">
<image id="image_yalta" x="-0.34" y="0.20" width="400" height="167"
xlink:href="&yalta_URI;"/>
<rect id="SR1" x="14.64" y="15.73" width="146.98" height="147.48"
style="opacity:1;stroke:#ff0000;stroke-opacity:1"/>
</g>
</svg>
Identify Spatial Fragments: MPEG-7 Approach
MPEG-7, ISO Standard in 2001
<Image id="image_yalta"> <!-- whole image -->
<MediaLocator>
<MediaUri>&yalta_URI;</MediaUri></MediaLocator>
[...]
<SpatialDecomposition>
<StillRegion id="SR1"> <!-- still region -->
<SpatialMask>
<SubRegion>
<Box>14.64 15.73 161.62 163.21</Box>
<SubRegion>
</SpatialMask>
</StillRegion>
</SpatialDecomposition>
</Image>
Identify Spatial Fragments: Summary
- Both SVG and MPEG-7 approaches require an indirection!
- MPEG-7 or SVG description are XML documents identified by a URL
- RDF annotation will be about a fragment of this XML document
- The XML code needs to be processed to dereference the region
Identify Temporal Fragments
Identify Temporal Fragments: SMIL Approach
Synchronized Multimedia Integration Language (SMIL 2.1), W3C Recommendation in 2005
<smil xmlns="http://www.w3.org/2001/SMIL20/Language">
<head>
<layout>
<root-layout width="640" height="480"/>
<region id="video_window"/>
</layout>
</head>
<body>
<seq>
<video id="seq_1" src="&video_URI;" region="video_window" clipBegin="3" clipEnd="9"/>
[...]
</seq>
</body>
</smil>
Identify Temporal Fragments: MPEG-7 Approach
MPEG-7, ISO Standard in 2001
<VideoSegment id="video_G8"> <!-- whole video -->
<MediaLocator>
<MediaUri>&video_URI;</MediaUri>
</MediaLocator>
[...]
<TemporalDecomposition gap="true" overlap="false">
<VideoSegment id="seq_1"> <!-- sequence 1 -->
<MediaTime>
<MediaTimePoint>T00:00:03:0F30000</MediaTimePoint>
<MediaDuration>PT00H00M06S26116N30000F</MediaDuration>
</MediaTime>
</VideoSegment>
[...]
</TemporalDecomposition>
</VideoSegment>
Identify Temporal Fragments: Direct URI Approaches
- Temporal URI,
Internet-Draft for IETF RFC in 2005
- Syntax: &video_URI; #npt:0:00:03-0:00:09
- MPEG-21,
Part 17: Fragment Identification of MPEG Resources, ISO Standard in 2006
- Syntax: &video_URI; #ffp(item_ID=_seq1-video)*mp(/~time('npt','0:00:03','0:00:09'))
Identify Temporal Fragments: Summary
- Both SMIL and MPEG-7 approaches require an indirection!
- SMIL or MPEG-7 description are XML documents identified by a URL
- RDF annotation will be about a fragment of this XML document
- The XML code needs to be processed to dereference the temporal sequence
- Both TemporalURI and MPEG-21 approaches extend the URI syntax: NO indirection!
- ... but MPEG-21 works only for MPEG encoding of video
- ... but TemporalURI works only for the OGG encoding of video
Conclusion
- Standardize how to localize spatial and temporal sub-parts of any
non-textual media content is urgently needed to make video a first class
citizen on the Web
- Simple localisation:
- Region of an image, temporal sequence of a video
- Enough experience out there ... the Web community should make the recommendation
- Complex localisation:
- Moving region over the time, non spatially or temporally connected segments, etc.
- Most likely require indirection: further investigation in the SW activity for possible
consequences with the RDF model