index.html
163 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
<?xml version='1.0' encoding='utf-8' standalone="yes" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xml:lang="en-US" xmlns="http://www.w3.org/1999/xhtml" lang="en-US">
<head>
<title>Multimedia Annotation Interoperability Framework</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="Home" href="http://www.w3.org/2005/Incubator/mmsem/Overview.html" />
<style type="text/css">
.new { color: #FF0000 }
.example {font-family: monospace; }
.figure {
font-weight: bold;
text-align: center; }
div.example {
padding: 1em;
margin: 0.1em 3.5em 0.1em 0.1em;
background-color: #efeff5;
border: 1px solid #cfcfcf; }
div.exampleOuter {
margin: 0em;
padding: 0em; }
div.exampleInner {
color: black;
background-color: #efeff5;
border-top-style: double;
border-top-color: #d3d3d3;
border-bottom-width: 1px;
border-bottom-style: double;
border-bottom-color: #d3d3d3;
padding: 4px;
margin: 0em; }
div.exampleInner pre {
margin-left: 0em;
margin-top: 0em;
margin-bottom: 0em;
font-family: monospace; }
</style>
<link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-XGR" />
</head>
<body>
<div id="headings" class="head">
<p>
<a href="http://www.w3.org/"><img height="48" width="72" alt="W3C" src="http://www.w3.org/Icons/w3c_home" />
</a><a href="http://www.w3.org/2005/Incubator/XGR/"><img height="48" width="160" alt="W3C Incubator Report" src="http://www.w3.org/2005/Incubator/images/XGR" />
</a>
</p>
<h1>Multimedia Annotation Interoperability Framework</h1>
<h2><a id="w3c-doctype" name="w3c-doctype" />W3C Incubator Group Editor's Draft
14 August 2007</h2>
<dl>
<dt>This version:</dt>
<dd>
<a href="http://www.w3.org/2005/Incubator/mmsem/XGR-interoperability-20070814/">http://www.w3.org/2005/Incubator/mmsem/XGR-interoperability-20070814/</a></dd>
<dt>Latest version:</dt><dd><a href="http://www.w3.org/2005/Incubator/mmsem/XGR-interoperability/">http://www.w3.org/2005/Incubator/mmsem/XGR-interoperability/</a></dd>
<dt>Previous version:</dt>
<dd>
This is the first public version.
</dd>
<dt>Editor:</dt>
<dd>
<a href="http://www.image.ece.ntua.gr/~tzouvaras/">Vassilis Tzouvaras</a>,
IVML, National Technical University of Athens</dd>
<dd>
<a href="http://www.cwi.nl/~troncy/">Raphaël Troncy</a>, Center for
Mathematics and Computer Science (CWI Amsterdam)</dd>
<dd>
<a href="http://www.csd.abdn.ac.uk/~jpan/">Jeff Z. Pan</a>, University of
Aberdeen</dd>
<dt>  </dt>
<dd>
Also see <a href="#acknowledgments">Acknowledgements</a>.</dd>
</dl>
<p class="copyright">
<a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a>
© 2007 <a href="http://www.w3.org/">
<acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>,
<a href="http://www.ercim.org/">
<acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>,
<a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">
liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">
trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">
document use</a> rules apply.
</p>
</div>
<hr />
<h2>
<a id="abstract" name="abstract">Abstract </a>
</h2>
<p>
Multimedia systems typically contain digital documents of mixed media types,
which are indexed on the basis of strongly divergent metadata standards. This
severely hamplers the inter-operation of such systems. Therefore, machine
understanding of metadata comming from different applications is a basic
requirement for the inter-operation of distributed Multimedia systems. In this
document, we present how interoperability among metadata,
vocabularies/ontologies and services is enhanced using Semantic Web
technologies. In addition, it provides guidelines for semantic
interoperability, illustrated by use cases. Finally, it presents an overview of
the most commonly used metadata standards and tools, and provides the general
research direction for semantic interoperability using Semantic Web
technologies.
</p>
<h2>
<a id="status" name="status">Status of This Document</a>
</h2>
<p>
<em>This section describes the status of this document at the time of its
publication. Other documents may supersede this document. A list of <a href="http://www.w3.org/2005/Incubator/XGR/">
Final Incubator Group Reports</a> is available. See also the <a href="http://www.w3.org/TR/">
W3C technical reports index</a> at http://www.w3.org/TR/. </em>
</p>
<p>
This document was developed by the W3C <a href="http://www.w3.org/2005/Incubator/mmsem/">
Multimedia Semantics Incubator Group</a>, part of the <a href="http://www.w3.org/2005/Incubator/">
W3C Incubator Activity</a>.
</p>
<p>
Publication of this document by W3C as part of the <a href="http://www.w3.org/2005/Incubator/">
W3C Incubator Activity</a> indicates no endorsement of its content by W3C, nor
that W3C has, is, or will be allocating any resources to the issues addressed
by it. Participation in Incubator Groups and publication of Incubator Group
Reports at the W3C site are benefits of <a href="http://www.w3.org/Consortium/join">
W3C Membership</a>.
</p>
<p>Incubator Groups have as a <a href="http://www.w3.org/2005/Incubator/procedures.html#Patent">
goal</a> to produce work that can be implemented on a Royalty Free basis, as
defined in the W3C Patent Policy. Participants in this Incubator Group have
made no statements about whether they will offer licenses according to the <a href="http://www.w3.org/Consortium/Patent-Policy-20030520.html#sec-Requirements">
licensing requirements of the W3C Patent Policy</a> for portions of this
Incubator Group Report that are subsequently incorporated in a W3C
Recommendation.
</p>
<h2>
<a id="scope" name="scope">Scope </a>
</h2>
<p>
This document targets at people with an interest in semantic interoperability,
ranging from non-professional end-users that have content are manually
annotating their personal digital photos to professionals working with digital
pictures in image and video banks, audiovisual archives, museums, libraries,
media production and broadcast industry, etc.
</p>
<p>
Discussion of this document is invited on the public mailing list <a href="mailto:public-xg-mmsem@w3.org">
public-xg-mmsem@w3.org</a> (<a href="http://lists.w3.org/Archives/Public/public-xg-mmsem/">public
archives</a>). Public comments should include "[MMSEM-Interoperability]" as
subject prefix .
</p>
<hr />
<!-- ======================================================================== -->
<div class="toc">
<h2 class="notoc">
<a id="contents" name="contents">Table of Contents</a>
</h2>
<ul id="toc" class="toc">
<li class="tocline">
<a href="#introduction"><b>1. Introduction in Semantic Interoperability in
Multimedia Applications</b></a>
</li>
<li class="tocline">
<a href="#useCases"><b>2. Use Cases and Possible Solutions</b></a>
<ul>
<li class="tocline">
<a href="#photo">2.1 Use Case: Photo</a></li>
<li class="tocline">
<a href="#music">2.2 Use Case: Music</a></li>
<li class="tocline">
<a href="#news">2.3 Use Case: News</a></li>
<li class="tocline">
<a href="#tagging">2.4 Use Case: Tagging</a></li>
<li class="tocline">
<a href="#semanticRetrieval">2.5 Use Case: Semantic Media Analysis for
Intelligent Retrieval</a></li>
<li class="tocline">
<a href="#algortihm">2.6 Use Case: Algorithm Representation</a></li>
</ul>
</li>
<li class="tocline">
<a href="#openIssues"><b>3. Open Issues</b></a>
<ul>
<li class="tocline">
<a href="#authoring">3.1 Semantics From Multimedia Authoring </a>
</li>
<li class="tocline">
<a href="#multimedial">3.2 Building Multimedial Semantic Web Applications </a>
</li>
</ul>
</li>
<li class="tocline">
<a href="#framework"><b>4. Common Framework</b></a>
<ul>
<li class="tocline">
<a href="#syntactic">4.1 Syntactic Interoperability: RDF</a></li>
<li class="tocline">
<a href="#layers">4.2 Layers of Interoperability</a></li>
<li class="tocline">
<a href="#common">4.3 Common Ontology/Schema</a></li>
<li class="tocline">
<a href="#ontology">4.4 Ontology/Schema Integration Harmonisation and Extensions</a></li>
<li class="tocline">
<a href="#guidelines">4.5 Guidelines</a></li>
</ul>
</li>
<li class="tocline">
<a href="#conclusion"><b>5. Conclusion</b></a></li>
<li class="tocline">
<a href="#references"><b>6. References</b></a></li>
<li class="tocline">
<a href="#acknowledgments"><b>Acknowledgments</b></a></li>
</ul>
</div>
<!-- ======================================================================== -->
<h2>
<a name="introduction">1. Introduction in Semantic Interoperability in
Multimedia Applications</a>
</h2>
<p>
This document uses a bottom-up approach to provide a simple extensible framework to improve interoperability of
applications related to some key <a href="#useCases">use cases</a> discussed in the XG.
</p>
<div>
<center>
<img src="interoperability_use_case.png" alt="Use Cases Overview"/>
</center>
</div>
<!-- ======================================================================== -->
<h2>
<a name="useCases">2. Use Cases and Possible Solutions</a>
</h2>
<p>
In this section, we present several use cases showing interoperability problems
with multimedia metadata formats. Each use case starts with an example
illustrating the main problem, and proposed after a possible solution using
Semantic Web technologies.
</p>
<!-- ======================================================================== -->
<h3>
<a name="photo">2.1 Use Case: Photo</a>
</h3>
<h4 id="photo-introduction">Introduction</h4>
<p>
Currently, we are facing a market in which, for example, more than 20 billion digital photos
are taken per year in Europe [<a href="#GFK2006">GFK</a>].
The number of tools, either for desktop machines or web-based, that perform
automatic as well as manual annotation of the content is
increasing. For example, a large number of personal photo management tools extract
information from the so called EXIF [<a href="#Exif">EXIF</a>] header and add
this information to the photo description. These tools typically allow to tag and describe
single photos. There are also many web-based tools that allow to upload photos to
share them, organize them and annotate them. Web sites such as [<a href="#Flickr">Flickr</a>]
allow tagging on the large scale. Sites like [<a href="#Riya">Riya</a>]
provide specific services such as face detection and face
recognition of personal photo collections. Photo community sites such as [<a href="#fotocommunity">Foto
Community</a>] allow an organization of the photos in categories and
allow rating and commenting on them. Even though our photos today find more and
more tools to manage and share them, these tools come with different
capabilities. What remains difficult is finding, sharing, reusing photo
collections across the borders of tools and sites. Not only the way in which
photos are automatically and manually annotated is different but also the
way in which this metadata is described and represented finds many different
standards. At the beginning of the management of personal photo collections is
the semantic understanding of the photos.
</p>
<h4 id="photo-scenario">Motivating Example</h4>
<p>
From the perspective of an end user let us consider the following scenario to
describe what is missing and needed for next generation digital photo services.
Ellen Scott and her family were on a nice two-week vacation in Tuscany.
They enjoyed the sun at the beaches of the Mediterranean, appreciating the
great culture in Florence, Siena and Pisa, and traveling on the traces of the
Etruscans through the small villages of the Maremma. During their marvelous
trip, the family was taking pictures of the sightseeing spots, the landscapes
and of course from the family members. The digital camera they use is already
equipped with a GPS receiver, so every photo is stamped not only with the time
when, but also with the geo-location where it has been taken.
</p>
<h5 id="photo-selection">Photo annotation and selection</h5>
<p>
Back home, the family uploads about 1000 pictures from the camera to the
computer and wants to create an album for grand dad. On this computer, the
family uses a photo management tool which both extracts some basic
features such as the EXIF header, but also allows for entering tags and personal
descriptions. Still fulfilled with the memories of the nice trip the mother of
the family labels most of the photos. With a second tool, the tour of the GPS
receiver and the photos are merged using the time stamp. As a results, each of
the photos is geo-referenced with the GPS position stored in the EXIF header.
However, showing all the photos would take an entire weekend. So Ellen starts
to create an excerpt of their trip with the highlights. Her
photo album software takes in the 1000 pictures and makes suggestions for the
selection and the arrangement of the pictures in a photo album. For example,
the album software shows her a map of Tuscany and visualises, where she has
taken which photos and groups them together making suggestions which photos
would best represent this part of the vacation. For places for which the
software detects highlights, the system offers to add information about the
place to the album, stating that on this Piazza in front of the Palazzo Vecchio
there is the copy of Michelangelo's famous David statue. Depending on the
selected style, the software creates a layout and the distribution of all
images over the pages of the album taking into account color, spatial and
temporal clusters and template preference. So, in about 20 minutes Ellen has
finished the album and orders a paper version as well as an online-version. The
paper album is delivered to her by mail three days later. It looks great, and
the explaining texts that her software has almost automatically added to the
pictures are informative and help her remembering the great vacation. They show
the album to grandpa and he can take his time to study their vacation and the
wonderful Tuscany.
</p>
<h5 id="photo-sharing">Exchanging and sharing photos</h5>
<p>
Selecting the most impressive photos, the son of the family uploads a nice set
of photos to <a href="#Flickr">Flickr</a>, to give his friends an impression of the great vacation.
Unfortunately, all the descriptions and annotations from the personal
photo management system are lost after the Web upload. Therefore, he adds a few
own tags to the Flickr photos to describe the places, events, persons of the
trip. Even the GPS track is lost and he places the photos again on the Flickr
map application to geo-reference them. One friend finds a cool picture from the
Spanish Stairs in Rome by night and would like to get the photo and its
location from Flickr. This is difficult again as a pure download of the photo
does not retain the geo-location. When aunt Mary visits the Web album and
starts looking on the photos she tries to download a few onto her laptop to
integrate them into her own photo management software. Now aunt Mary would like
to incorporate some of the pictures of her nieces and nephews into her photo
management system. And again, the system imports the photos but the precious
metadata that mother and sun of family Miller have already annotated twice are
gone.
</p>
<h4 id="photo-problem">The fundamental problem: semantic content understanding</h4>
<div style="float: right; width: 45%; border: 1px solid gray; padding: 1%; margin: 1%">
<img src="photo-indoor-outdoor.png" alt="Indoor/Outdoor detection with signal analysis and context analysis"/>
<br/>
Indoor/Outdoor detection with signal analysis and context analysis.<br/>
Image courtesy of <a href="http://mmit.informatik.uni-oldenburg.de/en/">Susanne Boll</a>, used with permission.
</div>
<p>
What is needed is a better and more effective automatic annotation of digital
photos that better reflects one's personal memory of the events captured by the
photos an allows different applications to create value-added services on top
of them such as the creation of a personal photo album book. For understanding
the personal photos and overcoming the semantic gap, digital cameras leave us
with files like <tt>dsc5881.jpg</tt>, a very poor reflection of the actual event. Is is
a 2D visual snapshot of a multi-sensory personal experience. The quality of
photos is often very limited (snapshots, over exposed, blurred, ...). On the
other hand digital photos come with a large potential for semantic
understanding the photos. Photographs are always taken in context. In
contrast to analogical photography, digital photos provide us with explicit
contextual information (time, flash, aperture, ...), a "unique id" such as the
timestamp allows to later merge contextual information with the pure image
content.
</p>
<p>
However, what we want to remember along with the photo is where it was, who was
there with us, what can be seen on the photo, what the weather was, if we liked
the event and so on. In recent years, it became clear that signal analysis
alone will not be the solution. In combination with the context of the photo,
such as the GPS position or the time stamp, some hard signal processing problems can be
solved better. So context analysis has gained much attention and became
important for photos and very helpful for photo understanding.
In the opposite figure, a simple example is given of how to
combine signal analysis and context analysis to achieve a better indoor/outdoor
detection of photos. And, not only with the advent of the Web 2.0 the actual
user came into focus. The manual effort of single user annotations but also
collaborative effects are considered to be important for semantic photo
understanding.
</p>
<p>
The role of metadata for this usage of photo collections is manyfold:
</p>
<ul>
<li>
Save the experience: The central goal is to overcome the semantic gap and
represent as much of the humans impression of the moment when the photo was
taken.</li>
<li>
Browse and find previously taken photos: Allow searching for events and
persons, places, moments in time, etc.</li>
<li>
Share photos with the metadata with others: Give your annotated photos from
Flickr or from Foto Community to your friends' applications.</li>
<li>
Use comprehensive metadata for value-added services of the photos: Create an
automatic photo collage or send a flash presentation to your aunt’s TV, notify
all friends that are interested in photos from certain locations, events, or
persons, etc.</li>
</ul>
<div style="float: right; width: 45%; border: 1px solid gray; padding: 1%; margin: 1%">
<img src="photo-usage.png" alt="Photos usage"/>
<br/>
Photos usage.<br/>
Image courtesy of <a href="http://mmit.informatik.uni-oldenburg.de/en/">Susanne Boll</a>, used with permission.
</div>
<p>
The opposite figure illustrates the use of photos today and what we do with
our photos at home but also in the Web.
</p>
<p>
So the social life of personal photos can be summarized as:
</p>
<ul>
<li>
Capturing: one or more persons capture and event, with one or different cameras
with different capabilities and characteristics</li>
<li>
Storing: one or more persons store the photos with different tools on different
systems</li>
<li>
Processing: post-editing with different tools that change the quality and maybe
the metadata</li>
<li>
Uploading: some persons make their photos available on Web (2.0) sites
(Flickr); different sites offer different kinds of value-added services to the
photos (Riya)</li>
<li>
Sharing: photos are given away or are given access to via email, Web sites,
print, ...</li>
<li>
Receiving: photos from others are received via MMS, email, download, ...
</li>
<li>
Combining: Photos from own and different sources are selected and reused for
services like T-Shirt, Mugs, mouse pads, photo albums, collages, ...
</li>
</ul>
<p style="clear: both">
For this, metadata plays a central role at all times and places of the social
life of our photos.
</p>
<h4 id="photo-interoperability">The multimedia semantics interoperability problem</h4>
<h5 id="photo-metadata-level">Different levels and types of metadata for photos</h5>
<p>
The problem we have here is that metadata is created and
enhanced by different tools and systems and follows different standards and
representations. Even though there are many tools and standards that aim
to capture and maintain this metadata, they are not necessarily interoperable.
So on a technical level, we have the problem of a common representation of
metadata that is helpful and relevant for photo management, sharing and reuse.
Metadata and end user typically gets in touch with descriptive metadata that
stem from the context of the photo. At the same time, in more than a decade
many results in multimedia analysis have been achieved to extract many
different valuable features from multimedia content. For photos for example,
this includes color histograms, edge detection, brightness, texture and so on.
With <a href="#MPEG-7">MPEG-7</a>, a very large standard has been developed that allows to describe
these features in a standardized way. However, both the size of the standard but also the many
optional attributes in the standard have lead to a situation in which MPEG-7 is
used only in very specific applications and has not been achieved as a world
wide accepted standard for adding (some) metadata to a media item. Especially
in the area of personal media, in the same fashion as in the tagging scenario,
a small but comprehensive shareable and exchangeable description scheme for
personal media is missing.
</p>
<h5 id="photo-metadata-standard">Different standards for photo metadata and annotations</h5>
<p>
What is needed is a machine readable description that comes with each photo
that allows a site to offer valuable search and selection functionality on the
uploaded photos. Even though approaches for Photo Annotation have been proposed
they still do not address the wide range of metadata, annotations that could
and should be stored with an image in a standardized fashion.
</p>
<ul>
<li>
EXIF [<a href="#Exif">EXIF</a>] is a standard that comprises many photographic
and capture relevant metadata. Even though the end user might use only a few of
the key/value pairs, they are relevant at least for photo editing and archiving
tools which read this kind of metadata and visualize it. So EXIF is a necessary
set of metadata which is needed for photos.
</li>
<li>
Tags from Flickr and other photo web sites and tools are metadata of low
structure but high relevance for the user and the use of the photos. Manually
added they reflect the users knowledge and understanding of the content which
can not be replaced by any automatic semantic extraction. Therefore, a
representation of these is needed. Depending on the source of tags is might be
of interest to relate the tags to their origin such as "taken from an
existing vocabulary", "from a suggested set of other tags" or
just "free tags". XMP seems to be a very promising standard as it
allows to define RDF-based metadata for photos. However, in the description of
the standard, it clearly states that it leaves the application dependent schema
/vocabulary definition to the application and only makes suggestions for a set
of "generic" sets such as EXIF, Dublin Core. So the standard could be
a good "host" for a defined photo metadata description scheme in RDF
but does not define it.</li>
<li>
PhotoRDF [<a href="#PhotoRDF">PhotoRDF</a>] "describes a project for
describing & retrieving (digitized) photos with (RDF) metadata. It
describes the RDF schemas, a data-entry program for quickly entering metadata
for large numbers of photos, a way to serve the photos and the metadata over
HTTP, and some suggestions for search methods to retrieve photos based on their
descriptions." So wonderful, but the standard is separated into three
different schemas: Dublin Core [<a href="#DublinCore">Dublin Core</a>], a Technical Schema which
comprises more or less entries about author, camera and short description, and
a Content Schema which provides a set of 10 keywords. With PhotoRDF, the type
and number of attributes is limited, does not even comprise the full EXIF
schema and is also limited with regard to the content description of a photo.
</li>
<li>
The Extensible Metadata Platform or XMP [<a href="#XMP">XMP</a>] and the
IPTC-IIM-Standard [<a href="#IIM">IIM</a>] have been introduced to define how
metadata (not only) of a photo can be stored with the media element itself.
However, these standards come with their own set of attributes to describe the
photo or allow to define individual metadata templates. This is the killer for
any sharing and semantic Web search! What is missing is an actual standardized
vocabulary what information about a photo is important and relevant to a large
set of next generation digital photo services has not been reached.
</li>
<li>
The Image Annotation on the Semantic Web [<a href="#MMSEM-Image">MMSEM Image</a>]
provides an overview of the existing standard such as those mentioned
above. At the same time it shows how diverse the world of annotation is. The
use case for photo annotation choses RDF/XML syntax of RDF in order to gain
interoperability. It refers to a large set of different standards and
approaches that can be used to image annotation but there is no unified view on
image annotation and metadata relevant for photos. The attempt here is to
integrated existing standards. If those however are too many, too
comprehensive, and might even have overlapping attributes is might not be
adopted as the common photo annotation scheme on the Web. For example, for the
low level features for example, there is only a link to MPEG-7.
</li>
<li>
The DIG35 Initiative Group of the International Imaging Industry Association
aims "provide a standardized mechanism which allows end-users to see
digital image use as being equally as easy, as convenient and as flexible as
the traditional photographic methods while enabling additional benefits that
are possible only with a digital format." [<a href="#DIG35">DIG35</a>].
The DIG35 standards aims to define a standard set of metadata for digital
images that can be widely implemented across multiple image file formats. From
all the photo standards this is the broadest one with respect to typical photo
metadata and is already defined as a XML Schema.</li>
<li>
MPEG-7 is far to big even though the standard comprises metadata elements that
are relevant also for a Web wide usage of media content. The advantage of
MPEG-7 is that one can define an own description scheme and with it collect a
subset of relevant feature related metadata with a photo. But, there is no
chance to actually include an entire XML-based MPEG-7 description of a photo
into the raw content. For the description of the content the use case refers to
three domain-specific ontologies: personal history event, location and
landscape.</li>
</ul>
<h4 id="photo-solution">Towards a solution</h4>
<div style="float: right; width: 45%; border: 1px solid gray; padding: 1%; margin: 1%">
<img src="photo-interoperability.png" alt="Toward a solution for photo metadata interoperability"/>
<br/>
Toward a solution for photo metadata interoperability.<br/>
Image courtesy of <a href="http://mmit.informatik.uni-oldenburg.de/en/">Susanne Boll</a>, used with permission.
</div>
<p>
The result is clear, that there is not one standardized representation and
vocabulary for adding metadata to photos. Even though the different semantic
Web applications and developments should be embraced, a photo annotation
standard as a patchwork of too many different specifications is not helpful.
The opposite Figure illustrates some of the different actitivities, as
described aboce in the scenario, that people do with their photos and what
different standalone or web-based tools they use for this.
</p>
<p>
What is missing, however, for content management, search, retrieval, sharing
and innovative semantic (Web 2.0) applications is a limited and simple but at
the same time comprehensive vocabulary in a machine-readable, exchangeable, but
not over complicated representation is needed. However, the single standards
described only solve part of the problem. For example, a standardization of
tags is very helpful for a semantic search on photos in the Web. However, today
the low(er) level features are also lost. Even though the semantic search is
fine on a search level, for a later use and exploitation of a set of photos,
previously extracted and annotated lower-level features might be interesting as
well. Maybe a Web site would like to offer a grouping of photos along the color
distribution. Then either the site needs to do the extraction of a color
histogram or the photo itself brings this information already in in its
standardized header information. A face detection software might have found the
bounding boxes on the photo where a face has been detected and also provide a
face count. Then the Web site might allow to search for photos with two or more
persons on it. And so one. Even though low level features do not seem relevant
at first sight, for a detailed search, visualization and also later processing
the previously extracted metadata should be stored and available with the
photo.
</p>
<!-- TO ADD:
* OWL/RDFS Schema for DIG35 and MPEG-7
-->
<!-- ======================================================================== -->
<h3>
<a name="music">2.2 Use Case: Music</a>
</h3>
<h4 id="music-introduction">Introduction</h4>
<p>
In recent years the typical music consumption behaviour has changed
dramatically. Personal music collections have grown favoured by technological
improvements in networks, storage, portability of devices and Internet
services. The amount and availability of songs has de-emphasized its value: it
is usually the case that users own many digital music files that they have only
listened to once or even never. It seems reasonable to think that by providing
listeners with efficient ways to create a personalized order on their
collections, and by providing ways to explore hidden "treasures" inside them,
the value of their collection will drastically increase.
</p>
<p>Also, notwithstanding the digital revolution had many advantages, we can
point out some negative effects. Users own huge music collections that need
proper storage and labelling. Searching inside digital collections arise new
methods for accessing and retrieving data. But, sometimes there is no metadata
-or only the file name- that informs about the content of the audio, and that
is not enough for an effective utilization and navigation of the music
collection.
</p>
<p>Thus, users can get lost searching into the digital pile of his music
collection. Yet, nowadays, the web is increasingly becoming the primary source
of music titles in digital form. With millions of tracks available from
thousands of websites, finding the right songs, and being informed of newly
music releases is becoming a problematic task. Thus, web page filtering has
become necessary for most web users.
</p>
<p>Beside, on the digital music distribution front, there is a need to find ways
of improving music retrieval effectiveness. Artist, title, and genre keywords
might not be the only criteria to help music consumers finding music they like.
This is currently mainly achieved using cultural or editorial metadata ("artist
A is somehow related with artist B") or exploiting existing purchasing
behaviour data ("since you bought this artist, you might also want to buy this
one"). A largely unexplored (and potentially interesting) complement is using
semantic descriptors automatically extracted from the music audio files. These
descriptors can be applied, for example, to recommend new music, or generate
personalized playlists.
</p>
<h4 id="music-description">A complete description of a popular song</h4>
<p>In <a href="#Pachet">[Pachet]</a>, Pachet classifies the music
knowledge management. This classification allows to create meaningful
descriptions of music, and to exploit these descriptions to build music related
systems. The three categories that Pachet defines are: editorial (EM), cultural
(CM) and acoustic metadata (AM).
</p>
<p>Editorial metadata includes simple creation and production information (e.g.
the song C'mon Billy, written by P.J. Harvey in 1995, was produced by John
Parish and Flood, and the song appears as the track number 4, on the album "To
bring you my love"). EM includes, in addition, artist biography, album reviews,
genre information, relationships among artists, etc. As it can be seen,
editorial information is not necessarily objective. It is usual the case that
different experts cannot agree in assigning a concrete genre to a song or to an
artist. Even more diffcult is a common consensus of a taxonomy of musical
genres.
</p>
<p>Cultural metadata is defined as the information that is implicitly present in
huge amounts of data. This data is gathered from weblogs, forums, music radio
programs, or even from web search engines' results. This information has a
clear subjective component as it is based on personal opinions.
</p>
<p>The last category of music information is acoustic metadata. In this context,
acoustic metadata describes the content analysis of an audio file. It is
intended to be objective information. Most of the current music content
processing systems operating on complex audio signals are mainly based on
computing low-level signal features. These features are good at characterising
the acoustic properties of the signal, returning a description that can be
associated to texture, or at best, to the rhythmical attributes of the signal.
Alternatively, a more general approach proposes that music content can be
successfully characterized according to several "musical facets" (i.e. rhythm,
harmony, melody, timbre, structure) by incorporating higher-level semantic
descriptors to a given feature set. Semantic descriptors are predicates that
can be computed directly from the audio signal, by means of the combination of
signal processing, machine learning techniques, and musical knowledge.
</p>
<p>Semantic Web languages allow to describe all this metadata, as well as
integrating it from different music repositories.
</p>
<p>The following example shows an RDF description of an artist, and a song by
the artist:</p>
<div class="exampleInner" style="clear: both">
<pre><rdf:Description rdf:about="http://www.garageband.com/artist/randycoleman">
<rdf:type rdf:resource="&music;Artist"/>
<music:name>Randy Coleman</music:name>
<music:decade>1990</music:decade>
<music:decade>2000</music:decade>
<music:genre>Pop</music:genre>
<music:city>Los Angeles</music:city>
<music:nationality>US</music:nationality>
<geo:Point>
<geo:lat>34.052</geo:lat>
<geo:long>-118.243</geo:long>
</geo:Point>
<music:influencedBy rdf:resource="http://www.coldplay.com"/>
<music:influencedBy rdf:resource="http://www.jeffbuckley.com"/>
<music:influencedBy rdf:resource="http://www.radiohead.com"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.garageband.com/song?|pe1|S8LTM0LdsaSkaFeyYG0">
<rdf:type rdf:resource="&music;Track"/>
<music:title>Last Salutation</music:title>
<music:playedBy rdf:resource="http://www.garageband.com/artist/randycoleman"/>
<music:duration>T00:04:27</music:duration>
<music:key>D</music:key>
<music:keyMode>Major</music:keyMode>
<music:tonalness>0.84</music:tonalness>
<music:tempo>72</music:tempo>
</rdf:Description
</pre>
</div>
<h5 id="music-lyrics">Lyrics as metadata</h5>
<p>For a complete description of a song, lyrics must be considered as well.
While lyrics could in a sense be regarded as "acoustic metadata", they are per
se actual information entities which have themselves annotation needs. Lyrics
share many similarities with metadata, e.g. they usually refer directly to well
specified song, but acceptions exists as different artist might sing the same
lyrics sometimes even with different musical bases and styles. Most notably,
lyrics have often different authors than the music and voice that interprets
them and might be composed at a different time. Lyrics are not a simple text;
they often have a structure which is similar to that of the song (e.g. a
chorus) so they justify the use use of a markup language with a well specified
semantics. Unlike the previous types of metadata, however, they are not well
suited to be expressed using the W3C Semantic Web initiative languages, e.g. in
RDF. While RDF has been suggested instead of XML for for representig texts in
situation where advanced and multilayered markup is wanted [Ref RDFTEI], music
lyrics markup needs usually limit themselves to indicating particular sections
of the songs (e.g. intro, outro, chorus) and possibly the performing character
(e.g. in duets). While there is no widespread standard for machine encoded
lyrics, some have been proposed [LML][4ML] which in general fit the need for
formatting and differentiating main parts. An encoding in RDF of lyrics would
be of limited use but still possible with RDF based queries possible just
thanks to text search operators in the query language (therfore likely to be
limited to "lyrics that contain word X"). More complex queries could be
possible if more characters are performing in the lirics and each denoted by an
RDF entity which has other metadata attached to it (e.g. the metadata described
in the examples above).
</p>
<p>It is to be reported however that an RDF encoding would have the disadvantage
of complexity. In general it would require a supporting software (for example <a href="http://rdftef.sourceforge.net/">
http://rdftef.sourceforge.net/</a>) to be encoded as XML/RDF can be
difficultly written by hand. Also, contrary to an XML based encoding, it could
not be easily visualized in a human readable way by, e.g., a simple XSLT
transformation.
</p>
<p>Both in case of RDF and XML encoding, interesting processing and queries
(e.g. conceptual similarities between texts, moods etc) would necessitate
advanced textual analysis algorithms well outside the scope or XML or RDF
languages. Interestingly however, it might be possible to use RDF description
to encode the results of such advanced processings. Keyword extraction
algorithms (usually a combination of statistical analysis, stemming and
linguistical processing e.g. using wordnet) can be successfully employed on
lyrics. The resulting reppresentative "terms" can be encoded as metadata to the
lyrics or to the related song itself.
</p>
<h5 id="music-low-level">Lower Level Acoustic metadata</h5>
<p>"Acoustic metadata" is a broad term which can encompass both features which
have an immediate use in higher level use cases (e.g. those presented in the
above examples such as tempo, key, keyMode etc ) and those that can only be
interpreted by data analisys (e.g. a full or simplified representation of the
spectrum or the average power sliced every 10 ms). As we have seen, semantic
technologies are suitable for reppresenting the higher level acoustic metadata.
These are in fact both concise and can be used directly in semantic queries
using, e.g., SparQL. Lower level metadata however, e.g. the MPEG7 features
extracted by extractors like [Ref MPEG7AUDIODB] is very ill suited to be
represented in RDF and is better kept in mpeg-7/xml format for serialization
and interchange.
</p>
<p>Semantic technologies could be of use in describing such "chunks" of low
level metadata, e.g. describing what the content is in terms of describing
which features are contained and at which quality. While this would be a
duplicaiton of the information encoded in the MPEG-7/XML, it might be of use in
semantic queries which select tracks also based on the availability of rich low
level metadata.</p>
<h4 id="music-scenario">Motivating Example</h4>
<div style="float: right; width: 45%; border: 1px solid gray; padding: 1%; margin: 1%">
<img src="music-nextgig.png" alt="The next gig"/>
<br/>
The next gig.<br/>
Image courtesy of <a href="http://www.iua.upf.es/~ocelma/">Oscar Celma</a>, used with permission.
</div>
<p>Commuting is a big issue in any modern society. Semantically Personalized
Playlists might provide both relief and actually benefit in time that cannot be
devoted to actively productive activities. Filippo commutes every morning an
average of 50+-10 minutes. Before leaving he connects his USB stick/mp3 player
to have it "filled" with his morning playlist. The process is completed in 10
seconds, afterall is just 50Mb. he is downloading. During the time of his
commute, Filippo will be offered a smooth flow of news, personal daily ,
entertainment, and cultural snippets from audiobooks and classes.
</p>
<p>Musical content comes from Filippo personal music collection or via a content
provider (e.g. a low cost thanks to a one time pay license). Further audio
content comes from podcasts but also from text to speech reading blog posts,
emails, calendar items etc.
</p>
<p>Behind the scenes the system works by a combination of semantic queries and
ad-hoc algorithms. Semantic queries operate on an RDF database collecting the
semantic reppresentation of music metadata (as explained in section 1), as well
as annotations on podcasts, news items, audiobooks, and "semantic desktop
items" that is represting Filippo's personal desktop information -such as
emails and calendar entries.
</p>
<p>Ad-hoc algorithms operate on low level metadata to provide smooth transition
among tracks. Algorithms for text analysis provide further links among songs
and links within songs, pieces of news, emails etc.
</p>
<p>At a higher level, a global optimization algorithm takes care of the final
playlist creation. This is done by balancing the need for having high priority
items played first (e.g. emails from addresss considered important) with the
overall goal of providing a smooth and entertaining experience (e.g.
interleaving news with music etc).
</p>
<p>Semantics can help in providing "related information or content" which can be
put adjacent to the actual core content. This can be done in relative freedom
since the content can be at any time skipped by the user using simply the
forward button.</p>
<h5 id="music-upcoming-concerts">Upcoming concerts</h5>
<p>John has been listening to the "Snow Patrol" band for a while. He discovered
the band while listening to one of his favorite podcasts about alternative
music. He has to travel to San Diego next week, and he is finding upcoming
concerts that he would enjoy there, and he asks his personalized semantic web
music service to provide him with some recommendations of upcoming gigs in the
area, and decent bars to have a beer.
</p>
<div class="exampleInner">
<pre>
<!-- San Diego geolocation -->
<foaf:based_near geo:lat='32.715' geo:long='-117.156'/>
</pre>
</div>
<p>The system is tracking user listening habits, so it detects than one song
from "The Killers" band (scrapped from their website) sounds similar to the
last song John has listened to from "Snow Patrol". Moreover, both bands have
similar styles, and there are some podcasts that contain songs from both bands
in the same session. Interestingly enough, the system knows that the Killers
are playing close to San Diego next weekend, thus it recommends to John to
assist to that gig.</p>
<h5 id="music-facet">Facet browsing of Music Collections</h5>
<p>Michael has a brand new (last generation-posh) iPod. He is looking for some
music using the classic hierarchical navigation
(Genre->Artist->Album->Songs). But the main problem is that he is not
able to find a decent list of songs (from his 100K music collection) to move
into his iPod. On the other hand, facet browsing has recently become popular as
a user friendly interface to data repositories.
</p>
<p>/facet system <a href="#Hildebrand">[Hildebrand]</a> presents a new and intuitive way to navigate large
collections, using several facets or aspects, of multimedia assets. /facet
extends browsing of Semantic Web data in four ways. First, users are able to
select and navigate through facets of resources of any type and to make
selections based on properties of other, semantically related, types. Second,
it addresses a disadvantage of hierarchy-based navigation by adding a keyword
search interface that dynamically makes semantically relevant suggestions.
Third, the /facet interface, allows the inclusion of facet-specific display
options that go beyond the hierarchical navigation that characterizes current
facet browsing. Fourth, the browser works on any RDF dataset without any
additional configuration.
</p>
<p>Thus, based on a RDF description of music titles, the user
can navigate through music facets, such as Rhythm (beats per minute), Tonality
(Key and mode), Intensity of the piece (moderate, energetic, etc.)
</p>
<p>A fully functional example can be seen at <a href="http://slashfacet.semanticweb.org/music/mazzle">
http://slashfacet.semanticweb.org/music/mazzle</a>
</p>
<div style="border: 1px solid gray; padding: 1%; margin: 1%">
<center>
<img src="music-mazzle.png" alt="The Mazzle Interface"/>
<br/>
The Mazzle Interface.<br/>
Image courtesy of <a href="http://www.cwi.nl/~hildebra/">Michiel Hildebrand</a>, used with permission.
</center>
</div>
<h4 id="music-metadata">Music Metadata on the Semantic Web</h4>
<p>Nowadays, in the context of the World Wide Web, the increasing amount of
available music makes very difficult, to the user, to find music he/she would
like to listen to. To overcome this problem, there are some audio search
engines that can fit the user's needs (for example: <a href="http://search.singingfish.com/">
http://search.singingfish.com/</a>, <a href="http://audio.search.yahoo.com/">http://audio.search.yahoo.com/</a>,
<a href="http://www.audiocrawler.com/">http://www.audiocrawler.com/</a>, <a href="http://www.alltheweb.com/?cat=mp3">
http://www.alltheweb.com/?cat=mp3</a>, <a href="http://www.searchsounds.net">http://www.searchsounds.net</a>
and <a href="http://www.altavista.com/audio/">http://www.altavista.com/audio/</a>).
</p>
<p>Some of the current existing search engines are nevertheless not fully
exploited because their companies would have to deal with copyright infringing
material. Music search engines have a crucial component: an audio crawler, that
scans the web and gathers related information about audio files.
</p>
<p>Moreover, describing music it not an easy task. As presented in section 1,
music metadata copes with several categories (editorial, acoustic, and
cultural). Yet, none of the audio metadata used in practice (e.g ID3, OGG
Vorbis, etc.) can fully describe all these facets. Actually, metadata for
describing music are mostly tags implemented in the Key-Value form
[TAG]=[VALUE], for instance, "ARTIST=The Killers".
</p>
<p>The following section introduces, then, the mappings between current audio
vocabularies within the Semantic Web technologies. This will allow to extend
the description of a piece of music, as well as adding explicit semantics.
</p>
<h4 id="music-integration">Integrating Various Vocabularies Using RDF</h4>
<p>In this section we present a way to integrate several audio vocabularies into
a single one, based on RDF. For more details about the audio vocabularies, the
reader is refered to <a href="http://www.w3.org/2005/Incubator/mmsem/wiki/Vocabularies#head-91ffc7bd57a4631807ae03b31721b099db56937a">
Vocabularies - Audio Content Section</a>, and <a href="http://www.w3.org/2005/Incubator/mmsem/wiki/Vocabularies#head-7d4cf55c8883fbcbfdbbe8b1eb1b1512c2a5b328">
Vocabularies - Audio Ontologies Section</a>.
</p>
<p>This section will focus on the ID3 and OGG Vorbis metadata initiatives, as
they are the most used ones. Though, both vocabularies cope only editorial
data. Moreover, a first mapping with the <a href="http://www.w3.org/2005/Incubator/mmsem/wiki/Vocabularies#head-7d4cf55c8883fbcbfdbbe8b1eb1b1512c2a5b328">
Music Ontology</a> is presented, too.
</p>
<p><a href="http://www.id3.org">ID3</a> is a metadata container most often
used in conjunction with the MP3 audio file format. It allows information such
as the title, artist, album, track number, or other information about the file
to be stored in the file itself (from Wikipedia).
</p>
<p>The most important metadata descriptors are:</p>
<ul>
<li>
Artist name <=> <tt><foaf:name></foaf:name></tt>
</li>
<li>
Album name <=> <tt><mo:Record><dc:title>Album
name</dc:title></mo:Record></tt>
</li>
<li>
Song title <=> <tt><mo:Track><dc:title>Album
name</dc:title></mo:Track></tt>
</li>
<li>
Year
</li>
<li>
Track number <=> <tt><mo:trackNum>Track
number</mo:trackNum></mo:Track></tt>
</li>
<li>
Genre (from a predefined list of more than 100 genres) <=> <tt><mo:Genre>Genre
name</mo:Genre></tt>
</li>
</ul>
<p><a href="http://www.vorbis.com/">OGG Vorbis</a> metadata, called comments,
support metadata 'tags' similar to those implemented in the ID3. The metadata
is stored in a vector of strings, encoded in UTF-8
</p>
<ul>
<li>
TITLE <=> <tt><mo:Track><dc:title>Album
name</dc:title></mo:Track></tt>
</li>
<li>
VERSION: The version field may be used to differentiate multiple versions of
the same track title
</li>
<li>
ALBUM <=> <tt><mo:Record><dc:title>Album
name</dc:title></mo:Record></tt>
</li>
<li>
TRACKNUMBER <=> <tt><mo:trackNum>Track
number</mo:trackNum></mo:Track></tt>
</li>
<li>
ARTIST <=> <tt><foaf:name></foaf:name></tt>
</li>
<li>
PERFORMER <=> <tt><foaf:name></foaf:name></tt> ???
</li>
<li>
COPYRIGHT: Copyright attribution
</li>
<li>
LICENSE: License information, eg, 'All Rights Reserved', 'Any Use Permitted', a
URL to a license such as a Creative Commons license
</li>
<li>
ORGANIZATION: Name of the organization producing the track (i.e ‘a record
label’)
</li>
<li>
DESCRIPTION: A short text description of the contents
</li>
<li>
GENRE <=> <tt><mo:Genre>Genre name</mo:Genre></tt>
</li>
<li>
DATE
</li>
<li>
LOCATION: Location where the track was recorded</li>
</ul>
<h4 id="music-rdfizing">RDFizing songs</h4>
<p>We present a way to RDFize tracks based on the
<a href="http://www.w3.org/2005/Incubator/mmsem/wiki/Vocabularies#head-7d4cf55c8883fbcbfdbbe8b1eb1b1512c2a5b328">
Music Ontology</a>.
</p>
<p>Example: Search a song into <a href="http://www.musicbrainz.org">MusicBrainz</a>
and RDFize results. This first example shows how to query the MusicBrainz music repository, and RDFize the results based on the Music
Ontology. Try a complete example at <a href="http://foafing-the-music.iua.upf.edu/RDFize/track?artist=U2&title=The+fly">
http://foafing-the-music.iua.upf.edu/RDFize/track?artist=U2&title=The+fly</a>.
The parameters are song title (The Fly) and artist name (U2).
</p>
<div class="exampleInner" style="clear: both">
<pre><mo:Track rdf:about='http://musicbrainz.org/track/dddb2236-823d-4c13-a560-bfe0ffbb19fc'>
<mo:puid rdf:resource='2285a2f8-858d-0d06-f982-3796d62284d4'/>
<mo:puid rdf:resource='2b04db54-0416-d154-4e27-074e8dcea57c'/>
<dc:title>The Fly</dc:title>
<dc:creator>
<mo:MusicGroup rdf:about='http://musicbrainz.org/artist/a3cb23fc-acd3-4ce0-8f36-1e5aa6a18432'>
<foaf:img rdf:resource='http://ec1.images-amazon.com/images/P/B000001FS3.01._SCMZZZZZZZ_.jpg'/>
<mo:musicmoz rdf:resource='http://musicmoz.org/Bands_and_Artists/U/U2/'/>
<foaf:name>U2</foaf:name>
<mo:discogs rdf:resource='http://www.discogs.com/artist/U2'/>
<foaf:homepage rdf:resource='http://www.u2.com/'/>
<foaf:member rdf:resource='http://musicbrainz.org/artist/0ce1a4c2-ad1e-40d0-80da-d3396bc6518a'/>
<foaf:member rdf:resource='http://musicbrainz.org/artist/1f52af22-0207-40ac-9a15-e5052bb670c2'/>
<foaf:member rdf:resource='http://musicbrainz.org/artist/a94e530f-4e9f-40e6-b44b-ebec06f7900e'/>
<foaf:member rdf:resource='http://musicbrainz.org/artist/7f347782-eb14-40c3-98e2-17b6e1bfe56c'/>
<mo:wikipedia rdf:resource='http://en.wikipedia.org/wiki/U2_%28band%29'/>
</mo:MusicGroup>
</dc:creator>
</mo:Track></pre>
</div>
<p>Example: The parameter is a URL that contains an MP3 file. In this case it
reads the ID3 tags from the MP3 file. See an output example at
<a href="http://foafing-the-music.iua.upf.edu/RDFize/track?url=http://www.archive.org/download/bt2002-11-21.shnf/bt2002-11-21d1.shnf/bt2002-11-21d1t01_64kb.mp3">
http://foafing-the-music.iua.upf.edu/RDFize/track?url=http://www.archive.org/download/bt2002-11-21.shnf/bt2002-11-21d1.shnf/bt2002-11-21d1t01_64kb.mp3</a>
(it might take a little while).
</p>
<div class="exampleInner" style="clear: both">
<pre>
<mo:Track rdf:about='http://musicbrainz.org/track/7201c2ab-e368-4bd3-934f-5d936efffcdc'>
<dc:creator>
<mo:MusicGroup rdf:about='http://musicbrainz.org/artist/6b28ecf0-94e6-48bb-aa2a-5ede325b675b'>
<foaf:name>Blues Traveler</foaf:name>
<mo:discogs rdf:resource='http://www.discogs.com/artist/Blues+Traveler'/>
<foaf:homepage rdf:resource='http://www.bluestraveler.com/'/>
<foaf:member rdf:resource='http://musicbrainz.org/artist/d73c9a5d-5d7d-47ec-b15a-a924a1a271c4'/>
<mo:wikipedia rdf:resource='http://en.wikipedia.org/wiki/Blues_Traveler'/>
<foaf:img rdf:resource='http://ec1.images-amazon.com/images/P/B000078JKC.01._SCMZZZZZZZ_.jpg'/>
</mo:MusicGroup>
</dc:creator>
<dc:title>Back in the Day</dc:title>
<mo:puid rdf:resource='0a57a829-9d3c-eb35-37a8-d0364d1eae3a'/>
<mo:puid rdf:resource='02039e1b-64bd-6862-2d27-3507726a8268'/>
</mo:Track></pre>
</div>
<p>Example: Once the songs have been RDFized, we can ask <a href="http://last.fm">last.fm</a>
for the latest tracks a user has been listening to, and then RDFize them.
<a href="http://foafing-the-music.iua.upf.edu/draft/RDFize/examples/lastfm_tracks.rdf">
http://foafing-the-music.iua.upf.edu/draft/RDFize/examples/lastfm_tracks.rdf</a> is an example that
shows the latest tracks a user (RJ) has been listening to.
You can try it at <a href="http://foafing-the-music.iua.upf.edu/RDFize/lastfm_tracks?username=RJ">
http://foafing-the-music.iua.upf.edu/RDFize/lastfm_tracks?username=RJ</a>
</p>
<!-- ======================================================================== -->
<h3>
<a name="news">2.3 Use Case: News</a>
</h3>
<h4 id="news-introduction">Introduction</h4>
<p>More and more news is produced and consumed each day. News generally consists
of mainly textual stories, which are more and more often illustrated with
graphics, images and videos. News can be further processed by professional
(newspapers), directly accessible for web users through news agencies, or
automatically aggregated on the web, generally by search engine portal and not
without copyright problems.
</p>
<p>For easing the exchange of news, the <a href="http://www.iptc.org/pages/index.php">
International Press Telecommunication Council (IPTC)</a> is currently
developping the NewsML G2 Architecture (NAR) whose goal is <em>to provide a
single generic model for exchanging all kinds of newsworthy information, thus
providing a framework for a future family of IPTC news exchange standards</em>
[<a href="#NewsML-G2">NewsML-G2</a>]. This family includes NewsML, SportsML, EventsML, ProgramGuideML and a
future WeatherML. All are XML-based languages used for describing not only the
news content (traditional metadata), but also their management and packaging,
or related to the exchange itself (transportation, routing).
</p>
<p>However, despite this general framework, interoperability problems can occur.
News is about the world, so its metadata might use specific controlled
vocabularies. For example, IPTC itself is developing the IPTC News Codes [<a href="#NewsCodes">NewsCodes</a>]
that currently contain 28 sets of controlled terms. These terms will be the
values of the metadata in the NewsML G2 Architecture. The news descriptions
often refer to other thesaurus and controlled vocabularies, that might come
from the industry (for example, XBRL [<a href="#XBRL">XBRL</a>] in the financial domain), and all are
represented using different formats. From the media point of view, the pictures
taken by the journalist come with their EXIF metadata [<a href="#Exif">EXIF</a>]. Some videos might be
described using the EBU format [<a href="#EBU">EBU</a>] or even with MPEG-7 [<a href="#MPEG-7">MPEG-7</a>].
</p>
<p>We illustrate these interoperability issues between domain vocabularies and
other multimedia standards in the financial news domain. For example, the
<a href="http://www.reuters.com/">Reuters Newswires</a> and the
<a href="http://www.djnewswires.com/">Dow Jones Newswires</a> provide categorical
metadata associated with news feeds. The particular vocabularies of category
codes, however, have been developed independently, leading to clear
interoperability issues. The general goal is to improve the search and the
presentation of news content in such an heterogeneous environment. We provide a
motivating example that highlight the issues discussed above and we present a
potential solution to this problem, which leverages Semantic Web technologies.
</p>
<h4 id="news-scenario">Motivating Example</h4>
<p>XBRL (Extended Business Reporting Language) [<a href="#XBRL">XBRL</a>] is a standardized way of
enconding financial information of companies, and about the management
structure, location, number of employes, etc. of such entities. XBRL is
basically about "quantitative" information in the financial domain, and is
based on the periodic reports generated by the companies. But for many Business
Intelligence applications, there is also a need to consider "qualitative"
information, which is mostly delivered by news articles. The problem is
therefore how to optimally integrate information from the periodic reports and
the day to day information provided by specialized news agencies. Our goal is
to provide a platform that allows more semantics in automated ranking of
creditworthiness of companies. The financial news are playing an important role
since they provide "qualitative" information on companies, branches, trends,
countries, regions etc.
</p>
<p>There are quite a few news feeds services within the financial domain,
including the Dow Jones Newswire and Reuters. Both Reuters and Dow Jones
provides an XML based representation and have associated with each article
metadata with date, time, headline, full story, company ticker symbol, and
category codes.
</p>
<h5 id="news-example1">Example 1: NewsML 1 Format</h5>
<p>We consider the news feeds similar to that published by <a href="http://www.reuters.com/">Reuters</a>, where
along with the text of the article, there is associated metadata in the form of
XML tags. The terms in these tags are associated with a controlled vocabulary
developed by Reuters and other industry bodies. Below is a sample news article
formatted in NewsML 1, which is similar to the structural format used by
Reuters. For exposition, the metadata tags associated with the article are
aligned with those used by Reurters.
</p>
<div class="exampleInner" style="clear: both"><pre>
<?xml version="1.0" encoding="UTF-8"?>
<NewsML Duid="MTFH93022_2006-12-14_23-16-17_NewsML">
<Catalog Href="..."/>
<NewsEnvelope>
<DateAndTime>20061214T231617+0000</DateAndTime>
<NewsService FormalName="..."/>
<NewsProduct FormalName="TXT"/>
<Priority FormalName="3"/>
</NewsEnvelope>
<NewsItem Duid="MTFH93022_2006-12-14_23-16-17_NEWSITEM">
<Identification>
<NewsIdentifier>
<ProviderId>...</ProviderId>
<DateId>20061214</DateId>
<NewsItemId>MTFH93022_2006-12-14_23-16-17</NewsItemId>
<RevisionId Update="N" PreviousRevision="0">1</RevisionId>
<PublicIdentifier>...</PublicIdentifier>
</NewsIdentifier>
<DateLabel>2006-12-14 23:16:17 GMT</DateLabel>
</Identification>
<NewsManagement>
<NewsItemType FormalName="News"/>
<FirstCreated>...</FirstCreated>
<ThisRevisionCreated>...</ThisRevisionCreated>
<Status FormalName="Usable"/>
<Urgency FormalName="3"/>
</NewsManagement>
<NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH92062_2002-09-23_09-29-03_T88093_MAIN_NC" xml:lang="en">
<TopicSet FormalName="HighImportance">
<Topic Duid="t1">
<TopicType FormalName="CategoryCode"/>
<FormalName Scheme="MediaCategory">OEC</FormalName>
<Description xml:lang="en">Economic news, EC, business/financial pages</Description>
</Topic>
<Topic Duid="t2">
<TopicType FormalName="Geography"/>
<FormalName Scheme="N2000">DE</FormalName>
<Description xml:lang="en">Germany</Description>
</Topic>
</TopicSet>
<Role FormalName="Main"/>
<AdministrativeMetadata>
<FileName>MTFH93022_2006-12-14_23-16-17.XML</FileName>
<Provider>
<Party FormalName="..."/>
</Provider>
<Source>
<Party FormalName="..."/>
</Source>
<Property FormalName="SourceFeed" Value="IDS"/>
<Property FormalName="IDSPublisher" Value="..."/>
</AdministrativeMetadata>
<NewsComponent EquivalentsList="no" Essential="no" Duid="MTFH93022_2006-12-14_23-16-17" xml:lang="en">
<Role FormalName="Main Text"/>
<NewsLines>
<HeadLine>Insurances get support</HeadLine>
<ByLine/>
<DateLine>December 14, 2006</DateLine>
<CreditLine>...</CreditLine>
<CopyrightLine>...</CopyrightLine>
<SlugLine>...</SlugLine>
<NewsLine>
<NewsLineType FormalName="Caption"/>
<NewsLineText>Insurances get support</NewsLineText>
</NewsLine>
</NewsLines>
<DescriptiveMetadata>
<Language FormalName="en"/>
<TopicOccurrence Importance="High" Topic="#t1"/>
<TopicOccurrence Importance="High" Topic="#t2"/>
</DescriptiveMetadata>
<ContentItem Duid="MTFH93022_2006-12-14_23-16-17">
<MediaType FormalName="Text"/>
<Format FormalName="XHTML"/>
<Characteristics>
<Property FormalName="ContentID" Value="urn:...20061214:MTFH93022_2006-12-14_23-16-17_T88093_TXT:1"/>
...
</Characteristics>
<DataContent>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Insurances get support</title>
</head>
<body>
<h1>The Senate of Germany wants to constraint the participation of clients to the hidden reserves</h1>
<p>
DÃœSSELDORF The German Senate supports the point of view of insurance companies in a central point of the new law
defining insurance contracts, foreseen for 2008. In a statement, the Senators show disagreements with the proposal
of the Federal Government, who was in favor of including investment bonds in the hidden reserves, which in the
next future should be accessible to the clients of the insurance companies.
...
</p>
</body>
</html>
</DataContent>
</ContentItem>
</NewsComponent>
</NewsComponent>
</NewsItem>
</NewsML></pre>
</div>
<h5 id="news-example2">Example 2: NewsML G2 Format</h5>
<p>If we consider the same data, but expressed in NewsML G2:
</p>
<div class="exampleInner" style="clear: both">
<pre><?xml version="1.0" encoding="UTF-8"?>
<newsMessage xmlns="http://iptc.org/std/newsml/2006-05-01/" xmlns:xhtml="http://www.w3.org/1999/xhtml">
<header>
<date>2006-12-14T23:16:17Z</date>
<transmitId>696</transmitId>
<priority>3</priority>
<channel>ANA</channel>
</header>
<itemSet>
<newsItem guid="urn:newsml:afp.com:20060720:TX-SGE-SNK66" schema="0.7" version="1">
<catalogRef href="http://www.afp.com/newsml2/catalog-2006-01-01.xml"/>
<itemMeta>
<contentClass code="ccls:text"/>
<provider literal="Handelsblatt"/>
<itemCreated>2006-07-20T23:16:17Z</itemCreated>
<pubStatus code="stat:usable"/>
<service code="srv:Archives"/>
</itemMeta>
<contentMeta>
<contentCreated>2006-07-20T23:16:17Z</contentCreated>
<creator/>
<language literal="en"/>
<subject code="cat:04006002" type="ctyp:category"/> #cat:04006002= banking
<subject code="cat:04006006" type="ctyp:category"/> #cat:04006006= insurance
<slugline separator="-">Insurances get support</slugline>
<headline>The Senate of Germany wants to constraint the participation of clients to the hidden reserves</headline>
</contentMeta>
<contentSet>
<inlineXML type="text/plain">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Insurances get support</title>
</head>
<body>
<h1>The Senate of Germany wants to constraint the participation of clients to the hidden reserves</h1>
<p>
DÃœSSELDORF The German Senate supports the point of view of insurance companies in a central point of the new law defining
insurance contracts, foreseen for 2008. In a statement, the Senators show disagreements with the proposal of the Federal
Government, who was in favor of including investment bonds in the hidden reserves, which in the next future should be accessible
to the clients of the insurance companies.
...
</p>
</body>
</html>
</inlineXML>
</contentSet>
</newsItem>
</itemSet>
</newsMessage></pre>
</div>
<h5 id="news-example3">Example 3: German Broadcaster Format</h5>
<p>The terms in the tags displayed just above are associated with a controlled
vocabulary developed by Reuters. If we consider the internal XML encoding that
has been proposed provisionally by a running European project (the <a href="http://www.musing.eu">MUSING
project</a>) for the encoding of similar articles in German
Newspapers (mapping the HTML tags of the online articles into XML and adding
others), we have the following:
</p>
<div class="exampleInner" style="clear: both"><pre>
<ID>1091484</ID> # Internal encoding
<SOURCE>Handelsblatt</SOURCE> # Name of the newspaper we get the information from
<DATE>14.12.2006</DATE> # Date of publication
<NUMBER>242</NUMBER> # Numbering of the publication
<PAGE>27</PAGE> # Page number in the publication
<LENGTH>111</LENGTH> # The number of lines in the main article
<ACTIVITY_FIELD>Banking_Insurance</ACTIVITY_FIELD> # corresponding to the financial domain reported in the article
<TITLE>Insurances get support</TITLE>
<SUBTITLE>The Senate of Germany wants to constraint the participation of clients to the hidden reserves</SUBTITLE>
<ABSTRACT></ABSTRACT>
<AUTHORS>Lansch, Rita</AUTHORS>
<LOCATION>Federal Republic of Germany</LOCATION>
<KEYWORDS>Bank supervision, Money and Stock exchange, Bank</KEYWORDS>
<PROPERNAMES>Meister, Edgar Remsperger, Hermann Reckers, Hans Fabritius, Hans Georg Zeitler, Franz-Christoph</PROPERNAMES>
<ORGANISATIONS>Bundesanstalt für Finanzdienstleistungsaufsicht BAFin</ORGANISATIONS>
<TEXT>DÃœSSELDORF The German Senate supports the point of view of insurance companies in a central point of the new law
defining insurance contracts, foreseen for 2008. In a statement, the Senators show disagreements with the proposal of the
Federal Government, who was in favor of including investment bonds in the hidden reserves, which in the next future should
be accessible to the clients of the insurance companies....</TEXT>
</pre>
</div>
<h5 id="news-example4">Example 4: XBRL Format</h5>
<p>Structured data and documents such as Profit & Loss tables can finally be
mapped onto existing taxonomies, like XBRL, which is an emerging standard for
Business Reporting.
</p>
<p>XBRL definition in Wikipedia: "XBRL is an emerging XML-based standard to
define and exchange business and financial performance information. The
standard is governed by a not-for-profit international consortium
<a href="http://www.xbrl.org">XBRL International Incorporated</a>
of approximately 450 organizations, including regulators, government agencies,
infomediaries and software vendors. XBRL is a standard way to communicate
business and financial performance data. These communications are defined by
metadata set in taxonomies. Taxonomies capture the definition of individual
reporting elements as well as the relationships between elements within a
taxonomy and in other taxonomies.
</p>
<p>The relations between elements supported, for the time being, (at least for
the German Accounting Principles expressed in the corresponding XBRL taxonomy,
see <a href="http://www.xbrl.de/">http://www.xbrl.de/</a>) are:
</p>
<ul>
<li>
child-parent
</li>
<li>
parent-child
</li>
<li>
dimension-element
</li>
<li>
element-dimension
</li>
</ul>
<p>In fact the child-parent/parent-child relation haves to be understood as
part-of relations within finanical reporting documents rather than as sub-class
relations, as we noticed in an attempt to formlize XBRL in OWL, in the context
of the European MUSING R&D project (<a href="http://www.musing.eu/">http://www.musing.eu/</a>).
</p>
<p>The table below shows how a balance sheet looks like:
</p>
<div>
<table border="2" id="Table1">
<tbody>
<tr>
<td><strong>structured P&L</strong>
</td>
<td><strong>2002 EUR</strong>
</td>
<td><strong>2002 EUR</strong>
</td>
<td><strong>2002 EUR</strong>
</td>
</tr>
<tr>
<td>
Sales
</td>
<td>850.000,00
</td>
<td>800.000,00
</td>
<td>300.000,00
</td>
</tr>
<tr>
<td>
Changes in stock
</td>
<td>171.000,00
</td>
<td>104.000,00
</td>
<td>83.000,00
</td>
</tr>
<tr>
<td>
Own work capitalized
</td>
<td>0,00
</td>
<td>0,00
</td>
<td>0,00
</td>
</tr>
<tr>
<td>
Total output
</td>
<td>1.021.000,00
</td>
<td>904.000,00
</td>
<td>383.000,00
</td>
</tr>
<tr>
<td>
...
</td>
<td>
</td>
<td>
</td>
<td>
</td>
</tr>
<tr>
<td>
Net income/net loss for the year
</td>
<td>139.000,00
</td>
<td>180.000,00
</td>
<td>-154.000,00
</td>
</tr>
<tr>
<td>
</td>
<td><strong>2002</strong>
</td>
<td><strong>2001</strong>
</td>
<td><strong>2000</strong>
</td>
</tr>
<tr>
<td>
Number of Employees
</td>
<td>27
</td>
<td>25
</td>
<td>23
</td>
</tr>
<tr>
<td>
....
</td>
<td>
</td>
<td>
</td>
<td>
</td>
</tr>
</tbody>
</table>
</div>
<p>There is a lot of variations in both the way the information can be displayed
(number of columns, use of fonts, etc.) but also in the terminology used: the
financial terms in the leftmost column are not normalized at all. Also the
figures are not normalized (clearly, the company has more than just "27"
employees, but it is not indicated in the table if we deal with 27000
employess). This makes this kind of information unable to be used by semantic
applications. XBRL is a very important step in the normalization of such data,
as can be seen in the following example displaying the XBRL encoding of the
kind of data that was presented just above in the table:
</p>
<div class="exampleInner" style="clear: both">
<pre><group xsi:schemaLocation="http://www.xbrl.org/german/ap/ci/2002-02-15 german_ap.xsd">
<numericContext id="c0" precision="8" cwa="false">
<entity>
<identifier scheme="urn:datev:www.datev.de/zmsd">11115,129472/12346</identifier>
</entity>
<period>
<startDate>2002-01-01</startDate>
<endDate>2002-12-31</endDate>
</period>
<unit>
<measure>ISO4217:EUR</measure>
</unit>
</numericContext>
<numericContext id="c1" precision="8" cwa="false">
<entity>
<identifier scheme="urn:datev:www.datev.de/zmsd">11115,129472/12346</identifier>
</entity>
<period>
<startDate>2001-01-01</startDate>
<endDate>2001-12-31</endDate>
</period>
<unit>
<measure>ISO4217:EUR</measure>
</unit>
</numericContext>
<numericContext id="c2" precision="8" cwa="false">
<entity>
<identifier scheme="urn:datev:www.datev.de/zmsd">11115,129472/12346</identifier>
</entity>
<period>
<startDate>2000-01-01</startDate>
<endDate>2000-12-31</endDate>
</period>
<unit>
<measure>ISO4217:EUR</measure>
</unit>
</numericContext>
<t:bs.ass numericContext="c2">1954000</t:bs.ass>
<t:bs.ass.accountingConvenience numericContext="c0">40000</t:bs.ass.accountingConvenience>
<t:bs.ass.accountingConvenience numericContext="c1">70000</t:bs.ass.accountingConvenience>
<t:bs.ass.accountingConvenience numericContext="c2">0</t:bs.ass.accountingConvenience>
<t:bs.ass.accountingConvenience.changeDem2Eur numericContext="c0">0</t:bs.ass.accountingConvenience.changeDem2Eur>
<t:bs.ass.accountingConvenience.changeDem2Eur numericContext="c1">20000</t:bs.ass.accountingConvenience.changeDem2Eur>
<t:bs.ass.accountingConvenience.changeDem2Eur numericContext="c2">0</t:bs.ass.accountingConvenience.changeDem2Eur>
<t:bs.ass.accountingConvenience.startUpCost numericContext="c0">40000</t:bs.ass.accountingConvenience.startUpCost>
<t:bs.ass.accountingConvenience.startUpCost numericContext="c1">50000</t:bs.ass.accountingConvenience.startUpCost>
<t:bs.ass.accountingConvenience.startUpCost numericContext="c2">0</t:bs.ass.accountingConvenience.startUpCost>
<t:bs.ass.currAss numericContext="c0">571500</t:bs.ass.currAss>
<t:bs.ass.currAss numericContext="c1">558000</t:bs.ass.currAss>
<t:bs.ass.currAss numericContext="c2">394000</t:bs.ass.currAss>
</group></pre>
</div>
<p>In the XBRL example shown just above, one can see the normalization of the
periods for which the reporting is valid, and for the currency used in the
report. The annotation of the financial values of the financial items is then
proposed on the base of a XBRL tag (language independent) in the context of the
uniquely identified period (the "c0", "c1" etc), and with the encoded currency.
</p>
<p>The XBRL representation is marking a real progress compared to the
"classical" way of displaying financial information. And as such XBRL allows
for some semantics, describing for example various types of relations. The need
for more semantics is mainly driven by applications requiring merging of the
quantitative information encoded in XBRL with other kind of information, which
is crucial in Business Intelligence scenarios, for example merging balance
sheet information with information coming from newswires or with information in
related domain, like politics. Therefore some initiatives started looking at
representing information encoded in XBRL within OWL, as the basic ontology
language representation in the Semantic Web community [<a href="#Declerck">Declerck</a>], [<a href="#Lara">Lara</a>].
</p>
<h4 id="news-solution">Potential Solution: Converting Various Vocabularies into RDF</h4>
<p>In this section, we discuss a potential solution to the problems highlighted
in this document. We propose utilizing Semantic Web technologies for the
purpose of aligning these standards and controlled vocabularies. Specifically,
we discuss adding an RDF/OWL layer on top of these standards and vocabularies
for the purpose of data integration and reuse. The following sections discuss
this approach in more detail.
</p>
<h5 id="news-XBRL">XBRL in the Semantic Web</h5>
<p>We sketch how we convert XBRL to OWL. The XBRL OWL base taxonomy was manually
developed using the OWL plugin of the Protege knowledge base editor [<a href="#Knublauch">Knublauch</a>]. The
version of XBRL we used together with the Accounting Principles for German
consists of 2,414 concepts, 34 properties, and 4,780 instances. Overall, this
translates into 24,395 unique RDF triples. The basic idea during our export was
that even though we are developing an XBRL taxonomy in OWL using Protege, the
information that is stored on disk is still RDF on the syntactic level. We were
thus interested in RDF data base systems which make sense of the semantics of
OWL and RDFS constructs such as rdfs:subClassOf or owl:equivalentClass. We have
been experimenting with the Sesame open-source middleware framework for storing
and retrieving RDF data [<a href="#Broekstra">Broekstra</a>].
</p>
<p>Sesame partially supports the semantics of RDFS and OWL constructs via
entailment rules that compute "missing" RDF triples (the deductive closure) in
a forward-chaining style at compile time. Since sets of RDF statements
represent RDF graphs, querying information in an RDF framework means to specify
path expressions. Sesame comes with a very powerful query language, SeRQL,
which includes (i) generalised path expressions, (ii) a restricted form of
disjunction through optional matching, (iii) existential quantifiation over
predicates, and (iv) Boolean constraints. From an RDF point of view, additional
62,598 triples were generated through Sesame's (incomplete) forward chaining
inference mechanism.
</p>
<p>For proof of concept, we looked at the freely available financial reporting
taxonomies(<a href="http://www.xbrl.org/FRTaxonomies/">http://www.xbrl.org/FRTaxonomies/</a>)
and took the final German AP Commercial and Industrial (German Accounting
Principles) taxonomy (February 15, 2002; <a href="http://www.xbrl-deutschland.de/xe">
http://www.xbrl-deutschland.de/xe</a> news2.htm), acknowledged by XBRL
International. The taxonomy can be obtained as a packed zip file from <a href="http://www.xbrl-deutschland.de/germanap.zip">
http://www.xbrl-deutschland.de/germanap.zip</a>.
</p>
<p>xbrl-instance.xsd specifies the XBRL base taxonomy using XML Schema. The file
makes use of XML schema datatypes, such as xsd:string or xsd:date, but also
defines simple types (simpleType), complex types (complexType), elements
(element), and attributes (attribute). Element and attribute declarations are
used to restrict the usage of elements and attributes in XBRL XML documents.
Since OWL only knows the distinction between classes and properties, the
correpondences between XBRL and OWL description primitives is not a one-to-one
mapping:
</p>
<p>However, OWL allows to characterize properties more precisely than just
having only a domain and a range. We can mark a property as functional (instead
of being relational, the default case), meaning that it takes at most one
value. This clearly means that a property must not have a value for each
instance of a class on which it is defined. Thus a functional property is in
fact a partial (and must not necessarily be a total) function. Exactly the
distinction functional vs. relational is represented by the attribute vs.
element distinction, since multiple elements are allowed within a surrounding
context. However, at most one attribute-value combination for each attribute
name is allowed within an element:
</p>
<div>
<table border="2" ID="Table2">
<tbody>
<tr>
<td><strong>XBRL</strong></td>
<td><strong>OWL</strong></td>
</tr>
<tr>
<td>simple type</td>
<td>class</td>
</tr>
<tr>
<td>complex type</td>
<td>class</td>
</tr>
<tr>
<td>attribute</td>
<td>functional property</td>
</tr>
<tr>
<td>element</td>
<td>relational property</td>
</tr>
</tbody>
</table>
</div>
<p>Simple and complex types differs from one another in that simple types are
essentially defined as extensions of the basic XML Schema datatypes, whereas
complex types are XBRL specifications that do not build upon XSD types, but
instead introduce their own element and attribute descriptions. Here are simple
type specifications found in the base terminology of XBRL, located in the file
xbrl-instance.xsd:
</p>
<p>Since OWL only claims that "As a minimum, tools must support datatype
reasoning for the XML Schema datatypes xsd:string and xsd:integer." [<a href="#OWL">OWL</a>, p. 30]
and because "It is not illegal, although not recommended, for applications to
define their own datatypes ..." [<a href="#OWL">OWL</a>, p. 29], we have decided to implement a
workaround that represents all the necessary XML Schema datatypes used in XBRL.
This was done by having a wrapper type for each simple XML Schema type. For
instance, "monetary" is a simple subtype of the wrapper type "decimal":
<tt><restriction base="decimal"/></tt>. Below we show the first lines of the
actual OWL version of XBRL we have implemented:
</p>
<div class="exampleInner" style="clear: both">
<pre>
<?xml version="1.0"?>
<rdf:RDF xmlns="http://xbrl.dfki.de/main.owl#"
xmlns:protege="http://protege.stanford.edu/plugins/owl/protege#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xml:base="http://xbrl.dfki.de/main.owl">
<owl:Ontology rdf:about=""/>
<owl:Class rdf:ID="bs.ass.fixAss.tan.machinery.installations">
<rdfs:subClassOf>
<owl:Class rdf:ID="Locator"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="nt.ass.fixAss.fin.loansToParticip.net.addition">
<rdfs:subClassOf>
<owl:Class rdf:about="#Locator"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="nt.ass.fixAss.fin.loansToSharehold.net.beginOfPeriod.endOfPrevPeriod">
<rdfs:subClassOf>
<owl:Class rdf:about="#Locator"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="nt.ass.fixAss.fin.gross.revaluation.comment">
<rdfs:subClassOf>
<owl:Class rdf:about="#Locator"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="nt.ass.fixAss.fin.securities.gross.beginOfPeriod.otherDiff">
<rdfs:subClassOf>
<owl:Class rdf:about="#Locator"/>
</rdfs:subClassOf>
</owl:Class>
...
</owl:Ontology>
</rdf:RDF></pre>
</div>
<p>The German Accounting Principles taxonomy consists of 2,387 concepts, plus 27
concepts from the base taxonomy for XBRL. 34 properties were defined and 4,780
instance fnally generated.
</p>
<p>Besides the ontologization of XBRL, we would propose to build an ontology on
the top of the taxonomic organization of NACE codes. Then we need a clear
ontological representation of the time units/information relevant in the
domain. And last but not least, we would also use all the
classification/categorization information of NewsML/IPTC to use more accurate
semantic Metadata for the encoding of the (financial) news articles.
</p>
<h5 id="news-exif">EXIF in the Semantic Web</h5>
<p>One of today's commonly used image format and metadata standard is the
Exchangeable Image File Format [<a href="#Exif">EXIF</a>]. This file format provides a standard
specification for storing metadata regarding image. Metadata elements
pertaining to the image are stored in the image file header and are marked with
unique tags, which serves as an element identifying.
</p>
<p>As we note in this document, one potentional way to integrate EXIF metadata
with additinoal news/multimedia metadata formats is to add an RDF layer on top
of the metadata standards. Recently there has been efforts to encode EXIF
metadata in such Semantic Web standards, which we briefly detail below. We note
that both of these ontologies are semantically very similar, thus this issue is
not addressed here. Essentially both are a straightforward encodings of the
EXIF metadata tags for images. There are some syntactic differences,
but again they are quite similar; they primarily differ in their naming
conventions utilized.
</p>
<p>The <a href="http://www.kanzaki.com/test/exif2rdf">Kanzaki EXIF RDF Schema</a> provides an encoding of the basic EXIF
metadata tags in RDFS. Essentially, these are the tags defined from Section 4.6
of [<a href="#Exif">EXIF</a>]. We also note here that relevant domains and ranges are utilized as
well. It additionally provides an EXIF conversion service, EXIF-to-RDF,
which extracts EXIF metadata from images and automatically maps it to
the RDF encoding. In particular the service takes a URL to an EXIF image and
extracts the embedded EXIF metadata. The service then converts this metadata to
the RDF schema and returns this to the user.</p>
<p>The <a href="http://www.nwalsh.com/java/jpegrdf/">Norm Walsh EXIF RDF Schema</a> provides another encoding of the basic
EXIF metadata tags in RDFS. Again, these are the tags defined from Section 4.6
of [<a href="#Exif">EXIF</a>]. It additionally provides JPEGRDF, which is a Java application that
provides an API to read and manipulate EXIF meatadata stored in JPEG images.
Currently, JPEGRDF can can extract, query, and augment the EXIF/RDF data stored
in the file headers. In particular, we note that the API can be used to convert
existing EXIF metadata in file headers to the schema. The
resulting RDF can then be stored in the image file header, etc. (Note here that
the API's functionality greatly extends that which was briefly presented here).
</p>
<h5 id="news-conclusion">Putting All That Together</h5>
<p>Some text showing how this qualitative and quantitative information benefits
to interoperate ...</p>
<!-- ======================================================================== -->
<h3>
<a name="tagging">2.4 Use Case: Tagging</a>
</h3>
<h4 id="tagging-introduction">Introduction</h4>
<p>
Tags are what may be the simplest form of annotation: simple user-provided
keywords that are assigned to resources, in order to support subsequent
retrieval. In itself, this idea is not particularly new or revolutionary:
keyword-based retrieval has been around for a while. In contrast to the formal
semantics provided by the Semantic Web standards, tags have no semantic
relations whatsoever, including a lack of hierarchy; tags are just flat
collections of keywords.<br/>
There are however new dimensions that have boosted the popularity of this
approach and given a new perspective on an old theme: low-cost applicability
and collaborative tagging.
</p>
<p>
Tagging lowers the barrier of metadata annotation, since it requires minimal
effort on behalf of annotators: there are no special tools or complex interface
that the user needs to get familiar with, and no deep understanding of logic
principles or formal semantics required – just some standard technical
expertise. Tagging seems to work in a way that is intuitive to most people, as
demonstrated by its widespread adoption, as well as by certain studies
conducted on the field [<a href="#Trant">Trant</a>]. Thus, it helps bridging the 'semantic gap' between
content creators and content consumers, by offering 'alternative points of
access' to document collections.
</p>
<p>
The main idea behind collaborative tagging is simple: collaborative tagging
platforms (or, alternatively, distributed classification systems - DCSs [<a href="#Mejias">Mejias</a>])
provide the technical means, usually via some sort of web-based interface, that
support users in tagging resources. What is the important aspect of this is
that they aggregate collections of tags that an individual uses, or his tag
vocabulary, called a personomy [<a href="#Hotho">Hotho</a>], into what has been termed a folksonomy: a
collection of all personomies [<a href="#Mathes">Mathes</a>, <a href="#Smith">Smith</a>].
</p>
<p>Some of the most popular collaborative tagging systems are Delicious
(bookmarks), Flickr (images), Last.fm (music), YouTube (video), Connotea
(bibliographic information), steve.museum (museum items) and Technorati
(blogging). Using these platforms is free, although in some cases users can opt
for more advanced features by getting an upgraded account, for which they have
to pay. The most prominent among them are Delicious and Flickr, for which some
quantitative user studies are available [<a href="#HitWise">HitWise</a>,
<a href="#NetRatings">NetRatings</a>]. These user studies document a
phenomenal growth, that indicates that in real-life tagging is a very viable
solution for annotating any type of resource.
</p>
<h4 id="tagging-scenario">Motivating Scenario</h4>
<p>
Let us view some of the current limitations of tag-based annotation, by
examining a motivating example:
</p>
<p>
Let's suppose that user Mary has an account on platform S1, that specializes in
images. Mary has been using S1 for a while, so she has progressively built a
large image collection, as well as a rich vocabulary of tags (personomy).
</p>
<p>
Another user, Sylvia, who is Mary's friend, is using a different platform, S2,
to annotate her images. At some point, Mary and Sylvia attended the same event,
and each one took some pictures with her own camera. As each user has her
reasons for choosing a preferred platform, none of them would like to change.
They would like however to be able to link to each other's annotated pictures,
where applicable: it can be expected that since the pictures were taken at the
same time and place, some of them may be annotated in similar way (same tags),
even by different annotators. So they may (within the boundaries of word
ambiguity) be about the same topic.
</p>
<p>In the course of time Mary also becomes interested in video and starts
shooting some of her own. As her personal video collection begins to grow, she
decides to start using another collaborative tagging system, S3, that
specializes in video, in order to better organise it. Since she already has a
rich personomy built in S1, she would naturally like to reuse it in S3, to the
extent possible: while some of the tags may not be appropriate, as they may
represent one-off ('29-08-06') or photography-specific ('CameraXYZ') use,
others might as well be reused across modalities/domains, in case they
represent high-level concepts ('holidays'). So if Mary has both video and
photographic material of some event, and since she has already created a
personomy on S1, she would naturally like to be able to reuse it (partially,
perhaps) on S2 as well.
</p>
<h4 id="tagging-issues">Issues</h4>
<p>
The above scenario demonstrates limitations of tag-based systems with respect
to personomy reuse:
</p>
<ul>
<li>
A personomy maintained at one platform cannot easily be reused for a tag-based
retrieval on another tagging platform.
</li>
<li>
A personomy maintained at one platform cannot easily be reused to organize
further media or resources on another tagging platform.
</li>
</ul>
<p>
As media resides not only on Internet platforms but is most likely maintained
on a local computer at first, local organizational structures can also not
easily be transferred to a tagging platform. The opposite holds as well, a
personomy maintained on a tagging platform cannot easily be reused on a desktop
computer.
</p>
<p>
Personomy reuse is currently not easily possible as each platform uses ad-hoc
solutions and only provides tag navigation within its own boundaries: there is
no standardization that regulates how tags and relations between tags, users,
and resources are represented. Due to that lack of standardization there are
further technical issues that become visible through the application
programming interfaces provided by some tagging platforms:
</p>
<ul>
<li>
Some platforms prohibit tags containing space characters while other allow such
tags
</li>
<li>
Different platforms provide different functionality for organizing tags
themselves, e.g. some platforms allow to summarize tags in tag-bundles
</li>
</ul>
<h4 id="tagging-solution">Possible Solutions</h4>
<p>
When it comes to interoperability, standards-based solutions have repeatedly
proven successful in enabling to bridge different systems. This could also be
the case here, as a standard for expressing personomies and folksonomies would
enable interoperability across platforms. On the other hand, use of a standard
should not enforce changes in the way tags are handled internally by each
system - it simply aims to function as a bridge between different systems. The
question is then, what standard?
</p>
<p>
We may be able to answer this question if we consider a personomy as a concept
scheme: tags used by an individual express his or her expertise, interests and
vocabulary, thus constituting the individual's own concept scheme. A recent W3C
standard that has been designed specifically to express the basic structure and
content of concept schemes is SKOS Core [<a href="#SKOS">SKOS</a>]. The SKOS Core Vocabulary is an
application of the Resource Description Framework (RDF), that can be used to
express a concept scheme as an RDF graph. Using RDF allows data to be linked to
and/or merged with other RDF data by semantic web applications.
</p>
<p>
Expressing personomies and folksonomies using SKOS is a good match for
promoting a standard representation for tags, as well as integrating tag
representation with Semantic Web standards: not only does it enable expression
of personomies in a standard format that fits semantically, but also allows
mixing personomies with existing Semantic Web ontologies. There is already a
publicly available SKOS-based tagging ontology that can be used to build on
[<a href="#Newman">Newman</a>], as well as some existing efforts to induce an ontology from collaborative
tagging platforms [<a href="#Schmitz">Schmitz</a>].
</p>
<p>Ideally, we would expect existing collaborative tagging platform to build on
a standard representation for tags in order to enable interoperability and
offer this as a service to their users. In practice however , even if such a
representation was eventually adopted as a standard, our expectation is that
there will be both technical and political reasons that could possibly hinder
its adoption. A different strategy that may be able to deal with this issue
then would be to implement this as a separate service that will integrate
disparate collaborative tagging platforms based on such an emergind standard
for tag representation, in the spirit of Web2.0 mashups. This service could
either be provided by a 3rd party, or even be self-hosted by individual users,
in the spirit of [<a href="#Koivunen">Koivunen</a>, <a href="#Segawa">Segawa</a>]</p>
<!-- ======================================================================== -->
<h3>
<a name="semanticRetrieval">2.5 Use Case : Semantic Media Analysis for Intelligent Retrieval</a>
</h3>
<h4 id="retrieval-introduction">Introduction</h4>
<p>Semantic Media Analysis seen from a multimedia retrieval perspective is
equivalent to the automatic creation of semantic indices and annotations based
on multimedia and domain ontologies to enable intelligent human-like multimedia
retrieval purposes. An efficient multimedia retrieval system [<a href="#Naphade">Naphade</a>], must:
</p>
<ol type="i">
<li>
Be able to handle the semantics of the query,
</li>
<li>
Unify multiple modalities in a homogeneous frameworkb and
</li>
<li>
Abstract the relationship between low level media features and high level
semantic concepts to allow the user to query in terms of these concepts rather
than in terms of examples, i.e. introduction the notion of ontologies.
</li>
</ol>
<p>
This Use Case aims to pinpoint problems that arise during the effort for an
automatic creation of semantic indices and annotations in an attempt to bridge
the multimedia semantic gap and thus provide corresponding solutions using
Semantic Web Technologies.
</p>
<p>
For multimedia data retrieval, based on only low-level features as in the case
of "quering by example" and of content-based retrieval paradigms and systems,
on the one hand, one gets the advantage of an automatic computation of the
required low-level features but on the other hand, such methodology lacks the
ability to respond to high-level, semantic-based queries, and evidently loses
the relation among low-level multimedia features such as pitch, or
zero-crossing rate in audio or color and shape in image and video, or frequency
of words in text, to high-level domain concepts that essentially characterize
the underlying knowledge in data that a human is capable of quickly grasping,
whereas a machine cannot. For this reason, an abstraction of high level
multimedia content descriptions and semantics is required based on what can
actually be generated automatically, such as low-level features after low-level
processing, and on methods, tools and languages to represent the domain
ontology and attain the mapping between the two. Tha latter is needed so that
semantic indices are extracted as automatic as possible, rather than being
produced manually which is a time-consuming and not always efficient task
(attains a lot of subjective annotations). To avoid the latter limitations of
manual semantic annotations on multimedia data, metadata standards and
ontologies (upper, domain, etc.) have to be used and interoperate. Thus, a
requirement emerges for multimedia semantics interoperability to further enable
efficient solutions interoperation, when considering the distributed nature of
the Web and the enormous amounts of multimedia data published there.
</p>
<p>
An example solution for the interoperability problem stated above is the MPEG-7
standard. MPEG-7, composed of various parts, defines both metadata descriptors
for structural and low-level aspects of multimedia documents, as well as high
level description schemes (Multimedia Description Schemes) for a higher-level
of descriptions including semantics of multimedia data. However, it does not
determine the mapping of the former to the latter based on the addressed
application domain. A number of publications have appeared to define the MPEG-7
core ontology to address such issues. What is
important is that the MPEG-7 provides the standardised means of descriptors
both low-level and high level. The value sets of those descriptions along with
a richer set of relationships definitions could form the necessary missing
piece along with the knowledge discovery algorithms which will use these to
extract semantic descriptions and indices in an almost automatic way out of
multimedia data. The bottom line thus is that MPEG-7 metadata descriptions need
to be properly linked to domain-specific ontologies that model high-level
semantics.
</p>
<p>
Furthermore, one should consider usually the multimodality feature of
multimedia data and content on the Web. The same concept there may be described
by different means, that is by news in text as well as an image showing a
snapshot of what the news are reporting. Thus, since the provision of
cross-linking between different media types or corresponding modalities
supports a rich scope for inferencing a semantic interpretation,
interoperability between different single media schemes (audio ontology, text
ontology, image ontology, video ontology, etc.) is an important issue. This
emerges from the need to homogenise different single modalities for which it is
possible that:
</p>
<ol type="i">
<li>
Can infer particular high level semantics with different degrees of confidence
(e.g. rely mainly on audio for infering certain concepts than text),
</li>
<li>
Can be supported by a world modelling (or ontologies) where different
relationships exist, e.g. in an image one can attribute spatial relationships
while in a video sequence spatio-temporal relationships can be attained, and
</li>
<li>
Can have different role in a cross-modality fashion – which modality triggers
the other, e.g. to identify that a particular photo in a Web page depicts
person X, we first extract information from text on the person's identity and
thereafter we cross-validate by the corresponding information extraction from
the image.
</li>
</ol>
<p>Both of the above concerns, either the single modality tackled first or the
cross-modality (which essentially encapsulates the sinlge modality), require
semantic interoperability which will support a knowledge representation of the
domain concepts and relatioships, of the multimedia descriptors and of the
cross-linking of both, as well as a multimedia analysis part combined with
modeling, inferencing and mining algorithms that can be directed towards
automatic semantics extraction from multimedia to further enable efficient
semantic-based indexing and intelligent multimedia retrieval.
</p>
<h4 id="retrieval-scenario">Motivating Examples</h4>
<p>
In the following, current pitfalls with respect to the desired semantic
interoperability are given via examples. The discussed pitfalls are not the
only ones, therefore, further discussion is needed to cover the broad scope of
semantic multimedia analysis and retrieval.
</p>
<h5 id="retrieval-example1">Example 1: Single modality case: Lack of semantics in low-level descriptors</h5>
<p>
The linking of low-level features to high-level semantics can be obtained by
the following two main trends:
</p>
<ol type="i">
<li>
Using machine learning and mining techniques to infer the required mapping,
based on a basic knowledge representation of the concepts of the addressed
domain (usually low-to-medium level inferencing) and
</li>
<li>
Using ontology-driven approaches to both guide the semantic analysis and infer
high-level concepts using reasoning and logics. This trend can include the
first one as well and then be further driven by medium-level semantics to more
abstract domain concepts and relationships.
</li>
</ol>
<p>
In both trends, it is appropriate for granularity purposes to produce
concept/event detectors, which usually incorporate a training phase applied on
training feature sets for which ground-truth is available (apriori knowledge of
addressed concepts or events). This phase enable optimization of the underlying
artificial intelligence algorithms. Semantic interoperability cannot be
achieved by only exchanging low-level features, wrapped in standardised
metadata descriptors, between different users or applications, since there is a
lack of formal semantics. In particular, a set of low level descriptors (eg.
MPEG-7 audio descriptors) cannot be semantically meaningful since there is a
lack of intuitive interpretation to higher levels of knowledge - these have
been however extensively used in content-based retrieval that relies on
similarity measures. The low level descriptors are represented as a vector of
numerical values, and thus, they are useful for a content-based multimedia
retrieval rather than a semantic multimedia retrieval process.
</p>
<p>
Furthermore, since a set of optimal low level descriptors per target
application (be it music genre recognition or speaker indexing) can be
conceived by only multimedia analysis experts, this set has to be transparent
to any other user. For example, although a non-expert user can understand the
color and shape of a particular object, he is unable to attribute to this
object a suitable representation by the selection of appropriate low level
descriptors. It is obvious that the low level descriptors do not only lack
semantics but also limit their direct use to people that have gained a
particular expertise concerning multimedia analysis and multimedia
characteristics.
</p>
<p>The problem raised out of this example that needs to be solved is in <em>which
way low level descriptors can be efficiently and automatically linked and
turned into an exchangeable bag of semantics</em>.
</p>
<h5 id="retrieval-example2">Example 2: Multi-modality case: Fusion and interchange of semantics among media</h5>
<p>
In multimedia data and web content, cross-modality aspects are dominant, a
characteristic that can be efficiently exploited by semantic multimedia
analysis and retrieval, when all modalities can be exploited to infer the same
or related concepts or events. One aspect, is again motivated from the analysis
part, that refers to particular concepts and relationships capturing, which
require a priority in the processing of modalities during their automatic
extraction. For example, to enhance recognition of a face of a particular
person in an image appearing in a Web page, which is actually a very difficult
task, it seems more natural and efficient that initially inferencing is based
on the textual content, to locate the identity (name) of the person, and
thereafter, the results can be validated or enhanced by related results from
image analysis. Similar multimodal media analysis benefits can be obtained by
analysing synchronized audio-visual content to semantically annotate it. The
trends there are:
</p>
<ol type="i">
<li>
To conscruct combined feature vectors from audio and visual features and feed
those to machine learning algorithms to extract combined semantics
</li>
<li>
To analyse each single modality separately towards recognizing medium-level
semantics or the same concepts and then fuse results of analysis (decision
fusion) in usually a weighted or ordered manner (depending on the underlying
single modality cross-relations towards the same topic) to either improve the
accuracy of semantics extraction results or enrich them, towards higher level
semantics.
</li>
</ol>
<p>
For the sake of clarity, an example scenario is described in the following
which is taken from the ‘sports’ domain and more specifically from ‘athletics’.
</p>
<p>
Let's assume that we need to semantically index and annotate, in the most
possible automatic way, the web page shown at Figure 1, which is taken from the
site of the <a href="http://www.iaaf.org/">International Association of Athletics Federation</a>. The subject
of this page is "the victory of the athlete Reiko Sosa at the Tokyo’s
marathon". Let's try to answer the question: What analysis steps are required
if we would like to enable semantic retrieval results for the query "show me
images with the athlete Reiko Sosa" ?
</p>
<p>
One might notice that for each image in this web page there is a caption which
includes very useful information about the content of the image, in particular
the persons appearing in it, i.e. structural (spatial) relations of the
media-rich web page contents. Therefore, it is important to identify the areas
of an image and the areas of a caption. Let's assume that we can detect those
areas (it is not useful to get into details how). Then, we proceed in the
semantics extraction of the textual content in the caption which identifies:
</p>
<ul>
<li>
Person Names = {Naoko Takahashi, Reiko Sosa},
</li>
<li>
Places = {Tokyo}, Athletics type = {Women’s Marathon} and
</li>
<li>
Activity = {runs away from} (see Figure 1, in yellow and blue color).
</li>
</ul>
<p>
In the case of the semantics extraction from images, we can identify the
following concepts and relationships:
</p>
<ul>
<li>
In the image at the upper part of the web page, we can get the ‘athlete’s
faces’ and with respect to the spatial relationship of those faces we can
identify which face (athlete) takes lead against the other. Using only the
image we cannot draw a conclusion who is the athlete.
</li>
<li>
In the image at the lower part of the web page, we can identify that there
exist a person after a face detection but still, we cannot ensure to whom this
face belongs to.
</li>
</ul>
<p>
If we combine both the semantics from textual information in captions and the
semantics from image we may give a large support to reasoning mechanisms to
reach the conclusion that "we have images with the athlete Reiko Sosa".
Nonetheless, in the case that we have several athletes like in the image on the
upper web image part, reasoning using the identified spatial relationship can
spot which particular athlete between the two, is Reiko Sosa.
</p>
<div style="border: 1px solid gray; padding: 1%; margin: 1%">
<center>
<img src="retrieval-athletics.jpg" alt="Example of a web page about athletics"/>
<br/>
Example of a web page about athletics.
</center>
</div>
<p>
Another scenario involved multimodal analysis of audio-visual data, distributed
on the web or accessed through it from video archives, and concerns automatic
semantics extraction and annotation of video scenes related to violence, for
further purposes of content filtering and parental control [<a href="#Perperis">Perperis</a>]. Thus, the goal
in this scenario is automatic identification and semantic classification of
violent content, using features extracted from visual, auditory and textual
modalities of multimedia data.
</p>
<p>
Let's consider that we are trying to automatically identify violent scenes
where fighting among two persons takes place with no weapons involved. The
low-level analysis parts will lead to different low-level descriptors
separately for each modality. For example, for the visual modality the analysis
will involve:
</p>
<ul>
<li>
Shot cut detection and video segmentation.
</li>
<li>
Human body recognition and motion analysis.
</li>
<li>
Human body parts recognition (arms, legs).
</li>
<li>
Human body parts movement and tracking (i.e. "Fast horizontal hand movement")
</li>
<li>
Interpretation of simple "visual" events/concepts based on spatial and temporal
relations of identified objects (medium-level semantics).
</li>
</ul>
<p>
On the other hand, the analysis of the auditory modality will involve:
</p>
<ul>
<li>
Audio signal segmentation.
</li>
<li>
Segment classification in sound categories, including speech, silence, music,
scream, etc. which may relate to violence events or not (medium-level
semantics).
</li>
</ul>
<p>
Now, by of course fusing medium-level semantics and results from the single
modality analysis, taking under consideration spatio-temporal relations and
behaviour patterns, we evidently can automatically extract (infer) higher level
semantics. For example, the "punch" concept can be automatically extracted
based on the initial analysis results and on the sequence or synchronicity of
audio or visual detected events such as two person in visual data, the one
moving towards the other, while a punch sound and scream of pain is detected in
the audio data.
</p>
<p>To fulfil such scenarios as the ones presented above, we should solve the
problem <em>how to fuse and interchange semantics from different modalities</em>.
</p>
<h4 id="retrieval-solution">Possible Solutions</h4>
<h5 id="retrieval-solution1">Example 1</h5>
<p>
As it was mentioned in Example 1, semantics extraction can be achieved via
concept detectors after a training phase based upon feature sets. Towards this
goal, recently there was a suggestion in [<a href="#Asbach">Asbach</a>] to go from a low level description
to a more semantic description by extending MPEG-7 to facilitate sharing
classifier parameters and class models. This should occur by presenting the
classification process in a standardised form. A classifier description must
specify on what kind of data it operates, contain a description of the feature
extraction process, the transformation to generate feature vectors and a model
that associates specific feature vector values to an object class. For this, an
upper ontology could be created, called a classifier ontology, which could be
linked to a multimedia core ontology (eg. CIDOC CRM ontology), a visual
descriptor ontology [<a href="#VDO">VDO</a>] as well as a domain ontology. A similar approach is
followed by the method presented in [<a href="#Tsekeridou">Tsekeridou</a>], where classifiers are used to
recognize and model music genres for efficient music retrieval, and description
extensions are introduced to account for such extended functionalities.
</p>
<p>
As to these aspects, the current Use Case relates at some extend to the
Algorithm Representation UC. However, the latter refers mainly to general
purpose processing and analysis and not to analysis and semantics extraction,
based on classification and machine learning algorithms, to enable intelligent
retrieval.
</p>
<p>In the proposed solution, the visual descriptor ontology consists of a
superset of MPEG-7 descriptors since the existing MPEG-7 descriptors cannot
always support an optimal feature set for a particular class.</p>
<p>A scenario that exemplifies the use of the above proposal is given in the
following. Maria is an architect who wishes to retrieve available multimedia
material of a particular architecture style like ‘Art Nouveau’, ‘Art Deco’,
‘Modern’ among the bulk of data that she has already stored using her
multimedia management software. Due to its particular interest, she plugs in
the ‘Art Nouveau classifier kit’ that enables the retrieval of all images or
videos that correspond to this particular style in the form of visual
representation or non-visual or their combination (eg. a video on exploring the
House of V. Horta, a major representative of Art Nouveau style in Brussels,
which includes visual instances of the style as well as a narration about Art
Nouveau history).
</p>
<p>
Necessary attributes for the classifier ontology are estimated to be:
</p>
<ul>
<li>
The name and category of the Classifier
</li>
<li>
The list and types of input parameters
</li>
<li>
The output type
</li>
<li>
Limitations on data set, on value ranges for parameters, on processing time and
memory requirements
</li>
<li>
Permormance metrics
</li>
<li>
Guidelines of use
</li>
<li>
Links to class models per domain/application and feature sets
</li>
</ul>
<p>
In the above examples, the exchangeable bag of semantics is directly linked to
an exchangeable bag of supervised classifiers.
</p>
<h5 id="retrieval-solution1">Example 2</h5>
<p>
In this example, to support reasoning mechanisms, it is required that apart
from the ontological descriptions for each modality, there is a need for a
cross-modality ontological description which interconnects all possible
relations from each modality and constructs rules that are cross-modality
specific. It is not clear, whether this can be achieved by an upper multimedia
ontology or a new cross-modality ontology that will strive toward the
knowledge representation of all possibilities combining media. It is evident
though, that the cross-modality ontology, along with the single modality ones,
greatly relate to the domain ontology, i.e. to the application at hand.
</p>
<p>
Furthermore, in this new cross-modality ontology, special attention should be
taken for the representation of the priorities/ordering among modalities for
any multimodal concept (eg. get textual semantics first to attach semantics in
an image). This translates to sequential rules construction. However there are
cases, where simultaneous semantic instances in different modalities may lead
to higher level of semantics, that synchronicity is also a relationship to be
accounted for. Apart from the spatial, temporal or spatio-temporal
relationships that need to be accounted for, there is also the issue of
importance of each modality for identifying a concept or semantic event. This
may be represented by means of weights.
</p>
<p>The solution is composed also by relating visual, audio, textual descriptor
ontologies with a cross-modality ontology showcasing their inter-relations as
well as a domain ontology representing the concepts and relations of the
application at hand.</p>
<!-- ======================================================================== -->
<h3>
<a name="algorithm">2.6 Use Case: Algorithm Representation</a>
</h3>
<h4 id="algorithm-introduction">Introduction</h4>
<p>The problem is that algorithms for image analysis are difficult to manage,
understand and apply, particularly for non-expert users. For instance, a
researcher needs to reduce the noise and improve the contrast in a radiology
image prior to analysis and interpretation but is unfamiliar with the specific
algorithms that could apply in this instance. In addition, many applications
require the processes applied to media to be concisely recorded for re-use,
re-evaluation or integration with other analysis data. Quantifying and
integrating knowledge, particularly visual outcomes, about algorithms for media
is a challenging problem.
</p>
<h4 id="algorithm-solution">Solution</h4>
<p>Our proposed solution is to use an algorithm ontology to record and describe
available algorithms for application to image analysis. This ontology can then
be used to interactively build sequences of algorithms to achieve particular
outcomes. In addition, the record of processes applied to the source image can
be used to define the history and provenance of data.
</p>
<p>The algorithm ontology should consist of information such as:
</p>
<ul>
<li>
name
</li>
<li>
informal natural language description
</li>
<li>
formal description
</li>
<li>
input format
</li>
<li>
output format
</li>
<li>
example media prior to application
</li>
<li>
example media after application
</li>
<li>
goal of the algorithm
</li>
</ul>
<p>To achieve this solution we need:
</p>
<ul>
<li>
a sufficiently detailed and well-constructed algorithm ontology;
</li>
<li>
a core multimedia ontology;
</li>
<li>
domain ontologies and;
</li>
<li>
the underlying interchange framework supplied by semantic web technologies such
as XML and RDF.
</li>
</ul>
<p>The benefits of this approach are a modularity through the use of independent ontologies to ensure usability and
flexibility.
</p>
<h4 id="algorithm-soa">State of the Art and Challenges</h4>
<p>Currently there exists a taxonomy/thesaurus for image analysis algorithms we
are working on [<a href="#Asirelli">Asirelli</a>] but this is insufficient to support the required functionality. We are
collaborating on expanding and converting this taxonomy to an OWL ontology.
</p>
<p>The challenges are:
</p>
<ul>
<li>
to articulate and quantify the ‘visual’ result of applying algorithms;
</li>
<li>
to associate practical example media with the algorithms specified;
</li>
<li>
to integrate and harmonise the ontologies;
</li>
<li>
to reason with and apply the knowledge in the algorithm ontology (e.g. using
input and output formats to align processes).
</li>
</ul>
<h4 id="algorithm-applications">Possible Applications</h4>
<p>The formal representation of the semantics of algorithms enables recording of
provenance, provides reasoning capabilities, facilitates application and
supports interoperability of data. This is important in fields such as:
</p>
<ol type="1">
<li>
Smart assistance to support quality control and defect detection of complex,
composite, manufactured objects;
</li>
<li>
Biometrics (face recognition, human behaviour, etc.)
</li>
<li>
The composition of web services to automatically analyse media based on user
goals and preferences;
</li>
<li>
To assist in the formal definition of protocols and procedures in fields that
are heavily dependent upon media analysis such as scientific or medical
research.
</li>
</ol>
<p>These are applications that utilise media analysis and need to integrate
information from a range of sources. Often recording the provenance of
conclusions and the ability to duplicate and defend results is critical.
</p>
<p>For example, in the field of aeronautical engineering, aeroplanes are
constructed from components that are manufactured in many different locations.
Quality control and defect detection requires data from many disparate sources.
An inspector should understand the integrity of a component by acquiring local
data (images and others) and combining it with information from one or more
databases and possibly interaction with an expert.</p>
<h4 id="algorithm-example">Example</h4>
<div style="float: right; width: 45%; border: 1px solid gray; padding: 1%; margin: 1%">
<img src="algorithm.jpg" alt="Excerpt of an Algorithm Ontology"/>
<br/>
Excerpt of an Algorithm Ontology.
</div>
<p>Problem:
</p>
<ul>
<li style="list-style-type: none;">
Suggest possible clinical descriptors (pneumothorax) given a chest x-ray.
</li>
</ul>
<p>Hypothesis of solution :
</p>
<ul>
<li style="list-style-type: none;">
1) Get a digital chest x-ray of patient P (image A).</li>
<li style="list-style-type: none;">
2) Apply on image A a
digital filter to improve the signal-to-noise ratio (image B).</li>
<li style="list-style-type: none;">
3) Apply on
image B a region detection algorithm. This algorithm segments image B according
to a partition of homogeneous regions (image C).</li>
<li style="list-style-type: none;">
4) Apply on image C an
algorithm that 'sorts' according to a given criterion the regions by their
geometrical and densitometric properties (from largest to smallest, from
darkest to clearest, etc.) (array D).</li>
<li style="list-style-type: none;">
5) Apply on array D an algorithm that
searching on a database of clinical descriptors detects the one that best fits
the similarity criterion (result E).</li>
</ul>
<p>However, we should consider the following aspects:
</p>
<ul>
<li style="list-style-type: none;">
step 2) Which digital filter should be applied on image A? We can consider
different kinds of filters (Fourier, Wiener, Smoothing, etc. ) each one having
different input-output formats and giving slightly different results.</li>
<li style="list-style-type: none;">
step 3)
Which segmentation algorithm should be used? We can consider different
algorithms (clustering, histogram, homogeneity criterion, etc.).</li>
<li style="list-style-type: none;">
step 4) How
can we define geometrical and densitometric properties of the regions? There
are several possibilities depending on the considered mathematical models for
describing closed curves (regions) and the grey level distribution inside each
region (histogram, Gaussian-like, etc.).</li>
<li style="list-style-type: none;">
step 5) How can we define similarity
between patterns? There are multiple approaches that can be applied (vector
distance, probability, etc.).</li>
</ul>
<p>Each step could be influenced by the previous ones.
</p>
<p>Goal: to segment the chest x-ray image (task 3)
</p>
<p>A segmentation algorithm is selected. To be most effective this segmentation
algorithm requires a particular level of signal-to-noise ratio. This is defined
as the precondition (Algorithm.hasPrecondition) of the segmentation algorithm
(instanceOf.segmentationAlgoritm). To achieve this result a filter algorithm is
found (Gaussian.instanceOf.filterAlgorithm) which has the effect
(Algorithm.hasEffect) of improving the signal-to-noise ratio for images of the
same type as the chest x-ray image (Algorithm.hasInput). By comparing the
values of the precondition of the segmentation algorithm with the effect of the
filter algorithm we are able to decide on the best algorithms to achieve our
goal.
</p>
<h4 id="algorithm-interoperability">Interoperability aspects</h4>
<p>Two types or levels of interoperability to be considered:
</p>
<ul>
<li style="list-style-type: none;">
1) low-level interoperability, concerning data formats and algorithms, their
transition or selection aspects among the different steps and consequently the
possible related ontologies (algorithm ontology, media ontology);</li>
<li style="list-style-type: none;">
2) high-level
interoperability, concerning the semantics at the base of the domain problem,
that is how similar problems (segment this image; improve image quality) can be
faced or even solved using codified 'experience' extracted from well-known case
studies.</li>
</ul>
<p>In our present use case proposal we focused our attention mainly on the
latter.
</p>
<p>Considering for instance the pneumothorax example, this can be studied
starting from a specific pre-analyzed case in order to define a general
reference procedure: what happens if we have to study a pneumothorax case
starting from an actual arbitrary image of a patient? Applying simply the
general procedure will not give in general the right solution because each
image (i.e. each patient) has its own specificity and the algorithms have to be
bound to the image type. Thus, the general procedure is not the one which fits
for any case because the results depend on the image to be processed. And also
in the better case, the result would be supervised and it would be necessary to
apply another algorithm to improve the result itself. High-level
interoperability would involve also a procedure able to take trace of a
specific result and how it has been obtained starting from a particular input.
</p>
<p>The open research questions that we are currently investigating relate to the
formal description of the values of effect and precondition and how these can
be compared and related. The interoperability of the media descriptions and
ability to describe visual features in a sufficiently abstract manner are key
requirements.</p>
<!-- ======================================================================== -->
<h2>
<a name="openIssues">3. Open Issues</a>
</h2>
<h3>
<a name="authoring">3.1 Semantics From Multimedia Authoring</a>
</h3>
<h4 id="authoring-introduction">Introduction</h4>
<p>Authoring of personalized multimedia content can be considered as a process
consisting of selecting, composing, and assembling media elements into coherent
multimedia presentations that meet the user’s or user group’s preferences,
interests, current situation, and environment. In the approaches we find today,
media items and semantically rich metadata information are used for the
selection and composition task.
</p>
<p>For example, Mary authors a multimedia birthday book for her daughter's 18th
birthday with some nice multimedia authoring tool. For this she selects images,
videos and audio from her personal media store but also content which is free
or she own from the Web. The selection is based of the different metadata and
descriptions that come with the media such as tags, descriptions, the time
stamp, the size, the location of the media item and so on. In addition to the
media elements used Mary arranges them in a spatio-temporal presentation: A
welcome title first and then along "multimedia chapters" sequences and groups
of images interleaved by small videos. Music underlies the presentation. Mary
arranges and groups, adds comments and titles, resizes media elements, brings
some media to front, takes others into the back. And then, finally, there is
this great birthday presentation that shows the years of her daughter's life.
She presses a button, creates a Flash presentation and all the authoring
semantics are gone.
</p>
<h4 id="authoring-semantics">Lost multimedia semantics</h4>
<p>Metadata and semantics today is mainly seen on the monomedia level. Single
media elements such as image, video and text are annotated and enriched with
metadata by different means ranging from automatic annotation to manual
tagging. In a multimedia document typically a set of media items come together
and are arranged into a coherent story with a spatial and temporal layout of
the time-continuous presentation, that often also allows user interaction. The
authored document is more than "just" the sum of the media elements it becomes
a new document with its own semantics. However, in the way we pursue multimedia
authoring today, we do not care and lose the emergent sementics from multimedia
authoring.
</p>
<h5 id="authoring-composition">Multimedia authoring semantics do not "survive" the composition</h5>
<p>So, most valuable semantics for the media elements and the resulting
multimedia content that emerge with and in the authoring process are not
considered any further. This means that the effort for semantically enriching
media content comes to a sudden halt in the created multimedia document – which
is very unfortunate. For example, for a multimedia presentation it could be
very helpful if an integrated annotation tells something about the structure of
the presentation, the media items and formats used, the lenght of the
presentation, its degree of interactivity, the table of contents of index of
the presentation, a textual summary of the content, the targeted user group and
so on. Current authoring tools just use metadata to select media elements and
compose them into a multimedia presentation. They do not extract and summarize
the semantics that emerge from the authoring and add them to the created
document for later search, retrieval and presentation support.
</p>
<h5 id="authoring-usage">Multimedia content can learn from composition and media usage</h5>
<p>For example, the media store of Mary could "learn" that some of the media
items seem to be more relevant than others. Additional comments on parts of the
presentation could also be new metadata entries for the media items. And also
the metadata of the single media items as well as of the presentation are not
added to the presentation such that is can afterwards more easier be shared,
searched, managed.
</p>
<h4 id="authoring-interoperability">Interoperability problems</h4>
<p>Currently, multimedia documents do not come with a single annotation scheme.
SMIL comes with the most advanced modeling of annotation. Based on RDF, the
head of a SMIL document allows to add an RDF description of the presentation to
the structured multiemdia document and gives the author or authoring tool a
space where to put the presentation's semantics. In specific domains we find
annotation schemes such as
<a href="http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf">LOM</a>
that provide the vocabulary for annotating
Learning Objects which are often Powerpoint Presentations of PDF documents but
might well be multimedia presentations.
<a href="http://www.dcs.shef.ac.uk/~ajay/html/cresearch.html">AKtive Media</a> is an ontology based
multimedia annotation (Images and Text) system which provides an interface for
adding ontology-based, free-text and relational annotations within multimedia
documents. Even though the community effort will contribute to a more or less
unified set of tags, this does not ensure interoperability, search, and
exchange.
</p>
<h4 id="authoring-needs">What is needed</h4>
<p>A semantic description of multimedia presentation should reveal the semantics
of its content as well as of the composition such that a user can search,
reuse, integrate multimedia presentation on the Web into his or her system. A
unified semantic Web annotation scheme could then describe the thousands of
Flash presentations as well as powerpoints presentation, but also SMIL and SVG
presentations. For existing presentations this would give the authors a chance
to annotate the presentations. For authoring tool creators this will give the
chance to publish a standardized semantic presentation description with the
presentation.
</p>
<!-- ======================================================================== -->
<h3>
<a name="multimedial">3.2 Building Multimedial Semantic Web Applications</a>
</h3>
<h4 id="multimedial-introduction">Introduction</h4>
<p>This use case is all about supporting to build real distributed, Semantic Web
applications in the domain of multimedial content. It discusses scalability,
and interop issues and tries to propose solutions to lower the barrier of
implementing such multimedial Semantic Web applications.
</p>
<h4 id="multimedial-motivation">Motivation</h4>
<p>Shirin is a IT manager at a NGO, called FWW (Foundation for Wildlife in the
World) and wants to offer some new multimedial service to inform, alarm, etc.
members, e.g.:
</p>
<dl>
<dt>Track your animal godchild (TyAG)</dt>
<dd>
<p>A service that would allow a member to audio-visually track his godchild
(using geo-spatial services, camera, satellite, RFID :). DONald ATOR, a
contributer of FWW is the godfather of a whale. Using the TyAG service he is
able to observe the route that his favorite whale takes (via <a href="http://www.geonames.org/">
Geonames</a>) and in case that the godchild is near a FWW-observing point,
Donald might also see some video footage. Currently the whales are somewhere
around <a href="http://www.geonames.org/maps/geonameId=3426256">Thule
Island</a>. TyAG allows Donald to ask questions like: <em>When</em> will the
whales be <em>in my region</em>? etc.</p>
</dd>
<dt>Video-news (vNews)</dt>
<dd>
<p>As Donald has gathered some good experiences with TyAG, he wants to be
informed about news, upcoming events, etc. w.r.t. <em>whales</em>. The backbone
of the vNews system is smart enough to understand that <em>whales</em> are a
kind of <em>animals that live in the water</em>. Any time a FWW member puts
some footage on the FWW-net that has some <em>water animals</em> in it, vNews -
using some automated feature extraction utils - offers it to Donald as well to
view it. <strong>Note:</strong> There might be a potential use of the outcome
of the <a href="http://www.w3.org/2005/Incubator/mmsem/wiki/News_Use_Case">News
Use Case</a> here.
</p>
</dd>
<dt>Interactive Annotation</dt>
<dd>
<p>A kind of video blogging [<a href="#Parker">Parker</a>] using vNews. Enables members to share thoughts about
endangered species etc. or to find out more information about a specific entity
in a (broadcasted) videostream. Therefore, vNews is able to automatically
segment its video-content and set up a list of <em>objects</em>, etc. For each
of the <em>objects</em> in a video, a user can get further information (by
linking it to Wikipedia, etc.) and share her thoughts about it with other
members of the vNews network.
</p>
</dd>
</dl>
<h4 id="multimedial-solution">Possible Solutions</h4>
<p>Common to all services listed above is an ample infrastructure that has to
deal with the following challenges:
</p>
<ul>
<li>
Using many different (multimedial) metadata (EXIF, GPS-data, etc.) as input, a
common internal representation has to be found (e.g. MPEG-7) -
INTEROPERABILITY.
</li>
<li>
For the domain (animals) a formal description needs to be defined - ONTOLOGY
ENGINEERING (also visual to entity mapping).
</li>
<li>
<p>Due to the vast amount of metadata, a scaleable approach has to be taken that
can handle both <em>low-level</em> features (in MPEG-7) and <em>high-level</em>
features (in RDF/OWL) - SCALABILITY.
</p>
</li>
</ul>
<p>We now try to give possible answers to the above listed question to enable
Shirin to implement the services in terms of:
</p>
<ul>
<li>
Based on well-known ontology engineering methods give hints on how to modell
the domain.
</li>
<li>
Giving support in selecting a representation that both addresses low-level as
high-level features.
</li>
<li>
Supplying an SW-architect with an evaluation of RDF-stores w.r.t. multimedial
metadata.</li>
</ul>
<!-- ======================================================================== -->
<h2>
<a name="framework">4. Common Framework</a>
</h2>
<p>In this section, we will propose a common framework that seek to provide both
syntactic (via RDF) and semantic interoperability. During the FTF2, we have
identified several layers of interoperability. Our methodology is simple: each
use case identifies a common ontology/schema to facilitate interoperability in
its own domain, and then we provide a simple framework to integrate and
harmonise these common ontologies/schema from different domains. Furthermore,
the simple extensible mechanism is provided to accommodate other
ontologies/schema related to the use cases we considered. Last but not least,
the framework includes some guidelines on which standard to use for specific
tasks related to the use cases.
</p>
<h3>
<a name="syntactic">4.1. Syntactic Interoperability: RDF</a>
</h3>
<p>
Resource Description Framework (RDF) is a W3C recommendation that provides a
standard to create, exchange and use annotations in the Semantic Web. An RDF
statement is of the form [subject property object .] This simple and general
form of syntax makes RDF a good candidate to provide (at least) syntactic
interoperability.
</p>
<h3>
<a name="layers">4.2. Layers of Interoperability</a>
</h3>
<p>
[Based on discussions in FTF2]
</p>
<h3>
<a name="common">4.3. Common Ontology/Schema</a>
</h3>
<p>
Individual use case provides its common ontology/schema for its domain.
</p>
<h3>
<a name="ontology">4.4. Ontology/Schema Integration, Harmonisation and Extension</a>
</h3>
<p>
[Integrate and harmonise the common ontologies/schema presented in the
previous sub-section. Based on this, to provide a simple extensible mechanism.]
</p>
<h3>
<a name="guidelines">4.5. Guidelines</a>
</h3>
<p>
Individual use case provides guidelines on which standard to use for specific
tasks related to the use case.
</p>
<!-- ======================================================================== -->
<h2>
<a name="conclusion">5. Conclusion</a>
</h2>
<!-- ======================================================================== -->
<h2>
<a name="references">6. References</a>
</h2>
<dl>
<dt>
<a id="Asbach" name="Asbach">[Asbach]</a>
</dt>
<dd><span class="title">Object detection and classification based on MPEG-7 descriptions – Technical study, use cases and business models</span>.
<span class="author">M. Asbach and J-R Ohm</span>.
ISO/IEC JTC1/SC29/WG11/MPEG2006/M13207, April 2006, Montreaux, CH.
</dd>
<dt>
<a id="Asirelli" name="Asirelli">[Asirelli]</a>
</dt>
<dd><span class="title">An Infrastructure for MultiMedia Metadata Management</span>.
<span class="author">Patrizia Asirelli, Massimo Martinelli, Ovidio Salvetti</span>.
<i>In:</i> Proceedings of International SWAMM Workshop, 2006.
</dd>
<dt>
<a id="Broekstra" name="Broekstra">[Broekstra]</a>
</dt>
<dd><span class="title">Sesame: A generic archistecture for storing and querying RDF and RDF schema</span>.
<span class="author">J. Broekstra, A. Kampman and F. van Harmelen</span>.
<i>In:</i> Proceedings of <a href="http://iswc2002.semanticweb.org/">The International Semantic Web Conference 2002</a>
(pages 54-68), 2002, Sardinia
</dd>
<dt>
<a id="Declerck" name="Declerck">[Declerck]</a>
</dt>
<dd><span class="title">Translating XBRL Into Description Logic. An Approach Using Protege, Sesame & OWL</span>.
<span class="author">T. Declerck and H.-U Krieger</span>.
<i>In:</i> Proceedings of the 9th International Conference on Business Information Systems, 2006
</dd>
<dt>
<a id="DIG35" name="DIG35">[DIG35]</a>
</dt>
<dd>
Digital Imaging Group (DIG),
<a href="http://xml.coverpages.org/FU-Berlin-DIG35-v10-Sept00.pdf">DIG35 Specification - Metadata for Digital Images - Version 1.0 August 30, 2000</a>
</dd>
<dt>
<a id="DublinCore" name="DublinCore">[Dublin Core]</a>
</dt>
<dd>
The Dublin Core Metadata Initiative,
<a href="http://dublincore.org/documents/dces/">Dublin Core Metadata Element Set</a>, Version 1.1: Reference Description
</dd>
<dt>
<a id="EBU" name="EBU">[EBU]</a>
</dt>
<dd>
European Broadcasting Union,
<a href="http://www.ebu.ch/">http://www.ebu.ch/</a>
</dd>
<dt>
<a id="Exif" name="Exif">[Exif]</a>
</dt>
<dd>
Standard of Japan Electronics and Information Technology Industries Association,
<a href="http://www.digicamsoft.com/exif22/exif22/html/exif22_1.htm">Exchangeable image file format for digital still cameras: Exif Version 2.2</a>
</dd>
<dt>
<a id="Flickr" name="Flickr">[Flickr]</a>
</dt>
<dd>
<a href="http://www.flickr.com/">Flickr</a> online photo management and sharing application,
<a href="http://www.flickr.com/">http://www.flickr.com/</a>, Yahoo! Inc, USA
</dd>
<dt>
<a id="fotocommunity" name="fotocommunity">[Foto Community]</a>
</dt>
<dd>
Foto Community,
<a href="http://www.fotocommunity.com/">http://www.fotocommunity.com/</a>
</dd>
<dt>
<a id="GFK2006" name="GFK2006">[GFK]</a>
</dt>
<dd><span class="title">Usage behavior digital photography</span>.
GfK Group for CeWe Color, 2006
</dd>
<dt>
<a id="Hildebrand" name="Hildebrand">[Hildebrand]</a>
</dt>
<dd><span class="title"><a href="http://dx.doi.org/10.1007/11926078_20">/facet: A browser for heterogeneous semantic web repositories</a></span>.
<span class="author">Michiel Hildebrand, Jacco van Ossenbruggen, and Lynda Hardman</span>.
<i>In:</i> <a href="http://iswc2006.semanticweb.org/">The Semantic Web - ISWC 2006</a> (pages 272-285), November 2006, Athens, USA.
</dd>
<dt>
<a id="HitWise" name="HitWise">[HitWise]</a>
</dt>
<dd>
HitWise Intelligence,
<a href="http://weblogs.hitwise.com/leeann-prescott/2006/08/delicious_traffic_more_than_do.html">
Del.icio.us Traffic More Than Doubled Since January</a>
</dd>
<dt>
<a id="Hotho" name="Hotho">[Hotho]</a>
</dt>
<dd><span class="title">Information Retrieval in Folksonomies: Search and Ranking</span>.
<span class="author">A. Hotho, R. Jaschke, C. Schmitz and G. Stumme</span>.
<i>In:</i> The 3rd European Semantic Web Conference (ESWC), 2006 Budva, Montenegro.
</dd>
<dt>
<a id="IIM" name="IIM">[IIM]</a>
</dt>
<dd><span class="title">Information Interchange Model</span>,
<a href="http://www.iptc.org/IIM/">http://www.iptc.org/IIM/</a>,
International Press Telecommunication Council (IPTC)
</dd>
<dt>
<a id="Koivunen" name="Koivunen">[Koivunen]</a>
</dt>
<dd><span class="title"><a href="http://www.annotea.org/eswc2005/01_koivunen_final.pdf">
Annotea and Semantic Web Supported Collaboration</a></span>.
<span class="author">M. Koivunen</span>.
<i>In:</i> Proceedings of the European Semantic Web Conference (ESWC), Crete, 2005
</dd>
<dt>
<a id="Knublauch" name="Knublauch">[Knublauch]</a>
</dt>
<dd><span class="title">Editing description logic ontologies with the Protege OWL plugin</span>.
<span class="author">H. Knublauch, M.A. Musen and A.L. Rector</span>.
<i>In:</i> Proceedings of the International Workshop on Description Logics (DL), 2004
</dd>
<dt>
<a id="Lara" name="Lara">[Lara]</a>
</dt>
<dd><span class="title">XBRL Taxonomies and OWL Ontologies for Investment Funds</span>.
<span class="author">R. Lara, I. Cantador and P. Castells</span>,
<i>In:</i> ER (Workshops), (pages 271-280), 2006
</dd>
<dt>
<a id="Mathes" name="Mathes">[Mathes]</a>
</dt>
<dd><span class="title"><a href="http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html">
Folksonomies - Cooperative Classification and Communication Through Shared Metadata</a></span>.
<span class="author">A. Mathes</span>,
Computer Mediated Communication - LIS590CMC, Graduate School of Library and Information Science,
University of Illinois Urbana-Champaign, 2004
</dd>
<dt>
<a id="Mejias" name="Mejias">[Mejias]</a>
</dt>
<dd><span class="title">Tag literacy</span>.
<span class="author">Ulises Ali Mejias</span>,
<a href="http://ideant.typepad.com/ideant/2005/04/tag_literacy.html">http://ideant.typepad.com/ideant/2005/04/tag_literacy.html</a>,
2005
</dd>
<dt>
<a id="MPEG-7" name="MPEG-7">[MPEG-7]</a>
</dt>
<dd>
Information Technology - Multimedia Content Description Interface (MPEG-7).
Standard No. ISO/IEC 15938:2001, International Organization for Standardization(ISO), 2001
</dd>
<dt><a id="MMSEM-Image" name="MMSEM-Image"></a>[MMSEM Image]</dt>
<dd>
<cite>
<a href="http://www.w3.org/2005/Incubator/mmsem/XGR-image-annotation-20070814/">
Image Annotation on the Semantic Web</a>
</cite>, Raphaël Troncy, Jacco van Ossenbruggen, Jeff Z. Pan and Giorgos Stamou,
Multimedia Semantics Incubator Group Report (XGR), 14 August 2007,
<a href="http://www.w3.org/2005/Incubator/mmsem/XGR-image-annotation/">http://www.w3.org/2005/Incubator/mmsem/XGR-image-annotation/</a>
</dd>
<dt>
<a id="Naphade" name="Naphade">[Naphade]</a>
</dt>
<dd>
Extracting semantics from audiovisual content: The final frontier in multimedia retrieval,
<span class="author">N. Naphade and T. Huang</span>.
<i>In:</i> IEEE Transactions on Neural Networks, vol. 13, No. 4, 2002.
</dd>
<dt>
<a id="NetRatings" name="NetRatings">[NetRatings]</a>
</dt>
<dd>
Nielsen/NetRatings,
<a href="http://www.nielsen-netratings.com/pr/PR_060810.PDF">
User-generated content drives halfs of US Top 10 fastest growing web brands</a>
</dd>
<dt>
<a id="Newman" name="Newman">[Newman]</a>
</dt>
<dd>
Richard Newman, Danny Ayers and Seth Russell.
Tag Ontology, <a href="http://www.holygoat.co.uk/owl/redwood/0.1/tags/">http://www.holygoat.co.uk/owl/redwood/0.1/tags/</a>
</dd>
<dt>
<a id="NewsML" name="NewsML">[NewsML-G2]</a>
</dt>
<dd>
IPTC,
<a href="http://www.iptc.org/NAR/">News Architecture (NAR) for G2-Standards Specifications (released 30th May, 2007)</a>
</dd>
<dt>
<a id="NewsCodes" name="NewsCodes">[NewsCodes]</a>
</dt>
<dd>
NewsCodes - Metadata taxonomies for the news industry,
<a href="http://www.iptc.org/NewsCodes/">http://www.iptc.org/NewsCodes/</a>
</dd>
<dt>
<a name="OWL" id="OWL">[OWL]</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/2004/REC-owl-ref-20040210/">
OWL Web Ontology Language Reference</a></cite>, S. Bechhofer, F. van Harmelen, J. Hendler, I. Horrocks,
D.L. McGuinness, P.F. Patel-Schneider and L.A. Stein, Editors, W3C
Recommendation, 10 February 2004,
<a href="http://www.w3.org/TR/owl-ref/">http://www.w3.org/TR/owl-guide/</a>
</dd>
<dt>
<a id="Pachet" name="Pachet">[Pachet]</a>
</dt>
<dd>
Knowledge Management and Musical Metadata.
F. Pachet, Encyclopedia of Knowledge Management, Schwartz, D. Ed. Idea Group, 2005
</dd>
<dt>
<a id="Parker" name="Parker">[Parker]</a>
</dt>
<dd>
<a href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1423925">Video blogging: Content to the max</a>,
<span class="author">C. Parker and S. Pfeiffer</span>.
IEEE MultiMedia, vol. 12, no. 2, pp. 4-8, 2005
</dd>
<dt>
<a id="Perperis" name="Perperis">[Perperis]</a>
</dt>
<dd>
Automatic Identification in Video Data of Dangerous to Vulnerable Groups of Users Content,
<span class="author">T. Perperis and S. Tsekeridou</span>.
Presentation at SSMS2006, Halkidiki, Greece, 2006
</dd>
<dt>
<a id="PhotoRDF" name="PhotoRDF">[PhotoRDF]</a>
</dt>
<dd>
W3C Note 19 April 2002,
<a href="http://www.w3.org/TR/2002/NOTE-photo-rdf-20020419">Describing and retrieving photos using RDF and HTTP</a>
</dd>
<dt>
<a id="Riya" name="Riya">[Riya]</a>
</dt>
<dd>
Riya Foto Search,
<a href="http://www.riya.com/">http://www.riya.com/</a>
</dd>
<dt>
<a id="Segawa" name="Segawa">[Segawa]</a>
</dt>
<dd>
<a href="http://doi.acm.org/10.1145/1135777.1135910">Web annotation sharing using P2P</a>,
<span class="author">O. Segawa</span>.
<i>In:</i> Proceedings of the 15th International Conference on World Wide Web, pages 851-852, Edinburgh, Scotland, 2006.
</dd>
<dt>
<a id="Smith" name="Smith">[Smith]</a>
</dt>
<dd>
<a href="http://atomiq.org/archives/2004/08/folksonomy_social_classification.html">
Atomiq: Folksonomy: social classification</a>,
<span class="author">G. Smith</span>, August 2004.
</dd>
<dt>
<a id="Schmitz" name="Schmitz">[Schmitz]</a>
</dt>
<dd>
<a href="http://www.rawsugar.com/www2006/22.pdf">Inducing Ontology from Flickr Tags</a>,
<span class="author">P. Schmitz</span>.
<i>In:</i> Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, 2006.
</dd>
<dt>
<a name="SKOS" id="SKOS">[SKOS]</a>
</dt>
<dd>
SKOS Core, <a href="http://www.w3.org/2004/02/skos/core/">http://www.w3.org/2004/02/skos/core/</a>
</dd>
<dt>
<a id="Trant" name="Trant">[Trant]</a>
</dt>
<dd>
Exploring the potential for social tagging and folksonomy in art museums: proof of concept,
<span class="author">J. Trant</span>.
<i>In:</i> New Review of Hypermedia and Multimedia, 2006
</dd>
<dt>
<a id="Tsekeridou" name="Tsekeridou">[Tsekeridou]</a>
</dt>
<dd>
MPEG-7 based Music Metadata Extensions for Traditional Greek Music Retrieval,
<span class="author">S. Tsekeridou, A. Kokonozi, K. Stavroglou and C. Chamzas</span>.
<i>In:</i> IAPR Workshop on Multimedia Content Representation, Classification and Security, Istanbul, Turkey, September 2006
</dd>
<dt>
<a id="VDO" name="VDO">[VDO]</a>
</dt>
<dd>
aceMedia Visual Descriptor Ontology, <a
href="http://www.acemedia.org/aceMedia/reference/resource/index.html">
http://www.acemedia.org/aceMedia/reference/resource/index.html</a>
</dd>
<dt>
<a id="XBRL" name="XBRL">[XBRL]</a>
</dt>
<dd>
XBRL - eXtensible Business Reporting Language,
<a href="http://www.xbrl.org/Home/">http://www.xbrl.org/Home/</a>, see also
<a href="http://www.tbray.org/ongoing/When/200x/2006/10/04/XBRL-RSS">Tim Bray's blog</a>
</dd>
<dt>
<a id="XMP" name="XMP">[XMP]</a>
</dt>
<dd>
Adobe,
<a href="http://partners.adobe.com/public/developer/en/xmp/sdk/XMPspecification.pdf">XMP Specification</a>
</dd>
</dl>
<!-- ======================================================================== -->
<h2>
<a id="acknowledgments" name="acknowledgments">Acknowledgments</a>
</h2>
<p>
The editors would like to thank all the contributors for the authoring of the
use cases (Melliyal Annamalai, George Anadiotis, Patrizia Asirelli, Susanne Boll, Oscar Celma,
Thierry Declerk, Thomas Franz, Christian Halaschek Wiener, Michael Hausenblas, Michiel Hildebrand,
Suzanne Little, Erik Mannens, Massimo Martinelli, Ioannis Pratikakis, Ovidio
Salvetti, Sofia Tsekeridou, Giovanni Tummarello) and the XG members for their
feedback on earlier versions of this document.
</p>
<hr />
<p>$Id: Overview.html,v 1.8 2007/08/14 23:54:38 rtroncy Exp $</p>
<!-- ======================================================================== -->
</body>
</html>