spec 96.1 KB
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643
<?xml version="1.0" encoding="UTF-8"?><!--*- nxml -*-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Gleaning Resource Descriptions from Dialects of Languages
  (GRDDL)</title>
  <meta name="RCS-Id" content="$Id: spec.html,v 1.292 2008/09/08 13:42:19 connolly Exp $"/>
  <style type="text/css">
.issue {
  background-color:#dfd;
  border: thin solid black;
  color:black;
}

.assertion {
  background-color:#dfd;
  color:black;
}

.ed {
  background-color:#fdf;
  border: thin solid black;
  color:black;
}

.postponed {
  background-color:#fee;
  border: thin dotted black;
  color:black;
}

.tech {
  background-color:#fdd;
  border: thin solid black;
  color:black;
  font-size: 80%
}

.designSketch {
  background-color:#fdf;
  border: thin solid black; 
  color:black;
}

.illustration {
 margin-left:auto;
 margin-right:auto;
 text-align:center; 
}

.example {
 margin-left:auto;
 margin-right:auto;
 padding-top:0.5em;
 padding-bottom:0.5em;
 width:85%;
 border-top:thin dashed black;
 border-bottom:thin dashed black;
}

td pre { font-size: smaller }

dfn { font-weight: bold }

/* try to get coherence bewteen the rule boxes */
table tr td.assertion { width: 500px }

</style>
<link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/base" />
<!-- @@PUBFIX
<link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-PR" />
-->
</head>

<body xml:lang="en" lang="en">

<div class="head">
<a href="http://www.w3.org/"><img alt="W3C"
src="http://www.w3.org/Icons/w3c_home" height="48" width="72" /></a>

<h1>Gleaning Resource Descriptions from Dialects of Languages (GRDDL)</h1>

<h2><strong>Editor's draft, obsoleted by
<a href="http://www.w3.org/TR/grddl/">the official W3C GRDDL specification</a>
and <a href="http://www.w3.org/2003/g/data-view">the GRDDL
homepage and namespace document</a>
</strong>
</h2>
<dl>
  <dt>This Version:</dt>
    <dd>$Revision: 1.292 $ of $Date: 2008/09/08 13:42:19 $</dd>
<!--  <dt>Previous Version:</dt>
    <dd><a href="http://www.w3.org/TR/2007/CR-grddl-20070502/">http://www.w3.org/TR/2007/CR-grddl-20070502/</a>
</dd>-->
  <dt>Editor:</dt>
    <dd><a
      href="/People/Connolly/">Dan Connolly</a></dd>
  <dt>Authors:</dt>
    <dd>see <a href="#changes">Acknowledgments</a></dd>
</dl>

<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> &#169; 2006-2007 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>&#174;</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p>

</div>
<hr />

<div><h2>Abstract</h2>

<p>GRDDL is a mechanism for <b>G</b>leaning <b>R</b>esource
<b>D</b>escriptions from <b>D</b>ialects of <b>L</b>anguages. This
GRDDL specification introduces markup based on existing standards for
declaring that an XML document includes data compatible with the
Resource Description Framework (RDF) and for linking to algorithms
(typically represented in XSLT), for extracting this data from the
document.</p>

<p>The markup includes a namespace-qualified attribute for use
in general-purpose XML documents and a profile-qualified
link relationship for use in valid XHTML documents. The GRDDL
mechanism also allows an XML namespace document
(or XHTML profile document) to declare that every document associated
with that namespace (or profile) includes gleanable data and for
linking to an algorithm for gleaning the data.</p>

<p>A corresponding <a href="#usecases">GRDDL Use Case Working
Draft</a> provides motivating examples.  A <a href="#primer">GRDDL
Primer</a> demonstrates the mechanism on XHTML documents which include
widely-deployed dialects known as microformats. A
<a href="#GRDDL-TESTS">GRDDL Test Cases</a> document illustrates
specific issues in this design and provides materials to
aid in test-driven development of GRDDL-aware agents.
</p>

</div>


<div>
<h2 id="status">Status of This Document</h2>

<p>This was an editor's draft.
  <span
class="ed">Any editorial notes and TODOs are styled this way.</span>
</p>
<!--
The following status is for an upcoming publication:</p>

<blockquote>
<p><em>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A
list of current W3C publications and the latest revision of this
technical report can be found in the <a
href="http://www.w3.org/TR/">W3C technical reports index</a> at
http://www.w3.org/TR/.</em></p>

<p>This is a <a
href="http://www.w3.org/2005/10/Process-20051014/tr.html#RecsPR">Proposed
Recommendation</a> of the GRDDL specification. The W3C Membership and
other interested parties are invited to review the document through 24
August 2007.</p>

<p>This document was produced by <a
href="http://www.w3.org/2001/sw/grddl-wg/">GRDDL Working Group</a>,
which is part of the <a href="http://www.w3.org/2001/sw/Activity">W3C
Semantic Web Activity</a>.  The first release of this document as a
Working Draft was 24 Oct 2006 and the Working Group has made its best
effort to address <a href=
"http://lists.w3.org/Archives/Public/public-grddl-comments/">comments
received</a> since then and has resolved a number of <a
href="http://www.w3.org/2001/sw/grddl-wg/issues">issues</a> meanwhile.
<span class="assertion" id="sotd_ex">Normative assertions are marked
up in this way.</span> A Last Call period for substantive technical
comments ended 31 May 2007.  A <a
href="#changes">change log</a> is appended,
detailing editorial changes since then.
</p>

<p id="implExp">The Working Group's <a
href="http://www.w3.org/2001/sw/grddl-wg/td/test_results">implementation
report</a> demonstrates that the goals for interoperable
implementations, set in the <a
href="http://www.w3.org/TR/2007/CR-grddl-20070502/">May 2007 Candidate
Recommendation draft of this document</a>, were achieved.</p>

  <p>GRDDL is intended to contribute to addressing Web Architecture
  issues such as <a href=
  "http://www.w3.org/2001/tag/issues.html?type=1#RDFinXHTML-35"
  >RDFinXHTML-35</a>, <a href=
  "http://www.w3.org/2001/tag/issues.html?type=1#namespaceDocument-8"
  >namespaceDocument-8</a>, and
<a href=
  "http://www.w3.org/2001/tag/issues.html?type=1#xmlFunctions-34"
  >xmlFunctions-34</a> as well as issues postponed by the RDF Core
  working group such as <a href=
  "http://www.w3.org/2000/03/rdf-tracking/#rdfms-validating-embedded-rdf"
  >rdfms-validating-embedded-rdf</a> and <a href=
  "http://www.w3.org/2000/03/rdf-tracking/#faq-html-compliance"
  >faq-html-compliance</a>.

  <span class="postponed">In particular, the GRDDL Working Group has
  postponed <a
  href="http://www.w3.org/2001/sw/grddl-wg/issues#issue-faithful-infoset">issue-faithful-infoset</a>,
  and anticipates that the resolution of TAG issue <a
  href="http://www.w3.org/2001/tag/issues.html?type=1#xmlFunctions-34"
  >xmlFunctions-34</a> will provide further clarification and
  guidance.</span>
  </p>


  <p>Aside from <a
  href="http://www.w3.org/2002/09/wbs/myQuestionnaires">formal
  membership reviews</a>, comments on this document should be sent to
  <a
  href="mailto:public-grddl-comments@w3.org">public-grddl-comments@w3.org</a>,
  a mailing list with a <a href=
  "http://lists.w3.org/Archives/Public/public-grddl-comments">public
  archive</a>.</p>


<p>Publication as a Proposed Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p>

<p> This document was produced by a group operating 
under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 W3C Patent Policy</a>. 
W3C maintains a 
<a rel="disclosure" 
href="http://www.w3.org/2004/01/pp-impl/39407/status">
public list of any patent disclosures</a> made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>. </p>

</blockquote>
-->



<p>The <span id="issues">issues appendix</span> that used to
be part of this draft has been moved to a <a href="http://www.w3.org/2001/sw/grddl-wg/issues">Working
Group issues list</a>; specifically:

<a id="issue-whichlangs" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-whichlangs"
>issue-whichlangs</a>,
<a id="issue-output-formats" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-output-formats"
>issue-output-formats</a>,
<a id="issue-base-param" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-base-param"
>issue-base-param</a>,
<a id="issue-tx-element" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-tx-element"
>issue-tx-element</a>,
<a id="issue-html-nsdoc" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-html-nsdoc"
>issue-html-nsdoc</a>,
<a id="issue-faithful-infoset" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-faithful-infoset"
>issue-faithful-infoset</a>,
<a id="issue-mt-ns" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-mt-ns"
>issue-mt-ns</a>,
<a id="issue-conformance-labels" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-conformance-labels"
>issue-conformance-labels</a>,
<a id="issue-http-header-links" href=
"http://www.w3.org/2001/sw/grddl-wg/issues#issue-http-header-links"
>issue-http-header-links</a>
</p>

<!-- 
<p>
<span class="ed">Each assertion bears an ID. An index of rules would
be nice to have; in the interest of stability, the editor is not
adding it just yet.  An <a href="spec_lean">extract of normative
material only</a> has been put on hold indefinitely.</span>
</p>
-->

</div>

<div>
<h2 id="toc">Table of Contents</h2>
<ol>
  <li><a href="#intro">Introduction</a></li>
  <li><a href="#grddl-xml">Adding GRDDL to well-formed XML</a></li>
  <li><a href="#ns-bind">GRDDL for XML Namespaces</a></li>
  <li><a href="#grddl-xhtml">Using GRDDL with valid XHTML</a></li>
  <li><a href="#profile-bind">GRDDL for HTML Profiles</a></li>
  <li><a href="#txforms">GRDDL Transformations</a></li>
  <li><a href="#sec_agt">GRDDL-Aware Agents</a></li>
  <li><a href="#sec">Security Considerations</a></li>
  <li><a href="#grddlvocab">The GRDDL Vocabulary</a></li>
  <li><a href="#bib">References</a></li>
</ol>
<ul>
  <li>Appendix: <a href="#stylepi">Transformations for Styling versus
  data extraction</a></li>
  <li>Appendix: <a href="#base_misc">Base IRI considerations</a></li>
  <li>Appendix: <a href="#changes">Acknowledgements and Change History</a></li>
</ul>

<div>Linked documents:</div>
<ul>
  <li>Appendix: <a id="mechspec" href="spec_rules"
  >About the Mechanical Rules</a></li>
</ul>

</div>

<div>
<h2 id="intro"><span class="gen">1. </span>Introduction: Data and Documents</h2>

<p>There are many domain-specific languages ("dialects") used in
practice among the many XML documents on the web.  There are dialects
of XHTML, XML and RDF that are used to represent everything from
poetry to prose, purchase orders to invoices, spreadsheets to
databases, schemas to scripts, and linked lists to ontologies.</p>

<p>While this breadth of expression is quite liberating, inspiring new
dialects to represent information, it can 
be a barrier to understanding across different domains or
fields. How, for example, does software discover the author of a poem,
a spreadsheet and an ontology? And how can software determine whether
authors of each are in fact the same?</p>

<p>The following are examples of how the same musical work might be
described in different XML dialects:</p>

<dl>
<dt>iTunes Music Library</dt>
<dd>
<pre>
&lt;key>Artist&lt;/key>
  &lt;string>The Jimi Hendrix Experience&lt;/string>
&lt;key>Album&lt;/key>
  &lt;string>Are You Experienced?&lt;/string>
</pre>
</dd>

<dt>Audioscrobbler</dt>
<dd>
<pre>
&lt;album>
    &lt;artist mbid="">The Jimi Hendrix Experience&lt;/artist>
    &lt;name>Are You Experienced?&lt;/name>
...
&lt;/album>
</pre>
</dd>

<dt>Atom</dt>
<dd>
<pre>
&lt;entry ... &gt;
&lt;title&gt;Are You Experienced?&lt;/title&gt;
&lt;author&gt;
&lt;name&gt;The Jimi Hendrix Experience&lt;/name&gt;
&lt;/author&gt;
...
&lt;/entry&gt;
</pre>
</dd>

<dt>Open Office</dt>
<dd><pre>
&lt;office:document-meta ... &gt;
&lt;office:meta&gt;
&lt;dc:title&gt;Are You Experienced?&lt;/dc:title&gt;
  &lt;meta:initial-creator&gt;
  The Jimi Hendrix Experience
  &lt;/meta:initial-creator&gt;
&lt;dc:creator>The Jimi Hendrix Experience&lt;/dc:creator&gt;
&lt;/office:meta&gt;
&lt;/office:document-meta&gt;
</pre>
</dd>
</dl>

<p>Although the examples above are obviously encodings of the same information,	
there remains no clear mechanism through which computer software 
might be able to determine this connection.</p>

<h3 id="intro_rdf">Resource Descriptions</h3>

<p>The Resource Description Framework<a href="#RDFC04">[RDFC04]</a>
provides a standard for making statements about resources in the form
of a subject-predicate-object expression. One way to represent the
fact "<cite>Are You Experienced?</cite>'s artist is The Jimi Hendrix
Experience" in RDF would be as a triple whose subject is <cite>Are You
Experienced</cite>, whose predicate is "has artist," and whose object
is The Jimi Hendrix Experience. The predicate, "has artist" expresses
a relationship between the subject (Are You Experienced?) and the
object (The Jimi Hendrix Experience).  Using URIs to uniquely identify
the album, the artist and even the relationship would facilitate
software design because not everyone knows The Jimi Hendrix Experience
or even spells its name consistently.</p>

<p>Here's the information contained in the XML fragments above, this
time expressed as RDF:</p>

<pre class="example">
&lt;rdf:RDF
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">

  &lt;rdf:Description rdf:about=
"http://musicbrainz.org/mm-2.1/album/6b050dcf-7ab1-456d-9e1b-c3c41c18eed2">
    &lt;dc:title>Are You Experienced?&lt;/dc:title>
    &lt;foaf:maker>
      &lt;foaf:Agent rdf:about=
  "http://musicbrainz.org/mm-2.1/artist/33b3c323-77c2-417c-a5b4-af7e6a111cc9">
        &lt;foaf:name>The Jimi Hendrix Experience&lt;/foaf:name>
      &lt;/foaf:Agent>
    &lt;/foaf:maker>

  &lt;/rdf:Description>
&lt;/rdf:RDF>
</pre>

<p>Both the entities (subject and object resources) and relationships
(predicates) are identified using unambiguous URIs.</p>

<p><em>Note that GRDDL follows HTML 4, RDF, and XML Schema in using
<em>Internationalized Resource Identifiers</em>, i.e. IRIs<a
class="norm" href="#rfc3987">[RFC3987]</a>. While in informal usage,
this specification uses the more familiar term <q>URI</q>
interchangeably with the recently standardized term <q>IRI</q>, the
formal rules use the relevant terms precisely.</em>
</p>

<p>The publishers of the XML above could also provide the same data in
RDF using RDF/XML or one of the other RDF syntaxes.
GRDDL provides a relatively inexpensive mechanism for bootstrapping
RDF content from uniform XML dialects, shifting the burden from
formulating RDF to creating transformation algorithms specifically for
each dialect.
</p>

<p>GRDDL works by associating transformations for an
individual document, either through direct inclusion of references or
indirectly through profile and namespace documents. Content authors
can nominate the transformations for producing RDF from their content
and use GRDDL to refer to them. </p>


<div><h3 id="sec_rend">Faithful Renditions</h3>

<p>By specifying a GRDDL transformation, the author of a document
states that the transformation will provide a faithful rendition in
RDF of information (or some portion of the information) expressed
through the XML dialect used in the source document.</p>

<p>Likewise, by specifying a GRDDL namespace transformation or profile
transformation, the creator of that namespace or profile states that
the transformation will provide a faithful RDF rendition of a class of
source documents which relate to that namespace or profile. A
namespace document or a profile document also provide a means for
their authors to explain in prose the purpose of the transformation or
any policy statements.</p>

</div>

<div><h3 id="intro_spec">Preface and Companion Documents</h3>

<p>This GRDDL specification is a concise technical specification of
the GRDDL mechanism and its XML syntax. It specifies the GRDDL syntax
to use in valid XHTML and well-formed XML documents, as well as how to
encode GRDDL into namespaces and HTML profiles. Discussions of the
GRDDL transformation link and security issues are also
covered. Appendices provide links to extended examples and existing
software and services that employ GRDDL.</p>

<h4 id="intro_primer">GRDDL Primer</h4>

<p>The GRDDL Primer<a href="#primer">[primer]</a> is a step-by-step tutorial on
the GRDDL mechanism.  It develops a number of examples from the
GRDDL Use Cases document to illustrate GRDDL techniques for
associating documents with transformations for extracting RDF.</p>

<h4 id="intro_uc">GRDDL Use Cases</h4>

<p>The use cases document<a href="#usecases">[usecases]</a> collects a
number of use cases with their goals and requirements for
GRDDL.
These use cases also illustrate how XML and XHTML documents can be
decorated with microformats, Embedded RDF or RDFa statements to support
GRDDL transformations in charge of extracting valuable data that can
then be used to automate a variety of tasks.</p>

<h4 id="intro_testcases">GRDDL Test Cases</h4>

<p>The GRDDL Test Cases<a class="inform" href="#GRDDL-TESTS">[GRDDL-TESTS]</a>
provides a collection of tests illustrating this specification.
Some of the tests may help clarify the intended
reading of the normative text.</p>
</div>
</div>

<div><h2 id="grddl-xml"><span class="gen">2. </span>Adding GRDDL to well-formed XML</h2>

<p>The general form of associating a GRDDL transformation link with a
well-formed XML document is adding to the root element a
<code>grddl</code> namespace declaration and a
<code>grddl:transformation</code> attribute whose value is an IRI
reference, or list of IRI references, that refer to executable scripts
or programs which are expected to transform the source document into
RDF.  This method is suitable for use with any XML dialects that can
accomodate an extra namespace-qualified attribute on the root
element.</p>

<p>For example, this XML document,
located at
<tt>http://www.w3.org/2001/sw/grddl-wg/td/titleauthor.html</tt>,
is linked to two GRDDL transformations:</p>

<pre class="example">
&lt;html xmlns="http://www.w3.org/1999/xhtml"
      <b>xmlns:grddl='http://www.w3.org/2003/g/data-view#'</b>
      <b>grddl:transformation="glean_title.xsl
			http://www.w3.org/2001/sw/grddl-wg/td/getAuthor.xsl"</b>
 >
&lt;head>
&lt;title>Are You Experienced?&lt;/title>
<em>[...]</em>
&lt;/html>
</pre>

<ol>
<li>It is linked to the transformation identified by
<tt>http://www.w3.org/2001/sw/grddl-wg/td/getAuthor.xsl</tt>.</li>
<li>To resolve the relative URI reference <tt>glean_title.xsl</tt>
to absolute form, we use the base URI of this XML element,
<tt>http://www.w3.org/2001/sw/grddl-wg/td/titleauthor.html</tt>.
Then this document is also linked to the GRDDL transformation
identified by the absolute form,
<tt>http://www.w3.org/2001/sw/grddl-wg/td/glean_title.xsl</tt>.</li>
</ol>

<div class="illustration">
<img src="figTitleAuthor.png" alt="diagram: link to multiple transformations" />
<p>extracting title and author information</p>
<small>(<a href="figTitleAuthor.svg">svg</a>)</small>
</div>

<p>As you will see in later sections, there are other ways to add GRDDL 
to HTML documents, especially designed to leverage HTML's existing capabilities 
and thereby overcome constraints imposed by the XML DTDs for some dialects of HTML.
See <a href="#grddl-xhtml">Using GRDDL with valid XHTML</a> and 
<a href="#profile-bind">GRDDL for HTML Profiles</a>.
</p>


<p>The formal specification of this markup is given below. <em>An
informative mechanical version of each rule is given with the premise
and the conclusion written as SPARQL graph patterns<a
href="#SPARQL">[SPARQL]</a>. See the <a href="spec_rules">Mechanical
Rules</a> appendix for namespace prefix bindings and further
explanation.
These are included for those readers who find them helpful.
Other readers are encouraged to ignore them.
</em></p>

<table border="1">
<tr>
  <th>Normative Statement</th><th>Mechanical Rule<br />(Informative)</th>
</tr>
<tr>
<td class="assertion" id="rule_GRDDL_transformation">
Given an XPath<a href="#XPATH">[XPATH]</a> root
node <var>N</var> with root element <var>E</var>,
if the expression

<pre>/*/@*[local-name()="transformation"
  and namespace-uri()=
    "http://www.w3.org/2003/g/data-view#"]</pre>

matches an attribute of 
an element
<var>E</var>, then for each <a href="#stok">space-separated
token</a> <var>REF</var> in the value of that attribute, the resource
identified<a class="norm" href="#WEBARCH">[WEBARCH]</a> by the
absolute form (see section 5.2 Relative Resolution in <a class="norm"
href="#rfc3986">[RFC3986]</a>) of <var>REF</var> with respect to the
base IRI<a class="norm" href="#rfc3987">[RFC3987]</a>,<a class="norm" href="#XMLBASE">[XMLBASE]</a>
of <var>E</var> is a <dfn>GRDDL transformation</dfn> of
<var>N</var>.

<p id="stok">
<dfn>Space-separated tokens</dfn> are the maximal non-empty
subsequences not containing the whitespace characters #x9, #xA, #xD or
#x20.
</p>
</td>
<td>

<table class="rule">
<tr><td>
<pre>
(?N "/*") gspec:xpath ?E.
(?N """/*/@*[local-name()="transformation" and
    namespace-uri()=
    "http://www.w3.org/2003/g/data-view#"]""")
   gspec:xpath [ fn:string ?V].
?V fn:normalize-space ?Vnorm.
(?Vnorm "[ \t\r\n]+") fn:tokenize [
  list:member ?REF ].
?E fn:base-uri ?BASE.
(?REF ?BASE) fn:resolve-uri ?TXURI.
?TX log:uri ?TXURI.
</pre>

</td></tr>
<tr><td><hr /></td></tr>
<tr><td>
<pre>?N grddl:transformation ?TX.</pre>
</td></tr>
</table>
</td>
</tr>
</table>

<p>The <tt>glean_title.xsl</tt> transformation computes
the following RDF/XML document, given the XML document
above as input:</p>

<pre class="example">
&lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:dc="http://purl.org/dc/elements/1.1/">
  &lt;rdf:Description rdf:about="">
    &lt;dc:title>Are You Experienced?&lt;/dc:title>
  &lt;/rdf:Description>
&lt;/rdf:RDF>
</pre>

<p>The graph serialized by that document is a <b>GRDDL result</b> of
the resource identified by
<tt>http://www.w3.org/2001/sw/grddl-wg/td/titleauthor.html</tt>.  Note
that this serialization of the graph contains a relative URI reference
(in the value of the <tt>rdf:about</tt> attribute).  The base IRI for
interpretting relative IRI references in a serialization of a
graph produced by a GRDDL transformation is the base IRI of the source
document.</p>

<p>The <tt>glean_title.xsl</tt> resource specifies a function from
XPath document nodes to RDF/XML documents, and hence to RDF graphs;
this function is called the <b>transformation property</b> of the XSLT
document. See the <a href="#txforms">GRDDL Transformations
section</a> for more details.</p>

<p>The general rule for using GRDDL with well-formed XML is:</p>

<table border="1">
<tr>
<td class="assertion" id="rule_result">
If an information resource(<a class="norm" href="#WEBARCH">[WEBARCH]</a>,
section 2.2) <var>IR</var>
is represented by an XML document with
an XPath root node <var>R</var>,
and <var>R</var> has a GRDDL transformation
with a <dfn>transformation property</dfn> <var>TP</var>, 
and <var>TP</var> applied to <var>R</var> gives an
RDF Graph<a class="norm" href="#RDFC04">[RDFC04]</a>
<var>G</var>, then <var>G</var>
is a <dfn>GRDDL result</dfn> of <var>IR</var>.
</td>
<td>

<table class="rule">
<tr><td>
<pre>
?IR log:uri [ fn:doc ?R ].
?R grddl:transformation [ grddl:transformationProperty ?TP ].
?R ?TP ?G.
</pre>
</td>
</tr>
<tr><td><hr /></td></tr>
<tr>
<td>
<pre>
?IR grddl:result ?G .
</pre>
</td>
</tr>
</table>

</td>
</tr>
</table>

<p>The <tt>titleauthor.html</tt> resource has another GRDDL
result via the <tt>getAuthor.xsl</tt> transformation. These
results can be merged together into another result, by
this rule:</p>

<table border="1">
<tr>
<td class="assertion" id="rule_merge">
If <var>F</var> and <var>G</var> are <b>GRDDL results</b> of <var>IR</var>,
then the 
<a class="norm"
href="http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#defmerge">merge</a>
<a class="norm" href="#RDF-MT">[RDF-MT]</a>
of <var>F</var> and <var>G</var> is also a <b>GRDDL result</b> of <var>IR</var>.
</td>
<td>
<table class="rule">
<tr>
<td>
<pre>
?IR grddl:result ?F, ?G.
(?F ?G) log:conjunction ?H.</pre>
</td>
</tr>
<tr><td><hr /></td></tr>
<tr><td>
<pre>
?IR grddl:result ?H.</pre>
</td>
</tr>
</table>
</td>
</tr>
</table>

</div>


<div><h2 id="ns-bind"><span class="gen">3. </span>Using GRDDL with XML Namespace Documents</h2>

<p>Transformations can be associated not only with individual
documents but also with whole dialects that share an XML namespace.
Any resource available for retrieval from a namespace URI is a
<dfn>namespace document</dfn> 
(cf. section <a class="norm"
href="http://www.w3.org/TR/2004/REC-webarch-20041215/#namespace-document">4.5.4. Namespace
documents</a> in <a class="norm" href="#WEBARCH">[WEBARCH]</a>).  For example, a
namespace document may have an XML Schema representation or an RDF
Schema representation, or perhaps both, using <a class="norm"
href="http://www.w3.org/TR/webarch/#def-coneg">content
negotiation</a>.</p>
<!-- er... the conneg link isn't really normative,
but the fixrefs.xsl script doesn't grok citing the
same document both normatively and informatively. -->

<p>To associate a GRDDL transformation with a whole dialect, include
a <code>grddl:namespaceTransformation</code> property in a GRDDL
result of the namespace document.</p>

<p id="sec_rdf_nsdoc">For example, consider this privacy policy written in P3Q, a
contrived analog to P3P<a href="#P3P">[P3P]</a>:</p>

<div class="example">
<pre>&lt;POLICIES xmlns="http://www.w3.org/2004/01/rdxh/p3q-ns-example"&gt;
	&lt;EXPIRY max-age="604800"/&gt;
<em>...</em>
</pre></div>

<p>The namespace document for P3Q relates the <tt>grokP3Q.xsl</tt>
transformation to all P3Q documents:</p>

<div class="example">
<pre>&lt;rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dataview="http://www.w3.org/2003/g/data-view#"&gt;
 &lt;rdf:Description rdf:about="http://www.w3.org/2004/01/rdxh/p3q-ns-example"&gt;
   &lt;dataview:namespaceTransformation
       rdf:resource="http://www.w3.org/2004/01/rdxh/grokP3Q.xsl"/&gt;
 &lt;/rdf:Description&gt;
&lt;/rdf:RDF&gt;
</pre></div>

<p>That is: every document whose root namespace name
is <tt>...p3q-ns-example</tt> has <tt>grokP3Q.xsl</tt>
as a <b>GRDDL transformation</b> implicitly, as illustrated
in this figure:</p>

<div class="illustration">
<img src="figGleanNsDoc.png" alt="diagram: glean via namespace" />
<br />transformation applied to namespace<br />
<small>(<a href="figGleanNsDoc.svg">svg</a>)</small></div>

<p>Some namespace documents, such as the XHTML namespace document
<tt>http://www.w3.org/1999/xhtml</tt> have very many references to
them.  If GRDDL-aware agents were to retrieve these documents every
time they processed a document referring to them, the origin servers
of those documents could become overloaded.  GRDDL-aware agents
therefore should not retrieve such documents on every reference and
should retain some cache or local memory of the transformations those
documents indicate should be applied. To avoid misrepresentation of
published information, GRDDL-aware agents should ensure that this
local memory is up to date and should support user options to
configure or disable the cache. See also section section <a class="norm"
href="http://www.w3.org/TR/webarch/#dereference-uri">3.1. Using a URI
to Access a Resource</a> of <a class="norm"
href="#WEBARCH">[WEBARCH]</a>.</p>

<p>The general case of namespace transformations is:</p>

<table border="1">
<tr>
  <th>Normative Statement</th><th>Mechanical Rule<br />(Informative)</th>
</tr>

<tr>
<td class="assertion" id="rule_nstx">
   If
<ul>
<li>an information resource <var>NSDOC</var>, identified by an IRI
<var>NS</var> has a <b>GRDDL result</b> that includes a triple
whose
<ul>
<li> subject is <var>NSDOC</var>, whose</li>
<li>predicate is the property
   <tt>&lt;http://www.w3.org/2003/g/data-view#namespaceTransformation&gt;</tt>,
   and whose</li>
<li>object is <var>TX</var>,</li>
</ul>
</li>
<li>and an information resource
   <var>IR</var> has an XML representation with
root node <var>NODE</var> and with a root element
with a namespace name <var>NS</var>,</li>
</ul> then <var>TX</var> is a <b>GRDDL
   transformation</b> of <var>NODE</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
?NSDOC log:uri ?NS;
   grddl:result [
     log:includes [
       rdf:subject ?NSDOC;
       rdf:predicate grddl:namespaceTransformation;
       rdf:object ?TX]].
?IR log:uri [ fn:doc ?NODE].
(?NODE "/*") gspec:xpath ?E.
?E fn:namespace-uri ?NS.
</pre>
</td></tr>
<tr><td><hr /></td></tr>
<tr><td>
<pre>
?NODE grddl:transformation ?TX.
</pre>
</td></tr>
</table>

</td>
</tr>
</table>


<p>Note that as a base case, the result of parsing an RDF/XML
document is a GRDDL result of that document:</p>

<table border="1">
<tr>
  <th>Normative Statement</th><th>Mechanical Rule<br />(Informative)</th>
</tr>

<tr>
<td class="assertion" id="rule_rdfxbase">
If an information resource <var>IR</var> is represented
by a 
<a class="norm" href="http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#dfn-conforming-rdf-xml-document">conforming RDF/XML document</a><a href="#RDFX">[RDFX]</a>,
then the RDF graph represented by that document
is a <dfn>GRDDL result</dfn> of <var>IR</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
?IR log:uri [ fn:doc [ gspec:rdfParse ?G ] ].
</pre>
</td></tr>
<tr><td><hr /></td></tr>
<tr><td>
<pre>
?IR grddl:result ?G.
</pre>
</td></tr>
</table>

</td>
</tr>
</table>

<p>Note that while an <tt>application/rdf+xml</tt> media type is one
indication that a document is RDF/XML, section <a href=
"http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#start"
>7.2.1 Grammar start</a> of <a href="#RDFX">[RDFX]</a> leaves open
"other means" by which an RDF/XML document may be identified.  For the
purposes of the rule above, a root element whose local name is
<code>RDF</code> and whose namespace URI is
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#</code> is such a
means. For a case in point, see the <a href=
"http://www.w3.org/2001/sw/grddl-wg/td/grddl-tests#grddlonrdf-xmlmediatype"
>grddlonrdf-xmlmediatype</a> test case.</p>

<div><h3 id="sec_xsd_nsdoc">Example: Using GRDDL with an XML Schema
namespace document</h3>

<p>A namespace transformation link may be discoverable by transforming
the namespace document itself. Note that this means that namespace
documents need not be written in RDF/XML directly.</p>

<p>Consider a purchase order that has a namespace document 
represented in XML Schema, where the XML Schema bears
a <tt>data-view:transformation</tt>
attribute licensing extraction of statements that include
<tt>namespaceTransformation</tt> statements:</p>

<div class="example">
<pre>
&lt;xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns="http:.../Order-1.0"
            targetNamespace="http:.../Order-1.0"
            version="1.0"
            ...
            xmlns:data-view="http://www.w3.org/2003/g/data-view#"
            data-view:transformation="http://www.w3.org/2003/g/embeddedRDF.xsl" &gt;
    &lt;xsd:element name="Order" type="OrderType"&gt;
    &lt;xsd:annotation 
      &lt;xsd:documentation&gt;This element is the root element.&lt;/xsd:documentation&gt;
    &lt;/xsd:annotation&gt;
                 ...
  &lt;xsd:annotation>
    &lt;xsd:appinfo>
      &lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	&lt;rdf:Description rdf:about="http://www.w3.org/2003/g/po-ex">
	  &lt;data-view:namespaceTransformation
	      rdf:resource="grokPO.xsl" />
	&lt;/rdf:Description>
      &lt;/rdf:RDF>
    &lt;/xsd:appinfo>
  &lt;/xsd:annotation>
<em>...</em>
</pre></div>

<p>Every purchase order using that schema as a namespace document
is linked to the <code>grokPO.xsl</code> transformation, as
illustrated below:</p>

<div class="illustration">
<img src="figGleanPO.png" alt="diagram: glean via namespace" />
<p>using GRDDL with an XML Schema</p>
<small>(<a href="figGleanPO.svg">svg</a>)</small></div>

</div>

</div>


<div><h2 id="grddl-xhtml"><span class="gen">4.</span> Using GRDDL with valid XHTML</h2>

<p>To accomodate the DTD-based syntax of XHTML<a
href="#XHTML">[XHTML]</a>, which precludes using attributes from
foreign namespaces, we use <code><a rel="ns-claim"
href="http://www.w3.org/2003/g/data-view">http://www.w3.org/2003/g/data-view</a></code>
as a metadata profile (cf. section <a class="norm"
href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#h-7.4.4.3">7.4.4.3
Meta data profiles</a> of <a href="#HTML4">[HTML4]</a>).</p>

<p>The general form of adding a GRDDL assertion to a valid XHTML
document is by specifying the GRDDL profile in the
<code>profile</code> attribute of the <code>head</code> element, and
<code>transformation</code> as the value of the <code>rel</code>
attribute of a <code>link</code> or <code>a</code> element whose
<code>href</code> attribute value is an IRI reference that refers to an
executable script or program which is expected to transform the source
document into RDF.  This method is suitable for use
with valid XHTML documents which are constrained by an XML DTD.
</p>

<div><h3 id="sec_dubc_ex">An example Dublin Core META transformation</h3>

<p>For example, this document follows the conventions of
<a href="#RFC2731">[RFC2731]</a>, and it explicitly uses the GRDDL
profile and links to an XSLT transformation to 
RDF/XML to signal that the transformation is a faithful
rendition:</p>

<pre class="example">&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
  &lt;head <b>profile="http://www.w3.org/2003/g/data-view"</b>&gt;
    &lt;title&gt;Some Document&lt;/title&gt;

    &lt;link <b>rel="transformation"</b>
       href="http://www.w3.org/2000/06/dc-extract/dc-extract.xsl" /&gt;
    &lt;meta name="DC.Subject"
       content="ADAM; Simple Search; Index+; prototype" /&gt;
    ...
  &lt;/head&gt;
  ...
&lt;/html&gt;</pre>


<p>The figure below shows the source document, the
<tt>dc-extract.xsl</tt> transformation, and the GRDDL result:</p>

<div class="illustration">
<img src="figGlean.png" alt="diagram: link to transformation" />
<p>Decoding HTML meta-data to RDF</p>
<small>(<a href="figGlean.svg">svg</a>)</small></div>


<p>This is what the data looks like in RDF/XML:</p>
<pre class="example">&lt;rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"&gt;
  &lt;rdf:Description rdf:about=""&gt;
    &lt;dc:subject&gt;ADAM; Simple Search; Index+; prototype&lt;/dc:subject&gt;
  &lt;/rdf:Description&gt;
&lt;/rdf:RDF&gt;</pre>

</div>

<div><h3 id="sec_multi">Multiple transformations in XHTML</h3>
<p>An XHTML document may conform to a number of dialects
simultaneously and link to more than one GRDDL transformation.  However,
since the <code>href</code> attribute of the <code>link</code> and
<code>a</code> elements accept only a single IRI reference, multiple
instances of these elements must be used to assert multiple links:</p>

<div class="example">
<pre>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
&lt;head profile="http://www.w3.org/2003/g/data-view"&gt;
  &lt;title&gt;Joe Lambda's Home page [an example of RDF in XHTML]&lt;/title&gt;

  &lt;link rel="transformation" href="http://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokFOAF.xsl" /&gt;
  &lt;link rel="transformation" href="http://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokCC.xsl" /&gt;
  &lt;link rel="transformation" href="http://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokGeoURL.xsl" /&gt;
...
</pre></div>

<div class="illustration">
<img src="figMultiTxform.png" alt="diagram: link to multiple transformations" />
<p>multiple transformations</p>
<small>(<a href="figMultiTxform.svg">svg</a>)</small>
</div>

</div>

<div><h3 id="prof_rules">Rules for GRDDL with valid XHTML</h3>

<p>The general rule is:</p>
<table border="1">
<tr><td class="assertion" id="rule_tlrel">
Given XPath root node <var>N</var>, if
<var>N</var> has <a href="#rule_metadata_profile_name">metadata profile name</a>
<tt>http://www.w3.org/2003/g/data-view</tt>, then

for each <tt>a</tt> and <tt>link</tt> descendant element <var>E</var>
whose <a href=
"http://www.w3.org/TR/1999/REC-html401-19991224/struct/links.html#adef-rel">
<tt>rel</tt>
attribute</a><a class="norm" href="#HTML4">[HTML4]</a> has
<tt>transformation</tt> as one of its <a href="#stok">space separated
values</a> 
the resource identified by the absolute form of the
<tt>href</tt> attribute with respect to the base IRI of <var>E</var>
is a <dfn>GRDDL transformation</dfn> of <var>N</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
?N gspec:profileName "http://www.w3.org/2003/g/data-view".
(?N
""".//*[namespace-uri()="http://www.w3.org/1999/xhtml" and
        (local-name() = "a"
         or local-name() = "link")"""
) gspec:xpath ?E.
(?E "@rel") gspec:xpath [ fn:string [
   fn:normalize-space ?E_REL ]].
(?E_REL "[ \t\r\n]+") fn:tokenize [
 list:member "transformation" ].
(?E "@href") gspec:xpath [ fn:string ?T_REF ].
?E gspec:htmlBase ?BASE.
(?T_REF ?BASE) fn:resolve-uri ?TURI.
?T log:uri ?TURI.
</pre>
</td>
</tr>
<tr><td><hr /></td></tr>
<tr>
<td>
<pre>
?N grddl:transformation ?T.
</pre>
</td>
</tr>
</table>

</td>
</tr>
</table>

<p>Note that the base IRI of an element node in an XHTML document may
be influenced by factors such as a <a
href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/links.html#edef-BASE"><tt>base</tt>
element</a><a class="norm" href="#HTML4">[HTML4]</a> <a
href="http://gbiv.com/protocols/uri/rev-2002/rfc2396bis.html#base-retrieval">Retrieval
URI</a><a href="#rfc3986">RFC3986</a>, etc. See the <a href="#base_misc">Base IRI considerations</a> appendix and test cases such as <a
href="http://www.w3.org/2001/sw/grddl-wg/td/grddl-tests#htmlbase1">htmlbase1</a>
for further clarification.</p>

<p>The rule above depends on the following formalization of 
metadata profiles in XHTML:</p>

<table border="1">
<tr><td class="assertion" id="rule_metadata_profile_name">
Given an XPath root node <var>N</var> of an XHTML document
(that is, an XML document whose root element has
a local name of <tt>html</tt> and a namespace name of
<tt>http://www.w3.org/1999/xhtml</tt>)
for each <a href="#stok">space-separated
token</a> <var>REF</var> in the value of the <tt>profile</tt>
attribute<a class="norm" href="#HTML4">[HTML4]</a>
of the <tt>head</tt> element <var>E</var>,
the absolute form of <var>REF</var> with respect to the
base IRI of <var>E</var> is a <dfn>metadata profile name</dfn> of
<var>N</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
(?N
 """
*[local-name()="html" and
  namespace-uri()="http://www.w3.org/1999/xhtml"] /
 *[local-name()="head" and
   namespace-uri()="http://www.w3.org/1999/xhtml"]""")
 gspec:xpath ?E.
(?E "@profile") gspec:xpath [ fn:string ?V ].
?E fn:base-uri ?BASE.
?V fn:normalize-space ?Vnorm.
(?Vnorm "[ \t\r\n]+") fn:tokenize [  list:member ?P_REF ].
(?P_REF ?BASE) fn:resolve-uri ?PROFID.
</pre>
</td>
</tr>
<tr><td><hr /></td></tr>
<tr>
<td>
<pre>
?N gspec:profileName ?PROFID.
</pre>
</td>
</tr>
</table>
</td></tr>
</table>

</div>

</div>

<div><h2 id="profile-bind"><span class="gen">5. </span>GRDDL for HTML Profiles</h2>

<p>XHTML provides the profile mechanism to link to the meaning of properties 
and the set of legal values for those properties. As with namespace documents,
a profile document can effectively be written using XHTML with embedded RDF statements 
and a GRDDL transformation to extract the definition of terms that are applicable.
Those terms can then be used in an XHTML document to convey profile-dependent meaning.
As discussed in 
<a href="#grddl-xhtml">Using GRDDL with valid XHTML</a>, the GRDDL profile can be used 
with XHTML documents to apply GRDDL semantics over <code>link</code> elements where 
the value of <code>rel</code> attribute is <code>transformation</code>.
This very powerful and flexible mechanism integrates well with 
<a class="inform" href="http://microformats.org/wiki/faqs-for-rdf#Are_there_Schemas_for_Microformats.3F">microformat profiles</a><a class="inform" href="#MF-RDF-FAQ">[MF-RDF-FAQ]</a> which overlay the normally semantically-poor HTML markup.</p>

<p>The following diagram illustrates an XFN document<a class="inform"
href="#XFN">[XFN]</a>, <tt>friends.html</tt> associated with the
<tt>grokXFN.xsl</tt> transformation indirectly via an XFN profile.
</p>

<div class="illustration">
<img src="figGleanProfile.png" alt="diagram: transformation linked indirectly via profile" />
<p>indirection via profile</p>
<small>(<a href="figGleanProfile.svg">svg</a>)</small>
</div>

<p>Adding a GRDDL <code>profileTransformation</code> assertion to a
profile document is much like <a href="#ns-bind">adding a
<code>namespaceTransformation</code> assertion to a namespace
document</a>. For a dialect defined by a valid XHTML profile
documents, add
<code>profile="http://www.w3.org/2003/g/data-view"</code> to the
<code>head</code> element and make a link of type
<code>profileTransformation</code> to the transformation of the
dialect.</p>

<p>The general rule is:</p>

<table border="1">
<tr>
<td class="assertion" id="rule_profiletrans">
If
<ul>
<li>an information resource <var>PDOC</var>, identified by an IRI
<var>PNAME</var> has a <b>GRDDL result</b> that includes a triple
whose
<ul>
<li> subject is <var>PDOC</var>, whose</li>
<li>predicate is the property
   <tt>&lt;http://www.w3.org/2003/g/data-view#profileTransformation&gt;</tt>,
   and whose</li>
<li>object is <var>TX</var>,</li>
</ul>
</li>
<li>and an information resource
<var>IR</var> has an XML representation with
XPath root node <var>NODE</var> that has a
<a href="#rule_metadata_profile_name">metadata profile name</a>
<var>PNAME</var>,</li>
</ul> then <var>TX</var> is a <b>GRDDL
   transformation</b> of <var>NODE</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
?PDOC log:uri ?PNAME;
   grddl:result [
     log:includes [
       rdf:subject ?PDOC;
       rdf:predicate grddl:profileTransformation;
       rdf:object ?TX]].
?IR log:uri [ fn:doc ?NODE].
?NODE gspec:profileName ?PNAME.
</pre>
</td></tr>
<tr><td><hr /></td></tr>
<tr><td>
<pre>
?NODE grddl:transformation ?TX.
</pre>
</td></tr>
</table>
</td>
</tr>
</table>

</div>


<div><h2 id="txforms"><span class="gen">6. </span>GRDDL Transformations</h2>

<p>As noted above, each GRDDL transformation specifies a
<b>transformation property</b>, a function from XPath document nodes
to RDF graphs.  This function need not
be total; it may have a domain smaller than all XML document
nodes. For example, use of <tt>xsl:message</tt> with
<tt>terminate="yes"</tt> may be used to signal that the input is
outside the domain of the transformation.
</p>

<p>Developers of transformations should make available representations
in widely-supported formats.  XSLT version 1<a class="inform"
href="#XSLT1">[XSLT1]</a> is the format most widely supported by GRDDL-aware
agents as of this writing, though though XSLT2<a
href="#XSLT2">[XSLT2]</a> deployment is increasing.
While technically Javascript, C, or virtually any other programming
language may be used to express transformations for GRDDL, XSLT is
specifically designed to express XML to XML transformations and has
some good safety characteristics; XQuery has similar characteristics
to XSLT, though use of XQuery in GRDDL implementation is
less widely deployed at the time of this writing.
</p>

<table border="1">
<tr>
<td class="assertion" id="rule_txprop">
If
<ul>
<li><var>RDFXML</var> is the root XPath node of a
<a class="norm" href="http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#dfn-conforming-rdf-xml-document">conforming RDF/XML document</a><a href="#RDFX">[RDFX]</a>
that represents an RDF Graph <var>G</var>, and</li>
<li><var>R</var> is the root node of some XML document
and <var>TXNODE</var> is the root node of
an XSLT transformation<a class="inform"
href="#XSLT1">[XSLT1]</a>, and</li>
<li><var>RDFXML</var> is the root node of the
XSLT result tree when <var>TXNODE</var>
is applied to <var>R</var>, and</li>
<li><var>TXDOC</var> is an information
resource
with <em>transformation property</em>
<var>TP</var>
represented by an XML document
with root node <var>TXNODE</var>
</li>
</ul>
then <var>TP</var> relates <var>R</var> to <var>G</var>.
</td>
<td>
<table class="rule">
<tr><td>
<pre>
?RDFXML gspec:rdfParse ?G.
(?TXNODE ?R) gspec:resultTree ?RDFXML.
?TXDOC grddl:transformationProperty ?TP;
  log:uri [fn:doc ?TXNODE].
</pre>
</td>
</tr>
<tr><td><hr /></td></tr>
<tr>
<td>
<pre>
?R ?TP ?G
</pre>
</td>
</tr>
</table>
</td>
</tr>

</table>

<p>The rule above covers the case of a <em>transformation
property</em> that relates an XPath document node to an RDF graph via
an RDF/XML document.  Transformations may use other, unspecified,
mechanisms.  For example, see <a
href="http://www.w3.org/2001/sw/grddl-wg/td/grddl-tests#atomttl1">test
<tt>#atomttl1</tt></a>, in which the the <tt>media-type</tt> attribute
of the <tt>xsl:output</tt> element bears a "text/rdf+n3" value to
indicate a media type other than "application/rdf+xml". GRDDL agents
that can process such a media type can then produce an RDF graph in
accordance with the media type. Non-XSLT transforms may indicate the
RDF graph in some other, unspecified, fashion.
</p>


<div class="postponed">
<p>At present, when an information resource
is represented by an XML document, the
corresponding XPath data model may not be fully determined, depending
on, for example, whether an agent elaborates inclusions, parameter
entities, fixed and default attributes, or checks digital signatures.
Put another way, if an author takes responsibility for the information
in an XML document, for what information exactly is the author taking
responsibility? And how can the author ensure that a GRDDL
transformation is able to meet GRDDL's <a href="#sec_rend">Faithful
Rendition assurance</a>?
</p>

<p>This specification is silent on the question of which XML
processors are employed by or for GRDDL-aware agents. Whether or not
processing of XInclude, XML Validity, XML Schema Validity, XML
Signatures or XML Decryption take place

is currently unspecified. However, this specification anticipates that
the resolution of TAG issue
<a href="http://www.w3.org/2001/tag/issues.html?type=1#xmlFunctions-34">
xmlFunctions-34
</a>
and the definition, by the 
<a href="http://www.w3.org/XML/Processing/">XML Processing Model Working
Group</a>, of a default processing model will provide further
clarification and guidance, and GRDDL-aware agents are expected to
comply with such guidance if it is issued.

There is no universal expectation that an XSLT
processor will call on such processing before executing a GRDDL
transformation.  Therefore, it is suggested that GRDDL transformations
be written so that they perform all expected pre-processing, including
processing of related DTDs, Schemas and namespaces.  Such measure can
be avoided for documents which do not require such pre-processing to
yield an infoset that is faithful. That is, for documents which do not
reference XInclude, DTDs, XML Schemas and so on.</p>

<p>
Document authors, particularly XHTML document authors,
who wish their documents to be unambiguous when used with GRDDL 
should avoid dependencies on an external <a 
href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-doctype"
 >DTD subset</a>;
specifically:
</p>
<ul>
<li>
Explicitly include the XHTML namespace declaration in an XHTML document,
or an appropriate namespace in an XML document.
</li>
<li>
Avoid use of entity references, except those listed
in <a href=
"http://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">
section 4.6</a> of [<a class="norm" href="#XML">XML</a>]
</li>
<li>
And, more generally, 
follow the rules
listed for <a
href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-check-rmd">
the standalone document</a> validity constraint.
</li>
</ul>

<p><cite>XProc: An XML Pipeline Language</cite><a class="inform"
href="#XPROC">[XPROC]</a>, <em>a language for describing operations to
be performed on XML documents,</em> has recently been published as a
W3C Working Draft.  It merits consideration for expressing more
complex or sophisticated transformations which require control over
the flow of processing through a variety of XML processing tools.
Using XProc, one could apply a sequence of operations such XInclude,
validation, and transformation to a document, aborting if the result
of an intermediate stage is not valid, for example.</p>
</div>

</div>

<div><h2 id="sec_agt"><span class="gen">7. </span>GRDDL-Aware Agents</h2>

<p class="assertion" id="GRDDL_aware_agent">A <dfn>GRDDL-aware
agent</dfn> is a software module that computes <b>GRDDL results</b> of
information resources.</p>

<p>For example, a SPARQL query service might use a GRDDL-aware agent
for collecting RDF data. Or a Web browser might serve as a GRDDL-aware
agent for the purpose of collecting calendar and contact data. The
appropriate policy, for which results to compute and when, is likely to
involve waiting for a signal from user more in the Web browser case
than in the query service case.
</p>

<div class="assertion" id="agt_obl">

<p>Subject to <a href="#sec">security considerations</a> below and 
local policy as expressed in its configuration,
given an information resource <var>IR</var>, and
an XPath node <var>N</var> for a representation of <var>IR</var>,
a GRDDL-aware agent <b>should</b>: 
</p>

<ol>
  <li>Find each transformation associated with
  <var>N</var>, i.e.

  <ol>
    <li>each transformation associated with <var>N</var> via the
    <tt>grddl:transformation</tt> attribute as in the <a
    href="#grddl-xml">Adding GRDDL to well-formed XML</a> section
    </li>

    <li>each transformation associated with <var>N</var> via HTML
    links of type <tt>transformation</tt>, provided the document bears
    the <tt>http://www.w3.org/2003/g/data-view </tt> profile, as in
    the <a href="#grddl-xhtml">Using GRDDL with valid XHTML</a>
    section.
    </li>

    <li>each transformation indicated by any available namespace
    document, as in the <a href="#ns-bind">GRDDL for XML
    Namespaces</a> section.</li>

    <li>each transformation indicated by any XHTML profiles,
     as in the <a href="#profile-bind">GRDDL for HTML Profiles</a>
     section.
    </li>
  </ol>
  </li>
  <li>Selectively apply any or all discovered transformations to
  obtain GRDDL results.  Note selection may be guided by the agent's
  capabilities, local security policies and possibly user/client
  intervention.

</li>
  <li>Merge those GRDDL results.</li>
</ol>

</div>

<p>Note that discovery by namespace or profile document is recursive; 
Loops in the profile/namespace structure should be detected in order to avoid
infinite recursion.</p>

<div><h3 id="extrace">Example: A GRDDL-aware Agent protocol trace</h3>

<p>While this declarative specification of GRDDL allows a variety of
implementation strategies, in this example we trace the behavior
common to a number of typical implementations.</p>

<p>Consider a GRDDL-aware agent that is asked for results from
<tt>http://www.w3.org/2003/g/po-doc.xml</tt>. It starts by
dereferencing that URI, noting that RDF/XML, HTML, and XML are
acceptable representations:</p>


<pre>
[00:00.000 - client connection from 127.0.0.1:39645]
GET <b>http://www.w3.org/2003/g/po-doc.xml</b> HTTP/1.1
Host: www.w3.org
Accept: <b>application/rdf+xml,application/xml,text/xml,application/xhtml+xml,text/html</b>

[00:00.055 - server connected]
HTTP/1.1 200 OK
Last-Modified: Tue, 07 Dec 2004 22:59:02 GMT
Content-Length: 1302
Content-Type: application/xml; qs=0.9

&lt;purchaseOrder orderDate="1999-10-20"
   <b>xmlns="http://www.w3.org/2003/g/po-ex"</b>>
   &lt;shipTo country="US">
      &lt;name>Alice Smith&lt;/name>
      &lt;street>123 Maple Street&lt;/street>
<em>...</em>
</pre>

<p>The XML document that comes back has no explicit transformation markup,
but the rules in <a href="#ns-bind">the XML Namespaces section</a> suggest
looking up results from the namespace document:</p>

<pre>
[00:00.000 - client connection from 127.0.0.1:39647]
GET <b>http://www.w3.org/2003/g/po-ex</b> HTTP/1.1
Host: www.w3.org
Accept: application/rdf+xml,application/xml,text/xml,application/xhtml+xml,text/html

[00:00.051 - server connected]
HTTP/1.1 200 OK
Content-Location: po-ex.xsd
Last-Modified: Tue, 07 Dec 2004 23:18:25 GMT
Content-Length: 2624
Content-Type: application/xml; qs=0.9

&lt;xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:po="http://www.w3.org/2003/g/po-ex"
        targetNamespace="http://www.w3.org/2003/g/po-ex"
        elementFormDefault="qualified"
        attributeFormDefault="unqualified"

   xmlns:data-view="http://www.w3.org/2003/g/data-view#" 
   data-view:transformation="http://www.w3.org/2003/g/embeddedRDF.xsl"
  >

  &lt;xs:annotation>
    &lt;xs:appinfo>
      &lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
        &lt;rdf:Description rdf:about="http://www.w3.org/2003/g/po-ex">
          &lt;data-view:namespaceTransformation
              rdf:resource="grokPO.xsl" />
        &lt;/rdf:Description>
      &lt;/rdf:RDF>
    &lt;/xs:appinfo>
  &lt;/xs:annotation>
<em>...</em>
</pre>

<p>We don't yet have a result in the form of an RDF/XML document,
but this time we find an explicit <tt>transformation</tt>
attribute in the GRDDL namespace, so we follow that link,
noting that we accept XML representations:</p>

<pre>
00:00.000 - client connection from 127.0.0.1:39649]
GET <b>http://www.w3.org/2003/g/embeddedRDF.xsl</b> HTTP/1.1
Host: www.w3.org
Accept: <b>application/xml</b>

[00:00.054 - server connected]
HTTP/1.1 200 OK
Last-Modified: Wed, 23 Mar 2005 18:49:12 GMT
Content-Length: 797
Content-Type: application/xml; qs=0.9

&lt;xsl:transform
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<em>...</em>
</pre>

<p>Applying that transformation yields...</p>

<pre>
&lt;rdf:RDF
   xmlns:data-view="http://www.w3.org/2003/g/data-view#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  &lt;rdf:Description rdf:about="http://www.w3.org/2003/g/po-ex">
    &lt;data-view:namespaceTransformation rdf:resource="http://www.w3.org/2003/g/grokPO.xsl"/>
  &lt;/rdf:Description>
&lt;/rdf:RDF>
</pre>

<p>... which tells us that <tt>.../grokPO.xsl</tt> is a transformation for
all documents in the <tt>.../po-ex</tt> namespace.</p>


<p>Continuing recursively, we examine the namespace document
for <tt>po-ex.xsd</tt>. As this is a well-known namespace document,
following the <a href="#sec">Security considerations section</a>,
we note the last modified date of our cached copy in the request,
and the origin server lets us know that our copy is current:
</p>

<pre>
[00:00.000 - client connection from 127.0.0.1:39651]
GET http://www.w3.org/2001/XMLSchema HTTP/1.1
Host: www.w3.org
Accept: application/rdf+xml,application/xml,text/xml,application/xhtml+xml,text/html
<b>If-modified-since: Fri, 16 Dec 2005 14:19:38 GMT</b>

[00:00.047 - server connected]
HTTP/1.1 304 Not Modified
Content-Location: XMLSchema.html
Expires: Wed, 07 Feb 2007 15:09:29 GMT
Cache-Control: max-age=21600
Vary: negotiate, accept, accept-charset
</pre>

<p>Since our cached copy of the XML Schema namespace document
shows no associated GRDDL transformation, we return
to the namespace transformation from <tt>po-ex</tt>,
i.e. <tt>grokPO.xsl</tt>:</p>

<pre>
[00:00.000 - client connection from 127.0.0.1:39653]
GET http://www.w3.org/2003/g/grokPO.xsl HTTP/1.1
Host: www.w3.org
Accept: application/xml

[00:00.048 - server connected]
HTTP/1.1 200 OK
Last-Modified: Tue, 07 Dec 2004 23:33:28 GMT
Content-Length: 1739
Content-Type: application/xml; qs=0.9

&lt;xsl:transform
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:po="http://www.w3.org/2003/g/po-ex"
    xmlns:poF="http://www.w3.org/2003/g/po-ex#"
    >

&lt;xsl:output method="xml" indent="yes" />

&lt;div xmlns="http://www.w3.org/1999/xhtml">
&lt;h1>grokPO.xsl -- interpret purchase order format as RDF&lt;/h1>
<em>...</em>
</pre>

<p>Applying this transformation to <tt>po-doc.xml</tt> yields RDF/XML;
we parse this to an RDF graph (using the URI of the source document,
<tt>http://www.w3.org/2003/g/po-doc.xml</tt>, as the base URI) and
return the graph as a GRDDL result of <tt>po-doc.xml</tt>:</p>

<pre>
&lt;rdf:RDF
   xmlns:poF="http://www.w3.org/2003/g/po-ex#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  &lt;rdf:Description rdf:nodeID="hOhqYGhx9">
    &lt;poF:city>Mill Valley&lt;/poF:city>
    &lt;poF:state>CA&lt;/poF:state>
    &lt;poF:zip>90952&lt;/poF:zip>
    &lt;poF:street>123 Maple Street&lt;/poF:street>
    &lt;poF:name>Alice Smith&lt;/poF:name>
  &lt;/rdf:Description>
<em>...</em>
</pre>

<p>HTTP trace data was collected via <a
href="http://hathawaymix.org/Software/TCPWatch">TCPWatch</a> by Shane
Hathaway. For more details, see <a
href="http://www.w3.org/2001/sw/grddl-wg/td/testlist1#http_tracing">HTTP
tracing in the GRDDL test materials</a>.</p>


</div>
</div>

<div>
<h2 id="sec"><span class="gen">8. </span>Security considerations</h2>

<p>The execution of general-purpose programming languages as
interpreters for transformations exposes serious security risks.
Designers of GRDDL-aware agents are advised to guard against simply
sending GRDDL transformations to "off-the-shelf" interpreters.  While
it is usually safe to pass documents from trusted sources through a
GRDDL transformation, implementors should consider all of the
following before adding the ability to execute arbitrary GRDDL
transformations linked from arbitrary Web documents.</p>

<p>GRDDL, like many Web technologies, fundamentally relies on the dereferencing of URIs.
Writers of GRDDL transformations are advised against employing URL operations
which are potentially dangerous, because these operations are more likely to be
unavailable in secure GRDDL implementations. Software executing GRDDL transformations
are advised to either completely disable all potentially dangerous URL operations or
take special care not to delegate any special authority to their operation. In particular,
operations to read or write URLs are more safely executed with the privileges associated
with an untrusted party, rather than the current user. Such disabling and/or checking
should be done completely outside of the reach of the transformation language itself;
care should be taken to insure that no method exists for re-enabling full-function versions
of these operators.</p>

<p>The remainder of this section outlines some, though probably not
all, of the possible problems with the execution of GRDDL transformations,
with particular reference to transformations in XSLT.</p>

<ol>

<li>With unconstrained use of GRDDL, untrusted
transformations may access URLs which the end-user has read or write
permission, while the author of the transformation does not. This is
particularly pertinent for URLs from the file: scheme; but many other
schemes are also impacted.  The untrusted code may, having read
documents which the author did not have permission to access, transmit
the content of the documents, to arbitrary Web servers by encoding the
contents within a URL, that may be passed to the server.
</li>

<li>Dangerous operations in the XSLT language include, but may not be
limited to, the operations involving getting a URL:
<tt>document()</tt>, <tt>doc()</tt>, <tt>unparsed-text()</tt> and
<tt>unparsed-text-available()</tt>, and <tt>xsl:result-document</tt>
which involves writing to a URL.  <tt>xsl:include</tt> and
<tt>xsl:import</tt> present fewer risks if they are processed before
execution of the transformation, rather than during it.


</li>

<li>Some transformation language implementations may provide facilities	for loading 
and executing other programming language code. For example,
an XSLT implementation may provide a method for executing Java code. 
Such facilities are obviously open to abuse. 
Designers of GRDDL transformations are advised against making use of
such features. Besides being implementation-specific, they are more likely to be
unavailable in secure implementations of the transformation language. The use of
such operators in software executing GRDDL transformations should protect against
such operators in case they are encountered.</li>

<li>XSLT implementations often provide their own extensions.
Designers of GRDDL transformations are advised not make use of extensions
because they are not guaranteed to be present in all implementations.
Software executing GRDDL transformations should make sure that extensions
are secure and do not present any kind of threat.
</li>

<li>Since it is possible to write transformations that inordinately consume system resources
or that loop indefinitely. Both types of transformations have the potential to cause damage
if sent to unsuspecting recipients. Designers of GRDDL transformations are advised
to avoid the construction and dissemination of such transformations.
Software executing GRDDL transformations should provide appropriate mechanisms
to abort processing after a reasonable amount of time has elapsed. In addition,
GRDDL software should be limited to the consumption of only a reasonable amount
of any given system resource.</li>

<li>Finally, bugs may exist in some interpreters of a transformation language which
might be exploited to gain unauthorized access to a recipient's system.
Apart from noting this possibility, no specific action is advised to take to prevent this
aside from timely correction of such bugs as they are discovered.
</li>
</ol>

</div>


<div><h2 id="grddlvocab"><span class="gen">9.</span> The GRDDL Vocabulary</h2>

<p>The following is excerpted from the GRDDL profile/namespace
document:</p>

<blockquote>
<p>This document, <a rel="ns-claim" href="http://www.w3.org/2003/g/data-view">http://www.w3.org/2003/g/data-view</a>,
is a metadata profile in the sense of the HTML specification, in section 
<a href="/TR/1999/REC-html401-19991224/struct/global.html#h-7.4.4.3">7.4.4.3 Meta data profiles</a>.</p>

<p>The following term is introduced here as an XHTML link relationship
name and RDF property name:</p>

<ul>
  <li id="transformation" class="-rdf-Property">
    <tt class="rdfs-label">transformation</tt>: <span
    class="rdfs-comment">relates a source document to a
    transformation, usually represented in <a
    href="/TR/xslt">XSLT</a>, that relates the source document syntax
    to the RDF graph syntax</span>.  domain: <a rel="rdfs-domain"
    href="#RootNode">RootNode</a>; range: <a
    rel="rdfs-range" href="#Transformation">Transformation</a>
  </li>

</ul>

<p>The following terms are introduced here as RDF properties:</p>

<ul>
  <li id="namespaceTransformation" class="-rdf-Property">
    <tt class="rdfs-label">namespaceTransformation</tt>: <span
    class="rdfs-comment">relates a namespace to a transformation for
    all documents in that namespace</span>.  range: <a
    rel="rdfs-range" href="#Transformation">Transformation</a>
  </li>

  <li id="profileTransformation" class="-rdf-Property">
    <tt class="rdfs-label">profileTransformation</tt>: <span
    class="rdfs-comment">relates a profile document to a
    transformation for all documents bearing that profile</span>.
    range: <a rel="rdfs-range"
    href="#Transformation">Transformation</a>
  </li>

  <li id="result" class="-rdf-Property">
    <tt class="rdfs-label">result</tt>: <span class="rdfs-comment">an
    RDF graph obtained from an information resource by directly
    parsing a representation in the standard RDF/XML syntax or
    indirectly by parsing some other dialect using a transformation
    nominated by the document</span>. domain: <a rel="rdfs-domain"
    href="#InformationResource">InformationResource</a>; range: <a
    rel="rdfs-range" href="#RDFGraph">RDFGraph</a>
  </li>

  <li id="transformationProperty" class="-owl-FunctionalProperty">
    <tt class="rdfs-label">transformationProperty</tt> <span
    class="rdfs-comment">relates a transformation to the algorithm
    specified by the property that computes an RDF graph from an XML
    document node</span> domain: <a rel="rdfs-domain"
    href="#Transformation">Transformation</a> range: <a
    rel="rdfs-range"
    href="#TransformationProperty">TransformationProperty</a>
  </li>
  <li id="Transformation" class="-rdfs-Class">
    <tt class="rdfs-label">Transformation</tt> <span
    class="rdfs-comment">an <a rel="rdfs-subClassOf"
    href="#InformationResource">InformationResource</a> that specifies
    a transformation from a set of XML documents to RDF graphs</span>
    Each Transformation has at least one <a rel="owl-onProperty"
    href="#transformationProperty">transformationProperty</a> that is
    a <a rel="owl-someValuesFrom"
    href="#TransformationProperty">TransformationProperty</a>.
  </li>

  <li id="TransformationProperty" class="-rdfs-Class">
    <tt class="rdfs-label">TransformationProperty</tt>
    <span class="rdfs-comment">a <a rel="rdfs-subClassOf"
    href="http://www.w3.org/2002/07/owl#FunctionalProperty"
    >FunctionalProperty</a> that relates
    <a href="#RootNode">XML document root nodes</a> to
    <a href="#RDFGraph">RDF graphs</a></span>
  </li>

</ul>

<p>The following terms are bound to concepts from existing standards:</p>

<ul>
  <li id="RootNode" class="-rdfs-Class">
    <tt class="rdfs-label">RootNode</tt> <span
    class="rdfs-comment">the root of the tree in the XPath data
    model</span>, per <a rel="rdfs-isDefinedBy"
    href="http://www.w3.org/TR/1999/REC-xpath-19991116#root-node">section
    5.1 Root Node in <cite>XML Path Language (XPath) Version
    1.0</cite></a>
  </li>

  <li id="RDFGraph" class="-rdfs-Class">
    <tt class="rdfs-label">RDFGraph</tt> <span class="rdfs-comment">a
    set of RDF triples</span>, per <a rel="rdfs-isDefinedBy"
    href="http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-rdf-graph">definition
    in <cite>Resource Description Framework (RDF): Concepts and
    Abstract Syntax</cite></a>

  </li>

  <li id="InformationResource" class="-rdfs-Class">
    <tt class="rdfs-label">InformationResource</tt>
    <span class="rdfs-comment">A resource which has the property that all of its essential characteristics can be conveyed in a message</span>, per <a rel="rdfs-isDefinedBy" href="http://www.w3.org/TR/2004/REC-webarch-20041215/#def-information-resource">definition in <cite>Architecture of the World Wide Web, Volume One</cite></a>
  </li>

</ul>
</blockquote>

<p>The namespace document includes RDF data about the terms in the
GRDDL Vocabulary, but these RDF data do not include any triples whose
predicate is <tt>grddl:profileTransformation</tt>.</p>

<p>In the section on <a href="#ns-bind">Using GRDDL with XML Namespace
Documents</a>, only explicit <tt>grddl:namespaceTransformation</tt>
triples satisfy the premise of the rule.  Likewise,
<tt>grddl:profileTransformation</tt> triples must be explicit in the
GRDDL result of a profile document in order to satisfy the premise of
the rule in the section on and on <a href="#profile-bind">GRDDL for
HTML Profiles</a>.  Authors of GRDDL source documents are advised
against using RDFS or OWL expressions which imply such triples but do
not explicitly state them.
</p>

</div>


<div><h2 id="bib"><span class="gen">10. </span>References</h2>

<h3 id="normativeRefs">Normative References</h3>

<dl class="bib">
<dt id="rfc3987">RFC3987</dt>
<dd><cite><a href="http://www.ietf.org/rfc/rfc3987.txt">Internationalized Resource Identifiers (IRIs)</a></cite> Internet RFC 3987 January 2005. Duerst, Suignard
</dd>


<dt id="rfc3986">RFC3986</dt>
<dd><cite><a href="http://www.apps.ietf.org/rfc/rfc3986.html">Uniform Resource Identifier (URI): Generic Syntax</a></cite> Internet RFC3986 January 2005. Berners-Lee, Fielding, Masinter
</dd>

<dt>
<a name="WEBARCH" id="WEBARCH">WEBARCH</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/2004/REC-webarch-20041215/">Architecture of the World Wide Web, Volume One</a>
</cite>, N. Walsh, I. Jacobs,  Editors, W3C Recommendation, 15 December 2004, http://www.w3.org/TR/2004/REC-webarch-20041215/ . <a href="http://www.w3.org/TR/webarch/">Latest version</a> available at http://www.w3.org/TR/webarch/ .</dd>

<dt>
<a name="RDFC04" id="RDFC04">RDFC04</a>

</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/">Resource Description Framework (RDF): Concepts and Abstract Syntax</a>
</cite>, G. Klyne, J. J. Carroll,  Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ . <a href="http://www.w3.org/TR/rdf-concepts/">Latest version</a> available at http://www.w3.org/TR/rdf-concepts/ .</dd>

        <dt>
            <a name="RDF-MT" id="RDF-MT">RDF-MT</a>

         </dt>
         <dd>
            <cite>
               <a href="http://www.w3.org/TR/2004/REC-rdf-mt-20040210/">RDF Semantics</a>
            </cite>, P. Hayes,  Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ . <a href="http://www.w3.org/TR/rdf-mt/" title="Latest version of RDF Semantics">Latest version</a> available at http://www.w3.org/TR/rdf-mt/ .</dd>

<dt id="RDFX">RDFX</dt>
<dd>
  <cite>
    <a href=
    "http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/">RDF/XML
    Syntax Specification (Revised)</a></cite>, D. Beckett, Editor, W3C
    Recommendation, 10 February 2004,
    http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/ . <a
    href="http://www.w3.org/TR/rdf-syntax-grammar" title="Latest
    version of RDF/XML Syntax Specification (Revised)">Latest
    version</a> available at http://www.w3.org/TR/rdf-syntax-grammar .
</dd>

<dt>
  <a name="XMLBASE" id="XMLBASE">XMLBASE</a>
  
</dt>
<dd>
  <cite>
    <a href="http://www.w3.org/TR/2001/REC-xmlbase-20010627/">XML Base</a>
</cite>, J. Marsh,  Editor, W3C Recommendation, 27 June 2001, http://www.w3.org/TR/2001/REC-xmlbase-20010627/ . <a href="http://www.w3.org/TR/xmlbase/" title="Latest version of XML Base">Latest version</a> available at http://www.w3.org/TR/xmlbase/ .</dd>


<dt>
<a name="XHTML" id="XHTML">XHTML</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/">Modularization of XHTML&#x2122;</a>
</cite>, S. Schnitzenbaumer, F. Boumphrey, T. Wugofski, S. McCarron, M. Altheim, S. Dooley,  Editors, W3C Recommendation, 10 April 2001, http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/ . <a href="http://www.w3.org/TR/xhtml-modularization/">Latest version</a> available at http://www.w3.org/TR/xhtml-modularization/ .</dd>

<dt>
<a name="HTML4" id="HTML4">HTML4</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/1999/REC-html401-19991224">HTML 4.01 Specification</a>
</cite>, D. Raggett, A. Le Hors, I. Jacobs,  Editors, W3C Recommendation, 24 December 1999, http://www.w3.org/TR/1999/REC-html401-19991224 . <a href="http://www.w3.org/TR/html401">Latest version</a> available at http://www.w3.org/TR/html401 .</dd>

<dt id="XPATH">XPATH</dt>
<dd>
  <cite><a href="http://www.w3.org/TR/1999/REC-xpath-19991116">XML
  Path Language (XPath) Version 1.0</a> </cite>, J. Clark,
  S. J. DeRose, Editors, W3C Recommendation, 16 November 1999,
  http://www.w3.org/TR/1999/REC-xpath-19991116 . <a
  href="http://www.w3.org/TR/xpath" title="Latest version of XML Path
  Language (XPath) Version 1.0">Latest version</a> available at
  http://www.w3.org/TR/xpath .
</dd>

<dt>
<a name="XSLT1" id="XSLT1">XSLT1</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/1999/REC-xslt-19991116">XSL Transformations (XSLT) Version 1.0</a>

</cite>, J. Clark,  Editor, W3C Recommendation, 16 November 1999, http://www.w3.org/TR/1999/REC-xslt-19991116 . <a href="http://www.w3.org/TR/xslt">Latest version</a> available at http://www.w3.org/TR/xslt .</dd>

</dl>

<h3 id="informativeRefs">Informative references</h3>

<p>The following documents provide additional background but are not
part of this specification.</p>

<dl class="bib">
  <dt>
    <a name="primer" id="primer">primer</a>
    
  </dt>
  <dd>
    <cite>
      <a
      href="http://www.w3.org/TR/2006/WD-grddl-primer-20061002/">GRDDL
      Primer</a>
      </cite>, I. Davis,  Editor, W3C Working Draft (work in progress), 2 October 2006, http://www.w3.org/TR/2006/WD-grddl-primer-20061002/ . <a href="http://www.w3.org/TR/grddl-primer/"
      title="Latest version of GRDDL Primer">Latest version</a> available at http://www.w3.org/TR/grddl-primer/ .
  </dd>

  <dt>
    <a name="usecases" id="usecases">usecases</a>
  </dt>
  <dd>
    <cite>
      <a
      href="http://www.w3.org/TR/2007/NOTE-grddl-scenarios-20070406/">GRDDL
      Use Cases: Scenarios of extracting RDF data from XML
      documents</a> </cite>, F. Gandon, Editor, W3C Working Group
      Note, 6 April 2007,
      http://www.w3.org/TR/2007/NOTE-grddl-scenarios-20070406/ . <a
      href="http://www.w3.org/TR/grddl-scenarios/" title="Latest
      version of GRDDL Use Cases: Scenarios of extracting RDF data
      from XML documents">Latest version</a> available at
      http://www.w3.org/TR/grddl-scenarios/ .
  </dd>
      
  <dt>
    <a name="GRDDL-TESTS" id="GRDDL-TESTS">GRDDL-TESTS</a>
  </dt>
  <dd>
    <cite>
      <a href="http://www.w3.org/TR/2007/WD-grddl-tests-20070328/">GRDDL Test Cases</a>
      </cite>, C. Ogbuji,  Editor, W3C Working Draft (work in progress), 28 March 2007, http://www.w3.org/TR/2007/WD-grddl-tests-20070328/ . <a href="http://www.w3.org/TR/grddl-tests/"
      title="Latest version of GRDDL Test Cases">Latest version</a> available at http://www.w3.org/TR/grddl-tests/ .</dd>
      

  <dt id="SPARQL">SPARQL</dt>
         <dd>
            <cite>
               <a href="http://www.w3.org/TR/2007/WD-rdf-sparql-query-20070326/">SPARQL Query Language for RDF</a>
            </cite>, E. Prud'hommeaux, A. Seaborne,  Editors, W3C Working Draft (work in progress), 26 March 2007, http://www.w3.org/TR/2007/WD-rdf-sparql-query-20070326/ . <a href="http://www.w3.org/TR/rdf-sparql-query/"
               title="Latest version of SPARQL Query Language for RDF">Latest version</a> available at http://www.w3.org/TR/rdf-sparql-query/ .</dd>

 
<dt>
            <a name="XSLT2" id="XSLT2">XSLT2</a>

         </dt>
         <dd>
            <cite>
               <a href="http://www.w3.org/TR/2007/REC-xslt20-20070123/">XSL Transformations (XSLT) Version 2.0</a>
            </cite>, M. Kay,  Editor, W3C Recommendation, 23 January 2007, http://www.w3.org/TR/2007/REC-xslt20-20070123/ . <a href="http://www.w3.org/TR/xslt20"
               title="Latest version of XSL Transformations (XSLT) Version 2.0">Latest version</a> available at http://www.w3.org/TR/xslt20 .</dd>


<dt id="RFC2731">RFC2731</dt>
<dd>J. Kunze <cite><a
href="http://www.ietf.org/rfc/rfc2731.txt">Encoding Dublin Core
Metadata in HTML</a></cite>  in 1999</dd>

<dt id="XFN">XFN</dt>

<dd>
<cite><a href="http://gmpg.org/xfn/intro">XFN: Introduction and
Examples</a>
</cite>
copyright GMPG 2003-2007. Eric, Tantek, and Matt
</dd>

<dt id="DCRDF">DCRDF</dt>
<dd><cite><a href="http://dublincore.org/documents/2002/07/31/dcmes-xml/">Expressing Simple Dublin Core in RDF/XML</a></cite>
Beckett, Miller, Brickley 2002-07-31</dd>

<dt>
<a name="P3P" id="P3P">P3P</a>
</dt>
<dd>
<cite>
<a href="http://www.w3.org/TR/2002/REC-P3P-20020416/">The Platform for Privacy Preferences 1.0 (P3P1.0)
Specification</a>
</cite>, M. Marchiori,  Editor, W3C Recommendation, 16 April 2002, http://www.w3.org/TR/2002/REC-P3P-20020416/ . <a href="http://www.w3.org/TR/P3P/">Latest version</a> available at http://www.w3.org/TR/P3P/ .</dd>
         <dt>
            <a name="STYPI" id="STYPI">STYPI</a>
         </dt>
         <dd>
            <cite>
               <a href="http://www.w3.org/1999/06/REC-xml-stylesheet-19990629">Associating Style Sheets with XML documents</a>
            </cite>, J. Clark,  Editor, W3C Recommendation, 29 June 1999, http://www.w3.org/1999/06/REC-xml-stylesheet-19990629 . <a href="http://www.w3.org/TR/xml-stylesheet"
               title="Latest version of Associating Style Sheets with XML documents">Latest version</a> available at http://www.w3.org/TR/xml-stylesheet .</dd>
         <dt>
            <a name="XPROC" id="XPROC">XPROC</a>

         </dt>
         <dd>
            <cite>
               <a href="http://www.w3.org/TR/2006/WD-xproc-20060928/">XProc: An XML Pipeline Language</a>
            </cite>, N. Walsh,  Editor, W3C Working Draft (work in progress), 28 September 2006, http://www.w3.org/TR/2006/WD-xproc-20060928/ . <a href="http://www.w3.org/TR/xproc/"
               title="Latest version of XProc: An XML Pipeline Language">Latest version</a> available at http://www.w3.org/TR/xproc/ .</dd>

<dt id="MF-RDF-FAQ">MF-RDF-FAQ</dt>
<dd><cite><a href="http://microformats.org/wiki/faqs-for-rdf"> Microformat FAQs for RDF Fans</a></cite>, last modified 17:57, 30 May 2006</dd>
</dl>

</div>

<div><h2 id="stylepi">Appendix: Transformations for Styling versus data extraction (Informative)</h2>

<p>The xml-stylesheet processing instruction<a class="inform"
href="#STYPI">[STYPI]</a> is generally deployed for automated
presentation processing. This type of link is different from links to
GRDDL transformation algorithms, which are intended to facilitate
extracting data. Also, parsing the content of processing instructions
is not supported by XML tools such as XSLT processors, and grounding
processing instructions in URI space is not as straightforward as
using namespaces with attributes.
</p>
</div>

<div>
<h2 id="base_misc">Appendix: Base IRI considerations</h2>

<p>
In the <a href="#grddl-xml">Adding GRDDL to well-formed XML</a> section,
we have:
</p>
<blockquote>
<p>
The base IRI for interpretting relative IRI references in a 
serialization of a graph produced by a GRDDL transformation 
is the base IRI of the source document.
</p>
</blockquote>
<p>
This corresponds to RFC 3986, particularly 
<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1">section 5.1</a>,
which illustrates the identification of a base URI, with the following picture:
</p>
<pre>
         .----------------------------------------------------------.
         |  .----------------------------------------------------.  |
         |  |  .----------------------------------------------.  |  |
         |  |  |  .----------------------------------------.  |  |  |
         |  |  |  |  .----------------------------------.  |  |  |  |
         |  |  |  |  |       &lt;relative-reference&gt;       |  |  |  |  |
         |  |  |  |  `----------------------------------'  |  |  |  |
         |  |  |  | (<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.1">5.1.1</a>) Base URI embedded in content   |  |  |  |
         |  |  |  `----------------------------------------'  |  |  |
         |  |  | (<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.2">5.1.2</a>) Base URI of the encapsulating entity |  |  |
         |  |  |         (message, representation, or none)   |  |  |
         |  |  `----------------------------------------------'  |  |
         |  | (<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.3">5.1.3</a>) URI used to retrieve the entity            |  |
         |  `----------------------------------------------------'  |
         | (<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.4">5.1.4</a>) Default Base URI (application-dependent)         |
         `----------------------------------------------------------'
</pre>
<p>
During typical GRDDL processing, an intermediate RDF/XML serialization is
produced as the output of a transform.
To convert this serialization into an RDF graph, any relative references
in the serialization are resolved to IRIs.  To identify
the appropriate base IRI for resolving a given relative reference,
first
check for a base URI embedded within this RDF/XML,
following XML Base, as permitted by RDF Syntax. 

If there is no base URI embedded within this RDF/XML, then section
5.1.2 of RFC 3986 may apply, because the <em>encapsulating entity</em>
of this serialization is the root element of the input document.  If
this element does not define a base URI, then its encapsulating
entity, the input document, may define a base IRI.
</p>

<p>
The original document may be an XHTML family document, or
it may be some other XML document.
</p>

<h3 id="base_xhtml">The Base IRI of an XHTML Family document</h3>

<p>For an XHTML family document,
the base IRI of the input document may be specified as the value
of the <code>href</code> attribute of the <code>&lt;base&gt;</code>
element (if any).
This is in accordance with section 5.1.1 of RFC 3986.
</p>

<p>
In many other cases, section 5.1.2 does not apply, and section 5.1.3
does apply.
Section 5.1.3 specifies the use of
the retrieval IRI as the base IRI.
Furthermore,
<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.3">section 5.1.3</a>
of RFC 3986 specifies that:
</p>
<blockquote>
<p>
if the retrieval was the result of a redirected request, 
the last URI used (i.e., the URI that 
resulted in the actual retrieval of the representation) 
is the base URI.
</p>
</blockquote>

<p>
The resulting IRI is used as the base IRI parameter for processing
the intermediate RDF/XML serialization.
</p>

<h3 id="base_other_xml">The base IRI of other XML documents</h3>

<p>Other XML documents may use XML Base.
This is only recommended when the specific document format
permits the use of XML Base.
</p>

<p>When an <code>xml:base</code> attribute is present
on the root element of an XML document, this
specifies the base IRI for that document,
following section  5.1.1 of RFC 3986.
</p>

<p>When there is no <code>xml:base</code> attribute
on the root element, even if there is such an attribute on
a descendent element, then section 5.1.1 of RFC 3986 does not apply.
</p>

<p>
As in the XHTML case, we then have to consider sections
5.1.2, 5.1.3 and 5.1.4 of RFC 3986.
</p>

<p>
Of these, sections 5.1.3 is the most common case,
and the note about redirected retrieval also applies.
</p>

<h3 id="pipeline">The base IRI in a processing pipeline</h3>
<p>
A GRDDL aware agent computes GRDDL results when
</p>
<blockquote>
<p>
given a URI <var>I</var> of an information resource <var>IR</var>, and
an XPath node <var>N</var> for a representation of <var>IR</var>
</p>
</blockquote>
<p>
To use a GRDDL aware agent in a processing pipeline,
as well as the XPath node <var>N</var>, it is also necessary
to specify a corresponding IRI  <var>I</var>.
This is used as the base IRI when the other mechanisms
do not apply.
This corresponds to section 5.1.4 of RFC 3986.
It is even possible for the default IRI used to bear
no relationship with the XPath node <var>N</var>,
but in such a case, we 
<a href="http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.4">read</a>:
</p>
<blockquote>
<p>
As this definition is necessarily application-dependent, failing to define a base URI by using one of the other methods may result in the same content being interpreted differently by different types of applications.
</p>
<p>
A sender of a representation containing relative references is responsible for ensuring that a base URI for those references can be established.
</p>
</blockquote>

<h3 id="rcpbase">Responsibilities for correct processing of base IRIs</h3>

<h4 id="bdoc_auth">Document authors, including profile and namespace documents</h4>

<p>Document authors should, in general, include a base URI
if the document is retrievable from some other URI.</p>

<p>For an XHTML family document<a href="#XHTML">[XHTML]</a>, this is done using the <code>base</code> element.</p>

<p>For other XML documents, if the format supports <code>xml:base</code> 
then this should be used. In general, experience suggests that there is
least confusion when this is done on the root element.
Document authors may also use <code>xml:base</code> attributes
elsewhere in their documents, as permitted by the document format,
with semantics as defined by XML Base<a href="#XMLBASE">[XMLBASE]</a>.
</p>

<p>For XML documents in formats that do not support <code>xml:base</code>,
and are not XHTML family documents, there is no support in GRDDL for
specifying an in-line base URI.</p>

<p>When a profile or namespace document can be accessed via multiple URIs,
for instance by a redirect, document authors should, in general, 
provide a GRDDL result that specifies profile transformations or 
namespace transformations for each of these URIs.
</p>


<h4 id="base_agt">GRDDL aware agents</h4>

<p>
When a GRDDL result represented in RDF/XML 
using the <a href="#rule_rdfxbase">rule for RDF/XML</a>,
a base URI may be needed for this representation, in order to convert it
into a RDF Graph, following the rules in the RDF/XML Syntax Specification<a href="#RDFX">[RDFX]</a>.
</p>
<p>
GRDDL results represented in other ways may also need a base URI.
</p>
<p>
Following the analysis above, a base URI for resolving a relative
reference is defined by following section 5.1 of RFC 3986.</p>
<p>
In many applications, it is highly undesirable
that GRDDL results may depend on an application default URI,
from section 5.1.4 of RFC 3986, ; some GRDDL
aware agents may treat this possibility as an error.
</p>

<h4 id="base_auth">GRDDL transformation authors</h4>

<p>
In general, when writing a GRDDL transformation for
an XHTML family document to RDF/XML the best advice is to ignore
issues to do with the base URI.
The easiest approach is to produce relative URIs in the output,
corresponding to any relative URIs in the input,
and absolute URIs corresponding to any concepts built into
the transform.
Such relative URIs will be resolved, during the processing
performed by a GRDDL aware agent, against the correct base URI.
</p>

<p>
When writing a GRDDL transformation for an XML document format
that does not support xml:base, and has no means to represent
an in-line base URI, there is little choice but to ignore issues
of the correct base.
</p>

<p>
When writing a GRDDL transformation for an XML document format,
other than an XHTML family document,
that does not support xml:base, but has some other means to represent
an in-line base URI, then a GRDDL aware agent will be ignorant
of this means, and a well-written GRDDL transformation will attempt
to correct for this. When a base URI is specified in such a way,
one approach is to insert the base URI into the RDF/XML output as
the value of an <code>xml:base</code> attribute, so that
the RDF/XML parser will resolve relative URIs against that base,
and ignore the base URI passed by the GRDDL aware agent, which
will have been computed ignoring the conventions specific to this format.
</p>

<p>
When writing a GRDDL transformation for an XML document format
that does support xml:base, then it must be remembered that 
a GRDDL aware agent 
has responsibility to handle an xml:base on the root element.
If there is such an xml:base attribute, then the simplest
behaviour for a GRDDL transformation, is to ignore it.
</p>

<p>
However, other xml:base attributes, not on the root element,
are the responsibility of the transform, since the GRDDL aware
agent ignores these.
Thus, these lower level xml:base attributes should be honored,
most easily by copying them into the output graph
in the appropriate place.
However, in general, xml:base attributes on ancestor nodes
also have to be taken into account, unless there is an intervening
xml:base attribute with an absolute URI as its value.
This is clearly non-trivial to get right: to assist,
the GRDDL library provides a module to be imported into your stylesheet,
see below.
</p> 


<p>
In all cases, 
while often unnecessary,
if a transform is aware of an absolute
base URI, specified in its input, for the whole document,
it is never incorrect to use this base URI as the base URI for
the output, for example, by adding an appropriate <code>xml:base</code>
attribute to the <code>rdf:RDF</code> element. 
</p>

<p>
Transforms that do this need to guard against the possibly incorrect
similar treatment of relative base URIs.  For example a
<code>xml:base=".."</code> on the root element might, in the
interaction between a correct GRDDL aware agent and a poorly written
transform, be applied twice, resulting in relative references being
resolved at the wrong level in the directory hierarchy.
</p>

</div>


<div class="changes">
<h2 id="changes">Acknowledgements and Change History</h2>

<p>A companion <cite><a
href="http://www.w3.org/2004/01/rdxh/specbg.html">GRDDL design history
and rationale</a></cite> discusses this design in the context of HTML,
PICS, and RDF since about 1997. The editor greatfully acknowledges the
many contributions of community members in the development of
GRDDL:</p>

<ul>
  <li>In Dec 2000, Ann Navarro raised the <a
  href="http://www.w3.org/2000/03/rdf-tracking/#faq-html-compliance">faq-html-compliance</a>
  issue: <q>The suggested way of including RDF meta data in HTML is
  not compliant with HTML 4.01 or XHTML</q>; in Apr 2001, Lee Jonas
  raised issue <a
  href="http://www.w3.org/2000/03/rdf-tracking/#rdfms-validating-embedded-rdf">rdfms-validating-embedded-rdf</a>:
  <q>RDF embedded in XHTML and other XML documents is hard to
  validate</q>.</li>

  <li>In May 2003, Joseph Reagle convened a task force with a a <a
  href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2003May/0001.html">Kickoff
  of public-rdf-in-xhtml-tf@w3.org</a> message. Dan Connolly
  sent a <a
  href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2003May/0002.html">relational data views of XHTML via XSLT</a> design sketch.
  </li>

  <li>In Nov 2003, <a href="/People/Dom/">Dominique
  Haza&#235;l-Massieux</a> wrote <cite><a
  href="/2003/11/rdf-in-xhtml-proposal">An RDF-in-XHTML Proposal</a></cite>,
  a predecessor of this spec.
  </li>

  <li>In Jan 2004, Dan Connolly integrated that draft into this one
  and sent <a
  href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2004Jan/0011.html">a
  message calling for review</a>. Discussion with Tim Berners-Lee
  led to generalizing from XHTML to all of XML and to
  indirection via namespace/profile document.</li>

  <li>In February 2004, the RDF Core specifications became W3C
  Recommendations; the issues <a
  href="http://www.w3.org/2000/03/rdf-tracking/#rdfms-validating-embedded-rdf">rdfms-validating-embedded-rdf</a>
  and <a
  href="http://www.w3.org/2000/03/rdf-tracking/#faq-html-compliance">faq-html-compliance</a>
  were postponed.</li>

  <li>A <a href="/TR/2004/NOTE-grddl-20040413/">13 April 2004 snapshot</a>
    was published as a W3C Coordination Group Note to faciliate
    exchange between the Semantic Web
    Best Practices and Deployment Working
    Group and the HTML Working Group.
  </li>

  <li>Also in February 2004, Connolly presented to the TAG a <a
  href="http://www.w3.org/2004/01/rdxh/specbg.html">GRDDL design
  history and rationale</a> which discusses contribution of this
  design to Web Architecture issues such as <a
  href="http://www.w3.org/2001/tag/issues.html?type=1#RDFinXHTML-35">RDFinXHTML-35</a>
  and <a
  href="http://www.w3.org/2001/tag/issues.html?type=1#namespaceDocument-8">namespaceDocument-8</a>.
  Feedback from Norm Walsh has been valuable, and Noah Mendelsohn
  noted a connection to the <cite>Cambridge Communiqué</cite> in a <a
  href="http://lists.w3.org/Archives/Public/www-tag/2005Mar/0090.html">message
  of 22 March</a>.
  </li>

  <li>Ben Adida started contributing use cases from Creative Commons
  in a <a href="http://www.w3.org/2004/03/04-SWBPD">March 2004 meeting
  of the Semantic Web Best Practices &amp; Deployment Working
  Group</a>.</li>

  <li>A <a
  href="http://www.w3.org/TeamSubmission/2005/SUBM-grddl-20050516/">16
  May 2005 snapshot</a> was published as a W3C Team Submission by Dom
  and Dan.</li>

  <li>In a <a
  href="http://esw.w3.org/topic/SwigAtTp2006">March 2006 Semantic
  Web Interest Group meeting</a>, Murray Maloney took and
  interest in the connection with XML Schemas and the readability of
  the specification, Brian McBride demonstrated some related
  implementation experience with transforming documents to RDF,
  and Ian Davis contributed the eRDF use case and profile.
  </li>

</ul>

<p>The GRDDL Working Group convened August 2006 with Harry Halpin as
chair and several of the contributors and implementors above
participating, plus
Brian Suda,
Chimezie Ogbuji,
David Booth,
Fabien Gandon,
Ian Davis,
Rachel Yager
Ronald P. Reck,
John Clark,
Danny Ayers,
and Simone Onofri.
</p>

<p>Jeremy Carroll provided detailed security considerations based on
<a class="inform" href="http://www.faqs.org/rfcs/rfc2046.html">RFC
2046</a> and implemented the HTTP header linking as proposed by Ian
Davis.</p>

<p>The Working Group published a <a
href="http://www.w3.org/TR/2006/WD-grddl-20061024/">24 October 2006
draft</a>. The <a
href="http://www.w3.org/2001/sw/grddl-wg/issues">issues list</a> shows
the major design decisions since then.</p>

<p>Changes since the 2 May 2007 release are as follows:</p>

<pre><!-- next line -->
$Log: spec.html,v $
Revision 1.292  2008/09/08 13:42:19  connolly
add missing WG members to acks section

Revision 1.291  2007/11/27 23:17:15  connolly
cite namespace doc in obsolete notice too

Revision 1.290  2007/11/27 23:14:21  connolly
note obsolete in favor of /TR/grddl/

Revision 1.289  2007/08/22 15:19:39  connolly
s/to in/to/ in #sec_dubc_ex (nice catch MM)

Revision 1.288  2007/07/19 21:21:00  connolly
note XQuery along with other languages

Revision 1.287  2007/07/16 21:28:03  connolly
fix case of rfc3987 cite

Revision 1.286  2007/07/16 21:27:26  connolly
XML ref goes nowhere; drop it

Revision 1.285  2007/07/16 15:15:14  connolly
update to 16 July release

Revision 1.284  2007/06/28 22:39:25  connolly
fix changelog link; remove some "dead code" in a comment

Revision 1.283  2007/06/28 22:36:46  connolly
pubrules: take PR style out now that checking seems to work

Revision 1.282  2007/06/28 22:29:56  connolly
pubrules checker forces us to lie about status :-/

Revision 1.281  2007/06/28 22:27:00  connolly
pubrules: add W3C to status heading; em markup for boilerplate

Revision 1.280  2007/06/28 22:23:32  connolly
- pubrules prep: style, this/previous version links

- strict XHTML markup in the base appendix

Revision 1.279  2007/06/28 22:06:57  connolly
status section:
 - estimate PR pub date, review end date
 - reduce implementation status stuff to a link to the implementation report
 - replace CR exit criteria by claim that they're met
 - link to WBS for formal reviews
 - update some issues list links
 - move bit about notes/TODOs outside official status
truncate change log at 2 May, date of last release

Revision 1.278  2007/06/28 21:23:37  connolly
XProc "recently" stuff is postponed too

Revision 1.277  2007/06/28 21:17:01  connolly
explicitly note open TAG issue re faithful infoset
using class="postponed" style

Revision 1.276  2007/06/27 15:02:25  connolly
remove bogus domain stuff from vocab

Revision 1.275  2007/06/25 22:52:32  connolly
tweak markup of #rule_profiletrans
for consistency
per Booth 21 Jun

Revision 1.274  2007/06/25 22:49:42  connolly
clarify base URI in titleauthor example
ack Booth 21 Jun 2007 10:53:55 -0400
(I think he asked for it earlier too)

Revision 1.273  2007/06/25 22:06:52  connolly
strike out-of-scope advice on xml:base in XHTML family documents,
per chime 25 Jun 2007 17:38:36 -0400

Revision 1.272  2007/06/20 16:12:36  connolly
markup fix

Revision 1.271  2007/06/20 16:10:40  connolly
note xmlFunctions-34 and issue-faithful-infoset into the status section

Revision 1.270  2007/06/20 14:40:15  connolly
remove unfinished library-related base stuff

Revision 1.269  2007/06/20 14:28:45  connolly
removed issues appendix from TOC

Revision 1.268  2007/06/20 14:27:49  connolly
moved issues list out of this editor's draft to the WG area

Revision 1.267  2007/06/19 21:13:23  connolly
refine base IRI appendix based on exchange between John and JJC.
supply citations

Revision 1.266  2007/06/19 20:37:09  connolly
paste in base text (2007/06/12 16:58:27 1.7 from jjc)

Revision 1.265  2007/06/19 15:44:11  connolly
note TransformationProperty domain/range stuff is broken

Revision 1.264  2007/06/19 09:26:30  connolly
remove unused "URI I" in Agents section,
per comment from Booth 10 May

Revision 1.263  2007/06/13 14:56:26  connolly
to reflect the postponed status of #issue-faithful-infoset,
take "purposely" out and change implementation-defined
to "unspecified"

Revision 1.262  2007/06/12 14:35:03  connolly
note faithful-infoset reconsidered and postponed

Revision 1.261  2007/06/04 20:23:32  connolly
todo += think about unused var I


</pre>
</div>
</body>
</html>