GoodRelations is a standardized vocabulary for product, price, and company data that can (1) be embedded into existing static and dynamic Web pages and that (2) can be processed by other computers. This increases the visibility of your products and services in the latest generation of search engines, recommender systems, and other novel applications.
Martin Hepp
martin.hepp at ebusiness-unibw.org
Tue Aug 24 22:18:27 CEST 2010
Hi Ed, > Is it better to have the sameAs reference a parallel canonical URI - > like a SPARQL endpoint - unrelated to the website topography (is this > option #2?). In general, I recommend to make hash URIs based on URIs of the human- readable HTML page the authoritative ones, as long as the pages contain RDFa markup. By that, you always refer someone who is trying to dereference a data entity (e.g. an offer or a product) to a resource that can serve both human-readable and machine-readable content. Even if you provide data dumps (e.g. in RDF/XML) or SPARQL endpoints, I usually suggest to use the URI references derived from the authoritative Web pages. It is essential to understand that this design choice influences the URI which will be tried for fetching information about the data object. So if you serve both a data dump in RDF/XML and RDFa in shop pages, I suggest to use the URIs resulting from parsing the RDFa also for the RDF/XML data. The HTML+RDFa and RDF/XML templates at http://code.google.com/p/templates4goodrelations/ also follow that approach. If interested in the details, please compare the output of the data dump file with the individual page markup. > Or is there a way to publish all product URI with sameAs > patterns such that the first discovered is taken as the canonical URI > for any following duplicate URI? The objective would be to remove the > web of maintenance from using the sameAs pattern. As I tried to express, the sameAs approach is the weakest, for it will still require crawling the same content from multiple URIs in order to consolidate all URI variants. Still, it is much better than nothing, because otherwise, a dataspace operator would have to rely on imperfect heuristics for entity consolidation. > Thanks for all the GoodRelations - Ed Thanks! Best Martin On 24.08.2010, at 20:12, Ed - 0x1b, Inc. wrote: > On Tue, Aug 24, 2010 at 6:04 AM, Martin Hepp > <martin.hepp at ebusiness-unibw.org> wrote: >> Dear all: >> >> Some shop applications, unfortunately, display the very same item at >> multiple URIs. This is problematic for Search Engine Optimization >> (SEO) and the Web of Data alike. >> > > regarding: > >> 3. The last option (but the least powerful, yet still much better >> than >> doing nothing) is to add owl:sameAs statements from the RDFa pattern >> to the canonical URI. >> The canonical URI should also be used for the foaf:page property, >> which is the crucial link from the data to the page from where the >> product can be ordered. >> >> Example: >> >> <div about="#offering" typeof="gr:Offering"> >> <div rel="owl:sameAs" resource="http://www.myshop.com/staplers/green_pocket_stapler123#offering >> "></div> >> <div property="rdfs:label" content="Cool green stapler for $8.99" >> xml:lang="en"></div> >> ... >> <div rel="foaf:page" resource="http://www.myshop.com/staplers/green_pocket_stapler123 >> "></div> >> </div> >> </div> >> > > Is it better to have the sameAs reference a parallel canonical URI - > like a SPARQL endpoint - unrelated to the website topography (is this > option #2?). Or is there a way to publish all product URI with sameAs > patterns such that the first discovered is taken as the canonical URI > for any following duplicate URI? The objective would be to remove the > web of maintenance from using the sameAs pattern. > > Thanks for all the GoodRelations - Ed