Web Information Administration Inwards Rdf Age
This was the keynote on ICDCS'17 Day 2, past times Tamer Ozsu. Below are my notes from his talk. The slides for his presentation are available here.
Querying spider web information presents challenges due to lack of a schema, its volatility, together with its sheer scale. There accept been several recent approaches to querying spider web data, including XML, JSON, together with fusion tables. This verbalise is virtually roughly other approach to maintaining together with querying spider web data: RDF together with SPARQL. This terminal 1 is the recommended means past times W3C (World Wide Web Consortium) together with is a edifice block for semantic spider web together with Linked Open Data (LOD). Here is a diagram denoting the LOD datasets together with links betwixt them equally of 2014.
It turns out biologists are heavy users of RDF alongside the UniProt dataset.
As inward querying alongside SQL, likewise many joins are bad for performance, together with a lot of locomote is extended to optimize this.
If it is non possible to get together all RDF data, y'all necessitate to a distributed querying, together with for this methods similar to distributed RDBMS tin sack locomote used.
To complicate this farther non all of the RDF information sites tin sack procedure SPARQL queries. So several approaches are formulated to bargain alongside the employment that poses for distributed SPARQL processing, such equally writing wrappers to execute SPARQL queries on these sites.
Finally alive querying of the spider web of RDF linked information is also possible past times sending bots to traverse/browse this graph at runtime.
Querying spider web information presents challenges due to lack of a schema, its volatility, together with its sheer scale. There accept been several recent approaches to querying spider web data, including XML, JSON, together with fusion tables. This verbalise is virtually roughly other approach to maintaining together with querying spider web data: RDF together with SPARQL. This terminal 1 is the recommended means past times W3C (World Wide Web Consortium) together with is a edifice block for semantic spider web together with Linked Open Data (LOD). Here is a diagram denoting the LOD datasets together with links betwixt them equally of 2014.
Resource Description Framework (RDF)
In RDF, everything is a uniquely named resources (URI). The ID for Jack Nicholson's resources is JN29704. Resources accept defined attributes: y:JN29704 hasName = "Jack Nicholson", y:JN29704 BornOnDate = "1937-04-22". The relationships alongside other resources tin sack locomote defined via predicates.(This is "predicates inward grammer context", together with non inward logic context.) Predicates for triples of the flat "Subject Predicate Object". The subjects e'er URIs, together with the objects are "literals" or URIs.It turns out biologists are heavy users of RDF alongside the UniProt dataset.
RDF interrogation model: SPARQL
SPARQL provides a season of SQL. It operates on the RDF triple, together with at 1 time the _variables_ tin sack locomote used to substitute inward house of subject, predicate, object equally well. Thus it is also possible to stand upwards for SPARQL queries equally a graph. Once y'all stand upwards for the SPARQL interrogation equally a graph, answering a SPARQL interrogation reduces to a subgraph matching matching employment betwixt the RDF data-graph together with the query-graph.As inward querying alongside SQL, likewise many joins are bad for performance, together with a lot of locomote is extended to optimize this.
Distributed SPARQL processing
If it is possible to get together all RDF needed inward 1 place, hence querying tin sack locomote done easily alongside using mutual cloud computing tools. You tin sack sectionalization together with shop the RDF on HDFS, together with hence run SPARQL queries on this equally mapreduce jobs (say using Spark).If it is non possible to get together all RDF data, y'all necessitate to a distributed querying, together with for this methods similar to distributed RDBMS tin sack locomote used.
To complicate this farther non all of the RDF information sites tin sack procedure SPARQL queries. So several approaches are formulated to bargain alongside the employment that poses for distributed SPARQL processing, such equally writing wrappers to execute SPARQL queries on these sites.
Finally alive querying of the spider web of RDF linked information is also possible past times sending bots to traverse/browse this graph at runtime.
0 Response to "Web Information Administration Inwards Rdf Age"
Post a Comment