Blended Web

Semantic Web

Taming the Wild Web - An Introduction to the Semantic Web

Dr. Jeff Heflin, Computer Science and Engineering
Lehigh University

  • More than 21% of humanity has used the web
  • Web can be used for more than you think
  • By tracking flu-related queries, Google can determine flue outbreaks faster than reported by the CDC.
  • What if we had richer data?

Definition

  • The semantic web is not a separate web, but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation (Berners-Lee et al., 2001).

Ontology

  • a key component of the Semantic Web
  • ontologies define the semantics of the terms used in semi-structured web pages
    • identify context, provide shared definitions
    • has a formal syntax and unambiguous semantics
    • usually includes a taxonomy but typically much more
  • interface algorithms can compute what logically follows

W3C Recommendations

  • RDF(S) (1999, revised 2004)
    • directed graphs labeled with URIs
    • XML serialization syntax
  • OWL (2004)
    • extends RDF with more semantic primitives
    • based on description logics (DLs)
    • has a model theoretic semantics
<owl:Class rdf:ID="Band">
 <rdfs:subClassOf>
  <owl:Restriction>
   <owl:onProperty rdf:resource="#hasMember" />
   <owl:allValuesFrom red:resource="#Musician" />
  </owl:Restriction>
 </rdfs:subClassOf>
</owl:Class> 

A band is a subset of the groups which only have Musicians as members

URI (Uniform Resource Identifier)

  • Includes URLs
  • also anything that you can design an identification scheme for
  • helps to prevent collision of names
  • all the "symbols" in RDF are either URIs or Literals

Namespace

  • a mechanism for abbreviating

Description Logic

  • Form of knowledge representation
    • Useful for formally defining classes
    • Studied extensively in the 1990s
    • mature reasoning software
      • e.g., FaCT, RACER, Pellet
  • benefits
    • optimized computation of subsumption
      • calculate implicit subClassOf relations
    • ontology integration

Level of Adoption?

  • Open source Semantic Web tools
  • Commercial software vendors (Oracle, Adobe)
  • ~65 million Semantic Web documents (as of October 2009)
    • Yahoo SearchMonkey uses REF to present richer search results
    • Google now indexes RDFa
  • Semantic Web Enabled Sites
    • BBC Music
    • Harper's Magazine
    • DBPedia
    • LiveJournal uses FOAF

Linking Open Data Project

An Application: Hawkeye

  • Requested 1.7 million real Semantic Web documents identified by Swoogle (swoogle.umbc.edu)
  • Loaded 760,000 documents, 16.280 ontologies and 166 million triples
  • conversion of Citeseer, DBLP, NSF award data and various e-Gov sources
  • Developed ontologies to map different schemas
  • Developed sources that equate individuals from different sources
  • Use OWL as mapping language
  • Mapping Ontologies
  • Individual Equivalence Statements

Conclusion

  • Web is a powerful tool
  • Semantic Web is approaching critical mass
  • We have demonstrated the feasibility of large-scale integration using OWL
  • Integration can emerge via social web processes

Future work

  • User-friendly interface
  • improved performance
  • support more complex reasoning
  • support for updates