Semantic Web and Information Extraction Technologies

WebSem
Abstract

The Semantic Web is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined. It derives from W3C director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange. This course is a guided tour for a number of W3C recommendations allowing one to represent (RDF/S, SKOS, OWL) and query (SPARQL) knowledge on the web as well as the underlying logical formalisms of these languages, their syntax and semantics. We will present the problems of modeling ontologies and reconciling data on the web. Finally, we will explain how to extract knowledge from textual documents using natural language processing and information extraction technologies.

Teaching and Learning Methods: Lectures and Lab sessions (group of 2 students max)

Course Policies: Attendance to the Lab sessions is mandatory.

Bibliography

Requirements

Basic knowledge of web technologies (html, css, javscript) or database is a plus

Description

This course is a guided tour for several W3C recommendations allowing one to represent (RDF/S, SKOS, OWL) and query (SPARQL) knowledge on the web as well as the underlying logical formalisms of these languages, their syntax and semantics. We will present the problems of modeling ontologies and reconciling data on the web. Finally, we will explain how to extract knowledge from textual documents using natural language processing and information extraction technologies.

Learning outcomes:

  • Mastering the Semantic Web stack

    • RDF: represent knowledge on the web
    • RDFS, SKOS, OWL: build your own vocabulary
    • SPARQL: query the web of data (federated queries)
  • Information Extraction 101

    • Named Entity Recognition and Disambiguation
    • Sentiment analysis
  • Developing semantic web applications

    • The Linked Data principles: RAW data now!
    • Reconcile web data at scale using machine learning techniques
    • Interact with the Web of Data: RDFa, microdata, JSON-LD

Nb hours: 21.00

Evaluation: 

  • Lab reports 1+2+3 (40% of the final grade),
  • Final Exam (60% of the final grade)