Connected Data for Machine Learning

About the talk

Knowledge graphs and connected data standards have transformed how data is managed made available in many organizations. Likewise, machine learning is at the core of many novel applications. In this talk, Paul looks at applications of these fundamental computational layers within Elsevier. He will focus on how they buttress each other but more importantly talk about the challenges Elsevier see in making connected data specifically for machine learning especially when relying on linked data standards.


elsevier labs

Elsevier Labs is an advanced technology group within Elsevier whose mission is to measurably improve the way knowledge is conveyed and used. They research and create new technologies for that mission, help implement proofs of concept, educate staff and management, and represent the company in technical discussions.


About the speaker

Paul Groth is Disruptive Technology Director at Elsevier Labs. He holds a Ph.D. in Computer Science from the University of Southampton (2007) and has done research at the University of Southern California and the Vrije Universiteit Amsterdam. His research focuses on dealing with large amounts of diverse contextualized knowledge with a particular focus on web and science applications. This includes research in data provenance, data science, data integration and knowledge sharing.

Previously, he lead architecture development for the Open PHACTS drug discovery data integration platform. Paul was co-chair of the W3C Provenance Working Group that created a standard for provenance interchange. At Elsevier, Paul continues his research line and helps the company understand new technologies and their applicability to building better infrastructure for scholarship.

Paul is co-author of “Provenance: an Introduction to PROV” and “The Semantic Web Primer: 3rd Edition” as well as numerous academic articles. He blogs at You can find him on twitter: @pgroth .