CSCI-B 659 (Class # 12560) Topics in Artificial Intelligence

LING-L 715 Seminar on Knowledge Graphs, Large Language Models, and Graph-based Reasoning

This is the course page for Topics in Artificial Intelligence / Seminar on Knowledge Graphs, Large Language Models, and Graph-based Reasoning by Damir Cavar.

Introduction

Meeting time: MW, 1:15-2:30 PM

Classroom: Public Health (PH) 10

Course website: Assignments, slides, and other material will be posted on Canvas.

Credits: 3

Instructor: Dr. Damir Cavar

Office: Ballantine Hall (BH) 511

Phone: (812) 856-5094

Office hours (BH 516): Thursdays 4:15-5:15 PM and by appointment

Seminar description

Natural Language Processing (NLP) technologies enjoyed significant hype due to the stunning abilities of Large Language Models (LLMs) demonstrated by ChatGPT and GPT4. One of the problematic issues with such models is hallucination. LLMs tend to generate plausible and well-formulated text, references, and data that do not correspond to factual knowledge. Various proposals discuss how approved and valid knowledge can be added to LLMs to minimize hallucinations. On the other hand, LLMs seem to provide superior capabilities to process or generate natural language and, to a limited extent, reason over utterances and claims. LLMs also seem able to reason over temporal and event logic in limited ways—these capabilities we want to combine with formal knowledge representations.

Knowledge Graphs, Ontologies, and related representations of semantic properties and concept relations facilitate efficient graph-based storage for processing meaning via entailment and concept hierarchies. Those technologies enable the specification of semantic relations and limitations for precise representation of core aspects of natural language semantics. This includes some possibilities to reason over data and relations. Factive knowledge stored in Knowledge Graphs enables processing descriptions of the world or a specific knowledge domain. Knowledge Graphs are limited in keeping complex event representations and changes of entities and situations over time.

This seminar consists of a series of experiments to test and experiment with:

We will look at implementations of LLMs and experiment with integrating Knowledge Graphs in such LLMs. In addition to that, we will experiment with approaches to generate knowledge representations from structured and unstructured sources, providing access to such models via LLMs.

We are discussing, implementing, and experimenting with general techniques to map knowledge from unstructured sources (text, speech, image, sensory data) to graph representations:

We use graphs as symbolic knowledge representations (or Knowledge Graphs) with RDF, JSON-LD, OWL backends, as well as probabilistic and dynamic networks in hybrid models (symbolic and neural). The complexity of knowledge extraction becomes much higher when including processing implicatures and presuppositions and representing those in graph models.

Our goal is to a.) gain a deep understanding of the mapping from unstructured information (e.g., language, vision) to high-precision graph-based knowledge representations to b.) generate implicatures and presuppositions from both to be able to extend the logical reasoning capabilities, to c.) explore the limits of hybrid AI and Machine Learning methods on symbolic and probabilistic/dynamic Knowledge Graphs using various approaches to Graph Embeddings, with different graph and Graph Neural Network algorithms. Integrating sophisticated knowledge representations in an LLM environment can facilitate AI systems significantly and provide new reliable reasoning capabilities for data and information in various domains, e.g., medical, cybersecurity, or scientific writing.

Literature

Knowledge Graph and Ontology

Large Language Model

Natural Language Processing

Disclaimer

This syllabus is subject to change and likely will change. All important changes will be made in writing, with ample time for adjustment.


(C) 2024 by Damir Cavar