Issues and Concerns

Overview

In this part, we look into issues and concerns related to large language models. We start by discussing ownership of training data and large language model outputs, moving into highlighting some of the problems in the outputs, including hallucination. We further highlight that large language models can leak sensitive data and that they can be used for malicious purposes. Finally, we discuss concerns related to learning — or the lack of it — when large language models are not used consciously.

The chapters in this part are as follows.

Ownership of Data and Outputs discusses the ownership of large language model training data and outputs of large language models.
Hallucination and Bias highlights the possibility of large language models producing faulty or misleading outputs, as well as the possible biases in the training data and consequently in the outputs.
Privacy Concerns discusses some of the possibilities of finding private information from the outputs of large language models, and notes that correcting the outputs of large language models is not always possible.
Security Concerns discusses the most common vulnerabilities in large language models and demonstrates how large language models could be used for malicious purposes through prompt injection.
Learning Concerns discusses concerns related to a lack of learning when overtly relying on large language models.

Ownership of Data and Outputs →