Neural Networks for Language
Overview
In this part, we outline neural network -related work leading to contemporary language models, starting with introducing neural networks and forming machine-understandable representations of data and text, followed by problems related to using neural networks with sequential data. Finally, the part introduces self-attention and transformers, which are the basis for many of the current large language models.
The chapters of this part are as follows.
- Basics of Neural Networks introduces the basic idea of neural networks.
- Training Neural Networks discusses how neural networks are trained.
- Embeddings and Word Representations introduces embeddings and word embeddings that are used to represent words in a machine-understandable way.
- Sequential Models (RNNs, LSTMs) discusses neural networks for processing sequential data, and the challenges associated with this.
- Attention Mechanisms describes the core idea of self-attention that allows processing words in context.
- Transformers discusses transformers architecture that is the base for many of the current large language models.
- Summary summarizes the key takeaways from this part.