[CaCL] CaCL 2/3: A Mathematical Framework for Transformer Circuits

Oh, Byung-Doh oh.531 at buckeyemail.osu.edu
Thu Jan 27 14:15:37 EST 2022


Hello everyone,

Next week we'll be discussing the following html-format paper. I had thought this was from OpenAI, but apparently it's from a group called Anthropic<https://www.anthropic.com>.

A Mathematical Framework for Transformer Circuits (Elhage et al., 2021): https://transformer-circuits.pub/2021/framework/index.html

REVERSE ENGINEERING RESULTS

To explore the challenge of reverse engineering transformers, we reverse engineer several toy, attention-only models. In doing so we find:

  *   Zero layer transformers model bigram statistics. The bigram table can be accessed directly from the weights.
  *   One layer attention-only transformers are an ensemble of bigram and “skip-trigram” (sequences of the form "A… B C") models. The bigram and skip-trigram tables can be accessed directly from the weights, without running the model. These skip-trigrams can be surprisingly expressive. This includes implementing a kind of very simple in-context learning.
  *   Two layer attention-only transformers can implement much more complex algorithms using compositions of attention heads. These compositional algorithms can also be detected directly from the weights. Notably, two layer models use attention head composition to create “induction heads”, a very general in-context learning algorithm.
  *   One layer and two layer attention-only transformers use very different algorithms to perform in-context learning. Two layer attention heads use qualitatively more sophisticated inference-time algorithms — in particular, a special type of attention head we call an induction head — to perform in-context-learning, forming an important transition point that will be relevant for larger models.

Best,
Byung-Doh

=================
Byung-Doh Oh (he/him/his)
Ph.D. Student
Department of Linguistics
The Ohio State University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/cacl/attachments/20220127/068114d4/attachment.html>


More information about the CaCL mailing list