Goals We will read two papers: Marcolli, Chomsky, and Berwick Mathematical structure of syntactic merge and Nemecek Coinductive guide to inductive transformer heads.

The first paper recasts Merge, a central operation in Chomsky's Minimalist Grammar (Chomsky 1995), in terms of Hopf algebras, while the second one argues that all building blocks of transformer models can be expressed via Hopf algebras. If both of these claims make sense, we are witnessing a convergence between the theoretical and the computational approaches to syntax not seen since the 1960s.

Methods We will review both papers with great care. Chomsky and others have been wrestling with Merge from the get-go, with several alternative formulations proposed over the years (see e.g. Collins and Stabler 2016). But experience shows that very complex definitions can have significant bugs, see e.g. Svenonius (1958), so a great deal of care is required, and mechanization of proofs seems advisable.

We will also probe some larger questions that go beyond bug-bashing. First, (natural language) syntax already has some highly abstract formulations, including Lambek (1958, 2004), Sadrzadeh (2010, 2015), and others. What are the natural language constructs that these formulations address, or fail to adress? Many practicing syntacticians feel that the gap between the abstract theories and the concrete linguistic facts needs to be bridged.

Second, since Hopf algebras are canonically built over vector spaces, investing some effort in relating them to the current use of vector spaces in computational linguistics would be a good idea.

Third, algebra and coalgebra are important tools not just for investigating natural languages but also for theoretical informatics in general. Where Hopf algebras fit into this larger picture needs to be better understood.

Fourth, language acquisition is a major issue. We are interested not just in some abstract representational target which can be leveraged into efficient computational blocks, but also in the question of how these can be acquired from the data, In general, structures with finite combinatorics are incredibly hard to acquire (Angluin 1980, 1981, 1987). Obviously, the quantum physics people who also use Hopf algebras have no interest in this issue, but for linguistics acquisition has been considerd central since Chomsky (1965).

Format We will have weekly or biweekly hybrid meetings, if you wish to participate, please sign up here. For those living in Budapest, physical attendance will be possible.

Time and place We start June 21 (Wed) at 2PM CET, SZTAKI Institute of Computer Science, Rm L306 (hybrid format). For those in faraway timezones, we will have a 2nd meeting at 11PM CET (zoom only). For the zoom links, please sign up first at the link above. Some course resources are made available at https://nessie.ilab.sztaki.hu/~kornai/2023/Hopf/Resources

In the first class we start by building Hopf algebras from the ground up, starting with the Wiki page. The goal is to relate HAs to standard algebraic structures in terms of signature, structure theorems, etc. and to see how much of universal algebra is applicable to them.