This repo contains code and data for "An information-theoretic characterization of morphological fusion" (at EMNLP 2021).
Contact neilrathi@gmail.com with any questions!
codecontains the code for creating fusion data for a language, as well as analysis coderesult_plotscontains the plots used in the paper (main figure, paradigm size vs. fusion, frequency vs. fusion)langdatahas data for- fusion by part-of-speech and language
- paradigm size by part-of-speech and language (vs. fusion)
- form frequency by feature and language (vs. fusion)
- R. We used version 4.0.3. Analyses and plot generation require
tidyr,dplyr,ggplot2, andrPref. - Python 3.8
- GPU TensorFlow. We used version 2.2.0.