publications

You can also find an up-to-date list of my publications on Google Scholar.

2024

  1. Architectural styles of curiosity in global Wikipedia mobile app readership
    Dale Zhou, Shubhankar Patankar, David M. Lydon-Staley, Perry Zurn, Martin Gerlach*, and Dani S. Bassett*
    Science Advances, 2024
  2. Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
    Tomás Feith, Akhil Arora, Martin Gerlach, Debjit Paul, and Robert West
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
  3. ACL
    2024_acl.png
    An Open Multilingual System for Scoring Readability of Wikipedia
    Mykola Trokhymovych, Indira Sen, and Martin Gerlach
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024
  4. Orphan Articles: The Dark Matter of Wikipedia
    Akhil Arora, Robert West, and Martin Gerlach
    In Proceedings of the Eighteenth International AAAI Conference on Web and Social Media, 2024
  5. Curious Rhythms: Temporal Regularities of Wikipedia Consumption
    Tiziano Piccardi, Martin Gerlach, and Robert West
    In Proceedings of the Eighteenth International AAAI Conference on Web and Social Media, 2024

2023

  1. A Large-Scale Characterization of How Readers Browse Wikipedia
    Tiziano Piccardi, Martin Gerlach, Akhil Arora, and Robert West
    ACM Transactions on the Web, 2023

2022

  1. WWW
    2022_www.png
    Going Down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading Sessions
    Tiziano Piccardi, Martin Gerlach, and Robert West
    In Companion Proceedings of The Web Conference 2022, 2022
  2. Wikipedia Reader Navigation: When Synthetic Data Is Enough
    Akhil Arora, Martin Gerlach, Tiziano Piccardi, Alberto García-Durán, and Robert West
    In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022

2021

  1. WWW
    2021_www.png
    Language-agnostic Topic Classification for Wikipedia
    Isaac Johnson, Martin Gerlach, and Diego Sáez-Trumper
    In Companion Proceedings of the Web Conference 2021, 2021
  2. Multilingual Entity Linking System for Wikipedia with a Machine-in-the-Loop Approach
    Martin Gerlach, Marshall Miller, Rita Ho, Kosta Harlan, and Djellel Difallah
    In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021
  3. Multilayer networks for text analysis with multiple data types
    Charles C Hyland, Yuanming Tao, Lamiae Azizi, Martin Gerlach, Tiago P Peixoto, and Eduardo G Altmann
    EPJ Data Science, 2021

2020

  1. Preprint
    2020_taxonomy.png
    A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft)
    Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, and Leila Zia
    Unpublished, 2020
  2. Preprint
    2020_information.png
    Information-theory-based benchmarking and feature selection algorithm improve cell type annotation and reproducibility of single cell RNA-seq data analysis pipelines
    Ziyou Ren, Martin Gerlach, Hanyu Shi, G R Scott Budinger, and Luís A Nunes Amaral
    Unpublished, 2020
  3. A Standardized Project Gutenberg Corpus for Statistical Analysis of Natural Language and Quantitative Linguistics
    Martin Gerlach, and Francesc Font-Clos
    Entropy, 2020

2019

  1. A new evaluation framework for topic modeling algorithms based on synthetic corpora
    Hanyu Shi*Martin Gerlach*, Isabel Diersen, Doug Downey, and Luis Amaral
    In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, 2019
  2. PRL
    2019_prl.png
    Testing Statistical Laws in Complex Systems
    Martin Gerlach, and Eduardo G Altmann
    Physical Review Letters, 2019
  3. Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria
    Julia Poncela-Casasnovas, Martin Gerlach, Nathan Aguirre, and Luís A N Amaral
    Nature Human Behaviour, 2019
  4. A universal information theoretic approach to the identification of stopwords
    Martin Gerlach, Hanyu Shi, and Luís A Nunes Amaral
    Nature Machine Intelligence, 2019
  5. Reply to: Four personality types may be neither robust nor exhaustive
    Martin Gerlach, William Revelle, and Luís A Nunes Amaral
    Nature Human Behaviour, 2019
  6. Preprint
    2019_preprint_reply.png
    Reply to Katahira et al. “Distribution of personality: Types or skewness?”
    Martin Gerlach, William Revelle, and Luis A N Amaral
    Unpublished, 2019

2018

  1. Large-scale investigation of the reasons why potentially important genes are ignored
    Thomas Stoeger, Martin Gerlach, Richard I Morimoto, and Luís A Nunes Amaral
    PLoS Biology, 2018
  2. Reply to "Far away from the lamppost"
    Thomas Stoeger, Martin Gerlach, Richard I Morimoto, and Luís A Nunes Amaral
    PLoS Biology, 2018
  3. A network approach to topic models
    Martin Gerlach, Tiago P Peixoto, and Eduardo G Altmann
    Science Advances, 2018
  4. A robust data-driven approach identifies four personality types across four large data sets
    Martin Gerlach, Beatrice Farb, William Revelle, and Luís A Nunes Amaral
    Nature Human Behaviour, 2018
  5. Using text analysis to quantify the similarity and evolution of scientific disciplines
    Laércio Dias, Martin Gerlach, Joachim Scharloth, and Eduardo G Altmann
    Royal Society Open Science, 2018

2017

  1. Generalized entropies and the similarity of texts
    Eduardo G Altmann, Laércio Dias, and Martin Gerlach
    Journal of Statistical Mechanics: Theory and Experiment, 2017

2016

  1. Thesis
    2016_thesis.png
    Universality and variability in the statistics of data with fat-tailed distributions: the case of word frequencies in natural languages
    Martin Gerlach
    Doctoral Thesis (Dr. rer. nat.). presented to the Physics Department of Dresden University of Technology; produced at the Max Planck Institute for the Physics of Complex Systems , 2016
  2. PRX
    2016_prx.png
    Similarity of Symbol Frequency Distributions with Heavy Tails
    Martin Gerlach, Francesc Font-Clos, and Eduardo G Altmann
    Physical Review X, 2016
  3. BookChapter
    2016_chapter.png
    Statistical laws in linguistics
    Eduardo G Altmann, and Martin Gerlach
    In Creativity and Universality in Language, 2016
  4. Is this scaling nonlinear?
    Jorge C. Leitão, Jose M. Miotto, Martin Gerlach, and Eduardo G. Altmann
    Royal Society Open Science, 2016

2014

  1. Extracting information from S-curves of language change
    Fakhteh Ghanbarnejad*Martin Gerlach*, Jose M Miotto, and Eduardo G Altmann
    Journal of the Royal Society Interface, 2014
  2. Scaling laws and fluctuations in the statistics of word frequencies
    Martin Gerlach, and Eduardo G Altmann
    New Journal of Physics, 2014

2013

  1. PRX
    2013_prx.png
    Stochastic Model for the Vocabulary Growth in Natural Languages
    Martin Gerlach, and Eduardo G Altmann
    Physical Review X, 2013

2012

  1. Kicking electrons
    Martin Gerlach, Sebastian Wüster, and Jan M Rost
    Journal of Physics B: Atomic, Molecular and Optical Physics, 2012

2011

  1. Thesis
    2011_thesis.png
    Hamiltonians dominanter Wechselwirkung
    Martin Gerlach
    Diploma Thesis (Diplom-Physiker). presented to the Physics Department of Dresden University of Technology; produced at the Max Planck Institute for the Physics of Complex Systems , 2011