[Google scholar] [orcid]
Papers
-
Dale Zhou, Shubhankar P. Patankar, David M. Lydon-Staley, Perry Zurn, Martin Gerlach*, Dani S. Bassett* (* equal contribution)
Architectural styles of curiosity in global Wikipedia mobile app readership
under review
[arxiv] [meta] -
Akhil Arora, Robert West, Martin Gerlach
Orphan Articles: The Dark Matter of Wikipedia
ICWSM (2024), to appear
[arxiv] [meta] -
Tiziano Piccardi, Martin Gerlach, Robert West
Curious Rhythms: Temporal Regularities of Wikipedia Consumption
ICWSM (2024), to appear
[arxiv] [meta] -
Tiziano Piccardi, Martin Gerlach, Akhil Arora, Robert West
A Large-Scale Characterization of How Readers Browse Wikipedia
ACM Transaction on the Web (2023)
[paper] [arxiv] [meta] -
Tiziano Piccardi, Martin Gerlach, Robert West
Going Down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading Sessions
WikiWorkshop 2022, Companion Proceedings of The Web Conference 2022 (WWW ‘22).
[paper] [arxiv] -
Akhil Arora, Martin Gerlach, Tiziano Piccardi, Alberto García-Durán, Robert West
Wikipedia Reader Navigation: When Synthetic Data Is Enough
WSDM 2022, Proceedings of the Fifteeenth ACM International Conference on Web Search and Data Mining
[paper] [arxiv] [code] [meta] -
Martin Gerlach, Marshall Miller, Rita Ho, Kosta Harlan, Djellel Difallah
A Multilingual Entity Linking System for Wikipedia with a Machine-in-the-Loop Approach
CIKM 2021, Proceedings of the 30th ACM International Conference on Information & Knowledge Management
[paper] [arxiv] [code] [meta] [tool] -
Charles C. Hyland, Yuanming Tao, Lamiae Azizi, Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann
Multilayer networks for text analysis with multiple data types
EPJ Data Science (2021)
[paper] [arxiv] [data&code: TopSBM] -
Isaac Johnson, Martin Gerlach, Diego Saez-Trumper
Language-agnostic Topic Classification for Wikipedia
WikiWorkshop 2021, Companion Proceedings of the Web Conference 2021 (WWW ‘21)
[paper] [arxiv] [code] [data] [meta] -
Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, Leila Zia
A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft)
unpublished
[arxiv] [meta] -
Ziyou Ren, Martin Gerlach, Hanyu Shi, GR Scott Budinger, Luis A.N. Amaral
Information-theory-based benchmarking and feature selection algorithm improve cell type annotation and reproducibility of single cell RNA-seq data analysis pipelines
under review
[biorxiv] [code] -
Martin Gerlach, Francesc Font-Clos
A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics
Entropy (2020)
[paper] [arxiv] [code] [data] -
Martin Gerlach*, Hanyu Shi*, Luis A.N. Amaral (* equal contribution)
A universal information theoretic approach to the identification of stopwords
Nature Machine Intelligence (2019)
[paper] [pdf (read-only)] [pdf] [data&code]Media Coverage:
+Northwestern News -
Martin Gerlach, Eduardo G. Altmann
Testing statistical laws in complex systems
Physical Review Letters (2019)
[paper] [arxiv] [data&code] -
Julia Poncela-Casasnovas, Martin Gerlach, Nathan Aguirre, Luis A.N. Amaral
Large scale analysis of micro-level citation patterns reveals nuanced selection criteria
Nature Human Behaviour (2019)
[paper] [pdf] [data&code]Media Coverage:
+phys.org -
Hanyu Shi, Martin Gerlach, Isabel Diersen, Doug Downey, Luis A.N. Amaral
A new evaluation framework for topic modeling algorithms based on synthetic corpora
AISTATS (2019), Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics
[paper] [arxiv] [code] -
Thomas Stoeger, Martin Gerlach, Richard I. Morimoto, Luis A.N. Amaral
Large-scale investigation of the reasons why potentially important genes are ignored
PLOS Biology (2018)
[paper] [data&code]Media Coverage:
+ Top 10% most cited papers in PLOS Biology
+ PLOS Biology Primer by Ian Dunham
+ New York Times by Carl Zimmer
+ The Atlantic by Ed Yong
+ Northwestern News (Video)
+ The Economist
+ Science Magazine
+ Nature (Daily Briefing)
+ F1000-primeComments/Replies:
+ Reply to “Far away from the lamppost”, PLOS Biology (2019) -
Martin Gerlach, Beatrice Farb, William Revelle, Luis A.N. Amaral
A robust data-driven approach identifies four personality types across four large data sets
Nature Human Behaviour (2018)
[paper] [pdf] [data&code]Media Coverage:
+ Northwestern News (Video)
+ Scientific American
+ Science Magazine
+ Time Magazine
+ Washington Post
+ Süddeutsche Zeitung (German)Comments/Replies:
- Reply to: Four personality types may be neither robust nor exhaustive, Nature Human Behaviour (2019) [pdf]
- Reply to Katahira et al. “Distribution of personality: Types or skewness?”, PsyArXiv preprint (2019)
-
Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann
A network approach to topic models
Science Advances (2018)
[paper] [arxiv] [code: TopSBM]Media Coverage:
+ Northwestern Data Science Initiative
+ TechXplore -
Laercio Dias, Martin Gerlach, Joachim Scharloth, Eduardo G. Altmann
Using text analysis to quantify the similarity and evolution of scientific disciplines
Royal Society Open Science (2018)
[paper] [arxiv] -
Eduardo G. Altmann, Laercio Dias, Martin Gerlach
Generalized Entropies and the Similarity of Texts
Journal of Statistical Mechanics (2017)
[paper] [arxiv] -
Jorge C. Leitão, Jose M. Miotto, Martin Gerlach, Eduardo G. Altmann
Is this scaling nonlinear?
Royal Society Open Science (2016)
[paper] [arxiv] [data&code] -
Martin Gerlach, Francesc Font-Clos, Eduardo G. Altmann
Similarity of symbol frequency distributions with heavy tails
Physical Review X (2016)
[paper] [arxiv]Media Coverage:
+ APS Focus by Philip Ball
+ Physics Today
+ Ria Novotny (Russian) -
Eduardo G. Altmann, Martin Gerlach
Statistical laws in linguistics
In Creativity and Universality in Language (Springer, 2016)
[paper] [arxiv] -
Martin Gerlach, Eduardo G. Altmann
Scaling laws and fluctuations in the statistics of word frequencies
New Journal of Physics (2014)
[paper] [arxiv] -
Fakhteh Ghanbarnejad*, Martin Gerlach*, Jose M. Miotto, Eduardo G. Altmann (* equal contribution)
Extracting information from S-curves of language change
Journal of The Royal Society Interface (2014)
[paper] [arxiv] [data&code]Media Coverage:
+ Spiegel Online (German) -
Martin Gerlach, Eduardo G. Altmann
Stochastic Model for the Vocabulary Growth in Natural Languages
Physical Review X (2013)
[paper] [arxiv] -
Martin Gerlach, Sebastian Wüster, Jan-Michael Rost
Kicking Electrons
Journal of Physics B (2012)
[paper] [arxiv]Media Coverage:
+ Selected for Highlights of 2012 by the editors of Journal of Physics B
Blogposts
-
Martin Gerlach, Isaac Johnson, and Nazia Tasnim
From hell to HTML: releasing a Python package to easily work with Wikimedia HTML dumps
Wikimedia Tech-blog (2023)
[tool] [code] -
Muniza A., Isaac Johnson, and Martin Gerlach
Analyzing the Wikipedia clickstream just got easier with WikiNav
Wikimedia Tech-blog (2021)
[tool] [code] [meta] -
Cristina Butoiu, Martin Gerlach, and Leighanna Mixter
World Suicide Prevention Day and the opportunity to increase access to mental health information on Wikimedia projects
Wikimedia Diff-blog (2021)
Thesis
-
Doctoral Thesis (2016)
Universality and variability in the statistics of data with fat-tailed distributions: The case of word frequencies in natural languages
presented to the Physics Department of Dresden University of Technology; produced at the Max Planck Institute for the Physics of Complex Systems [link] [pdf] -
Diploma Thesis (2011)
Hamiltonians dominanter Wechselwirkung
presented to the Physics Department of Dresden University of Technology; produced at the Max Planck Institute for the Physics of Complex Systems
Academic service
I have reviewed for the following journals/conferences:
- ACM Computing Surveys
- Artificial Life
- Communications Physics
- CSCW (Computer-supported Cooperative Work and Social Computing)
- Entropy
- European Physics Journal B (EPJB)
- EPL (Europhysics Letters)
- EPJ Data Science
- Frontiers in Digital Humanities
- Harvard Data Science Review
- Information Sciences
- Journal of Language Evolution
- Journal of Physics Communications
- Journal of Statistical Mechanics: Theory and Experiment (JSTAT)
- Language Dynamics and Change
- Nature Human Behaviour
- Nature Scientific Reports
- New Journal of Physics
- PeerJ Computer Science
- Physical Review E
- Physical Review Letters
- Physical Review Research
- Physical Review X
- Physics Letters A
- PLOS One
- PLOS Computational Biology
- Proceedings of the Royal Society B
- Royal Society Open Science
- Scientific Reports
- WSDM (Internation Conference on Web Search and Data Mining)
- WWW (The Web Conference)
- WikiWorkshop
I have been handling editor for PNAS
I have been co-organizer for WikiWorkshop 2023 and WikiWorkshop 2024
Conferences & Seminars
-
EPFL hosted by Data Science Lab. Lausanne, 02/2024 [Invited talk]
-
Computational Social Science Seminar hosted by the Centre Marc Bloch. Berlin, 06/2023 [Invited talk]
-
FOSDEM 23. Online, 02/2023 [talk]
-
Cavendish QI Seminars hosted by the Cavendish QI Group and the Hitachi Cambridge Laboratory at the University of Cambridge. Online, 03/2022 [Invited talk]
-
Wikimania 2021. Online, 08/2021 [lecture, workshop]
-
PCNet 21: satellite workshop on political communication networks as part of Networks 21: A Joint Sunbelt and NetSci Conference. Online, 06/2021 [Keynote]
-
Max Planck Institute for the History of Science, Berlin, 06/2019 [Invited talk]
-
AISTATS 2019, Naha, Okinawa (Japan), 04/2019 [poster]
-
Department of Network and Data Science, Central European University (Budapest), 03/2019 [Invited talk]
-
Department of Mathematical Sciences, University of Bath, 03/2019 [Invited talk]
-
IC2S2 2018: 4th Annual International Conference on Computational Social Science, Evanston, IL, 07/2018 [talk]
-
Centre for Translational Data Science, University of Sydney, 02/2018 [Invited talk]
-
Centre for Complex Systems, University of Sydney, 02/2018 [Invited talk]
-
Computing@PNNL Seminar Series, Pacific Northwest National Laboratory, 01/2018 [Invited talk]
-
Statistics of Languages: Theories and Experiments, Warsaw, 07/2017 [Invited talk] [Video]
-
ISSID 2017: Conference of the International Society for the Study of Individual Differences, Warsaw, 07/2017 [talk]
-
ICCSS 2017: 3rd Annual International Conference on Computational Social Science, Cologne, 07/2017 [poster]
-
ICCSS 2016: 2nd Annual International Conference on Computational Social Science, Evanston, IL, 06/2016 [talk]
-
NetSci-X 2016: International School and Conference on Network Science, Wrocław, 01/2016 [talk]
-
Granada Seminar on Computational and Statistical Physics, Granada, 06/2015 [talk]
-
Causality in the Language Sciences, Leipzig, 04/2015 [poster]
-
DPG Spring Meeting 2015, Berlin, 03/2015 [talk]
-
Open Access Ambassadors 2014, Munich, 12/2014
-
ECCS 2014, Lucca, 09/2014 [poster]
-
4th day of corpuslinguistics and humanities, Dresden Center for Digital Linguistics, 07/2014 [Invited talk]
-
Flow Machines Workshop: Creativity and Universality in Language, Paris, 06/2014 [poster]
-
Discussion Workshop: Physical Origins of correlated extreme events, Dresden, 06/2014 [poster]
-
DPG spring meeting 2014, Dresden, 04/2014 [talk]
-
Visions in Science 2013, Dresden, 09/2013
-
Lipari School on Computational Complex Systems: “Dynamic Networks and Social Behavior”, Lipari (Italy), 07/2013
-
DPG spring meeting 2013, Regensburg, 03/2013 [talk]
-
DPG spring meeting 2011, Dresden, 03/2011 [poster]
-
Workshop Extreme Atomic Systems, Riezlern (Austria), 02/2011 [talk]
-
International Workshop on Atomic Physics, Dresden, 11/2010 [poster]