Marco Landt-Hayen, CAU Kiel and GEOMAR, firstname.lastname@example.org
- Prof. Dr. Peer Kröger, CAU Kiel, Institut für Informatik; LMU München, Munich Center for Machine Learning, email@example.com
- Prof. Dr. Martin, GEOMAR, RU Ocean Dynamics, firstname.lastname@example.org
- Dr. Willi Rath, GEOMAR, RU Ocean Dynamics, email@example.com
Disciplines: Physical Oceanography, Climate Science, Machine Learning, Deep Learning
Keywords: climate-event attribution, climate prediction, explainable AI, interpretable ML, deep learning architectures, model interpretation, feature importance and visualization
Motivation: The question of how to enhance the explainability of complex predictive models in general and deep models (artificial neural networks, ANNs) in particular is of utmost importance for the acceptance and/or usefulness of AI/ML methods in many fields. Deep models considerably push the state-of-the-art solutions of many hard problems (e.g. image classification, speech recognition, etc.), but tend to produce black-box results that are difficult to interpret even by ML experts. Consequently, the question of enhancing the explainability of complex models ("Explainable AI") has gained a lot of attention in the AI/ML community and stimulated a large amount of fundamental research [AI06]. Thus, this project addresses a hot topic in basic AI/ML research.
Identification of causally linked modes of climate variability is key to understanding the climate system, and to improving the predictive skill of forecast systems. Classical approaches for identifying teleconnections or precursors of climate events use linear statistical models assuming stationary relationships, which are, however, often absent [CL01, CL02, CL03]. As non-linear models are difficult to interpret, they are often not used. Explainable AI may help building interpretable non-linear models for use in climate-event attribution and exploration of non-stationary teleconnections.
Q1: How to design an ANN that enhances interpretability of the results for climate data analyses?
Q2: How to present the learned features of hidden layers trained with climate data to the user?
Q3: To what extent can the methods developed in Q1 and Q2 be used to enhance the understanding of the dynamics and mechanisms of the climate system?
Approach: For Q1, we focus on the network topology (representation) and the training process (processing). In particular, we define new generic design patterns for the network topology that leverage interpretability and explore the interoperability of these patterns. Furthermore, we find generic design patterns for setting up a training process that enables a transparent training of ANNs. For all these design patterns, we examine the trade-off between explainability, robustness, and performance.
For Q2, we explore how to adopt techniques from computer vision developed for image data [AI03] to climate data. The original representation of the features learned in the hidden layers consisting of weights and biases are transformed into more intuitive representations based on their relationships to the input data. This relationship likely depends on the network architecture (see Q1). Furthermore, we explore suitable visualizations of these new representations that help a scientist to understand the key characteristics of the learned model.
For Q3, we identify climate phenomena that are well-understood, reproduce our understanding in a data-driven way, and validate the data-driven error estimates. Then, we explore currently un-attributed events or phenomena with the aim of using explainable AI to generate understanding that has not been achieved with classical methods.
[AI01] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should i trust you?: Explaining the predictions of any classifier,” Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2016.
[AI02] J. R. Zilke, E. L. Menc ́ıa, and F. Janssen, “Deepred–rule extraction from deep neural networks,” in Proc. Int. Conf. on Discovery Science, 2016.
[AI03] D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba, “Network dissection: Quantifying interpretability of deep visual representations,” Computer Vision and Pattern Recognition, 2017.
[AI04] T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, and Z. Zhang, “The application of two-level attention models in deep convolutional neural network for fine-grained image classification”, Computer Vision and Pattern Recognition, 2015.
[AI05] A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune, “Synthesizing the preferred inputs for neurons in neural networks via deep generator networks”, Advances in Neural Information Processing Systems, 2016.
[AI06] L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter and L. Kagal, “Explaining explanations: An overview of interpretability of machine learning”, arXiv:1806.00069v3 [cs.AI] 3 Feb 2019
[CLIM01] Park, YH., Kim, BM., Pak, G. et al., “A key process of the nonstationary relationship between ENSO and the Western Pacific teleconnection pattern”, Sci Rep 8, 9512 (2018). https://doi.org/10.1038/s41598-018-27906-z
[CLIM02] Pak, Gyundo, Young-Hyang Park, Frederic Vivier, Young-Oh Kwon, and Kyung-Il Chang. " Regime-Dependent Nonstationary Relationship between the East Asian Winter Monsoon and North Pacific Oscillation", Journal of Climate 27.21 (2014): 8185-8204. https://doi.org/10.1175/JCLI-D-13-00500.1
[CLIM03] Zhang, Wenjun, Xuebin Mei, Xin Geng, Andrew G. Turner, and Fei-Fei Jin. “A Nonstationary ENSO–NAO Relationship Due to AMO Modulation”, Journal of Climate 32, 1 (2019): 33-43, accessed Feb 23, 2021, https://doi.org/10.1175/JCLI-D-18-0365.1
[CLIM04] L. Li, R. W. Schmitt, C. C. Ummenhofer, K. B. Karnauskas, “North Atlantic salinity as a predictor of Sahel rainfall”, Sci. Adv. 2, e1501588 (2016). https://doi.org/10.1126/sciadv.1501588
[CLIM05] Trenberth, K., Fasullo, J. & Shepherd, T., “Attribution of climate extreme events”, Nature Clim Change 5, 725–730 (2015). https://doi.org/10.1038/nclimate2657