Temporal trends in environmental data – if, when, where, and why

Senast ändrad: 14 september 2023

Claudia von Brömssen.

Environmental challenges are typically complex and interlinked. Environmental monitoring programs, while providing extensive and high-quality data, are usually not sufficiently detailed to provide ad-hoc insights into all relevant processes. Therefore, use of appropriate statistical methods is crucial to extract and understand information obtained from large datasets. The statistical approaches used need to highlight relevant information in data to answer the research question at hand, but should retain reasonable complexity, avoid overfitting and not impose unrealistic assumptions on the data.

The presence of a temporal trend in monitoring data during a given period is not difficult to determine and a number of methods are available to account for various data properties and different functional forms of the trend. Identifying when changes occur, the geographical distribution of observed trends and how trends in different variables are connected to each other is more complex. Smoothing methods, such as generalized additive models, local linear regression, geographically weighted regression, and similar, are used in environmental science due to their flexibility and ability to identify patterns in time and space. As these models do not result in simple statistical summaries, such as single p-values and estimates of the average magnitude of change, new ways to extract and visualize relevant information provided by the models is needed. In addition, great care needs to be taken to avoid overfitting when models are routinely applied to many stations or variables.

In this lecture, I will illustrate how statistical smoothing models can be used, adjusted, and improved to tackle current environmental questions, and how the information they provide can be visualized to support understanding of complex environmental processes and decision making in environmental management. I will discuss how general data features, such as spatial and temporal autocorrelation, outliers, and different data distributions, and computational burdens of the statistical models influence what can and cannot be done. The focus is on spatial presentation of nonlinear temporal trends and how this provides new knowledge gained from environmental monitoring data collected as times series or in serially alternating monitoring designs (so-called ‘omdrev’).