Katharine Miller - Academia.edu (original) (raw)

Uploads

Papers by Katharine Miller

Research paper thumbnail of Genetic Characterization of Juvenile Chum Salmon (Oncorhynchus keta) Migrating out of the Yukon River Delta

Technical report, Dec 1, 2019

To identify critical life history stages for salmon survival, it may be informative to compare ad... more To identify critical life history stages for salmon survival, it may be informative to compare adult returns with abundances at various life-history stages. The transitional period from freshwater to saltwater is speculated to be a major source of mortality for salmon and information about early life stages may help reduce uncertainty around survival estimates and future run-size predictions. Past genetic studies demonstrated that relative abundances of Yukon summer-run and fall-run juvenile chum salmon (Oncorhynchus keta) caught on the eastern Bering Sea shelf during late summer/early fall are correlated with adult returns for their respective year classes (Kondzela et al. 2016). We are interested in testing whether earlier life history stages are also correlated with adult returns. Our study provides insights into the relative proportions of summer-run and fall-run juvenile chum salmon that outmigrate from the Yukon River during the spring/summer period.

Research paper thumbnail of Disentangling Population Level Differences in Juvenile Migration Phenology for Three Species of Salmon on the Yukon River

Journal of Marine Science and Engineering

Migration phenology influences many important ecological processes. For juvenile Pacific salmon, ... more Migration phenology influences many important ecological processes. For juvenile Pacific salmon, the timing of the seaward migration from fresh to marine waters is linked to early marine survival and adult returns. Seaward migration phenology is determined by interactions between the intrinsic attributes of individual species and environmental factors that are acting upon them. Temperature and discharge are two factors of the freshwater environment that have been shown to influence intra- and interannual variation in juvenile salmon phenology, but these factors may affect the migrations of sympatric species differently. Understanding how variations in phenology change with environmental heterogeneity is a critical first step in evaluating how the future climate may affect salmon. This is especially crucial for high-latitude rivers, where the pace of climate change is nearly twice as rapid as it is for more temperate areas. This research investigates the influence of river conditions...

Research paper thumbnail of Use of Machine Learning (ML) for Predicting and Analyzing Ecological and ‘Presence Only’ Data: An Overview of Applications and a Good Outlook

Machine Learning for Ecology and Sustainable Natural Resource Management, 2018

Machine learning (ML) has been established and used in science-based applications since the 1970s... more Machine learning (ML) has been established and used in science-based applications since the 1970s. The advent and maturation of mathematical algorithms and concepts like Neural Networks, Entropy, Classification and Regression Trees (CARTs), as well as the enhancement of computational power on personal computers worldwide have allowed for the development of many new applications and good approaches to analyzing highly complex systems and their data. Improvements to classical ML techniques, such as boosting, bagging and ensembles have been developed and combined with ML algorithms to yield powerful new tools for both data exploration and analysis (e.g. classification and prediction). Together with the increasing availability of online datasets (public and private), these tools have formed a new ‘science-culture’ that has yet to be fully embraced by the broader scientific community. ML can be used extremely well for data mining and classification, as well as to draw generalizable inference from powerful predictions (Breiman L, Stat Sci 16:199–231 (2001a); Breiman L, Mach Learn J 45:5–32 (2001b)). Thus, it offers a new scientific platform that can help overcome many of the earlier limitations associated with sparse field data, statistical model-fitting, p-values, parsimony (e.g., AIC), Bayesian and post-hoc studies. In contrast to conventional, statistical model-based data analysis, ML usually is non-parametric, so it does not require a priori assumptions about the structure and complexity of a model, nor is it based on just single linear algorithms. This eliminates potential biases and constraints being built into models that result from these assumptions and traditional singular algorithms. In contrast, ML techniques are classification tools of choice and convenience. They can decipher relevant relationships (‘extract the signal’) directly from virtually any data (e.g. messy, ‘gappy’, very large or rather small). Thus, ML can be seen as a new science philosophy with a newly available statistical approach that allows for faster, alternative and more encompassing results that more adequately generalize and reflect the very complex structure of ecological systems. Because ML is not only flexible but efficient, it is an ideal tool for application in the science-based wildlife and conservation management arenas as well as ecology, where decisions need to be robust but time-critical. Here we review some of the advantages and assumed application pitfalls of several key ML algorithms with published examples from the wildlife ecology and biodiversity disciplines using ‘location only’ (presence) data. We then provide a simulation case study to illustrate our key points, and evaluate how ML has the potential to change the way we use information to manage wildlife in times of a rapidly changing global environment and its ongoing crisis.

Research paper thumbnail of Impacts to Essential Fish Habitat from Non-fishing Activities in Alaska

Research paper thumbnail of Predicting distributions of estuarine associated fish and invertebrates in Southeast Alaska

Estuaries in Southeast Alaska provide habitat for juveniles and adults of several commercial fish... more Estuaries in Southeast Alaska provide habitat for juveniles and adults of several commercial fish and invertebrate species; however, because of the area's size and challenging environment, very little is known about the spatial structure and distribution of estuarine species in relation to the biotic and abiotic environment. This study uses advanced machine learning algorithms (random forest and multivariate random forest) and landscape and seascape-scale environmental variables to develop predictive models of species occurrence and community composition within Southeast Alaskanestuaries. Species data were obtained from trawl and seine sampling in 49 estuaries throughout the study area. Environmental data were compiled and extracted from existing spatial datasets. Individual models for species occurrence were validated using independent data from seine surveys in 88 estuaries. Prediction accuracy for individual species models ranged from 94% to 63%, with 76% of the fish species models and 72% of the invertebrate models having a predictive accuracy of 70% or better. The models elucidated complex species-habitat relationships that can be used to identify habitat protection priorities and to guide future research. The multivariate models demonstrated that community composition was strongly related to regional patterns of precipitation and tidal energy, as well as to local abundance of intertidal habitat and vegetation. The models provide insight into how changes in species abundance are influenced by both environmental variation and the co-occurrence of other species. Taxonomic diversity in the region was high (74%) and functional diversity was relatively low (23%). Functional diversity was not linearly correlated to species richness, indicating that the number of species in the estuary was not a good predictor of functional diversity or redundancy. Functional redundancy differed across estuary clusters, suggesting that some estuaries have a greater potential for loss of functional diversity with species removal than others. iv Dedication This dissertation is dedicated to my husband who is my biggest fan, my strongest advocate, and my best friend. v

Research paper thumbnail of Multivariate random forest models of estuarine-associated fish and invertebrate communities

Marine Ecology Progress Series, 2014

Models that evaluate species−habitat relationships at the community level have been gaining atten... more Models that evaluate species−habitat relationships at the community level have been gaining attention with increasing interest in ecosystem management. Developing models that can incorporate both a large number of predictor variables and a multivariate response (a vector of individual species occurrences or abundances) is challenging. One promising new approach is multivariate random forests (MRF), a method that combines multivariate regression trees with bootstrap resampling and predictor subsampling from traditional random forests. Random forest models have been shown to be highly accurate and powerful in their predictive ability in a wide variety of applications. They can effectively model nonlinear and interacting variables. Our research evaluated change in estuarine assemblage composition along habitat gradients in Southeast Alaska using landscape-scale habitat variables and MRF. For 541 estuaries, we identified 24 predictor variables describing the geomorphic and habitat environment on land and in the estuary. MRF models were constructed in R software for combined fish and invertebrate assemblages. Cluster analysis of model proximities revealed strong spatial variation in community composition in relation to differences in tidal range, precipitation, percent of eelgrass, and amount of intertidal habitat. This research presents a new science-based management template that can be used to inform and assess species management and protection strategies, as well as to guide future research on species distributions.

Research paper thumbnail of Genetic Characterization of Juvenile Chum Salmon (Oncorhynchus keta) Migrating out of the Yukon River Delta

Technical report, Dec 1, 2019

To identify critical life history stages for salmon survival, it may be informative to compare ad... more To identify critical life history stages for salmon survival, it may be informative to compare adult returns with abundances at various life-history stages. The transitional period from freshwater to saltwater is speculated to be a major source of mortality for salmon and information about early life stages may help reduce uncertainty around survival estimates and future run-size predictions. Past genetic studies demonstrated that relative abundances of Yukon summer-run and fall-run juvenile chum salmon (Oncorhynchus keta) caught on the eastern Bering Sea shelf during late summer/early fall are correlated with adult returns for their respective year classes (Kondzela et al. 2016). We are interested in testing whether earlier life history stages are also correlated with adult returns. Our study provides insights into the relative proportions of summer-run and fall-run juvenile chum salmon that outmigrate from the Yukon River during the spring/summer period.

Research paper thumbnail of Disentangling Population Level Differences in Juvenile Migration Phenology for Three Species of Salmon on the Yukon River

Journal of Marine Science and Engineering

Migration phenology influences many important ecological processes. For juvenile Pacific salmon, ... more Migration phenology influences many important ecological processes. For juvenile Pacific salmon, the timing of the seaward migration from fresh to marine waters is linked to early marine survival and adult returns. Seaward migration phenology is determined by interactions between the intrinsic attributes of individual species and environmental factors that are acting upon them. Temperature and discharge are two factors of the freshwater environment that have been shown to influence intra- and interannual variation in juvenile salmon phenology, but these factors may affect the migrations of sympatric species differently. Understanding how variations in phenology change with environmental heterogeneity is a critical first step in evaluating how the future climate may affect salmon. This is especially crucial for high-latitude rivers, where the pace of climate change is nearly twice as rapid as it is for more temperate areas. This research investigates the influence of river conditions...

Research paper thumbnail of Use of Machine Learning (ML) for Predicting and Analyzing Ecological and ‘Presence Only’ Data: An Overview of Applications and a Good Outlook

Machine Learning for Ecology and Sustainable Natural Resource Management, 2018

Machine learning (ML) has been established and used in science-based applications since the 1970s... more Machine learning (ML) has been established and used in science-based applications since the 1970s. The advent and maturation of mathematical algorithms and concepts like Neural Networks, Entropy, Classification and Regression Trees (CARTs), as well as the enhancement of computational power on personal computers worldwide have allowed for the development of many new applications and good approaches to analyzing highly complex systems and their data. Improvements to classical ML techniques, such as boosting, bagging and ensembles have been developed and combined with ML algorithms to yield powerful new tools for both data exploration and analysis (e.g. classification and prediction). Together with the increasing availability of online datasets (public and private), these tools have formed a new ‘science-culture’ that has yet to be fully embraced by the broader scientific community. ML can be used extremely well for data mining and classification, as well as to draw generalizable inference from powerful predictions (Breiman L, Stat Sci 16:199–231 (2001a); Breiman L, Mach Learn J 45:5–32 (2001b)). Thus, it offers a new scientific platform that can help overcome many of the earlier limitations associated with sparse field data, statistical model-fitting, p-values, parsimony (e.g., AIC), Bayesian and post-hoc studies. In contrast to conventional, statistical model-based data analysis, ML usually is non-parametric, so it does not require a priori assumptions about the structure and complexity of a model, nor is it based on just single linear algorithms. This eliminates potential biases and constraints being built into models that result from these assumptions and traditional singular algorithms. In contrast, ML techniques are classification tools of choice and convenience. They can decipher relevant relationships (‘extract the signal’) directly from virtually any data (e.g. messy, ‘gappy’, very large or rather small). Thus, ML can be seen as a new science philosophy with a newly available statistical approach that allows for faster, alternative and more encompassing results that more adequately generalize and reflect the very complex structure of ecological systems. Because ML is not only flexible but efficient, it is an ideal tool for application in the science-based wildlife and conservation management arenas as well as ecology, where decisions need to be robust but time-critical. Here we review some of the advantages and assumed application pitfalls of several key ML algorithms with published examples from the wildlife ecology and biodiversity disciplines using ‘location only’ (presence) data. We then provide a simulation case study to illustrate our key points, and evaluate how ML has the potential to change the way we use information to manage wildlife in times of a rapidly changing global environment and its ongoing crisis.

Research paper thumbnail of Impacts to Essential Fish Habitat from Non-fishing Activities in Alaska

Research paper thumbnail of Predicting distributions of estuarine associated fish and invertebrates in Southeast Alaska

Estuaries in Southeast Alaska provide habitat for juveniles and adults of several commercial fish... more Estuaries in Southeast Alaska provide habitat for juveniles and adults of several commercial fish and invertebrate species; however, because of the area's size and challenging environment, very little is known about the spatial structure and distribution of estuarine species in relation to the biotic and abiotic environment. This study uses advanced machine learning algorithms (random forest and multivariate random forest) and landscape and seascape-scale environmental variables to develop predictive models of species occurrence and community composition within Southeast Alaskanestuaries. Species data were obtained from trawl and seine sampling in 49 estuaries throughout the study area. Environmental data were compiled and extracted from existing spatial datasets. Individual models for species occurrence were validated using independent data from seine surveys in 88 estuaries. Prediction accuracy for individual species models ranged from 94% to 63%, with 76% of the fish species models and 72% of the invertebrate models having a predictive accuracy of 70% or better. The models elucidated complex species-habitat relationships that can be used to identify habitat protection priorities and to guide future research. The multivariate models demonstrated that community composition was strongly related to regional patterns of precipitation and tidal energy, as well as to local abundance of intertidal habitat and vegetation. The models provide insight into how changes in species abundance are influenced by both environmental variation and the co-occurrence of other species. Taxonomic diversity in the region was high (74%) and functional diversity was relatively low (23%). Functional diversity was not linearly correlated to species richness, indicating that the number of species in the estuary was not a good predictor of functional diversity or redundancy. Functional redundancy differed across estuary clusters, suggesting that some estuaries have a greater potential for loss of functional diversity with species removal than others. iv Dedication This dissertation is dedicated to my husband who is my biggest fan, my strongest advocate, and my best friend. v

Research paper thumbnail of Multivariate random forest models of estuarine-associated fish and invertebrate communities

Marine Ecology Progress Series, 2014

Models that evaluate species−habitat relationships at the community level have been gaining atten... more Models that evaluate species−habitat relationships at the community level have been gaining attention with increasing interest in ecosystem management. Developing models that can incorporate both a large number of predictor variables and a multivariate response (a vector of individual species occurrences or abundances) is challenging. One promising new approach is multivariate random forests (MRF), a method that combines multivariate regression trees with bootstrap resampling and predictor subsampling from traditional random forests. Random forest models have been shown to be highly accurate and powerful in their predictive ability in a wide variety of applications. They can effectively model nonlinear and interacting variables. Our research evaluated change in estuarine assemblage composition along habitat gradients in Southeast Alaska using landscape-scale habitat variables and MRF. For 541 estuaries, we identified 24 predictor variables describing the geomorphic and habitat environment on land and in the estuary. MRF models were constructed in R software for combined fish and invertebrate assemblages. Cluster analysis of model proximities revealed strong spatial variation in community composition in relation to differences in tidal range, precipitation, percent of eelgrass, and amount of intertidal habitat. This research presents a new science-based management template that can be used to inform and assess species management and protection strategies, as well as to guide future research on species distributions.