Developing genomic resources using striped bass (Morone saxatilis): genetic structure, associations, and text-mining
University of New Brunswick
The advent of next-generation sequencing technologies has revolutionized the field of molecular ecology, facilitating increasingly fine-scale detection of genetic differences among populations and adaptationally significant mutations. In this thesis, I use genomics to advance solve longstanding mysteries of Striped Bass genetics and lay the groundwork for future studies. In the first chapter, I characterize a group of Striped Bass that were thought to be extirpated in the Saint John River, but likely survive as a remnant population. In the second chapter, I investigate connectivity and relatedness of Striped Bass populations more widely across their native range on the North American Atlantic Coast. I found that Gulf of St. Lawrence, Shubenacadie River, and Saint John River populations were all very distinct from each other and from US populations. US Striped Bass, however, could be separated into three major regions: Hudson River-Kennebec River, Chesapeake Bay-Delaware River, and Roanoke River-Cape Fear River. Demonstrating that this work is useful for management, my SNP loci were able to assign 99% of Striped Bass to these six regions, the first time Roanoke River Striped Bass have been reliably distinguished from Chesapeake Bay bass. Additionally, the presence of apparent US-origin Striped Bass on the northeastern coast of Nova Scotia raises important questions about movement patterns of Striped Bass in this area and highlights the importance of further study. In the third chapter, I used computer modelling simulations to assess the performance of four recent techniques used to find associations between phenotypes and genotypes. I found that Random Forest algorithm with population correction performed similarly to a recent, complex model implemented in confounder adjusted multiple testing. Finally, in chapter four I created 9 novel tools and used them to create an automated text-mining pipeline that can scan full-text articles and extract sentences that contain associations between genes and ecological variables. This pipeline is the first step toward improving genome annotations of non-model organisms such as Striped Bass. Together, these four chapters lay important groundwork for future genomic research both for Striped Bass and other ecologically important species.