TY - JOUR
T1 - Scalable Sparse Testing Genomic Selection Strategy for Early Yield Testing Stage
AU - Atanda, Sikiru Adeniyi
AU - Olsen, Michael
AU - Crossa, Jose
AU - Burgueño, Juan
AU - Rincent, Renaud
AU - Dzidzienyo, Daniel
AU - Beyene, Yoseph
AU - Gowda, Manje
AU - Dreher, Kate
AU - Boddupalli, Prasanna M.
AU - Tongoona, Pangirayi
AU - Danquah, Eric Yirenkyi
AU - Olaoye, Gbadebo
AU - Robbins, Kelly R.
N1 - Publisher Copyright:
© Copyright © 2021 Atanda, Olsen, Crossa, Burgueño, Rincent, Dzidzienyo, Beyene, Gowda, Dreher, Boddupalli, Tongoona, Danquah, Olaoye and Robbins.
PY - 2021/6/22
Y1 - 2021/6/22
N2 - To enable a scalable sparse testing genomic selection (GS) strategy at preliminary yield trials in the CIMMYT maize breeding program, optimal approaches to incorporate genotype by environment interaction (GEI) in genomic prediction models are explored. Two cross-validation schemes were evaluated: CV1, predicting the genetic merit of new bi-parental populations that have been evaluated in some environments and not others, and CV2, predicting the genetic merit of half of a bi-parental population that has been phenotyped in some environments and not others using the coefficient of determination (CDmean) to determine optimized subsets of a full-sib family to be evaluated in each environment. We report similar prediction accuracies in CV1 and CV2, however, CV2 has an intuitive appeal in that all bi-parental populations have representation across environments, allowing efficient use of information across environments. It is also ideal for building robust historical data because all individuals of a full-sib family have phenotypic data, albeit in different environments. Results show that grouping of environments according to similar growing/management conditions improved prediction accuracy and reduced computational requirements, providing a scalable, parsimonious approach to multi-environmental trials and GS in early testing stages. We further demonstrate that complementing the full-sib calibration set with optimized historical data results in improved prediction accuracy for the cross-validation schemes.
AB - To enable a scalable sparse testing genomic selection (GS) strategy at preliminary yield trials in the CIMMYT maize breeding program, optimal approaches to incorporate genotype by environment interaction (GEI) in genomic prediction models are explored. Two cross-validation schemes were evaluated: CV1, predicting the genetic merit of new bi-parental populations that have been evaluated in some environments and not others, and CV2, predicting the genetic merit of half of a bi-parental population that has been phenotyped in some environments and not others using the coefficient of determination (CDmean) to determine optimized subsets of a full-sib family to be evaluated in each environment. We report similar prediction accuracies in CV1 and CV2, however, CV2 has an intuitive appeal in that all bi-parental populations have representation across environments, allowing efficient use of information across environments. It is also ideal for building robust historical data because all individuals of a full-sib family have phenotypic data, albeit in different environments. Results show that grouping of environments according to similar growing/management conditions improved prediction accuracy and reduced computational requirements, providing a scalable, parsimonious approach to multi-environmental trials and GS in early testing stages. We further demonstrate that complementing the full-sib calibration set with optimized historical data results in improved prediction accuracy for the cross-validation schemes.
KW - CDmean
KW - factor analytic
KW - genomic selection
KW - prediction accuracy
KW - preliminary yield trials
KW - unstructured model
UR - http://www.scopus.com/inward/record.url?scp=85112122670&partnerID=8YFLogxK
U2 - 10.3389/fpls.2021.658978
DO - 10.3389/fpls.2021.658978
M3 - Article
AN - SCOPUS:85112122670
SN - 1664-462X
VL - 12
JO - Frontiers in Plant Science
JF - Frontiers in Plant Science
M1 - 658978
ER -