Here we suggest a least absolute shrinkage and selection operator (LASSO) approach to estimate the marker effects for genomic selection using the least angle regression (LARS) algorithm, modified to include a cross–validation step to define the best subset of markers to involve in the model. The LASSO-LARS was tested on simulated data which consisted of 5,865 individuals and 6,000 SNPs. The last generations of this dataset were the selection candidates. Using only animals from generations prior to the candidates, three approaches to splitting the population into training and validation sets for cross-validation were evaluated. Furthermore, different sizes of the validation sample were tested. Moreover, BLUP and Bayesian methods were carried out for comparison. The most reliable cross-validation method was the random splitting of overall population with a validation sample size of 50% of the reference population. The accuracy of the GEBVs (correlation with true breeding values) in the candidate population obtained by LASSO-LARS was 0.89 with 156 explanatory SNPs. This value was higher then those obtained by using BLUP and Bayesian methods, which were 0.75 and 0.84 respectively. It was concluded that LASSO-LARS approach is a good alternative way to estimate markers effects for genomic selection.

Using LASSO to estimate marker effects for genomic selection8:Suppl. 2(2009), pp. 168-170. [10.4081/ijas.2009.s2.168]

Using LASSO to estimate marker effects for genomic selection

2009

Abstract

Here we suggest a least absolute shrinkage and selection operator (LASSO) approach to estimate the marker effects for genomic selection using the least angle regression (LARS) algorithm, modified to include a cross–validation step to define the best subset of markers to involve in the model. The LASSO-LARS was tested on simulated data which consisted of 5,865 individuals and 6,000 SNPs. The last generations of this dataset were the selection candidates. Using only animals from generations prior to the candidates, three approaches to splitting the population into training and validation sets for cross-validation were evaluated. Furthermore, different sizes of the validation sample were tested. Moreover, BLUP and Bayesian methods were carried out for comparison. The most reliable cross-validation method was the random splitting of overall population with a validation sample size of 50% of the reference population. The accuracy of the GEBVs (correlation with true breeding values) in the candidate population obtained by LASSO-LARS was 0.89 with 156 explanatory SNPs. This value was higher then those obtained by using BLUP and Bayesian methods, which were 0.75 and 0.84 respectively. It was concluded that LASSO-LARS approach is a good alternative way to estimate markers effects for genomic selection.
Using LASSO to estimate marker effects for genomic selection8:Suppl. 2(2009), pp. 168-170. [10.4081/ijas.2009.s2.168]
File in questo prodotto:
File Dimensione Formato  
Usai_MG_Using_LASSO_to_estimate.pdf

accesso aperto

Tipologia: Versione editoriale (versione finale pubblicata)
Licenza: Non specificato
Dimensione 234.37 kB
Formato Adobe PDF
234.37 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11388/264847
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact