AASC2020 AGM Webinars

December 3, 2020

The 2020 AASC AGM took place on 3 December 2020. You can find the videos of each presentation below with the abstracts.

ASReml and Genomic models

Arthur R Gilmour

It is 50 years since Arthur began working as a biometrician with NSW DPI (1970) having majored in biometry in the 4th year of his B Sc AGr (Sydney). He wrote software for the group as well as analysing variety trials and consulting with research scientists. This became the pattern for his career.

His alliance with Robin Thompson commenced in 1981 when Robin visited Massey University and continues today. Robin proposed the Average Information algorithm to Arthur in 1991 resulting in the development of ASReml in 1996. ASReml was particularly targetedat the analysis of plant breeding data with spatially correlated residuals and of animal breeding data utilising pedigree relationships. It was immediately adopted by many researchers, its core being soon included in Genstat by Sue Welham and into the ASReml-R procedure by David Butler.

Arthur retired from NSW DPI in 2009 and VSN purchased ASReml from NSW DPI and RR in 2010/12. The University of Wollongong funded Arthur to continue supporting ASReml for 3 years until VSN took up that role in 2014. The contract was revised in 2018/9 to resolve issues arising from the development of Echidna (EchidnaMMS.org) whose commercial use is restricted to ASReml licensees. Echidna has proved a useful platform to trial various improvements to ASReml.

ASReml 4.1 (2014) involved a major upgrade to the user interface with the adoption of a functional syntax for model specification similar to that already developed for ASReml-R.

Since then, the focus has been on improving the speed of ASReml 4.2 when fitting less sparse models. The SPARSE matrix methods as implemented in ASReml were inefficient for less sparse models and it has taken 6 years to make reasonable progress speeding up 10 different blocks of code for these models. Three things have contributed to the speed up: use of parallel processing, use of BLOCK processing rather than row by row processing and more efficient ordering of some operations. Even so, I expect further improvement is possible. Recent benchmarking against ASReml 4.1 for a model including a dense genomic relationship matrix of order 3200 showed 4.2 was 10 times faster than 4.1. The factor varies from 1 to 100 depending on the data set and model.

Modelling enhancements include (Double) Hierarchal Generalized Linear Models and bivariate GLMM (but not for multiple threshold models). ASReml-R 4 is based on ASReml 4.1 and a free student licence is available. An update based on ASReml 4.2 is in preparation by David Butler. ASReml 4.2 is also free for approved students.

I thank Jesus Christ and VSN for their continued support and the opportunity to help explore aspects of this incredible world He created.

Development of Genomic tools to support ASReml-R

Salvador Gezan

ASReml is a mature statistical software for the analysis of complex linear mixed models using Residual Maximum likelihood (REML) that is widely used in plant and animal breeding. ASReml is a batch program that can be executed using operating system level commands. An interactive interface, ASReml-W allows to organize, run commands, and display input and output files in a friendly manner using ASReml in the background.

In contrast, ASReml-R is a package that runs inside the R environment with the same capabilities as ASReml, but it uses all the available statistical computing and graphics that R provides. R language is widely used by the scientific community to perform a range of statistical and mathematical analyses through its thousands of libraries available. This environment has allowed ASReml-R users to exploit all its capabilities by developing analytical pipelines around data and output from ASReml-R.

The increased availability of molecular data for many crops and species has expanded the quantitative genetic analyses and allowed for accelerating selection of outstanding genotypes and greater genetic gains. Here, the use of genomic prediction (GP) and genome-wide association studies (GWAS) have become a critical tool to achieve these benefits. ASReml-R is particularly suited to fit genomic selection models, such as GBLUP and rrBLUP, with the use of its powerful REML and mixed model machinery allowing for flexible and complex model fitting. However, these analyses require some complex preprocessing of molecular, phenotypic and, in some cases, pedigree datasets. In addition, the output from these models often requires post-processing to provide the summarized information breeders require for their programs.

ASRgenomics is an R supporting package developed within VSN International to complement and facilitate the use of this genomic information within ASReml-R. This package contains efficient and useful R functions that help on pre-processing of data, such as filtering and generating additive and dominant genomic matrices, for GBLUP or ssGBLUP analyses. In addition, it has functions that help to assess the quality of the data to perform genomic studies by pairing pedigree-based against genomicbased relationship matrices using both statistical and graphical output, among other things. All these functions are designed to generate objects to be used directly by ASReml-R, or objects from the model fitting that generate additional information.

An open supporting and independent package such as ASRgenomics will allow user to facilitate and expedite analytical pipelines, but will also allow for fast and better interaction with other scientists from the community to incorporate new genomic developments using the machinery of ASReml successfully.

Genstat development in difficult times

Roger Payne

The AASC conference is always a highlight for us at VSN, and it is a real disappointment that we cannot meet this year. Nevertheless it is good to be able to keep in touch by email, MS Teams and Zoom - and to know that you are keeping safe.

The Genstat team is dispersed over the two hemispheres, and so we are used to collaborating at a distance. Genstat development has continued almost unaffected, and I am glad to have this opportunity to tell you what we have been doing. We are developing Genstat-based apps for future cloud services, which we shall be able to demonstrate when we can next meet together. The main activity, though, has been to complete the 21st Edition of Genstat.

A major focus of the 21st Edition has been to provide new menus and procedures to help determine the appropriate fixed terms to include in a REML analysis. This is a weak point of all REML software (including previous Genstat editions), as the focus has always been more on the randommodel. However, with wider usage of REML outside variety trials, QTL analysis etc, the fixed model is becoming more important. You can now explore the model by adding and dropping terms, just as you would in ordinary regression. You can also fit all subsets, and use statistics such as the Akaike or Schwarz Bayesian Information Coefficients to decide on the best model.

There are new menus and procedures for multi-treatment meta analysis based on summary results from the individual experiments, for response surfaces and for separation plots. There are also many minor extensions, including the ability to provide BLUPs from ANOVA.

Design is another important development area, and we have new commands to construct augmented block designs and doubly resolvable row-column designs, which should be very useful in all our core biological areas and especially variety trials.

Other new commands allow you to obtain details of the syntax of commands and the source code of procedures stored in a procedure library. This will be especially useful for the support calls that sometimes occur, where the user still has the library but has lost the original source. The source can now be recovered for updates and modification. These provide the basis for new menus to encourage use of Genstat’s commands. You can insert standard sections of code into a text box, for example for loops and if-blocks. There is also a menu where you can enter the options and parameters for a command to use any directive or procedure.

So Genstat’s life goes on in these difficult times. Please keep safe until Covid 19 is less of a danger, so that we can meet again in 2022.

CycDesigN Version 7.0 Update

Nye John, Dave Whitaker and Emplyn Williams

CycDesigN version 7.0 (CD7) has extensive features for the design of field and glasshouse experiments, including plant phenotyping. This talk will focus on the substantial new facilities available in CD7, in particular neighbour balance and evenness of distribution (NB & ED) designs as applied to resolvable, non-resolvable, single and multi-location partially replicated designs. A number of examples will be presented.