Aims

To support the free and open dissemination of research findings and information on alcoholism and alcohol-related problems. To encourage open access to peer-reviewed articles free for all to view.

For full versions of posted research articles readers are encouraged to email requests for "electronic reprints" (text file, PDF files, FAX copies) to the corresponding or lead author, who is highlighted in the posting.

___________________________________________

Saturday, May 1, 2010

An Incomplete-Data Quasi-Likelihood Approach to Haplotype-Based Genetic Association Studies on Related Individuals


We propose an incomplete-data, quasi-likelihood framework for estimation and score tests that accommodates both dependent and partially observed data. The motivation comes from genetic association studies, where we address the problems of estimating haplotype frequencies and testing association between a disease and haplotypes of multiple, tightly linked genetic markers, using case-control samples containing related individuals.

We consider a more general setting in which the complete data are dependent with marginal distributions following a generalized linear model. We form a vector, Z, whose elements are conditional expectations of the elements of the complete-data vector, given selected functions of the incomplete data. Assuming that the covariance matrix of Z is available, we create an optimal linear estimating function based on Z, which we solve by an iterative method.

This approach addresses key difficulties in haplotype frequency estimation and testing problems in related individuals: (a) dependence that is known but can be complicated; (b) data that are incomplete for structural reasons, as well as possibly missing, with different amounts of information for different observations; (c) the need for computational speed to analyze large numbers of markers; and (d) a well-established null model but an alternative model that is unknown and is difficult to specify fully in related individuals.

For haplotype analysis, we give sufficient conditions for consistency and asymptotic normality of the estimator and asymptotic χ2 null distribution of the score test.

We apply the method to test for association of haplotypes with alcoholism in the GAW 14 COGA data set.


Read Full Abstract


Request Reprint E-Mail:


_______________________________________________________