Ultra-high-dimensional data with grouping structures arise naturally in many contemporary statistical problems,such as gene-wide association studies and the multi-factor analysis-of-variance(ANOVA).To address this issue,we proposed a group screening method to do variables selection on groups of variables in linear models.This group screening method is based on a working independence,and sure screening property is also established for our approach.To enhance the finite sample performance,a data-driven thresholding and a two-stage iterative procedure are developed.To the best of our knowledge,screening for grouped variables rarely appeared in the literature,and this method can be regarded as an important and non-trivial extension of screening for individual variables.An extensive simulation study and a real data analysis demonstrate its finite sample performance.
The unified weighing scheme for the local-linear smoother in analysing functional data can deal with data that are dense,sparse or of neither type.In this paper,we focus on the convergence rate of functional principal component analysis using this method.Almost sure asymptotic consistency and rates of convergence for the estimators of eigenvalues and eigenfunctions have been established.We also provide the convergence rate of the variance estimation of the measurement error.Based on the results,the number of observations within each curve can be of any rate relative to the sample size,which is consistent with the earlier conclusions about the asymptotic properties of the mean and covariance estimators.
Suppose that we observe y|θ,τ∼N_(p)(Xθ,τ^(−1)I_(p)),where θ is an unknown vector with unknown precisionτ.Estimating the regression coefficient θ with known τ has been well studied.However,statistical properties such as admissibility in estimating θ with unknownτare not well studied.Han[(2009).Topics in shrinkage estimation and in causal inference(PhD thesis).Warton School,University of Pennsylvania]appears to be the first to consider the problem,developing sufficient conditions for the admissibility of estimating means of multivariate normal distributions with unknown variance.We generalise the sufficient conditions for admissibility and apply these results to the normal linear regression model.2-level and 3-level hierarchical models with unknown precisionτare investigated when a standard class of hierarchical priors leads to admissible estimators of θ under the normalised squared error loss.One reason to consider this problem is the importance of admissibility in the hierarchical prior selection,and we expect that our study could be helpful in providing some reference for choosing hierarchical priors.
Quantile treatment effects can be important causal estimands in evaluation of biomedical treatments or interventions for health outcomes such as medical cost and utilisation.We consider their estimation in observational studies with many possible covariates under the assumption that treatment and potential outcomes are independent conditional on all covariates.To obtain valid and efficient treatment effect estimators,we replace the set of all covariates by lower dimensional sets for estimation of the quantiles of potential outcomes.These lower dimensional sets are obtained using sufficient dimension reduction tools and are outcome specific.We justify our choice from efficiency point of view.We prove the asymptotic normality of our estimators and our theory is complemented by some simulation results and an application to data from the University of Wisconsin Health Accountable Care Organization.
Multivariate mixtures are encountered in situations where the data are repeated or clustered measurements in the presence of heterogeneity among the observations with unknown proportions.In such situations,the main interest may be not only in estimating the component parameters,but also in obtaining reliable estimates of the mixing proportions.In this paper,we propose an empirical likelihood approach combined with a novel dimension reduction procedure for estimating parameters of a two-component multivariate mixture model.The performance of the new method is compared to fully parametric as well as almost nonparametric methods used in the literature.