library("tidyverse")

Download the data here and load them using:

data = read_rds("assets/genotoy.rds")

The data variable is a list containing the following objects

The Q1 and B1 correspond to highly polygenic phenotypes whereas Q2 and B2 correspond to mildly polygenic phenotypes.

  1. Plot the distribution of the minor allelic frequencies

For each phenotype, analyse the whole dataset:

  1. Compute the effect sizes and the \(p\)-values
  2. What polymorphisms are significant at the 5% family-wise error rate?
  3. What polymorphisms are significant at the 25% FDR?
  4. Plot the quantile plot for the \(p\)-values

For each phenotype:

  1. Compute a polygenic score based on the polymorphims such that \(p < .003\), \(p < .01\), \(p < .03\), \(p < .1\) and \(p < .3\)
  2. Evaluate its accuracy using either Pearson’s \(\rho\) for quantitative traits and the area under the curve for binary traits
  3. What’s the best choice of threshold for the \(p\)-value?