How often does the GDC update the workflow/reference genome? If the GDC updates the workflow/reference genome, does the GDC re-process all data sets?
For the reference genome, the GDC has been using an augmented version of GRCh38.p2 (with additional decoy sequences and virus sequences) since inception. The GDC does not use alternative contigs, and only derives high-level data from the major chromosomes, so the same reference genome is used for both gene model GENCODE v22 (from Data Release 1 to 31) and GENCODE v36 (from Data Release 32). As future versions of the reference genome are released, e.g., GRCh39, the GDC will evaluate the benefits of updating data to utilize the new version.