SubKluster: Novel method to bin scaffolds from cereal genomes into subgenomes using substring frequency analysis
The genome of the Belinda variety of the hexaploid oat (Avena sativa) has recently been sequenced and assembled. This project aims to improve the assembly by clustering the thousands of scaffolds into their three ancestral subgenomes using Principle Component Analysis (PCA) of kmer and repeat-element frequencies. The method was developed using a chromosome level assembly of hexaploid Wheat (Tritiu