Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Resolving intra-repeat variation in medically relevant VNTRs from short-read sequencing data using the cardiovascular risk gene LPA as a model

Fig. 5

Region of Interest (ROI) for KIV-2 read extraction from aligned WES data. The LPA gene is represented from KIV-1 to KIV-3 in transcription direction and not at scale for clarity. The KIV-2 VNTR is represented with 6 repeats as in the reference genomes hg19 and hg38, with the third repeat in the transcription direction being a KIV-2B unit. The other KIV-2 units are KIV-2A (in each unit, the first exon is represented in yellow, the second exon in blue). The KIV-2B unit in the VNTR is highly homologous to the KIV-3, which flanks the KIV-2 VNTR, showing the same first exon (purple) and 96% homology in the second exon. Additionally, all second KIV-2 exons are identical to the second KIV-1 exon (blue, see also the supplementary Fig. 12 of ref [2] for a detailed per-base identity matrix). In the first 200 intronic bp flanking each exon, 70–100% per-base identity is observed between KIV-1, KIV-2, KIV-2B, and KIV-3. Thus, minor differences in the ROI used for read extraction can extract highly homologous reads that create spurious variants if aligned to only one repeat as done in this variant calling pipeline. ROI-1 extracts reads from the complete LPA region (from KIV-1 to the protease domain, represented by an arrow). ROI-2 to ROI-9 progressively restricts the extraction region to the KIV-2 VNTR only. Since the KIV-2B appears fixed in the human genome reference, ROI-2 (previously published in [3]) and ROI-5 do not extract reads mapped to the KIV-2.2 exon 2 and KIV-2.3 subtype B, while ROI-6 and ROI-9 do not extract reads mapped to the KIV-2.3 subtype B. In combination with the presence or absence of KIV-2B units in the sequenced sample, this creates different spurious variant calls in a sample-specific manner. The size of the regions enclosed by each ROI is provided as bp between the start and end coordinates on hg38. Precise coordinates are provided in Additional file 1: Table S3

Back to article page