Resource

Id pipeline/GPF_SFARI_annotation
Type annotation_pipeline
Version 0
Summary
Description
Labels

Pipeline Documentation

preamble

Summary GPF SFARI Annotation Pipeline
Description This is the pipeline used to annotate SFARI GPF instance resources
Input reference genome hg38/genomes/GRCh38.p14

Annotators

worst_effect
Type:

Worst effect across all transcripts.

source: worst_effect
worst_effect_genes
Type:

comma separated list of genes with worst effect.

source: worst_effect_genes
gene_effects
Type:

<gene_1>:<effect_1>|... A gene can be repeated.

source: gene_effects
effect_details
Type:

Effect details for each affected transcript. Format: < transcript 1 >:<gene 1>:<effect 1>:<details 1>|...

source: effect_details
gene_list
Type: (Internal)

List of all genes

source: gene_list
Annotator type: effect_annotator

Annotator to identify the effect of the variant on protein coding.

More info

Resource
Type: genome
Summary:
Nucleotide sequence of the GRCh38.p14 genome assembly
Resource
Type: gene_models
Summary:
GENCODE 49, basic gene annotation on the primary assembly (chromosomes and scaffolds) sequence regions
normalized_allele
Type: (Internal)

Normalized allele.

source: normalized_allele
Annotator type: normalize_allele_annotator
No description
Resource
Type: genome
Summary:
Nucleotide sequence of the GRCh38.p14 genome assembly
dbSNP_RS
Type:

dbSNP ID (i.e. rs number)

allele_aggregator: list

HISTOGRAM
source: RS
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
dbSNP: A public database of genetic variations for research and clinical use.
gnomad_v4_exomes_ALL_af
Type:

Alternate allele frequency

allele_aggregator: max

HISTOGRAM
source: AF
gnomad_v4_exomes_ALL_af_percent
Type:

Alternate allele frequency as percent

allele_aggregator: max

HISTOGRAM
source: AF_percent
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
gnomAD v4.1.0 exome variants (ALL)
gnomad_v4_genomes_ALL_af
Type:

Alternate allele frequency

allele_aggregator: max

HISTOGRAM
source: AF
gnomad_v4_genomes_ALL_af_percent
Type:

Alternate allele frequency as percent

allele_aggregator: max

HISTOGRAM
source: AF_percent
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
gnomAD v4.1.0 genome variants (ALL)
CLNDN
Type:

ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB

allele_aggregator: list

HISTOGRAM
source: CLNDN
CLNSIG
Type:

Aggregate germline classification for this single variant; multiple values are separated by a vertical bar

allele_aggregator: list

HISTOGRAM
source: CLNSIG
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
Measure used to assess the clinical significance of genetic variants
phylop100way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP100way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 100 species
phylop30way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP30way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 30 species
phylop20way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP20way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 20 species
phylop7way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP7way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 7 species
phastcons100way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons100way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 100 species
phastcons30way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons30way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 30 species
phastcons20way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons20way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 20 species
phastcons7way
Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons7way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 7 species
cadd_raw
Type:

CADD raw score for functional prediction of a SNP. The larger the score the more likely the SNP has damaging effect

allele_aggregator: max

HISTOGRAM
source: cadd_raw
cadd_phred
Type:

CADD phred-like score. This is phred-like rank score based on whole genome CADD raw scores. The larger the score the more likely the SNP has damaging effect.

allele_aggregator: max

HISTOGRAM
source: cadd_phred
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

Resource
Type: allele_score
Summary:
CADD (Combined Annotation Dependent Depletion score) predicts the potential impact of a SNP
am_pathogenicity
Type:

AlphaMissense Pathogenicity score is a deleteriousness score for missense variants

allele_aggregator: max

HISTOGRAM
source: am_pathogenicity
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

Resource
Type: allele_score
Summary:
Functional impact of mutations on protein function
hg19_annotatable
Type: (Internal)

The lifted over annotatable

source: liftover_annotatable
Annotator type: liftover_annotator

Annotator to lift over a variant from one reference genome to another.

More info

Resource
Type: liftover_chain
Summary:
Liftover Chain Hg38 to Hg19
Resource
Type: genome
Summary:
Nucleotide sequence of the GRCh38.p14 genome assembly
Resource
Type: genome
Summary:
HG19 reference genome
mpc
Type:

Missense badness, PolyPhen-2, and Constraint. A deleteriousness prediction score for missense variants"

allele_aggregator: max

HISTOGRAM
source: MPC
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

  • allele (default): exact chrom/pos/ref/alt match.
  • region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: allele_score
Summary:
MPC (Missense badness, PolyPhen-2, and Constraint) is a composite score that predicts the impact of missense variants.
worst_effect_MANE_1.4
Type:

Worst effect across all transcripts.

source: worst_effect
effect_details_MANE_1.4
Type:

Effect details for each affected transcript. Format: < transcript 1 >:<gene 1>:<effect 1>:<details 1>|...

source: effect_details
gene_effects_MANE_1.4
Type:

<gene_1>:<effect_1>|... A gene can be repeated.

source: gene_effects
Annotator type: effect_annotator

Annotator to identify the effect of the variant on protein coding.

More info

Resource
Type: genome
Summary:
Nucleotide sequence of the GRCh38.p14 genome assembly
Resource
Type: gene_models
Summary:
MANE gene model version 1.4
number_of_deletions_in_SSC_affected
Type:

The number of CNVs overlapping with the annotatable.

source: count
Annotator type: cnv_collection
No description
Resource
Type: cnv_collection
Summary:
De novo CNVs from SSC and AGRE WGS
number_of_duplications_in_SSC_affected
Type:

The number of CNVs overlapping with the annotatable.

source: count
Annotator type: cnv_collection
No description
Resource
Type: cnv_collection
Summary:
De novo CNVs from SSC and AGRE WGS
in_a_SFARI_gene_CNV
Type:

The number of CNVs overlapping with the annotatable.

source: count
Annotator type: cnv_collection
No description
Resource
Type: cnv_collection
Summary:
SFARI_Gene CNV collection

Files

Filename Size md5
annotation.yaml 3.29 KB ac9013f7bcd11a306b6e9e37f6c39179
genomic_resource.yaml 196.0 B d0b001f30fff2e7ed506b17cdfb008ed
statistics/