Lect 3: Models of CD

Hugo Menchen: For every complex human problem there is a neat and simple answer that is wrong

1) Planning stage: Gathering basic knowledge

2) Choosing a strategy to identify risk alleles of CD

a) Linkage studies

  • traditionally gold standard
  • if primary target to study families
  • identifies chromosomal regions that co-segregare with disease
  • Est likelihood of physical link bet an allele at locus & disease
  • Linkage degree -> LOD score
    • Z> 3 is sig +ve linkage & z < -2 exclude linkage
  • Successful if penetrance high
  • Now post-genome -> precise location of human genes known + HapMap + SNPMap -> mutations & polymorph known -> testing candidate genes for r/s with disease susceptibility easier
  • Allele sharing method
  • Used in most mendelian disease to identify cause
  • Families studies & affected sib pairs (x many studies based on sib pairs)
  • Entirely computer dep

  • Non-parametric linkage studies


    • model-free -> applied to complex diseases
    • ignores unaffected individuals
    • looks at sharing of alleles in affected individuals only
    • method of choice in CD
      • sufficient fam members available
      • use extended fam
      • use affected sib-pair analysis
      • x case control studies
    • alleles/haplotypes: identical by descent (within family) or state
  • Limitations


    • Req informative families -> since CD sporadic -> statistical power low
    • Even if sig family available -> common diseases -> complex & phenotype det by environement + genetics -> x cluster in Mendelian fashion


b) Association Studies

  • Sporadic cases analysed
  • Family cases may be collected but intro at end -> avoid bias
  • vital tool to identify risk alleles in CD
  • x physical link
  • 2 grps -> from same popn -> cases & healthy controls observed
  • 1st major -> MHC on chr 6p21.3
  • Association with increased risk -> OR
  • Used in complex diseases
  • To identify risk alleles
  • Commonly used -> Popn based case- control studies -> compare freq of 1 or more alleles -> higher in controls vs cases -> reduced risk if statistically sig.
  • Limitations


    • False +ves -> proved difficult to replicate
    • Small studies -> weak by impt associations mssed


    • OR values -> estimates & shld be considered in Confidence Interval range




  • Strategy used in CD (case control association studies)
    • Unrelated cases & healthy controls
    • Compare distribution of candidate alleles
    • Result -> genetic association
    • RISK -> measured as odds ratio
      -1 = normal; <1 = reduced risk >1 = increased risk
    • E.g. WTCCC1 -> 1st huge study -> 7 diseases -> T1D
  • More apt for ID where transmission is horizontal

G1: Identify risk alleles as an aid to diagnosis

  • Genetic test -> increase probability of knowing which disease is which for diagnosis in clinics
  • Risk -> measured as odds ratio -> low: x diagnostic use but good for pathogenesis knowlege

G2: Identify risk alleles & pathways involved in disease pathogenesis -> better understand disease pathology

  • Early pathogenesis of diseases unknown

G3: Identify risk alleles (involved in disease pathogenesis) – as an aid to patient mgmt & therapy choice -> to target for therapy more effectively & develop new therapies

Genome wide studies -> GWAS or GWLS dep on families available

  • Human Genome Mapping Project + SNPmap +Hapmap -> extensive study
  • Impute missing date from from databases easily -> extending data
  • Sufficient fully genotyped samples with complete haplotypes in database
  • E.g. Impute missing genotypes based on expected linkage pattern -> exploits linkage diseqm
  • Imputations -> PLINK software
  • Greater the no of WGS -> greater the quality of imputed data -
  • Success of method dep on degree of linkage diseqbm between tagged SNPs on target haplotypes
  • x req hypothesis initially -> flexible; hypothesis generating
  • Hence hypothesis free GWAS -> data -> identifies regions arnd genes that contribute to biological systems -> impt in understanding disease pathology

Each strategy -> sub-strategy - info to det plan reqd -> genome wide or selective approach

  • if selective -> whole genome or single candidate gene looked ar
  • dep on knowledge of previous & current studies & disease pathology

Studying extended haplotypes - lengthy DBA sections where multiple alleles inherited in specific grps


  • Linkage diseqm (alleles of 2 or more genes found tgt more often than normal although equally segregated)
  • Several impt genes all related to same pathway -> e.g. MHC contains key genes for T cell immunity -> widespread polymorph -> precise susceptibility allele -> difficult to identify
    • But data set large -> x issue
  • Combining haplotype data with GWAS data using imputation to assign HLA haplotypes -> interesting results in autoimmune liver disease primary biliary cirrhosis

Field of statistics vital - Sample size (stats confidence), case & control selection, sampling errors, publication bias

a) Prevalence & Incidence

  • Incidence = no. of new cases/time
  • Prevalence = total no. of cases of disease at time X (new & pre-existing cases)
  • Diff reflect environmental factors

b) Family, twin, adoption studies -> check signs of heritability


i) Family Studies - Informative but families in complex disease rare & x conform to Mendelian patterns

  • Incidence of traits
  • Inheritance pattern
  • Geography vital -> consider migration
  • Beware heritability -> when talking about genes

ii) Twin Studies


a) MZ twins (identical)
b) DiZ (non-identical)
c) Identify genetic variation levels

  • higher concordance in MZ vs DZ -> sig genetic component

c) Linkage analysis -> map susceptibility loci -> families and several affected individuals
d) Association analysis -> narrow down region -> x several affected individuals do this
e) Identify DNA seq variants conferring susceptibility
f) Define biochemical action

Before starting a study

  • Simple measures: Risk ration; Concordance in twins; disease, characteristic, disorder/trait freq
  • Indicators: Familial aggregation; geographic clustering

Studying selected chromosomal region - 2nd phase process & undertaken if prior studies identified regions of interest

  • Tagged SNPs indicate association with specific gene but more likely across an area
    • High resolution analysis using tagged SNPs -> identify association peaks in/arnd genes
  • However commercial SNP chip unavailable/costly to apply GWAS to single chromosome

Investigating pathways


  • Systems biology -> focuses on study of pathway interaction
  • With prior data -> able to study CD
  • E.g. Identification of NOD2 (CARD15) link to pathways associated with immune tolerance to commensal gut bacteria -> Crohn's disease -> value of styuding processes instead of individual genes in CD

Candidate gene approach - single gene - Hypothesis driven -> so limited to observing associations at specified loci

  • loci selected -> hv some fun_al r/s bet gene pdt & disease
  • limitations:
  • lack understanding of human biology & disease pathology
  • Narrow view -> miss other associations
  • But if prior studies solid -> high resolution genotyping of gene to identify polymorph associated with disease susceptibility relevant
  • Early pathogenesis of diseases poor -> late onset of disease -> poor hypothesis -> gene candidate selection poor -> gene studies rarely successful
  • Pathology changes but genome same -> so genome wide then observe pathways, haplotypes & candidate genes
  • Biological defects known: target specific gene or genes in biochemical pathway
  • If unknown biological defect or characterised with uncertainty
    o Whole genome scanning –> microsatellite/SNPs
    o Multiple candidates -> multiple pathways

Positional approach to select candidate

  • Single gene (selected candidate)
  • Chromosomal region (e.g. MHC)
  • Whole chromosome & GWAS

Functional approach to select candidates -> hypothesis driven

  • Candidate region/gene -> MHC on chromosome 6p21.3 -> genetic association with disease -> high as impt with immunity
  • Pathway (HSCR)
  • Both -> Complementary -> CARD15 (protein)/NOD2(gene) in Crohn’s Disease