The Chromosome 7 Annotation Project |
|
About the Project
Overview
The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications. In our vision, the DNA sequence of chromosome 7 should be made available in a user-friendly manner having every biological and medically relevant feature annotated along its length. We have established this website and database as one step towards this goal. In addition to being a primary data source we foresee this site serving as a "weighing station" for testing community ideas and information to produce highly curated data to be submitted to other databases such as NCBI, Ensembl, and UCSC. Therefore, any useful data submitted to us will be curated and shown in this database. For data sharing and submissions, please send any enquiry to tcag-chr7@sickkids.ca. A major challenge ahead will be to represent chromosome alterations, variants, and polymorphisms and their related phenotypes (or lack thereof), in an accessible way. This is our first attempt to accomplish this and, as will continue to be the case with the DNA sequence and gene annotation, improvements will occur incrementally. The project will be considered a success when an equal number of molecular biologists, medical geneticists, and physicians utilize the information.
I. Chromosome 7 Features (March 2003)
Chromosome type: Submetacentric
Overall: Short arm (7p): Long arm (7q): Centromere: |
157,953,789 bp |
Two different polymorphic alpha satellite arrays are
known with an average size of 2,580 kb for one (D7Z1) and 265 kb for the other (D7Z2),
as determined by pulsed-field gel electrophoresis. D7Z2 lies adjacent to 7p and D7Z2
adjacent to 7q. Therefore, to represent the 'average' chromosome we have
inserted 2,700,000 bp
Overall average: Male average: Female average: Average recombination length: |
181 cM |
G+C content SINES LINEs |
40.75% |
Segmental duplication:
(>5kb and >90%)
Intra-chromosomal Trans-chromosomal |
8.3 Mb (5.3%) |
Gene structure number Known genes Novel genes Partial genes Predicted genes Putative and non-coding TCR gene segments TCR pseudogenes Pseudogenes Alternative splicing Gene density Average size known gene Intergenic distance Largest gene Smallest gene Largest mRNA Overlapping genes % chromosome transcribed CpG islands |
1,917 |
FRA7A (rare, 7p11.2); FRA7B (7p22), FRA7C (7p14.2), FRA7D (7p13), FRA7E (7q21.2), FRA7F (7q22), FRA7G (7q31.2), FRA7H (7q32.3) are all common fragile sites: FRA7E, FRA7G, FRA7H and FRA7I are characterized at the molecular level.
Chromosome heteromorphisms:
Inversion polymorphism detected in some Williams-Beuren patients and carrier parents. Natural occurring heteromorphism at centromeres observed in normal population. Centromeric heteromorphism found in spurious monosomy 7 in leukemia.
Synteny:
Six murine chromosomes (5, 6, 9, 11, 12 and 13) grouped into 36 blocks.
Imprinted genes and regions:
MEG1/GRB10 at 7p12 shows isoform and tissue-specific imprinting in humans. In fetal brain, most isoforms are expressed from the paternal allele. In skeletal muscle, one isoform of GRB10 (gamma1) is expressed from the maternal allele alone, whereas in other fetal tissues, all GRB10 spliced isoforms are expressed from both parental alleles. Mouse Grb10 was found expressed from the maternal allele only. It functions as a growth suppressor via inhibitory interactions with IGF-1, the growth hormone-receptor and insulin.
SGCE at 7q21 is a component of the dystrophin-sarcoglycan complex. Imprinting of human SGCE has not been experimentally confirmed but the mouse gene is expressed from the paternal allele only. Moreover, heterozygous loss-of-function mutations in SGCE have been identified in myoclonus-dystonia syndrome (MDS), which demonstrates a marked difference in penetrance in MDS depending on the parental origin of the disease allele.
PEG10 at 7q21 located just distal to the SGCE gene contains two long open reading frames having homology to the gag and pol proteins of some vertebrate retrotransposons. Paternal expression of PEG10 has been demonstrated in placental villi.
PEG1/MEST at 7q32 is maternally imprinted in many human and mouse fetal tissues. The function of the protein is unknown but it shows sequence similarity with the alpha/beta hydrolase fold family. Two isoforms, distinguished by having unique first exons, have been identified. In lymphoblastoid cells only isoform 1 is imprinted, whereas isoform 2 is biallelically expressed. Mest-deficient mice show growth retardation and abnormal maternal behavior.
COPG2 at 7q32 encodes coatomer protein complex subunit gamma 2. It is located adjacent to PEG1/MEST and the two genes have overlapping 3’-UTRs. Initially, COPG2 was determined to be maternally imprinted. Subsequent studies could not reproduce this result but instead indicated the observation was due overlap with the PEG1/MEST gene.
CIT1 at 7q32 is an antisense transcript located within intron 20 of COPG2. It is maternally imprinted in all fetal tissues examined. DNA marker, Mit1/Lb9, located within intron 20 of mouse Copg2, is also maternally imprinted. The mRNAs for CIT1 and Mit/Lb9 are only partially known and so far, they appear to be non-coding RNAs. The CPA4 gene has also recently shown to be imprinted in some tissues.
II. Cytogenetic
characteristics:
Disease related:
-7q11.23 microdeletion in Williams-Beuren
Syndrome
-maternal uniparental disomy (UPD) of chromosome 7 observed in Russell-Silver syndrome (paternal UPD has also been detected but no defined phenotype is associated)
-various deletions of 7p associated with craniosynostosis
Cancer related:
-t(7;12) involving the ETV6 gene at 12p13 in myeloid disorders of children and other hematologic malignancies
-t(7;11)(p15;p15) involving fusion of HOXA9 and NUP98 in human myeloid leukemia
-t(2;7)(p12;q21) involving Ig kappa sequence with CDK6 in splenic marginal zone lymphoma
-t(7;17)(p15;q21) involving JAZF1 and JJAZ1 in endometrial stromal tumors
-t(3;7)(q27;p12) involving BCL6 and Ikaros in diffuse large B-cell lymphoma
Cytogenetic evidence suggesting a disease causing cancer gene resides on chromosome 7 (gene not yet identified)
Disease
|
Chromosome 7
abnormality |
Acute lymphoblastic leukemia |
Deletion 7q11 |
Acute lymphoblastic leukemia |
Deletion 7q22 |
Acute myelogenous leukemia |
Deletion 7q33-q35 |
Acute myeloid leukemia |
Deletion 7q22 |
Acute non-lymphoblastic leukemia |
Monosomy 7 |
B-cell chronic lymphoproliferative disorder |
Monosomy 7 |
Bladder carcinoma |
Trisomy 7 |
Brain tumors |
Trisomy 7 |
Breast carcinoma |
Deletion 7q11 |
Breast carcinoma |
Deletion 7q31-q32 |
Chronic myeloid leukemia |
Deletion 7q11 |
Colorectal carcinoma |
Trisomy 7 |
Gastric MALT lymphoma |
Deletion of 7p |
Head and neck squamous cell carcinoma |
Deletion 7q |
Intestinal polyps |
Trisomy 7 |
Lung adenocarcinoma |
Deletion 7q22 |
Malignant andrological neoplasias |
Deletion 7q31.1-q32 |
Malignant melanoma |
Deletion 7q |
Mesothelioma |
Monosomy 7 |
Myelodysplastic syndrome |
Monosomy 7 |
Myelodysplastic syndrome |
Deletion 7q22 |
Myeloid disorders |
Deletion 7q31-q32 |
Neurofibromatosis |
Monosomy 7 |
Non-Hodgkin’s lymphoma |
Monosomy 7 |
Non-Hodgkin’s lymphoma |
Deletion 7q22 |
Ovarian carcinoma |
Deletion 7q31-q32 |
Papillary renal carcinoma |
Trisomy 7 |
Primary lung carcinoma |
Trisomy 7 |
Primary prostate carcinoma |
Trisomy 7 |
Primary prostate carcinoma |
Deletion 7q |
Skin carcinoma in situ |
Deletion 7q31-q32 |
Small lymphocytic lymphoma |
Deletion 7q32 |
Thyroid carcinoma |
Deletion 7q22 |
Thyroid tumors |
Trisomy 7 |
Uterine leiomyoma |
Deletion 7q22-q31.1 |
III. General History
and Discussion:
Human Gene Mapping:
Chromosome 7 has been one of the more extensively studied chromosomes because
of the many interesting genes and diseases that map to it. In the 1980’s the
T-cell receptor gene families (TCRB
and TCRG), erythropoietin (EPO), the multi-drug resistance (PGY1 and PGY3) genes and the homeobox A (HOXA)
gene family were localized to chromosome 7 placing it in the spotlight of
genetics research. In 1985, the cystic fibrosis locus was mapped to chromosome
7 by Lap-Chee Tsui and colleagues in
Key
figures in the subsequent study of chromosome 7 and the human genome project
emerged from the cystic fibrosis gene hunt. Helen Donis-Keller, then at
Collaborative Research, shared reagents with Tsui in the mapping of the cystic
fibrosis gene to chromosome 7. Donis-Keller, members of Collaborative Research,
and others later produced the first genetic map of the human genome using
restriction fragment length polymorphisms (RFLPs). On a side note, the CEO of
Collaborative Research, Orrie Friedman, shocked the scientific community by
stating their company “owned chromosome 7”. Some consider this statement a
foreshadow of the events around the human genome sequencing race that would
occur between the public and private sectors a decade later. Francis Collins,
who would lead the U.S. National Institutes of Health (NIH) genome initiative,
collaborated with Tsui in the cloning of the cystic
fibrosis gene. The laboratories of Tsui, Donis-Keller, and Karl-Heinz Grzeschik
(
Genetic and Physical Maps:
With the advent of the human genome initiative in the early 1990s, many of the
initial NIH pilot study grants also focussed efforts on studying chromosome 7
at a higher resolution. A group led by Donis-Keller, Maynard Olson and David
Schlessinger at Washington University in St. Louis set out to make
chromosome-specific genetic and physical maps based on the sequence-tagged site
(STS) concept. Donis-Keller and Olson were working on chromosome 7 and Schlessinger
on the X chromosome. In 1994, a member of this group, Eric Green, moved the
chromosome 7 project to NIH operating within Francis Collins’ newly formed
intramural National Human Genome Research Institute (NHGRI) in
DNA Sequence Maps: In the
late 1990’s, once again chromosome 7 took center stage when it was marked down
to be one of the first large human chromosomes to be sequenced. The bulk of the
work completed by publicly-funded sources would be completed at
Annotation of Chromosome 7 DNA Sequence: With the majority of the DNA sequence of chromosome 7 finished, efforts will now include properly cataloguing all of the genes and their function, identifying the genes and DNA sequence variations that either directly cause or are associated with disease, and defining all of the structural and functional features of the chromosome. Chromosome 7 contains many important genes for development and maintenance including Sonic hedgehog, the homeobox genes PAX4, HLXB9, GBX1, the HOXA cluster, DLX5 and DLX6, a cluster of cytochromome P450 oxidases, and the leptin gene, to name just a few.
There are over 360
disease-associated genes or loci on chromosome 7. The first description of the
phenomena of uniparental disomy (UPD) in humans was observed in a female cystic
fibrosis patient identified to inherit two chromosome 7s both of maternal
origin (with no paternal contribution).
Many individuals with maternal UPD have a growth deficiency disorder
called Russell-Silver syndrome. Paternal uniparental disomy of chromosome 7 has
been observed in a small number of individuals, but no consistent disease
condition is associated with it. Recently, FOXP2
was the first gene found by Anthony Monaco's group to be causative in
speech and language disorder. This discovery captured public attention and the
description of the pursuit of the disease gene was featured in Matt Ridley’s
novel Genome: The Autobiography of a
Species in 23 Chapters. Media attention focussed on the fact that the
identification of FOXP2 provided yet
another example of how some human behavioral characteristics can be reduced to
gene function, or lack thereof. In fact, one of the first examples of a genetic
lesion underlying a behavioral condition was the finding that a 1.5 million
base pair deletion (and in some cases inversion) of 7q11.23 in Williams-Beuren
syndrome leaves affected individuals with a specific cognitive profile with
deficits in visuo-spatial reconstruction. There are also putative disease loci
for other neuropsychiatric diseases on chromosome 7 including autism,
alcoholism, bipolar affective disorder, panic disorder and schizophrenia. The
mapping of the first autism locus to 7q (by
Monosomy 7 is one of the most frequent chromosomal abnormalities observed in myelodysplasia and acute myelogenous leukemia (AML). The incidence of monosomy 7 is particularly high in those individuals who have been previously exposed to drugs, radiotherapy, or toxins. Monosomy 7 is also observed in constitutional disorders such as Fanconi's anaemia, congenital neutropenia and familial monosomy 7. These congenital bone marrow disorders predispose to leukaemia usually through a myelodysplastic phase. Other cytogenetic abnormalities of chromosome 7 are found in many different types of human neoplasia, some of which present consistent patterns of genetic alteration (see above). It has been postulated that the long arm of chromosome 7 may contain multiple tumor suppressor and oncogenes, and even an anti-senescence gene. So far, the ST7 tumor suppressor gene in colon cancer, the MET protooncogene in hereditary papillary renal cell carcinoma, the CDK6 gene in splenic lymphoma, and a few other fusion proteins have been identified. Currently, intensive searches are ongoing for a putative tumor suppressor gene at 7q22 and another at 7q34-q35 involved in AML. All of these studies will benefit from the vast wealth of resources, data, and literature available for human chromosome 7.