Open Close
General Information
Symbol
Dmel\Sox21a
Species
D. melanogaster
Name
Sox21a
Annotation Symbol
CG7345
Feature Type
FlyBase ID
FBgn0036411
Gene Model Status
Stock Availability
Gene Summary
Sox21a (Sox21a) encodes a Sox family transcription factor involved in the differentiation of stem cells in the midgut. [Date last reviewed: 2019-03-14] (FlyBase Gene Snapshot)
Key Links
Genomic Location
Cytogenetic map
Sequence location
Recombination map
3-42
RefSeq locus
NT_037436 REGION:14103481..14108118
Sequence
Other Genome Views
The following external sites may use different assemblies or annotations than FlyBase.
Function
GO Summary Ribbons
Gene Ontology (GO) Annotations (10 terms)
Molecular Function (2 terms)
Terms Based on Experimental Evidence (0 terms)
Terms Based on Predictions or Assertions (2 terms)
CV Term
Evidence
References
inferred from biological aspect of ancestor with PANTHER:PTN000030384
(assigned by GO_Central )
inferred from biological aspect of ancestor with PANTHER:PTN000030384
(assigned by GO_Central )
Biological Process (7 terms)
Terms Based on Experimental Evidence (3 terms)
CV Term
Evidence
References
inferred from mutant phenotype
inferred from mutant phenotype
inferred from mutant phenotype
Terms Based on Predictions or Assertions (4 terms)
CV Term
Evidence
References
inferred from biological aspect of ancestor with PANTHER:PTN000030384
(assigned by GO_Central )
inferred from biological aspect of ancestor with PANTHER:PTN000030384
(assigned by GO_Central )
inferred from biological aspect of ancestor with PANTHER:PTN000814418
(assigned by GO_Central )
inferred from biological aspect of ancestor with PANTHER:PTN000030384
(assigned by GO_Central )
Cellular Component (1 term)
Terms Based on Experimental Evidence (1 term)
CV Term
Evidence
References
located_in nucleus
inferred from direct assay
Terms Based on Predictions or Assertions (0 terms)
Protein Family (UniProt)
-
Summaries
Gene Snapshot
Sox21a (Sox21a) encodes a Sox family transcription factor involved in the differentiation of stem cells in the midgut. [Date last reviewed: 2019-03-14]
Gene Group (FlyBase)
HIGH MOBILITY GROUP BOX TRANSCRIPTION FACTORS -
The High mobility group box (HMGB) transcription factors are sequence-specific DNA binding proteins that regulate transcription. The HMGB proteins have a characteristic L-shaped HMGB domain of about 80 amino acid residues, which binds the DNA minor groove and induce DNA bending. The HMGB domains are found in one or more copies and are involved in the regulation of DNA-dependent processes such as transcription, replication and chromatin remodeling. (Adapted from FBrf0194706, FBrf0108466, PMID:24086078 and PMID:23153957).
Gene Model and Products
Number of Transcripts
2
Number of Unique Polypeptides
2

Please see the JBrowse view of Dmel\Sox21a for information on other features

To submit a correction to a gene model please use the Contact FlyBase form

Protein Domains (via Pfam)
Isoform displayed:
Pfam protein domains
InterPro name
classification
start
end
Protein Domains (via SMART)
Isoform displayed:
SMART protein domains
InterPro name
classification
start
end
Structure New Section
Protein 3D structure   (Predicted by AlphaFold)   (AlphaFold entry M9PFG6)

If you don't see the viewer to the right, refresh your browser.
Model Confidence:
  • Very high (pLDDT > 90)
  • Confident (90 > pLDDT > 70)
  • Low (70 > pLDDT > 50)
  • Very low (pLDDT < 50)

AlphaFold produces a per-residue confidence score (pLDDT) between 0 and 100. Some regions with low pLDDT may be unstructured in isolation.

Comments on Gene Model

Gene model reviewed during 5.44

Stop-codon suppression (UGA) postulated; FBrf0216884..

Gene model reviewed during 5.46

Sequence Ontology: Class of Gene
Transcript Data
Annotated Transcripts
Name
FlyBase ID
RefSeq ID
Length (nt)
Assoc. CDS (aa)
FBtr0075748
2335
388
FBtr0330178
2335
407
Additional Transcript Data and Comments
Reported size (kB)
Comments
External Data
Crossreferences
Polypeptide Data
Annotated Polypeptides
Name
FlyBase ID
Predicted MW (kDa)
Length (aa)
Theoretical pI
RefSeq ID
GenBank
FBpp0075490
40.8
388
10.58
FBpp0303211
42.6
407
10.57
Polypeptides with Identical Sequences

None of the polypeptides share 100% sequence identity.

Additional Polypeptide Data and Comments
Reported size (kDa)
Comments
External Data
Crossreferences
InterPro - A database of protein families, domains and functional sites
Linkouts
Sequences Consistent with the Gene Model
Mapped Features

Click to get a list of regulatory features (enhancers, TFBS, etc.) and gene disruptions (point mutations, indels, etc.) within or overlapping Dmel\Sox21a using the Feature Mapper tool.

External Data
Crossreferences
Eukaryotic Promoter Database - A collection of databases of experimentally validated promoters for selected model organisms.
Linkouts
Expression Data
Expression Summary Ribbons
Colored tiles in ribbon indicate that expression data has been curated by FlyBase for that anatomical location. Colorless tiles indicate that there is no curated data for that location.
For complete stage-specific expression data, view the modENCODE Development RNA-Seq section under High-Throughput Expression below.
Transcript Expression
Polypeptide Expression
immunolocalization
Stage
Tissue/Position (including subcellular localization)
Reference
Additional Descriptive Data

Sox21a appears to be preferentially expressed in differentiating enteroblasts in the midgut epithelium. It is not detected in newly eclosed adults. Weak expression is observed in enteroblasts and intestinal stem cells of the midgut epithelium at 5 days and it increases with age. Expression is higher in enteroblasts than in intestinal stem cells.

Marker for
 
Subcellular Localization
CV Term
Evidence
References
located_in nucleus
inferred from direct assay
Expression Deduced from Reporters
High-Throughput Expression Data
Associated Tools

GBrowse - Visual display of RNA-Seq signals

View Dmel\Sox21a in GBrowse 2
RNA-Seq by Region - Search RNA-Seq expression levels by exon or genomic region
Reference
See Gelbart and Emmert, 2013 for analysis details and data files for all genes.
Developmental Proteome: Life Cycle
Developmental Proteome: Embryogenesis
External Data and Images
Linkouts
BDGP expression data - Patterns of gene expression in Drosophila embryogenesis
DRscDB - A single-cell RNA-seq resource for data mining and data comparison across species
EMBL-EBI Single Cell Expression Atlas - Single cell expression across species
FlyAtlas - Adult expression by tissue, using Affymetrix Dros2 array
FlyAtlas2 - A Drosophila melanogaster expression atlas with RNA-Seq, miRNA-Seq and sex-specific data
Flygut - An atlas of the Drosophila adult midgut
Images
Alleles, Insertions, Transgenic Constructs, and Aberrations
Classical and Insertion Alleles ( 8 )
For All Classical and Insertion Alleles Show
 
Other relevant insertions
Transgenic Constructs ( 14 )
For All Alleles Carried on Transgenic Constructs Show
Transgenic constructs containing/affecting coding region of Sox21a
Transgenic constructs containing regulatory region of Sox21a
Aberrations (Deficiencies and Duplications) ( 1 )
Inferred from experimentation ( 1 )
Gene disrupted in
Inferred from location ( 0 )
Alleles Representing Disease-Implicated Variants
Phenotypes
For more details about a specific phenotype click on the relevant allele symbol.
Lethality
Allele
Sterility
Allele
Other Phenotypes
Allele
Phenotype manifest in
Allele
Orthologs
Human Orthologs (via DIOPT v8.0)
Homo sapiens (Human) (24)
Species\Gene Symbol
Score
Best Score
Best Reverse Score
Alignment
Complementation?
Transgene?
7 of 15
Yes
No
6 of 15
No
No
6 of 15
No
No
0  
5 of 15
No
No
5 of 15
No
No
5 of 15
No
Yes
1  
5 of 15
No
No
4 of 15
No
No
4 of 15
No
No
4 of 15
No
Yes
3 of 15
No
No
3 of 15
No
Yes
3 of 15
No
Yes
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
Yes
1 of 15
No
Yes
1 of 15
No
Yes
1 of 15
No
Yes
1 of 15
No
No
1 of 15
No
No
Model Organism Orthologs (via DIOPT v8.0)
Mus musculus (laboratory mouse) (22)
Species\Gene Symbol
Score
Best Score
Best Reverse Score
Alignment
Complementation?
Transgene?
7 of 15
Yes
No
6 of 15
No
No
6 of 15
No
No
5 of 15
No
No
4  
5 of 15
No
No
5 of 15
No
Yes
5 of 15
No
No
4 of 15
No
No
4 of 15
No
No
4 of 15
No
Yes
4 of 15
No
Yes
3 of 15
No
No
3 of 15
No
No
3 of 15
No
No
3 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
Yes
1 of 15
No
Yes
1 of 15
No
Yes
Rattus norvegicus (Norway rat) (25)
5 of 13
Yes
No
5 of 13
Yes
Yes
4 of 13
No
No
4 of 13
No
No
4 of 13
No
Yes
4 of 13
No
No
3 of 13
No
No
3 of 13
No
No
2 of 13
No
No
2 of 13
No
No
2 of 13
No
No
2 of 13
No
No
2 of 13
No
No
2 of 13
No
No
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
1 of 13
No
Yes
Xenopus tropicalis (Western clawed frog) (20)
7 of 12
Yes
Yes
6 of 12
No
Yes
6 of 12
No
Yes
6 of 12
No
Yes
6 of 12
No
No
5 of 12
No
No
5 of 12
No
Yes
4 of 12
No
No
4 of 12
No
Yes
3 of 12
No
No
2 of 12
No
No
2 of 12
No
No
2 of 12
No
No
2 of 12
No
No
2 of 12
No
No
2 of 12
No
No
1 of 12
No
Yes
1 of 12
No
Yes
1 of 12
No
Yes
1 of 12
No
Yes
Danio rerio (Zebrafish) (31)
7 of 15
Yes
No
6 of 15
No
No
5 of 15
No
No
5 of 15
No
No
5 of 15
No
No
5 of 15
No
Yes
5 of 15
No
Yes
5 of 15
No
Yes
4 of 15
No
No
4 of 15
No
No
4 of 15
No
No
4 of 15
No
No
4 of 15
No
No
4 of 15
No
Yes
4 of 15
No
Yes
3 of 15
No
No
3 of 15
No
No
3 of 15
No
No
3 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
2 of 15
No
No
1 of 15
No
No
1 of 15
No
No
1 of 15
No
No
1 of 15
No
No
1 of 15
No
No
Caenorhabditis elegans (Nematode, roundworm) (5)
8 of 15
Yes
Yes
5 of 15
No
No
5 of 15
No
Yes
4 of 15
No
No
1 of 15
No
No
Arabidopsis thaliana (thale-cress) (2)
1 of 9
Yes
No
1 of 9
Yes
No
Saccharomyces cerevisiae (Brewer's yeast) (1)
5 of 15
Yes
Yes
Schizosaccharomyces pombe (Fission yeast) (3)
3 of 12
Yes
Yes
2 of 12
No
Yes
2 of 12
No
No
Other Organism Orthologs (via OrthoDB)
Paralogs
Paralogs (via DIOPT v8.0)
Drosophila melanogaster (Fruit fly) (7)
7 of 10
6 of 10
6 of 10
6 of 10
6 of 10
3 of 10
2 of 10
Human Disease Associations
FlyBase Human Disease Model Reports
Disease Model Summary Ribbon
Disease Ontology (DO) Annotations
Models Based on Experimental Evidence ( 5 )
Potential Models Based on Orthology ( 4 )
Modifiers Based on Experimental Evidence ( 3 )
Allele
Disease
Interaction
References
is ameliorated by Mmp2dsRNA.UAS
is ameliorated by TimpUAS.cPa
is ameliorated by bskDN.UAS
is ameliorated by hep1
is exacerbated by pucE69
is ameliorated by upd2HMS00901
is ameliorated by upd2NIG.5988R
Disease Associations of Human Orthologs (via DIOPT v8.0 and OMIM)
Note that ortholog calls supported by only 1 or 2 algorithms (DIOPT score < 3) are not shown.
Functional Complementation Data
Functional complementation data is computed by FlyBase using a combination of the orthology data obtained from DIOPT and OrthoDB and the allele-level genetic interaction data curated from the literature.
Interactions
Summary of Physical Interactions
Summary of Genetic Interactions
esyN Network Diagram
esyN Network Key:
Suppression
Enhancement

Please look at the allele data for full details of the genetic interactions
Starting gene(s)
Interaction type
Interacting gene(s)
Reference
Starting gene(s)
Interaction type
Interacting gene(s)
Reference
External Data
Linkouts
DroID - A comprehensive database of gene and protein interactions.
MIST (genetic) - An integrated Molecular Interaction Database
MIST (protein-protein) - An integrated Molecular Interaction Database
Pathways
Signaling Pathways (FlyBase)
Metabolic Pathways
FlyMet - A comprehensive tissue-specific metabolomics resource for Drosophila.
External Data
Linkouts
Genomic Location and Detailed Mapping Data
Chromosome (arm)
3L
Recombination map
3-42
Cytogenetic map
Sequence location
FlyBase Computed Cytological Location
Cytogenetic map
Evidence for location
70D2-70D2
Limits computationally determined from genome sequence between P{PZ}l(3)70Da02402&P{PZ}btl00208 and P{PZ}Mpcp00564
Experimentally Determined Cytological Location
Cytogenetic map
Notes
References
Experimentally Determined Recombination Data
Location
Left of (cM)
Right of (cM)
Notes
Stocks and Reagents
Stocks (15)
Genomic Clones (27)
cDNA Clones (16)
 

Please Note This section lists cDNAs and ESTs that fall within the genomic extent of the gene model, which may include cDNAs and ESTs of genes within introns, or of overlapping genes. Please see GBrowse for alignment of the cDNAs and ESTs to the gene model.

cDNA clones, fully sequenced
BDGP DGC clones
Other clones
Drosophila Genomics Resource Center cDNA clones

For each fully sequenced cDNA the DGRC maintains various forms of the cDNA (e.g tagged or untagged) in several different host vectors for subsequent cloning and expression in Drosophila and Drosophila cell lines.

cDNA Clones, End Sequenced (ESTs)
BDGP DGC clones
    RNAi and Array Information
    Linkouts
    DRSC - Results frm RNAi screens
    GenomeRNAi - A database for cell-based and in vivo RNAi phenotypes and reagents
    Antibody Information
    Laboratory Generated Antibodies
    Commercially Available Antibodies
     
    Other Information
    Relationship to Other Genes
    Source for database identify of

    Source for identity of: Sox21a CG7345

    Source for database merge of

    Source for merge of: Sox21a SoxB2.3

    Additional comments
    Other Comments

    S2 cells treated with dsRNA generated against this gene show reduced phagocytosis of Candida albicans compared to untreated cells.

    Origin and Etymology
    Discoverer
    Etymology
    Identification
    External Crossreferences and Linkouts ( 36 )
    Sequence Crossreferences
    NCBI Gene - Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
    GenBank Protein - A collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.
    RefSeq - A comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein.
    UniProt/GCRP - The gene-centric reference proteome (GCRP) provides a 1:1 mapping between genes and UniProt accessions in which a single 'canonical' isoform represents the product(s) of each protein-coding gene.
    UniProt/TrEMBL - Automatically annotated and unreviewed records of protein sequence and functional information
    Other crossreferences
    AlphaFold DB - AlphaFold provides open access to protein structure predictions for the human proteome and other key proteins of interest, to accelerate scientific research.
    BDGP expression data - Patterns of gene expression in Drosophila embryogenesis
    Drosophila Genomics Resource Center - Drosophila Genomics Resource Center (DGRC) cDNA clones
    DRscDB - A single-cell RNA-seq resource for data mining and data comparison across species
    EMBL-EBI Single Cell Expression Atlas - Single cell expression across species
    Eukaryotic Promoter Database - A collection of databases of experimentally validated promoters for selected model organisms.
    FlyAtlas - Adult expression by tissue, using Affymetrix Dros2 array
    FlyAtlas2 - A Drosophila melanogaster expression atlas with RNA-Seq, miRNA-Seq and sex-specific data
    Flygut - An atlas of the Drosophila adult midgut
    FlyMet - A comprehensive tissue-specific metabolomics resource for Drosophila.
    GenomeRNAi - A database for cell-based and in vivo RNAi phenotypes and reagents
    iBeetle-Base - RNAi phenotypes in the red flour beetle (Tribolium castaneum)
    InterPro - A database of protein families, domains and functional sites
    KEGG Genes - Molecular building blocks of life in the genomic space.
    MARRVEL_MODEL - MARRVEL (model organism gene)
    modMine - A data warehouse for the modENCODE project
    Linkouts
    DroID - A comprehensive database of gene and protein interactions.
    DRSC - Results frm RNAi screens
    FlyCyc Genes - Genes from a BioCyc PGDB for Dmel
    FlyMine - An integrated database for Drosophila genomics
    MIST (genetic) - An integrated Molecular Interaction Database
    MIST (protein-protein) - An integrated Molecular Interaction Database
    Synonyms and Secondary IDs (7)
    Datasets (0)
    Study focus (0)
    Experimental Role
    Project
    Project Type
    Title
    Study result (0)
    Result
    Result Type
    Title
    References (61)