Skip to content

Extract exon list from UCSC

Hassan Foroughi edited this page Oct 30, 2018 · 2 revisions
  • to download exon information from ucsc:
    1. Download knownCanonical gene from ucsc track, with chrom, chromstart, chromend, known gene id, and known gene symbol, and save as knowngene.bed
    2. Download Bed exon from ucsc track, table: knowngene: select bed, select exon. And save as knowngene_exon.bed
    3. The bash command:
join -1 4 -2 4 -o 2.1,2.2,2.3,1.5,2.5,2.6 \
	<(sort -k4,4 knowngene.bed) \
	<( awk '{split ($4,a,"_"); {print $1"\t"$2"\t"$3"\t"a[1]"\t"a[3]"\t"$6}}' \
		knowngene_exon.bed  | sort -k4,4) \
	| tr ' ' '\t' > CanonicalExon.bed