GPL570平台重注释:获取lncRNA表达谱
推荐参考文献:
文章中提到的步骤:
LncRNA classification pipeline:
To evaluate the lncRNA expressions in the probe ID-centric glioma gene expression data, we developed a lncRNA classification pipeline to identify the lncRNAs represented on the Affymetrix array using the following steps.
First,
the Affymetrix HG-U133 Plus 2.0 probe set ID was mapped to the NetAffx Annotation Files (HG-U133 Plus 2.0 Annotations, CSV format, release 31, 08/23/10). The annotations included the probe set ID, gene symbol, gene title, Ensembl gene ID, Refseq transcript ID and other informative items for the specific probe
set. The probe ID-centric gene expression data were joined with the annotation files on the probe set ID.
Second,
the probe sets that were assigned with a Refseq transcript ID and/or Ensembl gene ID in the NetAffx annotations were extracted. For the probe sets with Refseq IDs, we only retained those labeled as “NR_” (NR indicates non-coding RNA in the Refseq database). For the probe sets with Ensembl gene IDs, we only retained those annotated with “lincRNA”, “processed transcripts”, “non-coding” or “misc_RNA” in Ensembl annotations (accessible at the UCSC genome browser: http://www.genome.ucsc.edu/).
Third,
we filtered the probe sets obtained in step 2 by filtering out pseudogenes, rRNAs, microRNAs and other short RNAs including tRNAs, snRNAs and snoRNAs.
Finally,
2448 annotated lncRNA transcripts with corresponding Affymetrix probe IDs were generated.
园子里的战友们,有兴趣的可以一起私聊,讨论!!!

















































