R语言长宽数据转换

长数据如下图:

宽数据如下图:

library(magrittr)
library(tidyr)
library(reshape2)
#宽转长

    #gather方法
    test <- as.data.frame(ddd)
    test$year <- rownames(test)
    test1 <- gather(test,key="month",value="tempretaure",-year) %>%
    .[order(.$year),]
    write.table(test1,file="chang_data.txt",sep="\t",quote=F,col.names=T,row.names=F)

    #melt方法
    test2 <- melt(test,id.vars=c('year'),variable.name='month',value.name='tempretaure') %>% 
    .[order(.$year),]

#长转宽

    #spread方法
    kuan_data_1 <- spread(test2,month,tempretaure)

    #dcast方法
    kuan_data_2 <- dcast(test2,year~test2$month,value.var='tempretaure')
发表在 R | 2条评论

perl语言长宽数据转换(长 -> 宽)

长数据如下图

perl代码如下

#!/usr/bin/perl -w
use strict;
use warnings;
use Data::Dumper;

my $usage=<<USAGE;
Usage:
    perl $0 inputfile
USAGE
if(@ARGV==0){die $usage};

my $file=$ARGV[0];
my %timeinfo=();
my $reftime=\ %timeinfo;
my $i=0;

open(RF,$file) || die $!;
open(WF,">kuan_data.txt") || die $!;
while(my $line=<RF>){
    next if ($.==1);
    chomp($line);
    my @arr=split('\t',$line);
    $reftime -> {$arr[0]} -> {$arr[1]} =$arr[2];

}

for my $j (keys %{$reftime}){
    $i=0;
    for my $dd (keys %{$reftime -> {$j}}){
        $i++;
    }
}

for my $year (sort {$a cmp $b} keys %{$reftime}){ 
    my $j=1;
    for my $month (sort {$a <=> $b} keys %{$reftime -> {$year}}){  
        if ($j==1){
            print WF $month,"\t";
            $j++;
        }elsif($j<$i){
            print WF $month,"\t";
            $j++;
        }elsif($j==$i){
            print WF $month,"\n";
        }else{
            next;
        }
    }
    last;
}


for my $year (sort {$a cmp $b} keys %{$reftime}){
    my $j=1;
    for my $month (sort {$a <=> $b} keys %{$reftime -> {$year}}){
        if ($j==1){
            print WF $year,"\t",$reftime -> {$year} -> {$month},"\t";
            $j++;
        }elsif($j>0 && $j < $i){
            print WF $reftime -> {$year} -> {$month},"\t";
            $j++;
        }else{
            print WF $reftime -> {$year} -> {$month},"\n";
        }
    }
}

close(RF);
close(WF);

得到最后的宽数据,如下图

发表在 Perl | 留下评论

perl语言长宽数据转换(宽 -> 长 )

首先输入的文件类似如下图 为了转换为长数据,编写如下perl代码

#!/usr/bin/perl -w
use strict;
use warnings;
use Data::Dumper;

my $file=$ARGV[0];
my @matrix=();
my @month=();
my @year=();

open(RF,$file) || die $!;
open(WF,">tidy_data.txt") || die $!;

my $i=0;
while (my $line=<RF>){
    next if ($.==1);
    chomp($line);
    my @arr=split('\t',$line);
    shift(@arr);
    for my $j (0..@arr-1){
        $matrix[$i][$j]=$arr[$j];
    }
    $i++;
}
close(RF);

open(RF,$file) || die $!;
open(WFF,">process.txt") || die $!;

while(my $line=<RF>){
    if ($.==1){
        chomp($line);
        @month=split('\t',$line);
        next;
    }
    chomp($line);
    my @arr=split('\t',$line);
    print WFF $arr[0],"\n";
}
close(RF);
close(WFF);

open(RFF,"process.txt") || die $!;
while(my $line=<RFF>){
    chomp($line);
    push @year,$line;
}
close(RFF);

for my $i (0..$#year){
    for my $j (0..$#month){
        print WF $year[$i],"\t",$month[$j],"\t",$matrix[$i][$j],"\n";
    }
}

close(WF);
system("del process.txt")

便可实现转换,结果如下图

发表在 Perl | 留下评论

在Centos6.5上对gdc-client进行源码安装

  source ~/.bashrc_python-2.7.11 #事先安装好Python2.7以上的版本
  git clone git@github.com:NCI-GDC/gdc-client.git
  vim requirements.txt
          -e /home/train/gdc-tool/parcel #指定本地安装parcel,不然因为外网,无法下载
  pip install -r requirements.txt
  pip install -r dev-requirements.txt
  python setup.py install
  python -m pytest --cov=gdc_client --cov-branch --cov-report term tests/

  echo 'PATH=$PATH:/home/train/gdc-tool/gdc-client/bin' >> ~/.bashrc
  source ~/.bashrc

  #如果遇到'DownloadStream' object has no attribute 'url'的问题
  cd /home/train/gdc-tool/gdc-client/gdc_client/download
  cp client.py client.py.bak
  # gdc-client calls parcel's parallel_download,
  # which is where most of the downloading takes place
      file_id = stream.uri.split('/')[-1] # l -> i

  cd /home/train/gdc-tool/gdc-client
  python setup.py install
  rm -rf /home/train/gdc-tool/gdc-client/gdc_client/
发表在 TCGA | 留下评论

TCGA_简单上手网站

  • UALCAN —> 可查看TCGA中相关基因,生存曲线,差异表达啥的
  • cBioPortal —-> 可查看TCGA中差异表达的基因
  • Oncomine —-> compute gene expression signatures, clusters and gene-set modules, automatically extracting biological insights from the data
  • LinkedOmics —-> The web application has three analytical modules: LinkFinder, LinkInterpreter and LinkCompare. LinkFinder allows users to search for attributes that are associated with a query attribute, such as mRNA or protein expression signatures of genomic alterations, candidate biomarkers of clinical attributes, and candidate target genes of transcriptional factors, microRNAs, or protein kinases. Analysis results can be visualized by scatter plots, box plots, or Kaplan-Meier plots. To derive biological insights from the association results, the LinkInterpreter module performs enrichment analysis based on Gene Ontology, biological pathways, network modules, among other functional categories. The LinkCompare module uses visualization functions (interactive venn diagram, scatter plot, and sortable heat map) and meta-analysis to compare and integrate association results generated by the LinkFinder module, which supports multi-omics analysis in a cancer type or pan-cancer analysis.
  • GeneMANIA —-> predict the function of your favorite genes and gene sets
  • GEPIA —-> GEPIA (Gene Expression Profiling Interactive Analysis), 目前第二版也出来了
  • TCGAportal —-> 就是TCGA的数据简单分析网站,很简单的那种
  • dbDEMC —-> 可以分析microRNA的网站
  • GSCA ——> 癌症基因集分析
发表在 TCGA | 留下评论

clusterProfiler_github版本安装

#R-3.3.1,失败,因为最后安装强制要求了R的高版本
#镜像选择0-cloud
#install.packages("installr")
#library(installr)
#用于将之前的R包迁移到新的R中
#copy.packages.between.libraries(from="D:/R/R-3.3.1/library",to="D:/R/R-3.5.2/library",ask=T,keep_old=T,do_NOT_override_packages_in_new_R=T)
#install.packages("rlang")
#install.packages("debugme")
#install.packages("git2r")
#install.packages("devtools")
#若报ERROR: dependency 'pkgload' is not available for package 'devtools',则手动下载pkgload进行安装
#报地址https://cran.r-project.org/bin/windows/contrib/3.6/pkgload_1.0.2.zip
#包介绍地址https://cran.r-project.org/web/packages/pkgload/index.html
#install_github("GuangchuangYu/enrichplot") ===>要求R大于3.5.1
#install_github("GuangchuangYu/clusterProfiler") ====>要求R大于3.4

#R版本必须是3.5.1以上
install.packages("devtools")
library(devtools)
#需要安装Rtools,网址:https://cran.r-project.org/bin/windows/Rtools/
install_github("GuangchuangYu/enrichplot")
install_github("GuangchuangYu/clusterProfiler")
发表在 R | 留下评论