TCGA转录本数据合并

#!/usr/bin/perl -w
use strict;
use warnings;
use Data::Dumper;
use File::Basename;
use JSON;

my $usage=<<USAGE;
Usage:
    perl dz_merge.pl /PATH/TO/meta.json /path/to/cartfile /path/to/filename
USAGE
if(@ARGV==0){die $usage};

my $file1=$ARGV[0];
my $file2=$ARGV[1];
my $file3=$ARGV[2];
my $js;

open(RF,$file1) || die "wrong json doc";
while (my $line=<RF>){
    $js .= "$line";
}
close(RF);

my $json=decode_json($js);

my %hash;
my %gene_exp;

if (-e ($file2."/dz_result.txt")){
    unlink ($file2."/dz_result.txt");
}
my @cat_file=glob($file2."/*.txt");
#print @cat_file;
for my $doc (0..$#cat_file){
    my $basename = basename $cat_file[$doc];
    $hash{$basename}++;
}

my @tcgaid;
my @genename;
my $jishu=0;

for my $i (@{$json}){
    my $docname=$i -> {file_name};
    my @arr=split(/\./,$docname);
    my $filename="$arr[0].$arr[1].$arr[2]";
    if (exists $hash{$filename}){
        push(@tcgaid,$i -> {associated_entities} -> [0] -> {entity_submitter_id});
        open(RF,$file2."/".$filename."/".$filename);
        while(my $line=<RF>){
            chomp ($line);
            my @genearr=split("\t",$line);
            $gene_exp{$i -> {associated_entities} -> [0] -> {entity_submitter_id}}{$genearr[0]}=$genearr[1];
            if ($jishu == 0){
                push(@genename,$genearr[0]);
            }
        }
        close(RF);
        $jishu=$jishu+1;
    }
}

my $normal=0;
my $tumor=0;

open(WF,">>".$file3) || die "wrong output file !";
print WF "id";
for my $i (0..$#tcgaid){
    my @arr=split("-",$tcgaid[$i]);
    if ($arr[3]=~/^1/){#normal tissue
        print WF "\t",$tcgaid[$i];
        $normal++;  
    }elsif($arr[3]=~/^0/){#tumor tissure
        print WF "\t",$tcgaid[$i];
        $tumor++;   
    }
}

print WF "\n";
print "normal tissue:",$normal,"\n";
print "tumor tissue:",$tumor,"\n";

for my $j (0..$#genename){
    print WF $genename[$j];
    for my $i (0..$#tcgaid){
        my @arr=split("-",$tcgaid[$i]);
        if ($arr[3]=~/^1/){
                print WF "\t",$gene_exp{$tcgaid[$i]}{$genename[$j]};
        }elsif($arr[3]=~/^0/){
                print WF "\t",$gene_exp{$tcgaid[$i]}{$genename[$j]};
        }
    }
    print WF "\n";
}

close(WF);

由于样品还未排序，不好用于差异分析，所以需要根据样本类型进行排序：排序脚本
 R语言合并脚本

此条目发表在Perl, TCGA分类目录。将固定链接加入收藏夹。

《TCGA转录本数据合并》有5条回应

Lukies说：

2019年8月14日下午4:15

大神，我用了您的脚本合并肿瘤TCGA甲基化数据，然而报了：
Use of uninitialized value $file2 in concatenation (.) or string at C:\Users\DELL\Desktop\livermRNA\methy\02_download\geneMerge1.pl line 30.
Use of uninitialized value $file2 in concatenation (.) or string at C:\Users\DELL\Desktop\livermRNA\methy\02_download\geneMerge1.pl line 33.
Use of uninitialized value $file3 in concatenation (.) or string at C:\Users\DELL\Desktop\livermRNA\methy\02_download\geneMerge1.pl line 67.
wrong output file ! at C:\Users\DELL\Desktop\livermRNA\methy\02_download\geneMerge1.pl line 67. 没办法合并呢，愁死了，咋办，能帮忙看看是啥问题吗？邮箱yuminzhongda@163.com，谢谢。

回复
- daizao说：
  
  2019年8月14日下午5:06
  
  输入文件有问题吧
  
  回复
- 求教说：
  
  2020年1月16日下午9:24
  
  您好，我也是，着急
  
  回复
- 陶德说：
  
  2020年3月16日下午4:14
  
  这个只能合并转录组数据，不能合并甲基化数据
  
  回复
安璐璐说：

2021年5月28日下午3:41

大神能帮忙看看，我这个为什么会这样？
Use of uninitialized value $file2 in concatenation (.) or string at mergess.pl line 30.
Use of uninitialized value $file2 in concatenation (.) or string at mergess.pl line 33.
Use of uninitialized value $file3 in concatenation (.) or string at mergess.pl line 67.
wrong output file ! at mergess.pl line 67.

回复

一	二	三	四	五	六	日
« 6月				8月 »
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

TCGA转录本数据合并

《TCGA转录本数据合并》有5条回应

发表评论取消回复

链接表

近期文章

近期评论

分类目录

TCGA转录本数据合并

《TCGA转录本数据合并》有5条回应

发表评论 取消回复

链接表

近期文章

近期评论

分类目录

分类目录

发表评论取消回复