dxy logo
首页丁香园病例库全部版块
搜索
登录

求助一perl程序

最后编辑于 2022-10-09 · IP 上海上海
929 浏览
这个帖子发布于 19 年零 209 天前,其中的信息可能已发生改变或有所发展。
我现在有20组cDNA序列(每组4~20个)
每组序列想经过如下处理:
1.翻译成protein
2.align protein序列
3.删除gap
4.将protein还原为DNA
处理的目的:用PAML的codeml找正选择位点
发现 http://bioperl.org/HOWTOs/html/PAML.html 上部分模块可是实现这个功能,如下:
use Bio::Tools::Run::Alignment::Clustalw;

# for projecting alignments from protein to R/DNA space
use Bio::Align::Utilities qw(aa_to_dna_aln);

# for input of the sequence data
use Bio::SeqIO;
use Bio::AlignIO;

my $aln_factory = new Bio::Tools::Run::Alignment::Clustalw();

my $seqdata = 'cdna.fa';

my $seqio = new Bio::SeqIO(-file => $seqdata,
-format => 'fasta');
my %seqs;
my @prots;
# process each sequence
while ( my $seq = $seqio->next_seq ) {
$seqs{$seq->display_id} = $seq;
# translate them into protein
my $protein = $seq->translate();
my $pseq = $protein->seq();
if( $pseq =~ /\*/ &&
  $pseq !~ /\*$/ ) {
  warn("provided a cDNA sequence with a stop codon, PAML will choke!");
  exit(0);
}
# Tcoffee can't handle '*' even if it is trailing
$pseq =~ s/\*//g;
$protein->seq($pseq);
push @prots, $protein;
}

if( @prots < 2 ) {
warn("Need at least 2 cDNA sequences to proceed");
exit(0);
}

open(OUT, ">align_output.txt") ||
die("cannot open output $output for writing");
# Align the sequences with clustalw
my $aa_aln = $aln_factory->align(\@prots);
# project the protein alignment back to cDNA coordinates
my $dna_aln = &aa_to_dna_aln($aa_aln, \%seqs);
my @each = $dna_aln->each_seq();
......
运行到my $dna_aln = &aa_to_dna_aln($aa_aln, \%seqs);这一行老是报错(见图),请问这个问题怎么解决,而且怎么将删除gap后的DNA序列读到输出中?
img
























































2 1 点赞

全部讨论(0)

默认最新
avatar
2
分享帖子
share-weibo分享到微博
share-weibo分享到微信
认证
返回顶部