1#
2# BioPerl module for Bio::Search::Tiling::MapTiling
3#
4# Please direct questions and support issues to <bioperl-l@bioperl.org>
5#
6# Cared for by Mark A. Jensen <maj@fortinbras.us>
7#
8# Copyright Mark A. Jensen
9#
10# You may distribute this module under the same terms as perl itself
11
12# POD documentation - main docs before the code
13
14=head1 NAME
15
16Bio::Search::Tiling::MapTiling - An implementation of an HSP tiling
17algorithm, with methods to obtain frequently-requested statistics
18
19=head1 SYNOPSIS
20
21 # get a BLAST $hit from somewhere, then
22 $tiling = Bio::Search::Tiling::MapTiling->new($hit);
23
24 # stats
25 $numID = $tiling->identities();
26 $numCons = $tiling->conserved();
27 $query_length = $tiling->length('query');
28 $subject_length = $tiling->length('subject'); # or...
29 $subject_length = $tiling->length('hit');
30
31 # get a visual on the coverage map
32 print $tiling->coverage_map_as_text('query',$context,'LEGEND');
33
34 # tilings
35 $context = $tiling->_context( -type => 'subject', -strand=> 1, -frame=>1);
36 @covering_hsps_for_subject = $tiling->next_tiling('subject',$context);
37 $context = $tiling->_context( -type => 'query', -strand=> -1, -frame=>0);
38 @covering_hsps_for_query   = $tiling->next_tiling('query', $context);
39
40=head1 DESCRIPTION
41
42Frequently, users want to use a set of high-scoring pairs (HSPs)
43obtained from a BLAST or other search to assess the overall level of
44identity, conservation, or coverage represented by matches between a
45subject and a query sequence. Because a set of HSPs frequently
46describes multiple overlapping sequence fragments, a simple summation of
47statistics over the HSPs will generally overestimate those
48statistics. To obtain an accurate estimate of global hit statistics, a
49'tiling' of HSPs onto either the subject or the query sequence must be
50performed, in order to properly correct for this.
51
52This module will execute a tiling algorithm on a given hit based on an
53interval decomposition I'm calling the "coverage map". Internal object
54methods compute the various statistics, which are then stored in
55appropriately-named public object attributes. See
56L<Bio::Search::Tiling::MapTileUtils> for more info on the algorithm.
57
58=head2 STRAND/FRAME CONTEXTS
59
60In BLASTX, TBLASTN, and TBLASTX reports, strand and frame information
61are reported for the query, subject, or query and subject,
62respectively, for each HSP. Tilings for these sequence types are only
63meaningful when they include HSPs in the same strand and frame, or
64"context". So, in these situations, the context must be specified
65in the method calls or the methods will throw.
66
67Contexts are specified as strings: C<[ 'all' | [m|p][_|0|1|2] ]>, where
68C<all> = all HSPs (will throw if context must be specified), C<m> = minus
69strand, C<p> = plus strand, and C<_> = no frame info, C<0,1,2> = respective
70(absolute) frame. The L</_make_context_key> method will convert a (strand,
71frame) specification to a context string, e.g.:
72
73    $context = $self->_context(-type=>'query', -strand=>-1, -frame=>-2);
74
75returns C<m2>.
76
77The contexts present among the HSPs in a hit are identified and stored
78for convenience upon object construction. These are accessed off the
79object with the L</contexts> method. If contexts don't apply for the
80given report, this returns C<('all')>.
81
82=head1 TILED ALIGNMENTS
83
84The experimental method L<ALIGNMENTS/get_tiled_alns> will use a tiling
85to concatenate tiled hsps into a series of L<Bio::SimpleAlign>
86objects:
87
88 @alns = $tiling->get_tiled_alns($type, $context);
89
90Each alignment contains two sequences with ids 'query' and 'subject',
91and consists of a concatenation of tiling HSPs which overlap or are
92directly adjacent. The alignment are returned in C<$type> sequence
93order. When HSPs overlap, the alignment sequence is taken from the HSP
94which comes first in the coverage map array.
95
96The sequences in each alignment contain features (even though they are
97L<Bio::LocatableSeq> objects) which map the original query/subject
98coordinates to the new alignment sequence coordinates. You can
99determine the original BLAST fragments this way:
100
101 $aln = ($tiling->get_tiled_alns)[0];
102 $qseq = $aln->get_seq_by_id('query');
103 $hseq = $aln->get_seq_by_id('subject');
104 foreach my $feat ($qseq->get_SeqFeatures) {
105    $org_start = ($feat->get_tag_values('query_start'))[0];
106    $org_end = ($feat->get_tag_values('query_end'))[0];
107    # original fragment as represented in the tiled alignment:
108    $org_fragment = $feat->seq;
109 }
110 foreach my $feat ($hseq->get_SeqFeatures) {
111    $org_start = ($feat->get_tag_values('subject_start'))[0];
112    $org_end = ($feat->get_tag_values('subject_end'))[0];
113    # original fragment as represented in the tiled alignment:
114    $org_fragment = $feat->seq;
115 }
116
117=head1 DESIGN NOTE
118
119The major calculations are made just-in-time, and then memoized. So,
120for example, for a given MapTiling object, a coverage map would
121usually be calculated only once (for the query), and at most twice (if
122the subject perspective is also desired), and then only when a
123statistic is first accessed. Afterward, the map and/or any statistic
124is read from storage. So feel free to call the statistic methods
125frequently if it suits you.
126
127=head1 FEEDBACK
128
129=head2 Mailing Lists
130
131User feedback is an integral part of the evolution of this and other
132Bioperl modules. Send your comments and suggestions preferably to
133the Bioperl mailing list.  Your participation is much appreciated.
134
135  bioperl-l@bioperl.org                  - General discussion
136  http://bioperl.org/wiki/Mailing_lists  - About the mailing lists
137
138=head2 Support
139
140Please direct usage questions or support issues to the mailing list:
141
142I<bioperl-l@bioperl.org>
143
144rather than to the module maintainer directly. Many experienced and
145reponsive experts will be able look at the problem and quickly
146address it. Please include a thorough description of the problem
147with code and data examples if at all possible.
148
149=head2 Reporting Bugs
150
151Report bugs to the Bioperl bug tracking system to help us keep track
152of the bugs and their resolution. Bug reports can be submitted via
153the web:
154
155  https://github.com/bioperl/bioperl-live/issues
156
157=head1 AUTHOR - Mark A. Jensen
158
159Email maj -at- fortinbras -dot- us
160
161=head1 APPENDIX
162
163The rest of the documentation details each of the object methods.
164Internal methods are usually preceded with a _
165
166=cut
167
168# Let the code begin...
169
170package Bio::Search::Tiling::MapTiling;
171$Bio::Search::Tiling::MapTiling::VERSION = '1.7.7';
172use strict;
173use warnings;
174
175use Bio::Root::Root;
176use Bio::Search::Tiling::TilingI;
177use Bio::Search::Tiling::MapTileUtils;
178use Bio::LocatableSeq;
179
180# use base qw(Bio::Root::Root Bio::Search::Tiling::TilingI);
181use base qw(Bio::Root::Root Bio::Search::Tiling::TilingI);
182
183=head1 CONSTRUCTOR
184
185=head2 new
186
187 Title   : new
188 Usage   : my $obj = new Bio::Search::Tiling::GenericTiling();
189 Function: Builds a new Bio::Search::Tiling::GenericTiling object
190 Returns : an instance of Bio::Search::Tiling::GenericTiling
191 Args    : -hit    => $a_Bio_Search_Hit_HitI_object
192           general filter function:
193           -hsp_filter => sub { my $this_hsp = shift;
194                                ...;
195                                return 1 if $wanted;
196                                return 0; }
197
198=cut
199
200sub new {
201    my $class = shift;
202    my @args = @_;
203    my $self = $class->SUPER::new(@args);
204    my($hit, $filter) = $self->_rearrange( [qw( HIT HSP_FILTER)],@args );
205
206    $self->throw("HitI object required") unless $hit;
207    $self->throw("Argument must be HitI object") unless ( ref $hit && $hit->isa('Bio::Search::Hit::HitI') );
208    $self->{hit} = $hit;
209    $self->_set_attributes();
210    $self->{"_algorithm"} = $hit->algorithm;
211
212    my @hsps = $hit->hsps;
213    # apply filter function if requested
214    if ( defined $filter ) {
215	if ( ref($filter) eq 'CODE' ) {
216	    @hsps = map { $filter->($_) ? $_ : () } @hsps;
217	}
218	else {
219	    $self->warn("-filter is not a coderef; ignoring");
220	}
221    }
222
223    # identify available contexts
224    for my $t (qw( query hit )) {
225	my %contexts;
226	for my $i (0..$#hsps) {
227	    my $ctxt = $self->_context(
228		-type => $t,
229		-strand => $hsps[$i]->strand($t),
230		-frame  => $hsps[$i]->frame($t));
231	    $contexts{$ctxt} ||= [];
232	    push @{$contexts{$ctxt}}, $i;
233	}
234	$self->{"_contexts_${t}"} = \%contexts;
235    }
236
237    $self->warn("No HSPs present in hit after filtering") unless (@hsps);
238    $self->hsps(\@hsps);
239    return $self;
240}
241
242# a tiling is based on the set of hsps contained in a single hit.
243# check all the boundaries - zero hsps, one hsp, all disjoint hsps
244
245=head1 TILING ITERATORS
246
247=head2 next_tiling
248
249 Title   : next_tiling
250 Usage   : @hsps = $self->next_tiling($type);
251 Function: Obtain a tiling: a minimal set of HSPs covering the $type
252           ('hit', 'subject', 'query') sequence
253 Example :
254 Returns : an array of HSPI objects
255 Args    : scalar $type: one of 'hit', 'subject', 'query', with
256           'subject' an alias for 'hit'
257
258=cut
259
260sub next_tiling{
261    my $self = shift;
262    my ($type, $context) = @_;
263    $self->_check_type_arg(\$type);
264    $self->_check_context_arg($type, \$context);
265    return $self->_tiling_iterator($type, $context)->();
266}
267
268=head2 rewind_tilings
269
270 Title   : rewind_tilings
271 Usage   : $self->rewind_tilings($type)
272 Function: Reset the next_tilings($type) iterator
273 Example :
274 Returns : True on success
275 Args    : scalar $type: one of 'hit', 'subject', 'query';
276           default is 'query'
277
278=cut
279
280sub rewind_tilings{
281    my $self = shift;
282    my ($type,$context) = @_;
283    $self->_check_type_arg(\$type);
284    $self->_check_context_arg($type, \$context);
285    return $self->_tiling_iterator($type, $context)->('REWIND');
286}
287
288=head1 ALIGNMENTS
289
290=head2 get_tiled_alns()
291
292 Title   : get_tiled_alns
293 Usage   : @alns = $tiling->get_tiled_alns($type, $context)
294 Function: Use a tiling to construct a minimal set of alignment
295           objects covering the region specified by $type/$context
296           by splicing adjacent HSP tiles
297 Returns : an array of Bio::SimpleAlign objects; see Note below
298 Args    : scalar $type: one of 'hit', 'subject', 'query'
299           default is 'query'
300           scalar $context: strand/frame context string
301           Following $type and $context, an array of
302           ordered, tiled HSP objects can be specified; this is
303           the tiling that will directly the alignment construction
304           default -- the first tiling provided by a tiling iterator
305 Notes   : Each returned alignment is a concatenation of adjacent tiles.
306           The set of alignments will cover all regions described by the
307           $type/$context pair in the hit. The pair of sequences in each
308           alignment have ids 'query' and 'subject', and each sequence
309           possesses SeqFeatures that map the original query or subject
310           coordinates to the sequence coordinates in the tiled alignment.
311
312=cut
313
314sub get_tiled_alns {
315    my $self = shift;
316    my ($type, $context) = @_;
317    $self->_check_type_arg(\$type);
318    $self->_check_context_arg($type, \$context);
319    my $t = shift; # first arg after type/context, arrayref to a tiling
320    my @tiling;
321    if ($t && (ref($t) eq 'ARRAY')) {
322	@tiling = @$t;
323    }
324    else { # otherwise, take the first tiling available
325
326	@tiling = $self->_make_tiling_iterator($type,$context)->();
327    }
328    my @ret;
329
330    my @map = $self->coverage_map($type, $context);
331    my @intervals = map {$_->[0]} @map; # disjoint decomp
332    # divide into disjoint covering groups
333    my @groups = covering_groups(@intervals);
334
335    require Bio::SimpleAlign;
336    require Bio::SeqFeature::Generic;
337    # cut hsp aligns along the disjoint decomp
338    # look for gaps...or add gaps?
339    my ($q_start, $h_start, $q_strand, $h_strand);
340    # build seqs
341    for my $grp (@groups) {
342	my $taln = Bio::SimpleAlign->new();
343	my (@qfeats, @hfeats);
344	my $query_string = '';
345	my $hit_string = '';
346	my ($qlen,$hlen) = (0,0);
347	my ($qinc, $hinc, $qstart, $hstart);
348	for my $intvl (@$grp) {
349	    # following just chooses the first available hsp containing the
350	    # interval -- this is arbitrary, could be smarter.
351	    my $aln = ( containing_hsps($intvl, @tiling) )[0]->get_aln;
352	    my $qseq = $aln->get_seq_by_pos(1);
353	    my $hseq = $aln->get_seq_by_pos(2);
354	    $qstart ||= $qseq->start;
355	    $hstart ||= $hseq->start;
356	    # calculate the slice boundaries
357	    my ($beg, $end);
358	    for ($type) {
359		/query/ && do {
360		    $beg = $aln->column_from_residue_number($qseq->id, $intvl->[0]);
361		    $end = $aln->column_from_residue_number($qseq->id, $intvl->[1]);
362		    last;
363		};
364		/subject|hit/ && do {
365		    $beg = $aln->column_from_residue_number($hseq->id, $intvl->[0]);
366		    $end = $aln->column_from_residue_number($hseq->id, $intvl->[1]);
367		    last;
368		};
369	    }
370	    $aln = $aln->slice($beg, $end);
371	    $qseq = $aln->get_seq_by_pos(1);
372	    $hseq = $aln->get_seq_by_pos(2);
373	    $qinc = $qseq->length - $qseq->num_gaps($Bio::LocatableSeq::GAP_SYMBOLS);
374	    $hinc = $hseq->length - $hseq->num_gaps($Bio::LocatableSeq::GAP_SYMBOLS);
375
376	    push @qfeats, Bio::SeqFeature::Generic->new(
377		-start => $qlen+1,
378		-end => $qlen+$qinc,
379		-strand => $qseq->strand,
380		-primary => 'query',
381		-source_tag => 'BLAST',
382		-display_name => 'query coordinates',
383		-tag => { query_id => $qseq->id,
384			  query_desc => $qseq->desc,
385			  query_start => $qstart + (($qseq->strand && $qseq->strand < 0) ? -1 : 1)*$qlen,
386			  query_end => $qstart + (($qseq->strand && $qseq->strand < 0) ? -1 : 1)*($qlen+$qinc-1),
387		}
388		);
389	    push @hfeats, Bio::SeqFeature::Generic->new(
390		-start => $hlen+1,
391		-end => $hlen+$hinc,
392		-strand => $hseq->strand,
393		-primary => 'subject/hit',
394		-source_tag => 'BLAST',
395		-display_name => 'subject/hit coordinates',
396		-tag => { subject_id => $hseq->id,
397			  subject_desc => $hseq->desc,
398			  subject_start => $hstart + (($hseq->strand && $hseq->strand < 0) ? -1 : 1)*$hlen,
399			  subject_end => $hstart + (($hseq->strand && $hseq->strand < 0) ? -1 : 1)*($hlen+$hinc-1)
400		}
401		);
402	    $query_string .= $qseq->seq;
403	    $hit_string .= $hseq->seq;
404	    $qlen += $qinc;
405	    $hlen += $hinc;
406	}
407	# create the LocatableSeqs and add the features to each
408	# then add the seqs to the new aln
409	# note in MapTileUtils, Bio::FeatureHolderI methods have been
410	# mixed into Bio::LocatableSeq
411	my $qseq = Bio::LocatableSeq->new( -id => 'query',
412					   -seq => $query_string);
413	$qseq->add_SeqFeature(@qfeats);
414	my $hseq = Bio::LocatableSeq->new( -id => 'subject',
415					   -seq => $hit_string );
416	$hseq->add_SeqFeature(@hfeats);
417	$taln->add_seq($qseq);
418	$taln->add_seq($hseq);
419	push @ret, $taln;
420    }
421
422    return @ret;
423}
424
425=head1 STATISTICS
426
427=head2 identities
428
429 Title   : identities
430 Usage   : $tiling->identities($type, $action, $context)
431 Function: Retrieve the calculated number of identities for the invocant
432 Example :
433 Returns : value of identities (a scalar)
434 Args    : scalar $type: one of 'hit', 'subject', 'query'
435           default is 'query'
436           option scalar $action: one of 'exact', 'est', 'fast', 'max'
437           default is 'exact'
438           option scalar $context: strand/frame context string
439 Note    : getter only
440
441=cut
442
443sub identities{
444    my $self = shift;
445    my ($type, $action, $context) = @_;
446    $self->_check_type_arg(\$type);
447    $self->_check_action_arg(\$action);
448    $self->_check_context_arg($type, \$context);
449    if (!defined $self->{"identities_${type}_${action}_${context}"}) {
450	$self->_calc_stats($type, $action, $context);
451    }
452    return $self->{"identities_${type}_${action}_${context}"};
453}
454
455=head2 conserved
456
457 Title   : conserved
458 Usage   : $tiling->conserved($type, $action)
459 Function: Retrieve the calculated number of conserved sites for the invocant
460 Example :
461 Returns : value of conserved (a scalar)
462 Args    : scalar $type: one of 'hit', 'subject', 'query'
463           default is 'query'
464           option scalar $action: one of 'exact', 'est', 'fast', 'max'
465           default is 'exact'
466           option scalar $context: strand/frame context string
467 Note    : getter only
468
469=cut
470
471sub conserved{
472    my $self = shift;
473    my ($type, $action, $context) = @_;
474    $self->_check_type_arg(\$type);
475    $self->_check_action_arg(\$action);
476    $self->_check_context_arg($type, \$context);
477    if (!defined $self->{"conserved_${type}_${action}_${context}"}) {
478	$self->_calc_stats($type, $action, $context);
479    }
480    return $self->{"conserved_${type}_${action}_${context}"};
481}
482
483=head2 length
484
485 Title   : length
486 Usage   : $tiling->length($type, $action, $context)
487 Function: Retrieve the total length of aligned residues for
488           the seq $type
489 Example :
490 Returns : value of length (a scalar)
491 Args    : scalar $type: one of 'hit', 'subject', 'query'
492           default is 'query'
493           option scalar $action: one of 'exact', 'est', 'fast', 'max'
494           default is 'exact'
495           option scalar $context: strand/frame context string
496 Note    : getter only
497
498=cut
499
500sub length{
501    my $self = shift;
502    my ($type,$action,$context) = @_;
503    $self->_check_type_arg(\$type);
504    $self->_check_action_arg(\$action);
505    $self->_check_context_arg($type, \$context);
506    if (!defined $self->{"length_${type}_${action}_${context}"}) {
507	$self->_calc_stats($type, $action, $context);
508    }
509    return $self->{"length_${type}_${action}_${context}"};
510}
511
512=head2 frac
513
514 Title   : frac
515 Usage   : $tiling->frac($type, $denom, $action, $context, $method)
516 Function: Return the fraction of sequence length consisting
517           of desired kinds of pairs (given by $method),
518           with respect to $denom
519 Returns : scalar float
520 Args    : -type => one of 'hit', 'subject', 'query'
521           -denom => one of 'total', 'aligned'
522           -action => one of 'exact', 'est', 'fast', 'max'
523           -context => strand/frame context string
524           -method => one of 'identical', 'conserved'
525 Note    : $denom == 'aligned', return desired_stat/num_aligned
526           $denom == 'total', return desired_stat/_reported_length
527             (i.e., length of the original input sequences)
528 Note    : In keeping with the spirit of Bio::Search::HSP::HSPI,
529           reported lengths of translated dna are reduced by
530           a factor of 3, to provide fractions relative to
531           amino acid coordinates.
532
533=cut
534
535sub frac {
536    my $self = shift;
537    my @args = @_;
538    my ($type, $denom, $action, $context, $method) = $self->_rearrange([qw(TYPE DENOM ACTION CONTEXT METHOD)],@args);
539    $self->_check_type_arg(\$type);
540    $self->_check_action_arg(\$action);
541    $self->_check_context_arg($type, \$context);
542    unless ($method and grep(/^$method$/, qw( identical conserved ))) {
543	$self->throw("-method must specified; one of ('identical', 'conserved')");
544    }
545    $denom ||= 'total';
546    unless (grep /^$denom/, qw( total aligned )) {
547	$self->throw("Denominator selection must be one of ('total', 'aligned'), not '$denom'");
548    }
549    my $key = "frac_${method}_${type}_${denom}_${action}_${context}";
550    my $stat;
551    for ($method) {
552	$_ eq 'identical' && do {
553	    $stat = $self->identities($type, $action, $context);
554	    last;
555	};
556	$_ eq 'conserved' && do {
557	    $stat = $self->conserved($type, $action, $context);
558	    last;
559	};
560	do {
561	    $self->throw("What are YOU doing here?");
562	};
563    }
564    if (!defined $self->{$key}) {
565	for ($denom) {
566	    /total/ && do {
567		$self->{$key} =
568		    $stat/$self->_reported_length($type); # need fudge fac??
569		last;
570	    };
571	    /aligned/ && do {
572		$self->{$key} =
573		    $stat/$self->length($type,$action,$context);
574		last;
575	    };
576	    do {
577		$self->throw("What are YOU doing here?");
578	    };
579	}
580    }
581    return $self->{$key};
582}
583
584=head2 frac_identical
585
586 Title   : frac_identical
587 Usage   : $tiling->frac_identical($type, $denom, $action, $context)
588 Function: Return the fraction of sequence length consisting
589           of identical pairs, with respect to $denom
590 Returns : scalar float
591 Args    : -type => one of 'hit', 'subject', 'query'
592           -denom => one of 'total', 'aligned'
593           -action => one of 'exact', 'est', 'fast', 'max'
594           -context => strand/frame context string
595 Note    : $denom == 'aligned', return conserved/num_aligned
596           $denom == 'total', return conserved/_reported_length
597             (i.e., length of the original input sequences)
598 Note    : In keeping with the spirit of Bio::Search::HSP::HSPI,
599           reported lengths of translated dna are reduced by
600           a factor of 3, to provide fractions relative to
601           amino acid coordinates.
602 Note    : This an alias that calls frac()
603
604=cut
605
606sub frac_identical{
607    my $self = shift;
608    my @args = @_;
609    my ($type, $denom, $action,$context) = $self->_rearrange( [qw[ TYPE DENOM ACTION CONTEXT]],@args );
610    $self->frac( -type=>$type, -denom=>$denom, -action=>$action, -method=>'identical', -context=>$context);
611}
612
613=head2 frac_conserved
614
615 Title   : frac_conserved
616 Usage   : $tiling->frac_conserved($type, $denom, $action, $context)
617 Function: Return the fraction of sequence length consisting
618           of conserved pairs, with respect to $denom
619 Returns : scalar float
620 Args    : -type => one of 'hit', 'subject', 'query'
621           -denom => one of 'total', 'aligned'
622           -action => one of 'exact', 'est', 'fast', 'max'
623           -context => strand/frame context string
624 Note    : $denom == 'aligned', return conserved/num_aligned
625           $denom == 'total', return conserved/_reported_length
626             (i.e., length of the original input sequences)
627 Note    : In keeping with the spirit of Bio::Search::HSP::HSPI,
628           reported lengths of translated dna are reduced by
629           a factor of 3, to provide fractions relative to
630           amino acid coordinates.
631 Note    : This an alias that calls frac()
632
633=cut
634
635sub frac_conserved{
636    my $self = shift;
637    my @args = @_;
638    my ($type, $denom, $action, $context) = $self->_rearrange( [qw[ TYPE DENOM ACTION CONTEXT]],@args );
639    $self->frac( -type=>$type, -denom=>$denom, -action=>$action, -context=>$context, -method=>'conserved');
640}
641
642=head2 frac_aligned
643
644 Title   : frac_aligned
645 Aliases : frac_aligned_query - frac_aligned(-type=>'query',...)
646           frac_aligned_hit   - frac_aligned(-type=>'hit',...)
647 Usage   : $tiling->frac_aligned(-type=>$type,
648                                 -action=>$action,
649                                 -context=>$context)
650 Function: Return the fraction of input sequence length
651           that was aligned by the algorithm
652 Returns : scalar float
653 Args    : -type => one of 'hit', 'subject', 'query'
654           -action => one of 'exact', 'est', 'fast', 'max'
655           -context => strand/frame context string
656
657=cut
658
659sub frac_aligned{
660    my ($self, @args) = @_;
661    my ($type, $action, $context) = $self->_rearrange([qw(TYPE ACTION CONTEXT)],@args);
662    $self->_check_type_arg(\$type);
663    $self->_check_action_arg(\$action);
664    $self->_check_context_arg($type, \$context);
665    if (!$self->{"frac_aligned_${type}_${action}_${context}"}) {
666	$self->{"frac_aligned_${type}_${action}_${context}"} = $self->num_aligned($type,$action,$context)/$self->_reported_length($type);
667    }
668    return $self->{"frac_aligned_${type}_${action}_${context}"};
669}
670
671sub frac_aligned_query { shift->frac_aligned(-type=>'query', @_) }
672sub frac_aligned_hit { shift->frac_aligned(-type=>'hit', @_) }
673
674
675=head2 num_aligned
676
677 Title   : num_aligned
678 Usage   : $tiling->num_aligned(-type=>$type)
679 Function: Return the number of residues of sequence $type
680           that were aligned by the algorithm
681 Returns : scalar int
682 Args    : -type => one of 'hit', 'subject', 'query'
683           -action => one of 'exact', 'est', 'fast', 'max'
684           -context => strand/frame context string
685 Note    : Since this is calculated from reported coordinates,
686           not symbol string counts, it is already in terms of
687           "logical length"
688 Note    : Aliases length()
689
690=cut
691
692sub num_aligned { shift->length( @_ ) };
693
694=head2 num_unaligned
695
696 Title   : num_unaligned
697 Usage   : $tiling->num_unaligned(-type=>$type)
698 Function: Return the number of residues of sequence $type
699           that were left unaligned by the algorithm
700 Returns : scalar int
701 Args    : -type => one of 'hit', 'subject', 'query'
702           -action => one of 'exact', 'est', 'fast', 'max'
703           -context => strand/frame context string
704 Note    : Since this is calculated from reported coordinates,
705           not symbol string counts, it is already in terms of
706           "logical length"
707
708=cut
709
710sub num_unaligned {
711    my $self = shift;
712    my ($type,$action,$context) = @_;
713    my $ret;
714    $self->_check_type_arg(\$type);
715    $self->_check_action_arg(\$action);
716    $self->_check_context_arg($type, \$context);
717    if (!defined $self->{"num_unaligned_${type}_${action}_${context}"}) {
718	$self->{"num_unaligned_${type}_${action}_${context}"} = $self->_reported_length($type)-$self->num_aligned($type,$action,$context);
719    }
720    return $self->{"num_unaligned_${type}_${action}_${context}"};
721}
722
723
724=head2 range
725
726 Title   : range
727 Usage   : $tiling->range(-type=>$type)
728 Function: Returns the extent of the longest tiling
729           as ($min_coord, $max_coord)
730 Returns : array of two scalar integers
731 Args    : -type => one of 'hit', 'subject', 'query'
732           -context => strand/frame context string
733
734=cut
735
736sub range {
737    my ($self, $type, $context) = @_;
738    $self->_check_type_arg(\$type);
739    $self->_check_context_arg($type, \$context);
740    my @a = $self->_contig_intersection($type,$context);
741    return ($a[0][0], $a[-1][1]);
742}
743
744
745
746=head1 ACCESSORS
747
748=head2 coverage_map
749
750 Title   : coverage_map
751 Usage   : $map = $tiling->coverage_map($type)
752 Function: Property to contain the coverage map calculated
753           by _calc_coverage_map() - see that for
754           details
755 Example :
756 Returns : value of coverage_map_$type as an array
757 Args    : scalar $type: one of 'hit', 'subject', 'query'
758           default is 'query'
759 Note    : getter
760
761=cut
762
763sub coverage_map{
764    my $self = shift;
765    my ($type, $context) = @_;
766    $self->_check_type_arg(\$type);
767    $self->_check_context_arg($type, \$context);
768
769    if (!defined $self->{"coverage_map_${type}_${context}"}) {
770	# following calculates coverage maps in all strands/frames
771	# if necessary
772	$self->_calc_coverage_map($type, $context);
773    }
774    # if undef is returned, then there were no hsps for given strand/frame
775    if (!defined $self->{"coverage_map_${type}_${context}"}) {
776	$self->warn("No HSPS present for type '$type' in context '$context' for this hit");
777	return undef;
778    }
779    return @{$self->{"coverage_map_${type}_${context}"}};
780}
781
782=head2 coverage_map_as_text
783
784 Title   : coverage_map_as_text
785 Usage   : $tiling->coverage_map_as_text($type, $legend_flag)
786 Function: Format a text-graphic representation of the
787           coverage map
788 Returns : an array of scalar strings, suitable for printing
789 Args    : $type: one of 'query', 'hit', 'subject'
790           $context: strand/frame context string
791           $legend_flag: boolean; add a legend indicating
792            the actual interval coordinates for each component
793            interval and hsp (in the $type sequence context)
794 Example : print $tiling->coverage_map_as_text('query',1);
795
796=cut
797
798sub coverage_map_as_text{
799    my $self = shift;
800    my ($type, $context, $legend_q) = @_;
801    $self->_check_type_arg(\$type);
802    $self->_check_context_arg($type, \$context);
803
804    my @map = $self->coverage_map($type, $context);
805    my @ret;
806    my @hsps = $self->hit->hsps;
807    my %hsps_i;
808    require Tie::RefHash;
809    tie %hsps_i, 'Tie::RefHash';
810    @hsps_i{@hsps} = (0..$#hsps);
811    my @mx;
812    foreach (0..$#map) {
813	my @hspx = ('') x @hsps;
814	my @these_hsps = @{$map[$_]->[1]};
815	@hspx[@hsps_i{@these_hsps}] = ('*') x @these_hsps;
816	$mx[$_] = \@hspx;
817    }
818    untie %hsps_i;
819
820    push @ret, "\tIntvl\n";
821    push @ret, "HSPS\t", join ("\t", (0..$#map)), "\n";
822    foreach my $h (0..$#hsps) {
823	push @ret, join("\t", $h, map { $mx[$_][$h] } (0..$#map)  ),"\n";
824    }
825    if ($legend_q) {
826	push @ret, "Interval legend\n";
827	foreach (0..$#map) {
828	    push @ret, sprintf("%d\t[%d, %d]\n", $_, @{$map[$_][0]});
829	}
830	push @ret, "HSP legend\n";
831	my @ints = get_intervals_from_hsps($type,@hsps);
832	foreach (0..$#hsps) {
833	    push @ret, sprintf("%d\t[%d, %d]\n", $_, @{$ints[$_]});
834	}
835    }
836    return @ret;
837}
838
839=head2 hit
840
841 Title   : hit
842 Usage   : $tiling->hit
843 Function:
844 Example :
845 Returns : The HitI object associated with the invocant
846 Args    : none
847 Note    : getter only
848
849=cut
850
851sub hit{
852    my $self = shift;
853    $self->warn("Getter only") if @_;
854    return $self->{'hit'};
855}
856
857=head2 hsps
858
859 Title   : hsps
860 Usage   : $tiling->hsps()
861 Function: Container for the HSP objects associated with invocant
862 Example :
863 Returns : an array of hsps associated with the hit
864 Args    : on set, new value (an arrayref or undef, optional)
865
866=cut
867
868sub hsps{
869    my $self = shift;
870    return $self->{'hsps'} = shift if @_;
871    return @{$self->{'hsps'}};
872}
873
874=head2 contexts
875
876 Title   : contexts
877 Usage   : @contexts = $tiling->context($type) or
878           @indices = $tiling->context($type, $context)
879 Function: Retrieve the set of available contexts in the hit,
880           or the indices of hsps having the given context
881           (integer indices for the array returned by $self->hsps)
882 Returns : array of scalar context strings or
883           array of scalar positive integers
884           undef if no hsps in given context
885 Args    : $type: one of 'query', 'hit', 'subject'
886           optional $context: context string
887
888=cut
889
890sub contexts{
891    my $self = shift;
892    my ($type, $context) = @_;
893    $self->_check_type_arg(\$type);
894    return keys %{$self->{"_contexts_$type"}} unless defined $context;
895    return undef unless $self->{"_contexts_$type"}{$context};
896    return @{$self->{"_contexts_$type"}{$context}};
897}
898
899=head2 mapping
900
901 Title   : mapping
902 Usage   : $tiling->mapping($type)
903 Function: Retrieve the mapping coefficient for the sequence type
904           based on the underlying algorithm
905 Returns : scalar integer (mapping coefficient)
906 Args    : $type: one of 'query', 'hit', 'subject'
907 Note    : getter only (set in constructor)
908
909=cut
910
911sub mapping{
912    my $self = shift;
913    my $type = shift;
914    $self->_check_type_arg(\$type);
915    return $self->{"_mapping_${type}"};
916}
917
918=head2 default_context
919
920 Title   : default_context
921 Usage   : $tiling->default_context($type)
922 Function: Retrieve the default strand/frame context string
923           for the sequence type based on the underlying algorithm
924 Returns : scalar string (context string)
925 Args    : $type: one of 'query', 'hit', 'subject'
926 Note    : getter only (set in constructor)
927
928=cut
929
930sub default_context{
931    my $self = shift;
932    my $type = shift;
933    $self->_check_type_arg(\$type);
934    return $self->{"_def_context_${type}"};
935}
936
937=head2 algorithm
938
939 Title   : algorithm
940 Usage   : $tiling->algorithm
941 Function: Retrieve the algorithm name associated with the
942           invocant's hit object
943 Returns : scalar string
944 Args    : none
945 Note    : getter only (set in constructor)
946
947=cut
948
949sub algorithm{
950    my $self = shift;
951    $self->warn("Getter only") if @_;
952    return $self->{"_algorithm"};
953}
954
955=head1 "PRIVATE" METHODS
956
957=head2 Calculators
958
959See L<Bio::Search::Tiling::MapTileUtils> for lower level
960calculation methods.
961
962=head2 _calc_coverage_map
963
964 Title   : _calc_coverage_map
965 Usage   : $tiling->_calc_coverage_map($type)
966 Function: Calculates the coverage map for the object's associated
967           hit from the perspective of the desired $type (see Args:)
968           and sets the coverage_map() property
969 Returns : True on success
970 Args    : optional scalar $type: one of 'hit'|'subject'|'query'
971           default is 'query'
972 Note    : The "coverage map" is an array with the following format:
973           ( [ $component_interval => [ @containing_hsps ] ], ... ),
974           where $component_interval is a closed interval (see
975           DESCRIPTION) of the form [$a0, $a1] with $a0 <= $a1, and
976           @containing_hsps is an array of all HspI objects in the hit
977           which completely contain the $component_interval.
978           The set of $component_interval's is a disjoint decomposition
979           of the minimum set of minimal intervals that completely
980           cover the hit's HSPs (from the perspective of the $type)
981 Note    : This calculates the map for all strand/frame contexts available
982           in the hit
983
984=cut
985
986sub _calc_coverage_map {
987    my $self = shift;
988    my ($type) = @_;
989    $self->_check_type_arg(\$type);
990
991    # obtain the [start, end] intervals for all hsps in the hit (relative
992    # to the type)
993    unless ($self->{'hsps'}) {
994	$self->warn("No HSPs for this hit");
995	return;
996    }
997
998    my (@map, @hsps, %filters, @intervals);
999
1000
1001    # conversion here?
1002    my $c = $self->mapping($type);
1003
1004    # create the possible maps
1005    for my $context ($self->contexts($type)) {
1006	@map = ();
1007	@hsps = ($self->hsps)[$self->contexts($type, $context)];
1008	@intervals = get_intervals_from_hsps( $type, @hsps );
1009	# the "frame"
1010	my $f = ($intervals[0]->[0] - 1) % $c;
1011
1012	# convert interval endpoints...
1013	for (@intervals) {
1014	    $$_[0] = ($$_[0] - $f + $c - 1)/$c;
1015	    $$_[1]  = ($$_[1] - $f)/$c;
1016	}
1017
1018	# determine the minimal set of disjoint intervals that cover the
1019	# set of hsp intervals
1020	my @dj_set = interval_tiling(\@intervals);
1021
1022	# decompose each disjoint interval into another set of disjoint
1023	# intervals, each of which is completely contained within the
1024	# original hsp intervals with which it overlaps
1025	my $i=0;
1026	my @decomp;
1027	for my $dj_elt (@dj_set) {
1028	    my ($covering, $indices) = @$dj_elt;
1029	    my @covering_hsps = @hsps[@$indices];
1030	    my @coverers = @intervals[@$indices];
1031	    @decomp = decompose_interval( \@coverers );
1032	    for (@decomp) {
1033		my ($component, $container_indices) = @{$_};
1034		push @map, [ $component,
1035			     [@covering_hsps[@$container_indices]] ];
1036	    }
1037	    1;
1038	}
1039
1040	# unconvert the components:
1041#####
1042	foreach (@map) {
1043	    $$_[0][0] = $c*$$_[0][0] - $c + 1 + $f;
1044	    $$_[0][1] = $c*$$_[0][1] + $f;
1045	}
1046	foreach (@dj_set) {
1047	    $$_[0][0] = $c*$$_[0][0] - $c + 1 + $f;
1048	    $$_[0][1] = $c*$$_[0][1] + $f;
1049	}
1050
1051	# sort the map on the interval left-ends
1052	@map = sort { $a->[0][0]<=>$b->[0][0] } @map;
1053	$self->{"coverage_map_${type}_${context}"} = [@map];
1054	# set the _contig_intersection attribute here (side effect)
1055	$self->{"_contig_intersection_${type}_${context}"} = [map { $$_[0] } @map];
1056    }
1057
1058    return 1; # success
1059}
1060
1061=head2 _calc_stats
1062
1063 Title   : _calc_stats
1064 Usage   : $tiling->_calc_stats($type, $action, $context)
1065 Function: Calculates [estimated] tiling statistics (identities, conserved sites
1066           length) and sets the public accessors
1067 Returns : True on success
1068 Args    : scalar $type: one of 'hit', 'subject', 'query'
1069           default is 'query'
1070           optional scalar $action: requests calculation method
1071            currently one of 'exact', 'est', 'fast', 'max'
1072           option scalar $context: strand/frame context string
1073 Note    : Action: The statistics are calculated by summing quantities
1074           over the disjoint component intervals, taking into account
1075           coverage of those intervals by multiple HSPs. The action
1076           tells the algorithm how to obtain those quantities--
1077           'exact' will use Bio::Search::HSP::HSPI::matches
1078            to count the appropriate segment of the homology string;
1079           'est' will estimate the statistics by multiplying the
1080            fraction of the HSP overlapped by the component interval
1081            (see MapTileUtils) by the BLAST-reported identities/postives
1082            (this may be convenient for BLAST summary report formats)
1083           * Both exact and est take the average over the number of HSPs
1084             that overlap the component interval.
1085           'max' uses the exact method to calculate the statistics,
1086            and returns only the maximum identites/positives over
1087            overlapping HSP for the component interval. No averaging
1088            is involved here.
1089           'fast' doesn't involve tiling at all (hence the name),
1090            but it seems like a very good estimate, and uses only
1091            reported values, and so does not require sequence data. It
1092            calculates an average of reported identities, conserved
1093            sites, and lengths, over unmodified hsps in the hit,
1094            weighted by the length of the hsps.
1095
1096=cut
1097
1098sub _calc_stats {
1099    my $self = shift;
1100    my ($type, $action, $context) = @_;
1101    # need to check args here, in case method is called internally.
1102    $self->_check_type_arg(\$type);
1103    $self->_check_action_arg(\$action);
1104    $self->_check_context_arg($type, \$context);
1105    my ($ident, $cons, $length) = (0,0,0);
1106
1107    # fast : avoid coverage map altogether, get a pretty damn
1108    # good estimate with a weighted average of reported hsp
1109    # statistics
1110
1111    ($action eq 'fast') && do {
1112	my @hsps = $self->hit->hsps;
1113	@hsps = @hsps[$self->contexts($type, $context)];
1114	# weights for averages
1115	my @wt = map {$_->length($type)} @hsps;
1116	my $sum = eval( join('+',@wt) );
1117	$_ /= $sum for (@wt);
1118	for (@hsps) {
1119	    my $wt = shift @wt;
1120	    $ident  += $wt*$_->matches_MT($type,'identities');
1121	    $cons   += $wt*$_->matches_MT($type,'conserved');
1122	    $length += $wt*$_->length($type);
1123	}
1124    };
1125
1126    # or, do tiling
1127
1128    # calculate identities/conserved sites in tiling
1129    # estimate based on the fraction of the component interval covered
1130    # and ident/cons reported by the HSPs
1131    ($action ne 'fast') && do {
1132	foreach ($self->coverage_map($type, $context)) {
1133	    my ($intvl, $hsps) = @{$_};
1134	    my $len = ($$intvl[1]-$$intvl[0]+1);
1135	    my $ncover = ($action eq 'max') ? 1 : scalar @$hsps;
1136	    my ($acc_i, $acc_c) = (0,0);
1137	    foreach my $hsp (@$hsps) {
1138		for ($action) {
1139		    ($_ eq 'est') && do {
1140			my ($inc_i, $inc_c) = $hsp->matches_MT(
1141			    -type   => $type,
1142			    -action => 'searchutils',
1143			    );
1144			my $frac = $len/$hsp->length($type);
1145			$acc_i += $inc_i * $frac;
1146			$acc_c += $inc_c * $frac;
1147			last;
1148		    };
1149		    ($_ eq 'max') && do {
1150			my ($inc_i, $inc_c) = $hsp->matches_MT(
1151			    -type   => $type,
1152			    -action => 'searchutils',
1153			    -start => $$intvl[0],
1154			    -end   => $$intvl[1]
1155			    );
1156			$acc_i = ($acc_i > $inc_i) ? $acc_i : $inc_i;
1157			$acc_c = ($acc_c > $inc_c) ? $acc_c : $inc_c;
1158			last;
1159		    };
1160		    (!$_ || ($_ eq 'exact')) && do {
1161			my ($inc_i, $inc_c) = $hsp->matches_MT(
1162			    -type   => $type,
1163			    -action => 'searchutils',
1164			    -start  => $$intvl[0],
1165			    -end    => $$intvl[1]
1166			    );
1167			$acc_i += $inc_i;
1168			$acc_c += $inc_c;
1169			last;
1170		    };
1171		}
1172	    }
1173	    $ident += ($acc_i/$ncover);
1174	    $cons  += ($acc_c/$ncover);
1175	    $length += $len;
1176	}
1177    };
1178
1179    $self->{"identities_${type}_${action}_${context}"} = $ident;
1180    $self->{"conserved_${type}_${action}_${context}"} = $cons;
1181    $self->{"length_${type}_${action}_${context}"} = $length;
1182
1183    return 1;
1184}
1185
1186=head2 Tiling Helper Methods
1187
1188=cut
1189
1190# coverage_map is of the form
1191# ( [ $interval, \@containing_hsps ], ... )
1192
1193# so, for each interval, pick one of the containing hsps,
1194# and return the union of all the picks.
1195
1196# use the combinatorial generating iterator, with
1197# the urns containing the @containing_hsps for each
1198# interval
1199
1200=head2 _make_tiling_iterator
1201
1202 Title   : _make_tiling_iterator
1203 Usage   : $self->_make_tiling_iterator($type)
1204 Function: Create an iterator code ref that will step through all
1205           minimal combinations of HSPs that produce complete coverage
1206           of the $type ('hit', 'subject', 'query') sequence,
1207           and set the correct iterator property of the invocant
1208 Example :
1209 Returns : The iterator
1210 Args    : scalar $type, one of 'hit', 'subject', 'query';
1211           default is 'query'
1212
1213=cut
1214
1215sub _make_tiling_iterator {
1216    ### create the urns
1217    my $self = shift;
1218    my ($type, $context) = @_;
1219    $self->_check_type_arg(\$type);
1220    $self->_check_context_arg($type, \$context);
1221
1222    # initialize the urns
1223    my @urns = map { [0,  $$_[1]] } $self->coverage_map($type, $context);
1224
1225    my $FINISHED = 0;
1226    my $iter = sub {
1227	# rewind?
1228	if (my $rewind = shift) {
1229	    # reinitialize urn indices
1230	    $$_[0] = 0 for (@urns);
1231	    $FINISHED = 0;
1232	    return 1;
1233	}
1234	# check if done...
1235        return if $FINISHED;
1236
1237        my $finished_incrementing = 0;
1238	# @ret is the collector of urn choices
1239	my @ret;
1240
1241	for my $urn (@urns) {
1242	    my ($n, $hsps) = @$urn;
1243	    push @ret, $$hsps[$n];
1244	    unless ($finished_incrementing) {
1245		if ($n == $#$hsps) { $$urn[0] = 0; }
1246		else { ($$urn[0])++; $finished_incrementing = 1 }
1247	    }
1248	}
1249
1250	# backstop...
1251        $FINISHED = 1 unless $finished_incrementing;
1252	# uniquify @ret
1253	# $hsp->rank is a unique identifier for an hsp in a hit.
1254	# preserve order in @ret
1255
1256	my (%order, %uniq);
1257	@order{(0..$#ret)} = @ret;
1258	$uniq{$order{$_}->rank} = $_ for (0..$#ret);
1259	@ret = @order{ sort {$a<=>$b} values %uniq };
1260
1261        return @ret;
1262    };
1263    return $iter;
1264}
1265
1266=head2 _tiling_iterator
1267
1268 Title   : _tiling_iterator
1269 Usage   : $tiling->_tiling_iterator($type,$context)
1270 Function: Retrieve the tiling iterator coderef for the requested
1271           $type ('hit', 'subject', 'query')
1272 Example :
1273 Returns : coderef to the desired iterator
1274 Args    : scalar $type, one of 'hit', 'subject', 'query'
1275           default is 'query'
1276           option scalar $context: strand/frame context string
1277 Note    : getter only
1278
1279=cut
1280
1281sub _tiling_iterator {
1282    my $self = shift;
1283    my ($type, $context) = @_;
1284    $self->_check_type_arg(\$type);
1285    $self->_check_context_arg($type, \$context);
1286
1287    if (!defined $self->{"_tiling_iterator_${type}_${context}"}) {
1288	$self->{"_tiling_iterator_${type}_${context}"} =
1289	    $self->_make_tiling_iterator($type,$context);
1290    }
1291    return $self->{"_tiling_iterator_${type}_${context}"};
1292}
1293=head2 Construction Helper Methods
1294
1295See also L<Bio::Search::Tiling::MapTileUtils>.
1296
1297=cut
1298
1299sub _check_type_arg {
1300    my $self = shift;
1301    my $typeref = shift;
1302    $$typeref ||= 'query';
1303    $self->throw("Unknown type '$$typeref'") unless grep(/^$$typeref$/, qw( hit query subject ));
1304    $$typeref = 'hit' if $$typeref eq 'subject';
1305    return 1;
1306}
1307
1308sub _check_action_arg {
1309    my $self = shift;
1310    my $actionref = shift;
1311    if (!$$actionref) {
1312	$$actionref = ($self->_has_sequence_data ? 'exact' : 'est');
1313    }
1314    else {
1315	$self->throw("Calc action '$$actionref' unrecognized") unless grep /^$$actionref$/, qw( est exact fast max );
1316	if ($$actionref ne 'est' and !$self->_has_sequence_data) {
1317	    $self->warn("Blast file did not possess sequence data; defaulting to 'est' action");
1318	    $$actionref = 'est';
1319	}
1320    }
1321    return 1;
1322}
1323
1324sub _check_context_arg {
1325    my $self = shift;
1326    my ($type, $contextref) = @_;
1327    if (!$$contextref) {
1328	$self->throw("Type '$type' requires strand/frame context for algorithm ".$self->algorithm) unless ($self->mapping($type) == 1);
1329	# set default according to default_context attrib
1330	$$contextref = $self->default_context($type);
1331    }
1332    else {
1333	($$contextref =~ /^[mp]$/) && do { $$contextref .= '_' };
1334	$self->throw("Context '$$contextref' unrecognized") unless
1335	    $$contextref =~ /all|[mp][0-2_]/;
1336    }
1337
1338}
1339
1340=head2 _make_context_key
1341
1342 Title   : _make_context_key
1343 Alias   : _context
1344 Usage   : $tiling->_make_context_key(-strand => $strand, -frame => $frame)
1345 Function: create a string indicating strand/frame context; serves as
1346           component of memoizing hash keys
1347 Returns : scalar string
1348 Args    : -type => one of ('query', 'hit', 'subject')
1349           -strand => one of (1,0,-1)
1350           -frame  => one of (-2, 1, 0, 1, -2)
1351           called w/o args: returns 'all'
1352
1353=cut
1354
1355sub _make_context_key {
1356    my $self = shift;
1357    my @args = @_;
1358    my ($type, $strand, $frame) = $self->_rearrange([qw(TYPE STRAND FRAME)], @args);
1359    _check_type_arg(\$type);
1360    return 'all' unless (defined $strand or defined $frame);
1361    if ( defined $strand && $self->_has_strand($type) ) {
1362	if (defined $frame && $self->_has_frame($type)) {
1363	    return ($strand >= 0 ? 'p' : 'm').abs($frame);
1364	}
1365	else {
1366	    return ($strand >= 0 ? 'p_' : 'm_');
1367	}
1368    }
1369    else {
1370	if (defined $frame && $self->_has_frame($type)) {
1371	    $self->warn("Frame defined without strand; punting with plus strand");
1372	    return 'p'.abs($frame);
1373	}
1374	else {
1375	    return 'all';
1376	}
1377    }
1378}
1379
1380=head2 _context
1381
1382 Title   : _context
1383 Alias   : _make_context_key
1384 Usage   : $tiling->_make_context_key(-strand => $strand, -frame => $frame)
1385 Function: create a string indicating strand/frame context; serves as
1386           component of memoizing hash keys
1387 Returns : scalar string
1388 Args    : -type => one of ('query', 'hit', 'subject')
1389           -strand => one of (1,0,-1)
1390           -frame  => one of (-2, 1, 0, 1, -2)
1391           called w/o args: returns 'all'
1392
1393=cut
1394
1395sub _context { shift->_make_context_key(@_) }
1396
1397=head2 Predicates
1398
1399Most based on a reading of the algorithm name with a configuration lookup.
1400
1401=over
1402
1403=item _has_sequence_data()
1404
1405=cut 
1406
1407sub _has_sequence_data {
1408    my $self = shift;
1409    $self->throw("Hit attribute  not yet set") unless defined $self->hit;
1410    return (($self->hit->hsps)[0]->seq_str('match') ? 1 : 0);
1411}
1412
1413=item _has_logical_length()
1414
1415=cut
1416
1417sub _has_logical_length {
1418    my $self = shift;
1419    my $type = shift;
1420    $self->_check_type_arg(\$type);
1421    # based on mapping coeff
1422    $self->throw("Mapping coefficients not yet set") unless defined $self->mapping($type);
1423    return ($self->mapping($type) > 1);
1424}
1425
1426=item _has_strand()
1427
1428=cut
1429
1430sub _has_strand {
1431    my $self = shift;
1432    my $type = shift;
1433    $self->_check_type_arg(\$type);
1434    return $self->{"_has_strand_${type}"};
1435}
1436
1437=item _has_frame()
1438
1439=cut
1440
1441sub _has_frame {
1442    my $self = shift;
1443    my $type = shift;
1444    $self->_check_type_arg(\$type);
1445    return $self->{"_has_frame_${type}"};
1446}
1447
1448=back
1449
1450=head1 Private Accessors
1451
1452=head2 _contig_intersection
1453
1454 Title   : _contig_intersection
1455 Usage   : $tiling->_contig_intersection($type)
1456 Function: Return the minimal set of $type coordinate intervals
1457           covered by the invocant's HSPs
1458 Returns : array of intervals (2-member arrayrefs; see MapTileUtils)
1459 Args    : scalar $type: one of 'query', 'hit', 'subject'
1460
1461=cut
1462
1463sub _contig_intersection {
1464    my $self = shift;
1465    my ($type, $context) = @_;
1466    $self->_check_type_arg(\$type);
1467    $self->_check_context_arg($type, \$context);
1468    if (!defined $self->{"_contig_intersection_${type}_${context}"}) {
1469	$self->_calc_coverage_map($type);
1470    }
1471    return @{$self->{"_contig_intersection_${type}_${context}"}};
1472}
1473
1474=head2 _reported_length
1475
1476 Title   : _reported_length
1477 Usage   : $tiling->_reported_length($type)
1478 Function: Get the total length of the seq $type
1479           for the invocant's hit object, as reported
1480           by (not calculated from) the input data file
1481 Returns : scalar int
1482 Args    : scalar $type: one of 'query', 'hit', 'subject'
1483 Note    : This is kludgy; the hit object does not currently
1484           maintain accessors for these values, but the
1485           hsps possess these attributes. This is a wrapper
1486           that allows a consistent access method in the
1487           MapTiling code.
1488 Note    : Since this number is based on a reported length,
1489           it is already a "logical length".
1490
1491=cut
1492
1493sub _reported_length {
1494    my $self = shift;
1495    my $type = shift;
1496    $self->_check_type_arg(\$type);
1497    my $key = uc( $type."_LENGTH" );
1498    return ($self->hsps)[0]->{$key};
1499}
1500
15011;
1502
1503