1package Geo::BUFR; 2 3# Copyright (C) 2010-2020 MET Norway 4# 5# This module is free software; you can redistribute it and/or 6# modify it under the same terms as Perl itself. 7 8=begin General_remarks 9 10Some general remarks on variables 11--------------------------------- 12 13@data = data array 14@desc = descriptor array 15 16These 2 arrays are in one to one correspondence, but note that some C 17descriptors (2.....) are included in @desc even though there is no 18associated data value in message (the corresponding element in @data 19is set to ''). These descriptors without value are printed in 20dumpsection4 without line number, to distinguish them from 'real' data 21descriptors. 22 23$idesc = index of descriptor in @desc (and @data) 24$bm_idesc = index of bit mapped descriptor in @data (and @desc, see below) 25 26Variables related to bit maps: 27 28$self->{BUILD_BITMAP} 29$self->{BITMAP_INDEX} 30$self->{NUM_BITMAPS} 31$self->{BACKWARD_DATA_REFERENCE} 32 33These are explained in sub new 34 35$self->{BITMAP_OPERATORS} 36 37Reference to an array containing operators in BUFR table C which are 38associated with bit maps, i.e. one of 22[2-5]000 and 232000; the 39operator being added when it is met in section 3 in message. Note that 40an operator may occur multiple times, which is why we have to use an 41array, not a hash. 42 43$self->{CURRENT_BITMAP} 44 45Reference to an array which contains the indexes of data values for 46which data is marked as present in 031031 in the current used bit map. 47E.g. [2,3,6] if bitmap = 1100110. 48 49$self->{BITMAP_START} 50 51Array containing for each bit map the index of the first element 52descriptor for which the bit map relates. 53 54$self->{BITMAPS} 55 56Reference to an array, one element added for each bit map operator in 57$self->{BITMAP_OPERATORS} and each subset (although for compression we 58assume all subset have identical bitmaps and operate with subset 0 59only, i.e. $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] instead of 60...->[$isub]), the element being a reference to an array containing 61consecutive pairs of indexes ($idesc, $bm_idesc), used to look up in 62@data and @desc arrays for the value/descriptor and corresponding bit 63mapped value/descriptor. 64 65$self->{REUSE_BITMAP} 66 67Gets defined when 237000 is met, undefined if 237255 or 235000 is met. 68Originally for each subset (but defined for subset 0 only if 69compression) set to reference an array of the indexes of data values 70to which the last used bitmap relates (fetched from $self->{BITMAPS}), 71then shifted as the new element in $self->{BITMAPS} is built up. 72 73For operator 222000 ('Quality information follows') the bit mapped 74descriptor should be a 033-descriptor. For 22[3-5]/232 the bit mapped 75value should be the data value of the 22[3-5]255/232255 descriptors 76following the operator in BUFR section 3, with bit mapped descriptor 77$desc[bm_idesc] equal to $desc[$idesc] (with data width and reference 78value changed for 225255) 79 80=end General_remarks 81 82=cut 83 84require 5.006; 85use strict; 86use warnings; 87use Carp; 88use Cwd qw(getcwd); 89use FileHandle; 90use File::Spec::Functions qw(catfile); 91use Scalar::Util qw(looks_like_number); 92use Time::Local qw(timegm); 93# Also requires Storable if sub copy_from() is called 94 95require DynaLoader; 96our @ISA = qw(DynaLoader); 97our $VERSION = '1.38'; 98 99# This loads BUFR.so, the compiled version of BUFR.xs, which 100# contains bitstream2dec, bitstream2ascii, dec2bitstream, 101# ascii2bitstream and null2bitstream 102bootstrap Geo::BUFR $VERSION; 103 104 105# Some package globals 106our $Verbose = 0; 107 108# $Verbose or $self->{VERBOSE} > 0 leads to the following output, all 109# except for level 6 on lines starting with 'BUFR.pm: ': 110# 1 -> B,C,D tables used (full path) 111# 2 -> Identifying stages of processing, displaying length of sections 112# and some additional data from section 1 and 3 113# 3 -> All descriptors and values extracted 114# 4 -> Operator specific information, including delayed replication 115# and repetition 116# 5 -> BUFR compression specific information 117# 6 -> Calling dumpsection0,1,3 118 119our $Spew = 0; # To avoid the overhead of subroutine calls to _spew 120 # (which is called a lot), $Spew is set to 1 if global 121 # $Verbose or at least one object VERBOSE is set > 1. 122 # This should speed up execution a bit in the common 123 # situation when no verbose output (except possibly 124 # the BUFR tables used) is requested 125our $Nodata = 0; # If set to true will prevent decoding of section 4 126our $Noqc = 0; # If set to true will prevent decoding (or encoding) of 127 # any descriptors after 222000 is met 128our $Reuse_current_ahl = 0; 129 # If set to true will cause cet_current_ahl() to return 130 # last AHL extracted and not undef if currently 131 # processed BUFR message has no (immediately preceding) 132 # AHL 133our $Strict_checking = 0; # Ignore recoverable errors in BUFR format 134 # met during decoding. User might set 135 # $Strict_checking to 1: Issue warning 136 # (carp) but continue decoding, or to 2: 137 # Croak instead of carp 138 139# The next 2 operators are separated for readability. Public interface should 140# provide only set_show_all_operators() to set both of these (to the same value) 141our $Show_all_operators = 0; # = 0: show just the most informative C operators in dumpsection4 142 # = 1: show all operators (as far as possible) 143our $Show_replication = 0; # = 0: don't include replication descriptors (F=1) in dumpsection4 144 # = 1: include replication descriptors(F=1) in dumpsection4, 145 # with X in FXY replaced with actual number X' of replicated descriptors. 146 # X' is replaced by 0 if X' > 99 147 148our %BUFR_table; 149# Keys: PATH -> full path to the chosen directory of BUFR tables 150# FORMAT -> supported formats are BUFRDC and ECCODES 151# B$version -> hash containing the B table $BUFR_table/B$version 152# key: element descriptor (6 digits) 153# value: a \0 separated string containing the B table fields 154# $name, $unit, $scale, $refval, $bits 155# C$version -> hash containing the C table $BUFR_table/C$version 156# key: table B descriptor (6 digits) of the code/flag table 157# value: a new hash, with keys the possible values listed in 158# the code table, the value the corresponding text 159# D$version -> hash containing the D table $BUFR_table/D$version 160# key: sequence descriptor 161# value: a space separated string containing the element 162# descriptors (6 digits) the sequence descriptor expands to 163$BUFR_table{FORMAT} = 'BUFRDC'; # Default. Might in the future be changed to ECCODES 164 165our %Descriptors_already_expanded; 166# Keys: Text string "$table_version $unexpanded_descriptors" 167# Values: Space separated string of expanded descriptors 168 169sub _croak { 170 my $msg = shift; 171 croak "BUFR.pm ERROR: $msg"; 172} 173 174## Carp or croak (or ignore) according to value of $Strict_checking 175sub _complain { 176 my $msg = shift; 177 if ($Strict_checking == 1) { 178 carp "BUFR.pm WARNING: $msg"; 179 } elsif ($Strict_checking > 1) { 180 croak "BUFR.pm ERROR: $msg"; 181 } 182 return; 183} 184 185sub _spew { 186 my $self = shift; 187 my $level = shift; 188 if (ref($self)) { 189 # Global $Verbose overrides object VERBOSE 190 return if $level > $self->{VERBOSE} && $level > $Verbose; 191 } else { 192 return if $level > $Verbose; 193 } 194 my $format = shift; 195 if (@_) { 196 printf "BUFR.pm: $format\n", @_; 197 } else { 198 print "BUFR.pm: $format\n"; 199 } 200 return; 201} 202 203## Object constructor 204sub new { 205 my $class = shift; 206 my $self = {}; 207 $self->{VERBOSE} = 0; 208 $self->{CURRENT_MESSAGE} = 0; 209 $self->{CURRENT_SUBSET} = 0; 210 $self->{BUILD_BITMAP} = 0; # Will be set to 1 if a bit map needs to 211 # be built 212 $self->{BITMAP_INDEX} = 0; # Used for building up bit maps; will 213 # be incremented for each 031031 214 # encountered, then reset to 0 when bit 215 # map is finished built 216 $self->{NUM_BITMAPS} = 0; # Will be incremented each time an 217 # operator descriptor which uses a bit 218 # map is encountered in section 3 219 $self->{BACKWARD_DATA_REFERENCE} = 1; # Number the first bitmap in 220 # a possible sequence of bitmaps which 221 # relate to the same scope of data 222 # descriptors. Starts as 1 when (or 223 # rather before) the first bitmap is 224 # constructed, will then be reset to 225 # the number of the next bitmap to be 226 # constructed each time 235000 is met 227 $self->{NUM_CHANGE_OPERATORS} = 0; # Will be incremented for 228 # each of the operators CHANGE_WIDTH, 229 # CHANGE_CCITTIA5_WIDTH, CHANGE_SCALE, 230 # CHANGE_REFERENCE_VALUE (actually 231 # NEW_REFVAL_OF), CHANGE_SRW and 232 # DIFFERENCE_STATISTICAL_VALUE in effect 233 234 # If number of arguments is odd, first argument is expected to be 235 # a string containing the BUFR message(s) 236 if (@_ % 2) { 237 $self->{IN_BUFFER} = shift; 238 } 239 240 # This part is not documented in the POD. Better to remove it? 241 while (@_) { 242 my $parameter = shift; 243 my $value = shift; 244 $self->{$parameter} = $value; 245 } 246 bless $self, ref($class) || $class; 247 return $self; 248} 249 250## Copy contents of the bufr object in first argument. With no extra 251## arguments, will copy (clone) everything. With 'metadata' as second 252## argument, will copy just the metadata in section 0, 1 and 3 (and 253## all of section 2 if present) 254sub copy_from { 255 my $self = shift; 256 my $bufr = shift; 257 _croak("First argument to copy_from must be a Geo::BUFR object") 258 unless ref($bufr) eq 'Geo::BUFR'; 259 my $what = shift || 'all'; 260 if ($what eq 'metadata') { 261 for (qw( 262 BUFR_EDITION 263 MASTER_TABLE CENTRE SUBCENTRE UPDATE_NUMBER OPTIONAL_SECTION 264 DATA_CATEGORY INT_DATA_SUBCATEGORY LOC_DATA_SUBCATEGORY 265 MASTER_TABLE_VERSION LOCAL_TABLE_VERSION YEAR MONTH DAY 266 HOUR MINUTE SECOND LOCAL_USE DATA_SUBCATEGORY YEAR_OF_CENTURY 267 NUM_SUBSETS OBSERVED_DATA COMPRESSED_DATA DESCRIPTORS_UNEXPANDED 268 SEC2_STREAM 269 )) { 270 if (exists $bufr->{$_}) { 271 $self->{$_} = $bufr->{$_}; 272 } else { 273 # This cleanup might be necessary if BUFR edition changes 274 delete $self->{$_} if exists $self->{$_}; 275 } 276 } 277 } elsif ($what eq 'all') { 278 %$self = (); 279 while (my ($key, $value) = each %{$bufr}) { 280 if ($key eq 'FILEHANDLE') { 281 # If a file has been associated with the copied 282 # object, make a new filehandle rather than just 283 # copying the reference 284 $self->fopen($bufr->{FILENAME}); 285 } elsif (ref($value) and $key !~ /[BCD]_TABLE/) { 286 # Copy the whole structure, not merely the reference. 287 # Using Clone would be cheaper, but unfortunately 288 # Clone is not a core module, while Storable is 289 require Storable; 290 import Storable qw(dclone); 291 $self->{$key} = dclone($value); 292 } else { 293 $self->{$key} = $value; 294 } 295 } 296 } else { 297 _croak("Don't recognize second argument '$what' to copy_from()"); 298 } 299 return 1; 300} 301 302 303## Set debug level. Also set $Spew to true if debug level > 1 is set 304## (we don't bother to reset $Spew to 0 if all debug levels later are 305## reset to 0 or 1) 306sub set_verbose { 307 my $self = shift; 308 my $verbose = shift; 309 if (ref($self)) { 310 # Just myself 311 $self->{VERBOSE} = $verbose; 312 $self->_spew(2, "Verbosity level for object set to %d", $verbose); 313 } else { 314 # Whole class 315 $Verbose = $verbose; 316 Geo::BUFR->_spew(2, "Verbosity level for class set to %d", $verbose); 317 } 318 $Spew = $verbose if $verbose > 1; 319 return 1; 320} 321 322## Turn off (or on) decoding of section 4 323sub set_nodata { 324 my $self = shift; 325 my $n = shift; 326 $Nodata = defined $n ? $n : 1; # Default is 1 327 Geo::BUFR->_spew(2, "Nodata set to %d", $Nodata); 328 return 1; 329} 330 331## Turn off (or on) decoding of quality information 332sub set_noqc { 333 my $self = shift; 334 my $n = shift; 335 $Noqc = defined $n ? $n : 1; # Default is 1 336 Geo::BUFR->_spew(2, "Noqc set to %d", $Noqc); 337 return 1; 338} 339 340## Require strict checking of BUFR format 341sub set_strict_checking { 342 my $self = shift; 343 my $n = shift; 344 _croak "Value for strict checking not provided" 345 unless defined $n; 346 $Strict_checking = $n; 347 Geo::BUFR->_spew(2, "Strict_checking set to %d", $Strict_checking); 348 return 1; 349} 350 351## Show replication descriptors (with X in FXY replaced by actual 352## number of descriptors replicated, adjusted to 0 if > 99) and all 353## data description operators when calling dumpsection4 354sub set_show_all_operators { 355 my $self = shift; 356 my $n = shift; 357 $Show_all_operators = defined $n ? $n : 1; # Default is 1 358 $Show_replication = $Show_all_operators; 359 Geo::BUFR->_spew(2, "Show_all_operators set to %d", $Show_all_operators); 360 return 1; 361} 362 363## Accessor methods for BUFR sec0-3 ## 364sub get_bufr_length { 365 my $self = shift; 366 return defined $self->{BUFR_LENGTH} ? $self->{BUFR_LENGTH} : undef; 367} 368sub set_bufr_edition { 369 my ($self, $bufr_edition) = @_; 370 _croak "BUFR edition number not provided in set_bufr_edition" 371 unless defined $bufr_edition; 372 _croak "BUFR edition number must be an integer, is '$bufr_edition'" 373 unless $bufr_edition =~ /^\d+$/; 374 _croak "Not an allowed value for BUFR edition number: $bufr_edition" 375 unless $bufr_edition >= 0 and $bufr_edition < 5; 376 # BUFR edition 0 is in fact in use in ECMWF MARS archive 377 $self->{BUFR_EDITION} = $bufr_edition; 378 return 1; 379} 380sub get_bufr_edition { 381 my $self = shift; 382 return defined $self->{BUFR_EDITION} ? $self->{BUFR_EDITION} : undef; 383} 384sub set_master_table { 385 my ($self, $master_table) = @_; 386 _croak "BUFR master table not provided in set_master_table" 387 unless defined $master_table; 388 _croak "BUFR master table must be an integer, is '$master_table'" 389 unless $master_table =~ /^\d+$/; 390 # Max value that can be stored in 1 byte is 255 391 _croak "BUFR master table exceeds limit 255, is '$master_table'" 392 if $master_table > 255; 393 $self->{MASTER_TABLE} = $master_table; 394 return 1; 395} 396sub get_master_table { 397 my $self = shift; 398 return defined $self->{MASTER_TABLE} ? $self->{MASTER_TABLE} : undef; 399} 400sub set_centre { 401 my ($self, $centre) = @_; 402 _croak "Originating/generating centre not provided in set_centre" 403 unless defined $centre; 404 _croak "Originating/generating centre must be an integer, is '$centre'" 405 unless $centre =~ /^\d+$/; 406 # Max value that can be stored in 2 bytes is 65535 407 _croak "Originating/generating centre exceeds limit 65535, is '$centre'" 408 if $centre > 65535; 409 $self->{CENTRE} = $centre; 410 return 1; 411} 412sub get_centre { 413 my $self = shift; 414 return defined $self->{CENTRE} ? $self->{CENTRE} : undef; 415} 416sub set_subcentre { 417 my ($self, $subcentre) = @_; 418 _croak "Originating/generating subcentre not provided in set_subcentre" 419 unless defined $subcentre; 420 _croak "Originating/generating subcentre must be an integer, is '$subcentre'" 421 unless $subcentre =~ /^\d+$/; 422 _croak "Originating/generating subcentre exceeds limit 65535, is '$subcentre'" 423 if $subcentre > 65535; 424 $self->{SUBCENTRE} = $subcentre; 425 return 1; 426} 427sub get_subcentre { 428 my $self = shift; 429 return defined $self->{SUBCENTRE} ? $self->{SUBCENTRE} : undef; 430} 431sub set_update_sequence_number { 432 my ($self, $update_number) = @_; 433 _croak "Update sequence number not provided in set_update_sequence_number" 434 unless defined $update_number; 435 _croak "Update sequence number must be a nonnegative integer, is '$update_number'" 436 unless $update_number =~ /^\d+$/; 437 _croak "Update sequence number exceeds limit 255, is '$update_number'" 438 if $update_number > 255; 439 $self->{UPDATE_NUMBER} = $update_number; 440 return 1; 441} 442sub get_update_sequence_number { 443 my $self = shift; 444 return defined $self->{UPDATE_NUMBER} ? $self->{UPDATE_NUMBER} : undef; 445} 446sub set_optional_section { 447 my ($self, $optional_section) = @_; 448 _croak "Optional section (0 or 1) not provided in set_optional_section" 449 unless defined $optional_section; 450 _croak "Optional section must be 0 or 1, is '$optional_section'" 451 unless $optional_section eq '0' or $optional_section eq '1'; 452 $self->{OPTIONAL_SECTION} = $optional_section; 453 return 1; 454} 455sub get_optional_section { 456 my $self = shift; 457 return defined $self->{OPTIONAL_SECTION} ? $self->{OPTIONAL_SECTION} : undef; 458} 459sub set_data_category { 460 my ($self, $data_category) = @_; 461 _croak "Data category not provided in set_data_category" 462 unless defined $data_category; 463 _croak "Data category must be an integer, is '$data_category'" 464 unless $data_category =~ /^\d+$/; 465 _croak "Data category exceeds limit 255, is '$data_category'" 466 if $data_category > 255; 467 $self->{DATA_CATEGORY} = $data_category; 468 return 1; 469} 470sub get_data_category { 471 my $self = shift; 472 return defined $self->{DATA_CATEGORY} ? $self->{DATA_CATEGORY} : undef; 473} 474sub set_int_data_subcategory { 475 my ($self, $int_data_subcategory) = @_; 476 _croak "International data subcategory not provided in set_int_data_subcategory" 477 unless defined $int_data_subcategory; 478 _croak "International data subcategory must be an integer, is '$int_data_subcategory'" 479 unless $int_data_subcategory =~ /^\d+$/; 480 _croak "International data subcategory exceeds limit 255, is '$int_data_subcategory'" 481 if $int_data_subcategory > 255; 482 $self->{INT_DATA_SUBCATEGORY} = $int_data_subcategory; 483 return 1; 484} 485sub get_int_data_subcategory { 486 my $self = shift; 487 return defined $self->{INT_DATA_SUBCATEGORY} ? $self->{INT_DATA_SUBCATEGORY} : undef; 488} 489sub set_loc_data_subcategory { 490 my ($self, $loc_data_subcategory) = @_; 491 _croak "Local subcategory not provided in set_loc_data_subcategory" 492 unless defined $loc_data_subcategory; 493 _croak "Local data subcategory must be an integer, is '$loc_data_subcategory'" 494 unless $loc_data_subcategory =~ /^\d+$/; 495 _croak "Local data subcategory exceeds limit 255, is '$loc_data_subcategory'" 496 if $loc_data_subcategory > 255; 497 $self->{LOC_DATA_SUBCATEGORY} = $loc_data_subcategory; 498 return 1; 499} 500sub get_loc_data_subcategory { 501 my $self = shift; 502 return defined $self->{LOC_DATA_SUBCATEGORY} ? $self->{LOC_DATA_SUBCATEGORY} : undef; 503} 504sub set_data_subcategory { 505 my ($self, $data_subcategory) = @_; 506 _croak "Data subcategory not provided in set_data_subcategory" 507 unless defined $data_subcategory; 508 _croak "Data subcategory must be an integer, is '$data_subcategory'" 509 unless $data_subcategory =~ /^\d+$/; 510 _croak "Data subcategory exceeds limit 255, is '$data_subcategory'" 511 if $data_subcategory > 255; 512 $self->{DATA_SUBCATEGORY} = $data_subcategory; 513 return 1; 514} 515sub get_data_subcategory { 516 my $self = shift; 517 return defined $self->{DATA_SUBCATEGORY} ? $self->{DATA_SUBCATEGORY} : undef; 518} 519sub set_master_table_version { 520 my ($self, $master_table_version) = @_; 521 _croak "Master table version not provided in set_master_table_version" 522 unless defined $master_table_version; 523 _croak "BUFR master table version must be an integer, is '$master_table_version'" 524 unless $master_table_version =~ /^\d+$/; 525 _croak "BUFR master table version exceeds limit 255, is '$master_table_version'" 526 if $master_table_version > 255; 527 $self->{MASTER_TABLE_VERSION} = $master_table_version; 528 return 1; 529} 530sub get_master_table_version { 531 my $self = shift; 532 return defined $self->{MASTER_TABLE_VERSION} 533 ? $self->{MASTER_TABLE_VERSION} : undef; 534} 535sub set_local_table_version { 536 my ($self, $local_table_version) = @_; 537 _croak "Local table version not provided in set_local_table_version" 538 unless defined $local_table_version; 539 _croak "Local table version must be an integer, is '$local_table_version'" 540 unless $local_table_version =~ /^\d+$/; 541 _croak "Local table version exceeds limit 255, is '$local_table_version'" 542 if $local_table_version > 255; 543 $self->{LOCAL_TABLE_VERSION} = $local_table_version; 544 return 1; 545} 546sub get_local_table_version { 547 my $self = shift; 548 return defined $self->{LOCAL_TABLE_VERSION} 549 ? $self->{LOCAL_TABLE_VERSION} : undef; 550} 551sub set_year_of_century { 552 my ($self, $year_of_century) = @_; 553 _croak "Year of century not provided in set_year_of_century" 554 unless defined $year_of_century; 555 _croak "Year of century must be an integer, is '$year_of_century'" 556 unless $year_of_century =~ /^\d+$/; 557 _complain "year_of_century > 100 in set_year_of_century: $year_of_century" 558 if $year_of_century > 100; 559 # A common mistake is to set year_of_century for year 2000 to 0, should be 100 560 $self->{YEAR_OF_CENTURY} = $year_of_century == 0 ? 100 : $year_of_century; 561 return 1; 562} 563sub get_year_of_century { 564 my $self = shift; 565 if (defined $self->{YEAR_OF_CENTURY}) { 566 return $self->{YEAR_OF_CENTURY}; 567 } elsif (defined $self->{YEAR}) { 568 my $yy = $self->{YEAR} % 100; 569 return $yy == 0 ? 100 : $yy; 570 } else { 571 return undef; 572 } 573} 574sub set_year { 575 my ($self, $year) = @_; 576 _croak "Year not provided in set_year" 577 unless defined $year; 578 _croak "Year must be an integer, is '$year'" 579 unless $year =~ /^\d+$/; 580 _croak "Year exceeds limit 65535, is '$year'" 581 if $year > 65535; 582 $self->{YEAR} = $year; 583 return 1; 584} 585sub get_year { 586 my $self = shift; 587 return defined $self->{YEAR} ? $self->{YEAR} : undef; 588} 589sub set_month { 590 my ($self, $month) = @_; 591 _croak "Month not provided in set_month" 592 unless defined $month; 593 _croak "Month must be an integer, is '$month'" 594 unless $month =~ /^\d+$/; 595 _complain "Month must be 1-12 in set_month, is '$month'" 596 if $month == 0 || $month > 12; 597 $self->{MONTH} = $month; 598 return 1; 599} 600sub get_month { 601 my $self = shift; 602 return defined $self->{MONTH} ? $self->{MONTH} : undef; 603} 604sub set_day { 605 my ($self, $day) = @_; 606 _croak "Day not provided in set_day" 607 unless defined $day; 608 _croak "Day must be an integer, is '$day'" 609 unless $day =~ /^\d+$/; 610 _complain "Day must be 1-31 in set_day, is '$day'" 611 if $day == 0 || $day > 31; 612 $self->{DAY} = $day; 613 return 1; 614} 615sub get_day { 616 my $self = shift; 617 return defined $self->{DAY} ? $self->{DAY} : undef; 618} 619sub set_hour { 620 my ($self, $hour) = @_; 621 _croak "Hour not provided in set_hour" 622 unless defined $hour; 623 _croak "Hour must be an integer, is '$hour'" 624 unless $hour =~ /^\d+$/; 625 _complain "Hour must be 0-23 in set_hour, is '$hour'" 626 if $hour > 23; 627 $self->{HOUR} = $hour; 628 return 1; 629} 630sub get_hour { 631 my $self = shift; 632 return defined $self->{HOUR} ? $self->{HOUR} : undef; 633} 634sub set_minute { 635 my ($self, $minute) = @_; 636 _croak "Minute not provided in set_minute" 637 unless defined $minute; 638 _croak "Minute must be an integer, is '$minute'" 639 unless $minute =~ /^\d+$/; 640 _complain "Minute must be 0-59 in set_minute, is '$minute'" 641 if $minute > 59; 642 $self->{MINUTE} = $minute; 643 return 1; 644} 645sub get_minute { 646 my $self = shift; 647 return defined $self->{MINUTE} ? $self->{MINUTE} : undef; 648} 649sub set_second { 650 my ($self, $second) = @_; 651 _croak "Second not provided in set_second" 652 unless defined $second; 653 _croak "Second must be an integer, is '$second'" 654 unless $second =~ /^\d+$/; 655 _complain "Second must be 0-59 in set_second, is '$second'" 656 if $second > 59; 657 $self->{SECOND} = $second; 658 return 1; 659} 660sub get_second { 661 my $self = shift; 662 return defined $self->{SECOND} ? $self->{SECOND} : undef; 663} 664sub set_local_use { 665 my ($self, $local_use) = @_; 666 _croak "Local use not provided in set_local use" 667 unless defined $local_use; 668 $self->{LOCAL_USE} = $local_use; 669 return 1; 670} 671sub get_local_use { 672 my $self = shift; 673 return defined $self->{LOCAL_USE} ? $self->{LOCAL_USE} : undef; 674} 675sub set_number_of_subsets { 676 my ($self, $number_of_subsets) = @_; 677 _croak "Number of subsets not provided in set_number_of_subsets" 678 unless defined $number_of_subsets; 679 _croak "Number of subsets must be an integer, is '$number_of_subsets'" 680 unless $number_of_subsets =~ /^\d+$/; 681 _croak "Number of subsets exceeds limit 65535, is '$number_of_subsets'" 682 if $number_of_subsets > 65535; 683 $self->{NUM_SUBSETS} = $number_of_subsets; 684 return 1; 685} 686sub get_number_of_subsets { 687 my $self = shift; 688 return defined $self->{NUM_SUBSETS} ? $self->{NUM_SUBSETS} : undef; 689} 690sub set_observed_data { 691 my ($self, $observed_data) = @_; 692 _croak "Observed data (0 or 1) not provided in set_observed_data" 693 unless defined $observed_data; 694 _croak "Observed data must be 0 or 1, is '$observed_data'" 695 unless $observed_data eq '0' or $observed_data eq '1'; 696 $self->{OBSERVED_DATA} = $observed_data; 697 return 1; 698} 699sub get_observed_data { 700 my $self = shift; 701 return defined $self->{OBSERVED_DATA} ? $self->{OBSERVED_DATA} : undef; 702} 703sub set_compressed_data { 704 my ($self, $compressed_data) = @_; 705 _croak "Compressed data (0 or 1) not provided in set_compressed_data" 706 unless defined $compressed_data; 707 _croak "Compressed data must be 0 or 1, is '$compressed_data'" 708 unless $compressed_data eq '0' or $compressed_data eq '1'; 709 _complain "Not allowed to use compression for one subset messages!" 710 if $compressed_data 711 and defined $self->{NUM_SUBSETS} and $self->{NUM_SUBSETS} == 1; 712 $self->{COMPRESSED_DATA} = $compressed_data; 713 return 1; 714} 715sub get_compressed_data { 716 my $self = shift; 717 return defined $self->{COMPRESSED_DATA} ? $self->{COMPRESSED_DATA} : undef; 718} 719sub set_descriptors_unexpanded { 720 my ($self, $descriptors_unexpanded) = @_; 721 _croak "Unexpanded descriptors not provided in set_descriptors_unexpanded" 722 unless defined $descriptors_unexpanded; 723 $self->{DESCRIPTORS_UNEXPANDED} = $descriptors_unexpanded; 724 return 1; 725} 726sub get_descriptors_unexpanded { 727 my $self = shift; 728 return defined $self->{DESCRIPTORS_UNEXPANDED} 729 ? $self->{DESCRIPTORS_UNEXPANDED} : undef; 730} 731############################################# 732## End of accessor methods for BUFR sec0-3 ## 733############################################# 734 735sub get_current_subset_number { 736 my $self = shift; 737 return defined $self->{CURRENT_SUBSET} ? $self->{CURRENT_SUBSET} : undef; 738} 739 740sub get_current_message_number { 741 my $self = shift; 742 return defined $self->{CURRENT_MESSAGE} ? $self->{CURRENT_MESSAGE} : undef; 743} 744 745sub get_current_ahl { 746 my $self = shift; 747 return defined $self->{CURRENT_AHL} ? $self->{CURRENT_AHL} : undef; 748} 749 750sub reuse_current_ahl { 751 my $self = shift; 752 my $n = shift; 753 $Reuse_current_ahl = defined $n ? $n : 1; # Default is 1 754 Geo::BUFR->_spew(2, "Reuse_current_ahl set to %d", $Reuse_current_ahl); 755 return 1; 756} 757 758sub ahl_is_reused { 759 my $self = shift; 760 return defined $self->{REUSED_CURRENT_AHL} ? $self->{REUSED_CURRENT_AHL} : undef; 761} 762 763sub set_filter_cb { 764 my $self = shift; 765 my $cb = shift; 766 767 if (ref $cb eq 'CODE') { 768 $self->{FILTER_CB} = $cb; 769 @{$self->{FILTER_ARGS}} = ($self, @_); 770 } else { 771 $self->{FILTER_CB} = undef; 772 delete $self->{FILTER_ARGS}; 773 } 774 return 1; 775} 776 777sub is_filtered { 778 my $self = shift; 779 return defined $self->{IS_FILTERED} ? $self->{IS_FILTERED} : undef; 780} 781 782sub bad_bufrlength { 783 my $self = shift; 784 return defined $self->{BAD_LENGTH} ? $self->{BAD_LENGTH} : undef; 785} 786 787sub set_tableformat { 788 my $self = shift; 789 790 my $format = shift; 791 _croak "Table format not provided. Possible values are BUFRDC and ECCODES" 792 unless defined $format; 793 _croak "Supported table formats are BUFRDC and ECCODES" 794 unless uc($format) eq 'BUFRDC' || uc($format) eq 'ECCODES'; 795 $BUFR_table{FORMAT} = uc($format); 796 Geo::BUFR->_spew(2, "BUFR table format set to %s", $BUFR_table{FORMAT}); 797 return 1; 798} 799 800sub get_tableformat { 801 my $self = shift; 802 return exists $BUFR_table{FORMAT} ? $BUFR_table{FORMAT} : ''; 803} 804 805## Set the path for BUFR table files 806## Usage: Geo::BUFR->set_tablepath(directory_list) 807## where directory_list is a list of colon-separated strings. 808## Example: Geo::BUFR->set_tablepath("/foo/bar:/foo/baz", "/some/where/else") 809sub set_tablepath { 810 my $self = shift; 811 812 $BUFR_table{PATH} = join ":", map {split /:/} @_; 813 Geo::BUFR->_spew(2, "BUFR table path set to %s", $BUFR_table{PATH}); 814 return 1; 815} 816 817sub get_tablepath { 818 my $self = shift; 819 820 if (exists $BUFR_table{PATH}) { 821 return wantarray ? split(/:/, $BUFR_table{PATH}) : $BUFR_table{PATH}; 822 } else { 823 return ''; 824 } 825} 826 827## Return table version from table if provided, or else from section 1 828## information in BUFR message. For BUFRDC, this is a stripped down 829## version of table name. For ECCODES, this is last path of table 830## location (e.g. '0/wmo/29'), and a stringified list of two such 831## paths (master and local) if local tables are used 832## (e.g. '0/wmo/29,0/local/8/78/236'). Returns undef/empty list if 833## impossible to determine table version. 834sub get_table_version { 835 my $self = shift; 836 my $table = shift; 837 838 if ($table) { 839 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 840 # First check if this actually is an attempt to load an ECCODES table 841 if ($table =~ /wmo/ || $table =~ /local/) { 842 _croak("$table cannot be a BUFRDC table. " 843 . "Did you forget to set tableformat to ECCODES?"); 844 } 845 (my $version = $table) =~ s/^(?:[BCD]?)(.*?)(?:\.TXT)?$/$1/; 846 return $version; 847 } elsif ($BUFR_table{FORMAT} eq 'ECCODES') { 848 # Mainly meant to catch attempts to load a BUFRDC table 849 # with tableformat mistakingly set to ECCODES 850 _croak("$table cannot be an ecCodes table") 851 unless ($table =~ /wmo/ || $table =~ /local/); 852 return $table; 853 } 854 } 855 856 # No table provided. Decide version from section 1 information. 857 # First check that the necessary metadata exist 858 foreach my $metadata (qw(MASTER_TABLE LOCAL_TABLE_VERSION 859 CENTRE SUBCENTRE)) { 860 return undef if ! defined $self->{$metadata}; 861 } 862 863 # If master table version, use centre 0 and subcentre 0 (in ECMWF 864 # BUFRDC this is the convention from version 320 onwards) 865 my $centre = $self->{CENTRE}; 866 my $subcentre = $self->{SUBCENTRE}; 867 my $local_table_version = $self->{LOCAL_TABLE_VERSION}; 868 if ($local_table_version == 0 || $local_table_version == 255) { 869 $centre = 0; 870 $subcentre = 0; 871 $local_table_version = 0; 872 } 873 874 my $master_table = $self->{MASTER_TABLE}; 875 my $master_table_version = $self->{MASTER_TABLE_VERSION}; 876 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 877 # naming convention used in BUFRDC version >= 000270 878 return sprintf "%03d%05d%05d%03d%03d", 879 $master_table,$subcentre,$centre,$master_table_version,$local_table_version; 880 } elsif ($BUFR_table{FORMAT} eq 'ECCODES') { 881 if ($local_table_version == 0) { 882 return catfile($master_table,'wmo',$master_table_version); 883 } else { 884 return catfile($master_table,'wmo',$master_table_version) . ',' . 885 catfile($master_table,'local',$local_table_version,$centre,$subcentre); 886 } 887 } 888} 889 890# Search through $BUFR_table{PATH} to find first path for which $fname 891# exists, or (for BUFRDC) if no such path exists, first path for which the 892# corresponding master file exists, in which case 893# $self->{LOCAL_TABLES_NOT_FOUND} is set to the local table initially 894# searched for (this variable should be undefined as soon as the 895# message is finished processing). Returns empty list if no such path 896# could be found, else returns the path and the table name for which 897# path was found. 898sub _locate_table { 899 my ($self,$fname) = @_; 900 901 _croak "BUFR table path not set, did you forget to call set_tablepath()?" 902 unless $BUFR_table{PATH}; 903 904 my $path; 905 foreach (split /:/, $BUFR_table{PATH}) { 906 if (-e catfile($_, $fname)) { 907 $path = $_; 908 $path =~ s|/$||; 909 return ($path,$fname); 910 } 911 } 912 913 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 914 # Path couldn't be found for $fname. Then try again for master table 915 my $master_table; 916 ($master_table,$path) = $self->_locate_master_table($fname); 917 if ($path) { 918 $self->{LOCAL_TABLES_NOT_FOUND} = $fname; 919 return ($path,$master_table); 920 } 921 } 922 923 # No table found 924 return; 925} 926 927# Return master table and path corresponding to local table $fname, or 928# empty list if $fname actually is a master table or if no path for the 929# master table could be found. 930sub _locate_master_table { 931 my ($self,$fname) = @_; 932 933 my $master_table; 934 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 935 _croak("$fname is not a valid name for BUFRDC tables") 936 if length($fname) < 20; 937 $master_table = substr($fname,0,4) . '00000' . '00000' 938 . substr($fname,14,3) . '000.TXT'; 939 } elsif ($BUFR_table{FORMAT} eq 'ECCODES') { 940 foreach my $metadata (qw(MASTER_TABLE MASTER_TABLE_VERSION)) { 941 return if ! defined $self->{$metadata}; 942 } 943 $master_table = catfile($self->{MASTER_TABLE},'wmo',$self->{MASTER_TABLE_VERSION}); 944 } 945 return if ($master_table eq $fname); # Already tried 946 947 my $path; 948 foreach (split /:/, $BUFR_table{PATH}) { 949 if (-e catfile($_, $master_table)) { 950 $path = $_; 951 $path =~ s|/$||; 952 return ($master_table,$path); 953 } 954 } 955 return; 956} 957 958## Read in a B table file into a hash, e.g. 959## $B_table{'001001'} = "WMO BLOCK NUMBER\0NUMERIC\0 0\0 0\0 7" 960## where the B table values for 001001 are \0 (NUL) separated 961sub _read_B_table_bufrdc { 962 my ($self,$version) = @_; 963 964 my $fname = "B$version.TXT"; 965 my ($path,$tname) = $self->_locate_table($fname) 966 or _croak "Couldn't find BUFR table $fname in $BUFR_table{PATH}." 967 . " Wrong tablepath?"; 968 969 # If we are forced to try master table because local table 970 # couldn't be found, check if this might already have been loaded 971 if ($tname ne $fname) { 972 my $master_version = substr($tname,1,-4); 973 return $BUFR_table{"B$master_version"} if exists $BUFR_table{"B$master_version"}; 974 } 975 976 my $tablefile = catfile($path,$tname); 977 open(my $TABLE, '<', $tablefile) 978 or _croak "Couldn't open BUFR table B $tablefile: $!"; 979 my $txt = "Reading table $tablefile"; 980 $txt .= " (since local table " . $self->{LOCAL_TABLES_NOT_FOUND} 981 . " couldn't be found)" if $self->{LOCAL_TABLES_NOT_FOUND}; 982 $self->_spew(1, "%s", $txt); 983 984 my %B_table; 985 while (<$TABLE>) { 986 my ($s1,$fxy,$s2,$name,$s3,$unit,$s4,$scale,$s5,$refval,$s6,$bits) 987 = unpack('AA6AA64AA24AA3AA12AA3', $_); 988 next unless defined $bits; 989 $name =~ s/\s+$//; 990 $refval =~ s/-\s+(\d+)/-$1/; # Remove blanks between minus sign and value 991 $B_table{$fxy} = join "\0", $name, $unit, $scale, $refval, $bits; 992 } 993 # When installing Geo::BUFR on Windows Vista with Strawberry Perl, 994 # close sometimes returned an empty string. Therefore removed 995 # check on return value for close. 996 close $TABLE; # or _croak "Closing $tablefile failed: $!"; 997 998 $BUFR_table{"B$version"} = \%B_table; 999 return \%B_table; 1000} 1001 1002sub _read_B_table_eccodes { 1003 my ($self,$version) = @_; 1004 1005 my ($path,$tname) = $self->_locate_table(catfile($version,'element.table')); 1006 1007 if (! $path) { 1008 if ($version =~ /wmo/) { 1009 _croak "Couldn't find BUFR table " . catfile($version,'element.table') 1010 . " in $BUFR_table{PATH}. Wrong tablepath?"; 1011 } else { 1012 # This might actually not be an error, since local table 1013 # might be provided for D only. But if later a local 1014 # element descriptor is requested, we should complain 1015 $self->{LOCAL_TABLES_NOT_FOUND} = $version; 1016 return; 1017 } 1018 } 1019 my $tablefile = catfile($path,$tname); 1020 1021 open(my $TABLE, '<', $tablefile) 1022 or _croak "Couldn't open BUFR table B $tablefile: $!"; 1023 $self->_spew(1, "Reading table %s", $tablefile); 1024 1025 my %B_table; 1026 while (<$TABLE>) { 1027 # Skip comments (expexted to be in first line only) 1028 next if /^#/; 1029 1030 # $rest is crex_unit|crex_scale|crex_width 1031 my ($code,$abbreviation,$type,$name,$unit,$scale,$reference,$width,$rest) 1032 = split /[|]/; 1033 next unless defined $width; # shouldn't happen 1034 $unit = 'CCITTIA5' if $unit eq 'CCITT IA5'; 1035 $B_table{$code} = join "\0", $name, $unit, $scale, $reference, $width; 1036 } 1037 close $TABLE; 1038 1039 $BUFR_table{"B$version"} = \%B_table; 1040 return \%B_table; 1041} 1042 1043## Reads a D table file into a hash, e.g. 1044## $D_table->{307080} = '301090 302031 ...' 1045## There are two different types of lines in D*.TXT, e.g. 1046## 307080 13 301090 BUFR template for synoptic reports 1047## 302031 1048## We choose to ignore the number of lines in expansion (here 13) 1049## because this number is sometimes in error. Instead we consider a 1050## line starting with 5 spaces to be of the second type above, else of 1051## the first type 1052sub _read_D_table_bufrdc { 1053 my ($self,$version) = @_; 1054 1055 my $fname = "D$version.TXT"; 1056 my ($path,$tname) = $self->_locate_table($fname) 1057 or _croak "Couldn't find BUFR table $fname in $BUFR_table{PATH}." 1058 . "Wrong tablepath?"; 1059 1060 # If we are forced to try master table because local table 1061 # couldn't be found, check if this might already have been loaded 1062 if ($tname ne $fname) { 1063 my $master_version = substr($tname,1,-4); 1064 return $BUFR_table{"D$master_version"} if exists $BUFR_table{"D$master_version"}; 1065 } 1066 1067 my $tablefile = catfile($path,$tname); 1068 open(my $TABLE, '<', $tablefile) 1069 or _croak "Couldn't open BUFR table D $tablefile: $!"; 1070 my $txt = "Reading table $tablefile"; 1071 $txt .= " (since local table " . $self->{LOCAL_TABLES_NOT_FOUND} 1072 . " couldn't be found)" if $self->{LOCAL_TABLES_NOT_FOUND}; 1073 $self->_spew(1, "%s", $txt); 1074 1075 my (%D_table, $alias); 1076 while (my $line = <$TABLE>) { 1077 $line =~ s/\s+$//; 1078 next if $line =~ /^\s*$/; # Blank line 1079 1080 if (substr($line,0,5) eq ' ' x 5) { 1081 $line =~ s/^\s+//; 1082 $D_table{$alias} .= " $line"; 1083 } else { 1084 $line =~ s/^\s+//; 1085 # In table version 17 a descriptor with more than 100 1086 # entries occurs, causing no space between alias and 1087 # number of entries (so split /\s+/ doesn't work) 1088 my ($ali, $skip, $desc) = unpack('A6A4A6', $line); 1089 $alias = $ali; 1090 $D_table{$alias} = $desc; 1091 } 1092 } 1093 close $TABLE; # or _croak "Closing $tablefile failed: $!"; 1094 1095 $BUFR_table{"D$version"} = \%D_table; 1096 return \%D_table; 1097} 1098 1099sub _read_D_table_eccodes { 1100 my ($self,$version) = @_; 1101 1102 my ($path,$tname) = $self->_locate_table(catfile($version,'sequence.def')); 1103 1104 if (! $path) { 1105 if ($version =~ /wmo/) { 1106 _croak "Couldn't find BUFR table " . catfile($version,'sequence.def') 1107 . " in $BUFR_table{PATH}. Wrong tablepath?"; 1108 } else { 1109 # This might actually not be an error, since local table 1110 # might be provided for B only. But if later a local 1111 # sequence descriptor is requested, we should complain 1112 $self->{LOCAL_TABLES_NOT_FOUND} = $version; 1113 } 1114 return; 1115 } 1116 my $tablefile = catfile($path,$tname); 1117 1118 open(my $TABLE, '<', $tablefile) 1119 or _croak "Couldn't open BUFR table B $tablefile: $!"; 1120 $self->_spew(1, "Reading table %s", $tablefile); 1121 1122## sequence.def is expected to contain lines like 1123#"301196" = [ 301011, 301013, 301021 ] 1124## which should be converted to 1125# 301196 3 301011 1126# 301013 1127# 301021 1128## Must also handle descriptors spanning more than one line, like 1129#"301046" = [ 001007, 001012, 002048, 021119, 025060, 202124, 002026, 002027, 202000, 005040 1130# ] 1131## and 1132#"301058" = [ 301011, 301012, 201152, 202135, 004006, 202000, 201000, 301021, 020111, 020112, 1133# 020113, 020114, 020115, 020116, 020117, 020118, 020119, 025035, 020121, 020122, 1134# 020123, 020124, 025175, 020023, 025063, 202136, 201136, 002121, 201000, 202000, 1135# 025061, 002184, 002189, 025036, 101000, 031002, 301059 ] 1136 my %D_table; 1137 my $txt; 1138 while (<$TABLE>) { 1139 if (substr($_,0,1) eq '"') { 1140 # New sequence descriptor, parse and store the previous 1141 _parse_sequence(\%D_table,$txt) if $txt; 1142 chomp; 1143 $txt = $_; 1144 } else { 1145 chomp; 1146 $txt .= $_; 1147 } 1148 } 1149 _parse_sequence(\%D_table,$txt) if $txt; 1150 1151 close $TABLE; # or _croak "Closing $tablefile failed: $!"; 1152 1153 $BUFR_table{"D$version"} = \%D_table; 1154 return \%D_table; 1155} 1156 1157sub _parse_sequence { 1158 my ($Dtable, $txt) = @_; 1159 1160 my ($seq, $rest) = ($txt =~ /^"(\d{6})" = \[(.*)\]/); 1161 my @list = split(/,/, $rest); 1162 foreach (@list) { 1163 s/^ +//; 1164 s/ +$//; 1165 } 1166 $Dtable->{$seq} = join(' ', @list); 1167} 1168 1169## Read the flag and code tables, which in ECMWF BUFRDC tables are 1170## put in tables C$version.TXT (not to be confused with BUFR C tables, 1171## which contain the operator descriptors). Note that even though 1172## number of code values and number of lines are included in the 1173## tables, we choose to ignore them, because these values are often 1174## found to be in error. Instead we trust that the text starts at 1175## fixed positions in file. Returns reference to the C table, or undef 1176## if failing to open table file. 1177sub _read_C_table { 1178 my ($self,$version) = @_; 1179 1180 # For ECCODES loading 2 different codetables directories might be necessary 1181 if ($BUFR_table{FORMAT} eq 'ECCODES') { 1182 if ($version =~ /,/) { 1183 my ($master, $local) = (split /,/, $version); 1184 $self->_read_C_table_eccodes($master); 1185 return $self->_read_C_table_eccodes($local); 1186 } else { 1187 return $self->_read_C_table_eccodes($version); 1188 } 1189 } 1190 1191 # Rest of code is for BUFRDC 1192 my $fname = "C$version.TXT"; 1193 my ($path,$tname) = $self->_locate_table($fname); 1194 return undef unless $path; 1195 1196 # If we are forced to try master table because local table 1197 # couldn't be found, check if this might already have been loaded 1198 if ($tname ne $fname) { 1199 my $master_version = substr($tname,1,-4); 1200 return $BUFR_table{"C$master_version"} if exists $BUFR_table{"C$master_version"}; 1201 } 1202 1203 my $tablefile = catfile($path,$tname); 1204 open(my $TABLE, '<', $tablefile) 1205 or _croak "Couldn't open BUFR table C $tablefile: $!"; 1206 my $txt = "Reading table $tablefile"; 1207 $txt .= " (since local table " . $self->{LOCAL_TABLES_NOT_FOUND} 1208 . " couldn't be found)" if $self->{LOCAL_TABLES_NOT_FOUND}; 1209 $self->_spew(1, "%s", $txt); 1210 1211 my (%C_table, $table, $value); 1212 while (my $line = <$TABLE>) { 1213 $line =~ s/\s+$//; 1214 next if $line =~ /^\s*$/; # Blank line 1215 1216 if (substr($line,0,15) eq ' ' x 15) { 1217 $line =~ s/^\s+//; 1218 next if $line eq 'NOT DEFINED' || $line eq 'RESERVED'; 1219 $C_table{$table}{$value} .= $line . "\n"; 1220 } elsif (substr($line,0,10) eq ' ' x 10) { 1221 $line =~ s/^\s+//; 1222 my ($val, $nlines, $txt) = split /\s+/, $line, 3; 1223 $value = $val+0; 1224 next if !defined $txt || $txt eq 'NOT DEFINED' || $txt eq 'RESERVED'; 1225 $C_table{$table}{$value} .= $txt . "\n"; 1226 } else { 1227 my ($tbl, $nval, $val, $nlines, $txt) = split /\s+/, $line, 5; 1228 $table = sprintf "%06d", $tbl; 1229 # For tables listed 2 or more times, use last instance only. 1230 # This prevents $txt to be duplicated in $C_table{$table}{$value} 1231 undef $C_table{$table} if defined $C_table{$table}; 1232 $value = $val+0; 1233 next if !defined $txt || $txt eq 'NOT DEFINED' || $txt eq 'RESERVED'; 1234 $C_table{$table}{$value} = $txt . "\n"; 1235 } 1236 } 1237 close $TABLE; # or _croak "Closing $tablefile failed: $!"; 1238 1239 $BUFR_table{"C$version"} = \%C_table; 1240 return \%C_table; 1241} 1242 1243sub _read_C_table_eccodes { 1244 my ($self,$version) = @_; 1245 1246 my ($path,$tname) = $self->_locate_table(catfile($version,'codetables')); 1247 1248 if (! $path) { 1249 if ($version =~ /wmo/) { 1250 _croak "Couldn't find BUFR table " . catfile($version,'element.table') 1251 . " in $BUFR_table{PATH}. Wrong tablepath?" 1252 if (! $path && $version =~ /wmo/); 1253 } else { 1254 # This might actually not be an error, if none of the 1255 # local descriptors are of type code or flag table. So 1256 # prefer to keep silent in this case. 1257 return; 1258 } 1259 } 1260 1261 my $tabledir = catfile($path,$tname); 1262 my $cwd = getcwd(); 1263 chdir $tabledir || croak "Couldn't chdir to $tabledir: $!"; 1264 1265 my @table_files = map { $_->[1] } 1266 sort { $a->[0] <=> $b->[0] } 1267 map { [_get_tableid_eccodes($_), $_] } 1268 glob("*.table"); 1269 $self->_spew(1, "Reading tables in %s", $tabledir) if @table_files; 1270 1271 my %C_table; 1272 foreach my $table_file (@table_files) { 1273 my ($table) = ($table_file =~ /(\d+)\.table$/); 1274 die "Unexpected name of table file: $table_file" unless $table; 1275 $table = sprintf "%06d", $table; 1276 1277 open my $IN, '<', $table_file 1278 or croak "Couldn't open $table_file: $!"; 1279 while (<$IN>) { 1280 chomp; 1281 my ($num, $val, $txt) = split(/ /, $_, 3); 1282 _complain("Unexpected: first 2 fields in $table_file in $tabledir are unequal: $num $val") 1283 if ($Strict_checking and $num ne $val); 1284 1285 # Fix a common problem in ecCodes codetables with long 1286 # lines, hopefully not changing valid use of '"' in local 1287 # tables (e.g. 8/78/0/codetables/8198.table: ""Nebenamtliche"" measurement 1288 $txt =~ s/(?<!")" +//; 1289## $txt =~ s/" +//; 1290 1291 $C_table{$table}{$val} = $txt . "\n"; 1292 } 1293 1294 _complain("$table_file in $tabledir is empty!") 1295 if ($Strict_checking and not $C_table{$table}); 1296 close $IN; 1297 } 1298 chdir $cwd; 1299 1300 $BUFR_table{"C$version"} = \%C_table; 1301 return \%C_table; 1302} 1303 1304sub _get_tableid_eccodes { 1305 my $table_file = shift; 1306 my ($id) = ($table_file =~ /(\d+)\.table$/); 1307 return $id; 1308} 1309 1310 1311sub load_BDtables { 1312 my $self = shift; 1313 my $table = shift || ''; 1314 1315 my $version = $self->{TABLE_VERSION} = $self->get_table_version($table) 1316 or _croak "Not enough info to decide which tables to load"; 1317 1318 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 1319 $self->{B_TABLE} = $BUFR_table{"B$version"} || $self->_read_B_table_bufrdc($version); 1320 $self->{D_TABLE} = $BUFR_table{"D$version"} || $self->_read_D_table_bufrdc($version); 1321 } elsif ($BUFR_table{FORMAT} eq 'ECCODES') { 1322 if ($version =~ /,/) { 1323 my ($master, $local) = (split /,/, $version); 1324 $self->{B_TABLE} = $BUFR_table{"B$master"} || $self->_read_B_table_eccodes($master); 1325 $self->{D_TABLE} = $BUFR_table{"D$master"} || $self->_read_D_table_eccodes($master); 1326 1327 # Append local table to the master table (should work even if empty) 1328 my $local_Btable = (exists($BUFR_table{"B$local"})) ? $BUFR_table{"B$local"} 1329 : $self->_read_B_table_eccodes($local); 1330 @{$self->{B_TABLE}}{ keys %$local_Btable } = values %$local_Btable; 1331 my $local_Dtable = (exists($BUFR_table{"D$local"})) ? $BUFR_table{"D$local"} 1332 : $self->_read_D_table_eccodes($local); 1333 @{$self->{D_TABLE}}{ keys %$local_Dtable } = values %$local_Dtable;; 1334 1335 } else { 1336 $self->{B_TABLE} = $BUFR_table{"B$version"} || $self->_read_B_table_eccodes($version); 1337 $self->{D_TABLE} = $BUFR_table{"D$version"} || $self->_read_D_table_eccodes($version); 1338 } 1339 } 1340 return $version; 1341} 1342 1343sub load_Ctable { 1344 my $self = shift; 1345 my $table = shift || ''; 1346 my $default_table = shift || ''; 1347 1348 my $version = $self->get_table_version($table) || ''; 1349 _croak "Not enough info to decide which C table to load" 1350 if not $version and not $default_table; 1351 1352 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 1353 $self->{C_TABLE} = $BUFR_table{"C$version"} || $self->_read_C_table($version); 1354 } elsif ($BUFR_table{FORMAT} eq 'ECCODES') { 1355 if ($version =~ /,/) { 1356 my ($master, $local) = (split /,/, $version); 1357 $self->{C_TABLE} = $BUFR_table{"$master"} || $self->_read_C_table($master); 1358 1359 # Append local table to the master table (should work even if empty) 1360 my $local_Ctable = (exists($BUFR_table{"C$local"})) ? $BUFR_table{"C$local"} 1361 : $self->_read_C_table_eccodes($local); 1362 @{$self->{C_TABLE}}{ keys %$local_Ctable } = values %$local_Ctable; 1363 1364 } else { 1365 $self->{C_TABLE} = $BUFR_table{"C$version"} || $self->_read_C_table_eccodes($version); 1366 } 1367 } 1368 1369 if ($default_table and not $self->{C_TABLE}) { 1370 # Was not able to load $table. Try $default_table instead. 1371 $version = $self->get_table_version($default_table); 1372 _croak "Not enough info to decide which C table to load" 1373 if not $version; 1374 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 1375 $self->{C_TABLE} = $BUFR_table{"C$version"} || $self->_read_C_table($version); 1376 } else { 1377 $self->{C_TABLE} = $BUFR_table{"C$version"} || $self->_read_C_table_eccodes($version); 1378 } 1379 } 1380 if (not $self->{C_TABLE}) { 1381 if ($BUFR_table{FORMAT} eq 'BUFRDC') { 1382 _croak "Unable to load C table (C$version.TXT)"; 1383 } else { 1384 _croak "Unable to load codetables for $version"; 1385 } 1386 } 1387 1388 return $version; 1389} 1390 1391 1392## Specify BUFR file to read 1393sub fopen { 1394 my $self = shift; 1395 my $filename = shift 1396 or _croak "fopen() called without an argument"; 1397 _croak "File $filename doesn't exist!" unless -e $filename; 1398 _croak "$filename is not a plain file" unless -f $filename; 1399 1400 # Open file for reading 1401 $self->{FILEHANDLE} = new FileHandle; 1402 open $self->{FILEHANDLE}, '<', $filename 1403 or _croak "Couldn't open file $filename for reading"; 1404 1405 $self->_spew(2, "File %s opened for reading", $filename); 1406 1407 # For some OS this is necessary 1408 binmode $self->{FILEHANDLE}; 1409 1410 $self->{FILENAME} = $filename; 1411 return 1; 1412} 1413 1414sub fclose { 1415 my $self = shift; 1416 if ($self->{FILEHANDLE}) { 1417 close $self->{FILEHANDLE} 1418 or _croak "Couldn't close BUFR file opened by fopen()"; 1419 $self->_spew(2, "Closed file %s", $self->{FILENAME}); 1420 } 1421 delete $self->{FILEHANDLE}; 1422 delete $self->{FILENAME}; 1423 # Much more might be considered deleted here, but usually the bufr 1424 # object goes out of scope immediately after a fclose anyway 1425 return 1; 1426} 1427 1428sub eof { 1429 my $self = shift; 1430 return ($self->{EOF} || 0); 1431} 1432 1433# Go to start of input buffer or start of file associated with the object 1434sub rewind { 1435 my $self = shift; 1436 if (exists $self->{FILEHANDLE}) { 1437 seek $self->{FILEHANDLE}, 0, 0 or _croak "Cannot seek: $!"; 1438 } elsif (! $self->{IN_BUFFER}) { 1439 _croak "Cannot rewind: no file or input buffer associated with this object"; 1440 } 1441 $self->{CURRENT_MESSAGE} = 0; 1442 $self->{CURRENT_SUBSET} = 0; 1443 delete $self->{START_POS}; 1444 delete $self->{POS}; 1445 delete $self->{EOF}; 1446 return 1; 1447} 1448 1449## Read in next BUFR message from file if $self->{FILEHANDLE} is set, 1450## else from $self->{IN_BUFFER} (string argument to 1451## constructor). Decodes section 0 and sets $self->{START_POS} to 1452## start of message and $self->{POS} to end of BUFR message (or after 1453## first 8 bytes of truncated/corrupt BUFR message for which we still 1454## want to attempt decoding). $self->{CURRENT_AHL} is updated if a 1455## GTS ahl is found (implemented for file reading only), and 1456## $self->{EOF} is set if no more 'BUFR' in file/buffer. Croaks if an 1457## error occurs when reading BUFR message. 1458 1459## Returns BUFR message from section 1 on, or undef if no BUFR message 1460## is found. 1461sub _read_message { 1462 my $self = shift; 1463 1464 my $filehandle = $self->{FILEHANDLE} ? $self->{FILEHANDLE} : undef; 1465 my $in_buffer = $self->{IN_BUFFER} ? $self->{IN_BUFFER} : undef; 1466 _croak "_read_message: Neither BUFR file nor BUFR text is given" 1467 unless $filehandle or $in_buffer; 1468 1469 # Locate next 'BUFR' and set $pos to this position in file/string, 1470 # also finding corresponding GTS ahl if exists (for file 1471 # only). Possibly sets $self->{EOF} 1472 my $pos = defined $self->{POS} ? $self->{POS} : 0; 1473 my $ahl; 1474 ($pos, $ahl) = $self->_find_next_BUFR($filehandle,$in_buffer,$pos,''); 1475 return if $pos < 0; 1476 $self->{REUSED_CURRENT_AHL} = 0; 1477 if ($ahl) { 1478 $self->{CURRENT_AHL} = $ahl; 1479 } elsif (! $Reuse_current_ahl) { 1480 $self->{CURRENT_AHL} = undef; 1481 } elsif (defined $self->{CURRENT_AHL}) { 1482 $self->{REUSED_CURRENT_AHL} = 1; 1483 } 1484 1485 # Remember start position of BUFR message in case we need to 1486 # rewind later because length of BUFR cannot be trusted 1487 $self->{START_POS} = $pos; 1488 1489 # Report (if verbose setting) where we found the BUFR message 1490 $self->_spew(2, "BUFR message at position %d", $pos) if $Spew; 1491 1492 # Read (rest) of Section 0 (length of BUFR message and edition number) 1493 my $sec0; # Section 0 is BUFR$sec0 1494 if ($filehandle) { 1495 if ((read $filehandle, $sec0, 8) != 8) { 1496 $self->{EOF} = 1; 1497 _croak "Error reading section 0 in file '$self->{FILENAME}', position " 1498 . tell($filehandle); 1499 } 1500 $sec0 = substr $sec0, 4; 1501 } else { 1502 if (length($in_buffer) < $pos+8) { 1503 $self->{EOF} = 1; 1504 _croak "Error reading section 0: this is not a BUFR message?" 1505 } 1506 $sec0 = substr $in_buffer, $pos+4, 4; 1507 } 1508 $self->{SEC0_STREAM} = "BUFR$sec0"; 1509 1510 # Extract length and edition number 1511 my ($length, $edition) = unpack 'NC', "\0$sec0"; 1512 $self->{BUFR_LENGTH} = $length; 1513 $self->{BUFR_EDITION} = $edition; 1514 $self->_spew(2, "Message length: %d, Edition: %d", $length, $edition) if $Spew; 1515 _croak "Cannot handle BUFR edition $edition" if $edition < 2 || $edition > 4; 1516 1517 # Read rest of BUFR message (section 1-5) 1518 my $msg; 1519 my $msgisOK = 1; 1520 if ($filehandle) { 1521 if ((read $filehandle, $msg, $length-8) != $length-8) { 1522 # Probably a corrupt or truncated BUFR message. We choose 1523 # to decode as much as possible (maybe the length in 1524 # section 0 is all that is wrong), but obviously we cannot 1525 # trust the stated length of BUFR message, so reset 1526 # position of filehandle to just after section 0 1527 $self->{BAD_LENGTH} = 1; 1528 $msgisOK = 0; 1529 seek $filehandle, $pos+8, 0; 1530 $self->_spew(2, "Danger: file %s not big enough to contain the stated" 1531 . " length of BUFR message", $self->{FILENAME}); 1532 $pos += 8; 1533 } else { 1534 $pos = tell($filehandle); 1535 if (substr($msg, -4) ne '7777') { 1536 $self->{BAD_LENGTH} = 1; 1537 $self->_spew(2, "Danger: BUFR length in sec 0 can't be correct, " 1538 . "last 4 bytes are not '7777'"); 1539 } 1540 } 1541 } else { 1542 if (length($in_buffer) < $pos+$length) { 1543 $self->{BAD_LENGTH} = 1; 1544 $msgisOK = 0; 1545 $self->_spew(2, "Danger: buffer not big enough " 1546 . "to contain the stated length of BUFR message"); 1547 $msg = substr $in_buffer, $pos+8, $length-8; 1548 $pos += 8; 1549 } else { 1550 $msg = substr $in_buffer, $pos+8, $length-8; 1551 $pos += $length; 1552 if (substr($msg, -4) ne '7777') { 1553 $self->{BAD_LENGTH} = 1; 1554 $self->_spew(2, "Danger: BUFR length in sec 0 can't be correct, " 1555 . "last 4 bytes are not '7777'"); 1556 } 1557 } 1558 } 1559 if ($Spew) { 1560 if ($msgisOK) { 1561 $self->_spew(2, "Successfully read BUFR message; position now %d", $pos); 1562 } else { 1563 $self->_spew(2, "Resetting position to %d", $pos); 1564 } 1565 } 1566 1567 # Reset $self->{POS} to end of BUFR message (or after first 8 1568 # bytes of truncated/corrupt BUFR message) 1569 $self->{POS} = $pos; 1570 1571 return $msg; 1572} 1573 1574my $ahl_regex = qr{[A-Z]{4}\d\d [A-Z]{4} \d{6}(?: (?:(?:RR|CC|AA|PA)[A-Z])| COR| RTD)?}; 1575# BBB=Pxx (segmentation) was allowed until 2007, but at least one 1576# centre still uses PAA as of 2014. COR and RTD shouldn't be 1577# allowed (from ?), but are still used 1578 1579## Advance to first occurrence of 'BUFR', or to the possibly preceding 1580## GTS ahl if this is requested in $at. Returns the new position and 1581## (if called in array context) the possibly preceding ahl. If no 1582## 'BUFR' is found, sets $self->{EOF} and returns -1 for the new 1583## position. 1584sub _find_next_BUFR { 1585 my $self = shift; 1586 my ($filehandle, $in_buffer, $pos, $at) = @_; 1587 1588 my ($new_pos, $ahl); 1589 if ($filehandle) { 1590 my $oldeol = $/; 1591 $/ = "BUFR"; 1592 my $slurp = <$filehandle> || ' '; 1593 $/ = $oldeol; 1594 if (CORE::eof($filehandle) or substr($slurp,-4) ne 'BUFR') { 1595 $self->{EOF} = 1; 1596 } else { 1597 # Get the GTS ahl (TTAAii CCCC DTG [BBB]) before 'BUFR', 1598 # if present. Use '\n+' not '\n' since adding an extra 1599 # '\n' in bulletin has been seen. Allow also for not 1600 # including \r\r (which might be how the bulletin file was 1601 # prepared originally, or might catch cases where ahl is 1602 # mistakingly included twice) 1603 my $reset = 4; 1604 if ($slurp =~ /(${ahl_regex})((?:\r\r)?\n+BUFR)$/) { 1605 $ahl = $1; 1606 # Don't use lenght($&), since this slows down execution for 1607 # Perl 5.16 or earlier. See the WARNING at the end of 1608 # the Capture Buffers section of the perlre documentation 1609 $reset = length($1) + length($2) if $at eq 'at_ahl'; 1610 1611 $self->_spew(2,"GTS ahl found: %s",$ahl) if $Spew; 1612 } 1613 # Reset position of filehandle to just before 'BUFR', or 1614 # if requested, before possible preceding AHL 1615 seek($filehandle, -$reset, 1); 1616 $new_pos = tell $filehandle; 1617 } 1618 } else { 1619 $new_pos = index($in_buffer, 'BUFR', $pos); 1620 if ($new_pos < 0) { 1621 $self->{EOF} = 1; 1622 } else { 1623 if (substr($in_buffer,$pos,$new_pos-$pos) =~ /(${ahl_regex})((?:\r\r)?\n+)$/) { 1624 $ahl = $1; 1625 $self->_spew(2,"GTS ahl found: %s",$ahl) if $Spew; 1626 if ($at eq 'at_ahl') { 1627 $new_pos -= length($1) + length($2); 1628 } 1629 } 1630 } 1631 } 1632 1633 if ($self->{EOF}) { 1634 if ($pos == 0) { 1635 if ($filehandle) { 1636 $self->_spew(2,"No BUFR message in file %s",$self->{FILENAME}) 1637 if $Spew; 1638 } else { 1639 $self->_spew(2, "No BUFR message found") if $Spew; 1640 } 1641 } 1642 return -1; 1643 } 1644 1645 return wantarray ? ($new_pos,$ahl) : $new_pos; 1646} 1647 1648## Returns the BUFR message in raw (binary) form, '' if errors encountered 1649sub get_bufr_message { 1650 my $self = shift; 1651 1652 if ($self->{BAD_LENGTH} || $self->{ERROR_IN_MESSAGE}) { 1653 $self->_spew(2, "Skipping erroneous BUFR message"); 1654 return ''; 1655 } 1656 if (!$self->{FILEHANDLE} && !$self->{IN_BUFFER}) { 1657 $self->_spew(2, "No file or input buffer associated with this object"); 1658 return ''; 1659 } 1660 if (!exists $self->{START_POS} || !$self->{BUFR_LENGTH}) { 1661 $self->_spew(2, "No bufr message to return"); 1662 return ''; 1663 } 1664 1665 my $msg; 1666 if (exists $self->{FILEHANDLE}) { 1667 my $fh = $self->{FILEHANDLE}; 1668 my $old_pos = tell($fh); 1669 seek($fh, $self->{START_POS}, 0); 1670 read($fh, $msg, $self->{BUFR_LENGTH}); 1671 seek($fh, $old_pos, 0); 1672 $self->_spew(2, "BUFR message extracted from file"); 1673 } elsif (exists $self->{IN_BUFFER}) { 1674 $msg = substr $self->{IN_BUFFER}, $self->{START_POS}, $self->{BUFR_LENGTH}; 1675 $self->_spew(2, "BUFR message extracted"); 1676 } 1677 1678 return $msg; 1679} 1680 1681## Decode section 1 to 5. Section 0 is already decoded in _read_message. 1682sub _decode_sections { 1683 my $self = shift; 1684 my $msg = shift; 1685 1686 $self->{BUFR_STREAM} = $msg; 1687 $self->{SEC1_STREAM} = undef; 1688 $self->{SEC2_STREAM} = undef; 1689 $self->{SEC3_STREAM} = undef; 1690 $self->{SEC4_STREAM} = undef; 1691 $self->{SEC5_STREAM} = undef; 1692 1693 # Breaking the rule that all debugging should be on lines starting 1694 # with 'BUFR.pm:', therefore using $verbose=6 1695 $self->_spew(6, "%s", $self->dumpsection0()) if $Spew; 1696 1697 ## Decode Section 1 (Identification Section) ## 1698 1699 $self->_spew(2, "Decoding section 1") if $Spew; 1700 1701 # Extract Section 1 information 1702 if ($self->{BUFR_EDITION} < 4) { 1703 # N means 4 byte integer, so put an extra null byte ('\0') in 1704 # front of string to get first 3 bytes as integer 1705 my @sec1 = unpack 'NC14', "\0" . $self->{BUFR_STREAM}; 1706 1707 # Check that stated length of section 1 makes sense 1708 _croak "Length of section 1 too small (< 17): $sec1[0]" 1709 if $sec1[0] < 17; 1710 _croak "Rest of BUFR message shorter (" . length($self->{BUFR_STREAM}) 1711 . " bytes) than stated length of section 1 ($sec1[0] bytes)" 1712 if $sec1[0] > length($self->{BUFR_STREAM}); 1713 1714 push @sec1, (unpack 'a*', substr $self->{BUFR_STREAM},17,$sec1[0]-17); 1715 $self->{SEC1_STREAM} = substr $self->{BUFR_STREAM}, 0, $sec1[0]; 1716 $self->{BUFR_STREAM} = substr $self->{BUFR_STREAM}, $sec1[0]; 1717 $self->{SEC1} = \@sec1; 1718 $self->{MASTER_TABLE} = $sec1[1]; 1719 $self->{SUBCENTRE} = $sec1[2]; 1720 $self->{CENTRE} = $sec1[3]; 1721 $self->{UPDATE_NUMBER} = $sec1[4]; 1722 $self->{OPTIONAL_SECTION} = vec($sec1[5] & 0x80,0,1); # 1. bit 1723 $self->{DATA_CATEGORY} = $sec1[6]; 1724 $self->{DATA_SUBCATEGORY} = $sec1[7]; 1725 $self->{MASTER_TABLE_VERSION} = $sec1[8]; 1726 $self->{LOCAL_TABLE_VERSION} = $sec1[9]; 1727 $self->{YEAR_OF_CENTURY} = $sec1[10]; 1728 $self->{MONTH} = $sec1[11]; 1729 $self->{DAY} = $sec1[12]; 1730 $self->{HOUR} = $sec1[13]; 1731 $self->{MINUTE} = $sec1[14]; 1732 $self->{LOCAL_USE} = $sec1[15]; 1733 # In case previous message was edition 4 1734 foreach my $key (qw(INT_DATA_SUBCATEGORY LOC_DATA_SUBCATEGORY 1735 YEAR SECOND)) { 1736 undef $self->{$key}; 1737 } 1738 } elsif ($self->{BUFR_EDITION} == 4) { 1739 my @sec1 = unpack 'NCnnC7nC5', "\0" . $self->{BUFR_STREAM}; 1740 1741 # Check that stated length of section 1 makes sense 1742 _croak "Length of section 1 too small (< 22): $sec1[0]" 1743 if $sec1[0] < 22; 1744 _croak "Rest of BUFR message shorter (" . length($self->{BUFR_STREAM}) 1745 . " bytes) than stated length of section 1 ($sec1[0] bytes)" 1746 if $sec1[0] > length($self->{BUFR_STREAM}); 1747 1748 push @sec1, (unpack 'a*', substr $self->{BUFR_STREAM},22,$sec1[0]-22); 1749 $self->{SEC1_STREAM} = substr $self->{BUFR_STREAM}, 0, $sec1[0]; 1750 $self->{BUFR_STREAM} = substr $self->{BUFR_STREAM}, $sec1[0]; 1751 $self->{SEC1} = \@sec1; 1752 $self->{MASTER_TABLE} = $sec1[1]; 1753 $self->{CENTRE} = $sec1[2]; 1754 $self->{SUBCENTRE} = $sec1[3]; 1755 $self->{UPDATE_NUMBER} = $sec1[4]; 1756 $self->{OPTIONAL_SECTION} = vec($sec1[5] & 0x80,0,1); # 1. bit 1757 $self->{DATA_CATEGORY} = $sec1[6]; 1758 $self->{INT_DATA_SUBCATEGORY} = $sec1[7]; 1759 $self->{LOC_DATA_SUBCATEGORY} = $sec1[8]; 1760 $self->{MASTER_TABLE_VERSION} = $sec1[9]; 1761 $self->{LOCAL_TABLE_VERSION} = $sec1[10]; 1762 $self->{YEAR} = $sec1[11]; 1763 $self->{MONTH} = $sec1[12]; 1764 $self->{DAY} = $sec1[13]; 1765 $self->{HOUR} = $sec1[14]; 1766 $self->{MINUTE} = $sec1[15]; 1767 $self->{SECOND} = $sec1[16]; 1768 $self->{LOCAL_USE} = ($sec1[0] > 22) ? $sec1[17] : undef; 1769 # In case previous message was edition 3 or lower 1770 foreach my $key (qw(DATA_SUBCATEGORY YEAR_OF_CENTURY)) { 1771 undef $self->{$key}; 1772 } 1773 } 1774 $self->_spew(2, "BUFR edition: %d Optional section: %d Update sequence number: %d", 1775 $self->{BUFR_EDITION}, $self->{OPTIONAL_SECTION}, $self->{UPDATE_NUMBER}) if $Spew; 1776 $self->_spew(6, "%s", $self->dumpsection1()) if $Spew; 1777 1778 $self->_validate_datetime() if ($Strict_checking); 1779 1780 ## Decode Section 2 (Optional Section) if present ## 1781 1782 $self->_spew(2, "Decoding section 2") if $Spew; 1783 1784 if ($self->{OPTIONAL_SECTION}) { 1785 my @sec2 = unpack 'N', "\0" . $self->{BUFR_STREAM}; 1786 1787 # Check that stated length of section 2 makes sense 1788 _croak "Length of section 2 too small (< 4): $sec2[0]" 1789 if $sec2[0] < 4; 1790 _croak "Rest of BUFR message shorter (" . length($self->{BUFR_STREAM}) 1791 . " bytes) than stated length of section 2 ($sec2[0] bytes)" 1792 if $sec2[0] > length($self->{BUFR_STREAM}); 1793 1794 push @sec2, substr $self->{BUFR_STREAM}, 4, $sec2[0]-4; 1795 $self->{SEC2_STREAM} = substr $self->{BUFR_STREAM}, 0, $sec2[0]; 1796 $self->{BUFR_STREAM} = substr $self->{BUFR_STREAM}, $sec2[0]; 1797 $self->{SEC2} = \@sec2; 1798 $self->_spew(2, "Length of section 2: %d", $sec2[0]) if $Spew; 1799 } else { 1800 $self->{SEC2} = undef; 1801 $self->{SEC2_STREAM} = undef; 1802 } 1803 1804 ## Decode Section 3 (Data Description Section) ## 1805 1806 $self->_spew(2, "Decoding section 3") if $Spew; 1807 1808 my @sec3 = unpack 'NCnC', "\0".$self->{BUFR_STREAM}; 1809 1810 # Check that stated length of section 3 makes sense 1811 _croak "Length of section 3 too small (< 8): $sec3[0]" 1812 if $sec3[0] < 8; 1813 _croak "Rest of BUFR message shorter (" . length($self->{BUFR_STREAM}) 1814 . " bytes) than stated length of section 3 ($sec3[0] bytes)" 1815 if $sec3[0] > length($self->{BUFR_STREAM}); 1816 1817 push @sec3, substr $self->{BUFR_STREAM},7,($sec3[0]-7)&0x0ffe; # $sec3[0]-7 will be reduced by one if odd integer, 1818 # so will not push last byte if length of sec3 is even, 1819 # which might happen for BUFR edition < 4 (padding byte) 1820 $self->{SEC3_STREAM} = substr $self->{BUFR_STREAM}, 0, $sec3[0]; 1821 $self->{BUFR_STREAM} = substr $self->{BUFR_STREAM}, $sec3[0]; 1822 1823 $self->{SEC3} = \@sec3; 1824 $self->{NUM_SUBSETS} = $sec3[2]; 1825 $self->{OBSERVED_DATA} = vec($sec3[3] & 0x80,0,1); # extract 1. bit 1826 $self->{COMPRESSED_DATA} = vec($sec3[3] & 0x40,1,1); # extract 2. bit 1827 $self->_spew(2, "Length of section 3: %d", $sec3[0]) if $Spew; 1828 $self->_spew(2, "Number of subsets: %d Observed data: %d Compressed data: %d", 1829 $self->{NUM_SUBSETS}, $self->{OBSERVED_DATA}, $self->{COMPRESSED_DATA}) if $Spew; 1830 _complain("0 subsets in BUFR message") 1831 if ($Strict_checking and $self->{NUM_SUBSETS} == 0); 1832 _complain("Bits 3-8 in octet 7 in section 3 are not 0 (octet 7 = $sec3[3])") 1833 if ($Strict_checking and ($sec3[3] & 0x3f) != 0); 1834 if ($Spew == 6 || $Nodata) { 1835 my @unexpanded = _int2fxy(unpack 'n*', $self->{SEC3}[4]); 1836 $self->{DESCRIPTORS_UNEXPANDED} = @unexpanded ? 1837 join(' ', @unexpanded) : ''; 1838 $self->_spew(6, "%s", $self->dumpsection3()); 1839 } 1840 1841 $self->{IS_FILTERED} = defined $self->{FILTER_CB} 1842 ? $self->{FILTER_CB}->(@{$self->{FILTER_ARGS}}) : 0; 1843 return if $self->{IS_FILTERED} || $Nodata; 1844 1845 ## Decode Section 4 (Data Section) ## 1846 1847 $self->_spew(2, "Decoding section 4") if $Spew; 1848 1849 my $sec4_len = unpack 'N', "\0$self->{BUFR_STREAM}"; 1850 $self->_spew(2, "Length of section 4: %d", $sec4_len) if $Spew; 1851 1852 # Check that stated length of section 4 makes sense 1853 _croak "Length of section 4 too small (< 4): $sec4_len" 1854 if $sec4_len < 4; 1855 _croak "Rest of BUFR message (" . length($self->{BUFR_STREAM}) . " bytes)" 1856 . " shorter than stated length of section 4 ($sec4_len bytes)." 1857 . " Probably the BUFR message is truncated" 1858 if $sec4_len > length($self->{BUFR_STREAM}); 1859 1860 $self->{SEC4_STREAM} = substr $self->{BUFR_STREAM}, 0, $sec4_len; 1861 $self->{SEC4_RAWDATA} = substr $self->{BUFR_STREAM}, 4, $sec4_len-4; 1862 $self->{BUFR_STREAM} = substr $self->{BUFR_STREAM}, $sec4_len; 1863 1864 ## Decode Section 5 (End Section) ## 1865 1866 $self->_spew(2, "Decoding section 5") if $Spew; 1867 1868 # Next 4 characters should be '7777' and these should be end of 1869 # message, but allow more characters (i.e. length of message in 1870 # section 0 has been set too big) if $Strict_checking not set 1871 my $str = $self->{BUFR_STREAM}; 1872 my $len = length($str); 1873 if ($len > 4 1874 || ($len == 4 && substr($str,0,4) ne '7777')) { 1875 my $err_msg = "Section 5 is not '7777' but the $len" 1876 . " characters (in hex): " 1877 . join(' ', map {sprintf "0x%02X", $_} unpack('C*', $str)); 1878 if ($len > 4 && substr($str,0,4) eq '7777') { 1879 _complain($err_msg); 1880 } elsif ($len == 4 && substr($str,0,4) ne '7777') { 1881 _croak($err_msg); 1882 } 1883 } 1884 1885 return; 1886} 1887 1888## Read next BUFR message and decode. Set $self->{ERROR_IN_MESSAGE} if 1889## anything goes seriously wrong, so that sub next_observation can use 1890## this to skip to next message if user chooses to trap the call to 1891## next_observation in an eval and then calls next_observation again. 1892sub _next_message { 1893 my $self = shift; 1894 1895 $self->_spew(2, "Reading next BUFR message") if $Spew; 1896 1897 $self->{ERROR_IN_MESSAGE} = 0; 1898 $self->{BAD_LENGTH} = 0; 1899 1900 my $msg; 1901 eval { 1902 # Read BUFR message and decode section 0 (needed to get length 1903 # of message) 1904 $msg = $self->_read_message(); 1905 1906 # Unpack section 1-5 1907 $self->_decode_sections($msg) if $msg; 1908 }; 1909 if ($@) { 1910 $self->{ERROR_IN_MESSAGE} = 1; 1911 $self->{CURRENT_MESSAGE}++; 1912 die $@; # Could use croak, but then 2 "at ... line ..." will 1913 # be printed to STDERR 1914 } 1915 if (!$msg) { 1916 # Nothing to decode. $self->{EOF} should have been set 1917 $self->_spew(2, "No more BUFR messages found") if $Spew; 1918 return; 1919 } 1920 1921 $self->{CURRENT_MESSAGE}++; 1922 1923 return if $Nodata || $self->{IS_FILTERED}; 1924 1925 # Load the relevant code tables 1926 my $table_version; 1927 eval { $table_version = $self->load_BDtables() }; 1928 if ($@) { 1929 $self->{ERROR_IN_MESSAGE} = 1; 1930 die $@; 1931 } 1932 1933 # Get the data descriptors and expand them 1934 my @unexpanded = _int2fxy(unpack 'n*', $self->{SEC3}[4]); 1935 _croak "No data description in section 3" if !defined $unexpanded[0]; 1936 # Using master table because local tables couldn't be found is 1937 # risky, so catch missing descriptors here to be able to give 1938 # informative error messages 1939 $self->_check_descriptors(\@unexpanded) if $self->{LOCAL_TABLES_NOT_FOUND}; 1940 $self->{DESCRIPTORS_UNEXPANDED} = join ' ', @unexpanded; 1941 $self->_spew(2, "Unexpanded data descriptors: %s", $self->{DESCRIPTORS_UNEXPANDED}) if $Spew; 1942 1943 $self->_spew(2, "Expanding data descriptors") if $Spew; 1944 my $alias = "$table_version " . $self->{DESCRIPTORS_UNEXPANDED}; 1945 if (exists $Descriptors_already_expanded{$alias}) { 1946 $self->{DESCRIPTORS_EXPANDED} = $Descriptors_already_expanded{$alias}; 1947 } else { 1948 eval { 1949 $Descriptors_already_expanded{$alias} = $self->{DESCRIPTORS_EXPANDED} 1950 = join " ", _expand_descriptors($self->{D_TABLE}, @unexpanded); 1951 }; 1952 if ($@) { 1953 $self->{ERROR_IN_MESSAGE} = 1; 1954 die $@; 1955 } 1956 } 1957 1958 # Unpack data from bitstream 1959 $self->_spew(2, "Unpacking data") if $Spew; 1960 eval { 1961 if ($self->{COMPRESSED_DATA}) { 1962 $self->_decompress_bitstream(); 1963 } else { 1964 $self->_decode_bitstream(); 1965 } 1966 }; 1967 if ($@) { 1968 $self->{ERROR_IN_MESSAGE} = 1; 1969 die $@; 1970 } 1971 1972 return; 1973} 1974 1975## Check if all element and sequence descriptors given are found in 1976## B/D-tables (but skip check for those preceded by 206-operator) 1977sub _check_descriptors { 1978 my ($self,$unexpanded) = @_; 1979 1980 my $B_table = $self->{B_TABLE}; 1981 my $D_table = $self->{D_TABLE}; 1982 my $skip_next = 0; 1983 foreach my $id (@{$unexpanded}) { 1984 # Skip descriptors preceded by 206-operator 1985 if ($skip_next) { 1986 $skip_next = 0; 1987 } elsif (substr($id,0,3) eq '206') { 1988 $skip_next = 1; 1989 } elsif ( (substr($id,0,1) eq '0' && ! exists $B_table->{$id}) 1990 || (substr($id,0,1) eq '3' && ! exists $D_table->{$id}) ) { 1991 my $version = ($BUFR_table{FORMAT} eq 'BUFRDC') 1992 ? substr($self->{LOCAL_TABLES_NOT_FOUND},1,-4) 1993 : $self->{LOCAL_TABLES_NOT_FOUND}; 1994 undef $BUFR_table{"B$version"}; 1995 undef $BUFR_table{"D$version"}; 1996 $self->{ERROR_IN_MESSAGE} = 1; 1997 _croak("Data descriptor $id is not in master table." 1998 . " You need to get the local tables B/D$version.TXT"); 1999 } 2000 } 2001 return; 2002} 2003 2004## Get next observation, i.e. next subset in current BUFR message or 2005## first subset in next message. Returns (reference to) data and 2006## descriptors, or empty list if either no observation is found (in 2007## which case $self->{EOF} should have been set) or if decoding of 2008## section 4 is not requested (in which case all of sections 0-3 have 2009## been decoded in next message). 2010sub next_observation { 2011 my $self = shift; 2012 2013 $self->_spew(2, "Fetching next observation") if $Spew; 2014 2015 # If an error occurred during decoding of previous message, we 2016 # don't know if stated length in section 0 is to be trusted, 2017 # so rewind to next 'BUFR', or setting EOF if no such exists 2018 if ($self->{ERROR_IN_MESSAGE}) { 2019 # First rewind to right after 'BUFR' in previous (faulty) 2020 # message. We cannot go further if file/buffer starts as 2021 # 'BUFRBUFR' 2022 my $pos = $self->{START_POS} + 4; 2023 seek($self->{FILEHANDLE}, $pos, 0) if $self->{FILEHANDLE}; 2024 $self->_spew(2, "Error in processing BUFR message (check STDERR for " 2025 . "details), rewinding to next 'BUFR'") if $Spew; 2026 # Prepare for (a possible) next call to _read_message by 2027 # advancing to next 'BUFR', not skipping a preceding ahl 2028 my $new_pos = $self->_find_next_BUFR($self->{FILEHANDLE}, 2029 $self->{IN_BUFFER},$pos,'at_ahl'); 2030 if ($self->{EOF}) { 2031 $self->_spew(2, "Last BUFR message (reached end of file)") if $Spew; 2032 return; 2033 } else { 2034 $self->{POS} = $new_pos; 2035 } 2036 } 2037 2038 # Read next BUFR message 2039 if ($self->{CURRENT_MESSAGE} == 0 2040 or $self->{ERROR_IN_MESSAGE} 2041 or $self->{CURRENT_SUBSET} >= $self->{NUM_SUBSETS}) { 2042 2043 $self->{CURRENT_SUBSET} = 0; 2044 # The bit maps must be rebuilt for each message 2045 undef $self->{BITMAPS}; 2046 undef $self->{BITMAP_OPERATORS}; 2047 undef $self->{BITMAP_START}; 2048 undef $self->{REUSE_BITMAP}; 2049 $self->{NUM_BITMAPS} = 0; 2050 $self->{BACKWARD_DATA_REFERENCE} = 1; 2051 # Some more tidying after decoding of previous message might 2052 # be necessary 2053 $self->{NUM_CHANGE_OPERATORS} = 0; 2054 undef $self->{CHANGE_WIDTH}; 2055 undef $self->{CHANGE_CCITTIA5_WIDTH}; 2056 undef $self->{CHANGE_SCALE}; 2057 undef $self->{CHANGE_REFERENCE_VALUE}; 2058 undef $self->{NEW_REFVAL_OF}; 2059 undef $self->{CHANGE_SRW}; 2060 undef $self->{ADD_ASSOCIATED_FIELD}; 2061 undef $self->{LOCAL_TABLES_NOT_FOUND}; 2062 undef $self->{DATA}; 2063 undef $self->{DESC}; 2064 # Note that we should NOT undef metadata in section 1-3 here, 2065 # since if the next call (_next_message) finds no more 2066 # messages, we don't want to lose the metadata of the last 2067 # valid message extracted. sub join_subsets is based on this 2068 # assumption 2069 2070 $self->_next_message(); 2071 return if $self->{EOF}; 2072 2073 if ($Nodata || $self->{IS_FILTERED}) { 2074 # Make a simple check that section 4 and 5 are complete 2075 if ($self->{BAD_LENGTH}) { 2076 # We could have set $self->{ERROR_IN_MESSAGE} here and 2077 # let next_observation() take care of the rewinding. 2078 # But we don't want error messages to be displayed if 2079 # e.g. message is to be filtered 2080 $self->{POS} = $self->{START_POS} + 4; 2081 seek($self->{FILEHANDLE}, $self->{POS}, 0) if $self->{FILEHANDLE}; 2082 $self->_spew(2, "Possibly truncated message found (last 4 bytes" 2083 . " are not '7777'), so rewinding to position %d", 2084 $self->{POS}) if $Spew; 2085 } 2086 # This will ensure next call to next_observation to read next message 2087 $self->{CURRENT_SUBSET} = $self->{NUM_SUBSETS}; 2088 return; 2089 } 2090 } 2091 2092 $self->{CURRENT_SUBSET}++; 2093 2094 # Return references to data and descriptor arrays 2095 if ($self->{COMPRESSED_DATA}) { 2096 return ($self->{DATA}[$self->{CURRENT_SUBSET}], 2097 $self->{DESC}); 2098 } else { 2099 return ($self->{DATA}[$self->{CURRENT_SUBSET}], 2100 $self->{DESC}[$self->{CURRENT_SUBSET}]); 2101 } 2102} 2103 2104# Dumping contents of a subset (including section 0, 1 and 3 if this is 2105# first subset) in a BUFR message, also displaying message number and 2106# ahl (if found) and subset number 2107sub dumpsections { 2108 my $self = shift; 2109 my $data = shift; 2110 my $descriptors = shift; 2111 my $options = shift || {}; 2112 2113 my $width = $options->{width} || 15; 2114 my $bitmap = exists $options->{bitmap} ? $options->{bitmap} : 1; 2115 2116 my $current_subset_number = $self->get_current_subset_number(); 2117 my $current_message_number = $self->get_current_message_number(); 2118 my $current_ahl = $self->get_current_ahl() || ''; 2119 2120 my $txt; 2121 if ($current_subset_number == 1) { 2122 $txt = "\nMessage $current_message_number"; 2123 $txt .= defined $current_ahl ? " $current_ahl\n" : "\n"; 2124 $txt .= $self->dumpsection0() . $self->dumpsection1() . $self->dumpsection3(); 2125 } 2126 2127 # If this is last message and there is a BUFR formatting error 2128 # caught by user with eval, we might end up here with current 2129 # subset number 0 (and no section 4 to dump) 2130 if ($current_subset_number > 0) { 2131 $txt .= "\nSubset $current_subset_number\n"; 2132 $txt .= $bitmap ? $self->dumpsection4_with_bitmaps($data,$descriptors, 2133 $current_subset_number,$width) 2134 : $self->dumpsection4($data,$descriptors,$width); 2135 } 2136 2137 return $txt; 2138} 2139 2140sub dumpsection0 { 2141 my $self = shift; 2142 _croak "BUFR object not properly initialized to call dumpsection0. " 2143 . "Did you forget to call next_observation()?" unless $self->{BUFR_LENGTH}; 2144 2145 my $txt = <<"EOT"; 2146 2147Section 0: 2148 Length of BUFR message: $self->{BUFR_LENGTH} 2149 BUFR edition: $self->{BUFR_EDITION} 2150EOT 2151 return $txt; 2152} 2153 2154sub dumpsection1 { 2155 my $self = shift; 2156 _croak "BUFR object not properly initialized to call dumpsection1. " 2157 . "Did you forget to call next_observation()?" unless $self->{SEC1_STREAM}; 2158 2159 my $txt; 2160 if ($self->{BUFR_EDITION} < 4) { 2161 $txt = <<"EOT"; 2162 2163Section 1: 2164 Length of section: @{[ length $self->{SEC1_STREAM} ]} 2165 BUFR master table: $self->{MASTER_TABLE} 2166 Originating subcentre: $self->{SUBCENTRE} 2167 Originating centre: $self->{CENTRE} 2168 Update sequence number: $self->{UPDATE_NUMBER} 2169 Optional section present: $self->{OPTIONAL_SECTION} 2170 Data category (table A): $self->{DATA_CATEGORY} 2171 Data subcategory: $self->{DATA_SUBCATEGORY} 2172 Master table version number: $self->{MASTER_TABLE_VERSION} 2173 Local table version number: $self->{LOCAL_TABLE_VERSION} 2174 Year of century: $self->{YEAR_OF_CENTURY} 2175 Month: $self->{MONTH} 2176 Day: $self->{DAY} 2177 Hour: $self->{HOUR} 2178 Minute: $self->{MINUTE} 2179EOT 2180 } else { 2181 $txt = <<"EOT"; 2182 2183Section 1: 2184 Length of section: @{[ length $self->{SEC1_STREAM} ]} 2185 BUFR master table: $self->{MASTER_TABLE} 2186 Originating centre: $self->{CENTRE} 2187 Originating subcentre: $self->{SUBCENTRE} 2188 Update sequence number: $self->{UPDATE_NUMBER} 2189 Optional section present: $self->{OPTIONAL_SECTION} 2190 Data category (table A): $self->{DATA_CATEGORY} 2191 International data subcategory: $self->{INT_DATA_SUBCATEGORY} 2192 Local data subcategory: $self->{LOC_DATA_SUBCATEGORY} 2193 Master table version number: $self->{MASTER_TABLE_VERSION} 2194 Local table version number: $self->{LOCAL_TABLE_VERSION} 2195 Year: $self->{YEAR} 2196 Month: $self->{MONTH} 2197 Day: $self->{DAY} 2198 Hour: $self->{HOUR} 2199 Minute: $self->{MINUTE} 2200 Second: $self->{SECOND} 2201EOT 2202 } 2203 # Last part of section 1: "Reserved for local use by ADP centres" 2204 # is considered so uninteresting (and rare), that it is displayed 2205 # only if verbose >= 2, in a _spew statement. Note that for BUFR 2206 # edition < 4 there is always one byte here (to make an even 2207 # number of bytes in section 1). 2208 $self->_spew(2, "Reserved for local use: 0x@{[unpack('H*', $self->{LOCAL_USE})]}") 2209 if $self->{LOCAL_USE} and length $self->{LOCAL_USE} > 1; 2210 2211 return $txt; 2212} 2213 2214sub dumpsection2 { 2215 my $self = shift; 2216 return '' if not defined $self->{SEC2}; 2217 2218 my $sec2_code_ref = shift; 2219 _croak "dumpsection2: no code ref provided" 2220 unless defined $sec2_code_ref && ref($sec2_code_ref) eq 'CODE'; 2221 2222 my $txt = <<"EOT"; 2223 2224Section 2: 2225 Length of section: @{[ length $self->{SEC2_STREAM} ]} 2226EOT 2227 2228 return $txt . $sec2_code_ref->($self->{SEC2_STREAM}) . "\n"; 2229} 2230 2231sub dumpsection3 { 2232 my $self = shift; 2233 _croak "BUFR object not properly initialized to call dumpsection3. " 2234 . "Did you forget to call next_observation()?" unless $self->{SEC3_STREAM}; 2235 $self->{DESCRIPTORS_UNEXPANDED} ||= ''; 2236 2237 my $txt = <<"EOT"; 2238 2239Section 3: 2240 Length of section: @{[ length $self->{SEC3_STREAM} ]} 2241 Number of data subsets: $self->{NUM_SUBSETS} 2242 Observed data: $self->{OBSERVED_DATA} 2243 Compressed data: $self->{COMPRESSED_DATA} 2244 Data descriptors unexpanded: $self->{DESCRIPTORS_UNEXPANDED} 2245EOT 2246 return $txt; 2247} 2248 2249sub dumpsection4 { 2250 my $self = shift; 2251 my $data = shift; 2252 my $descriptors = shift; 2253 my $width = shift || 15; # Optional argument 2254 # Since last (optional) argument to dumpsection() is an anonymous 2255 # hash, check that this is not mistakenly applied here also 2256 _croak "Last optional argument to dumpsection4 should be integer" 2257 if ref($width) || $width !~ /^\d+$/; 2258 2259 my $txt = "\n"; 2260 my $B_table = $self->{B_TABLE}; 2261 # Add the artificial descriptor for associated field 2262 $B_table->{999999} = "ASSOCIATED FIELD\0NUMERIC"; 2263 my $C_table = $self->{C_TABLE} || ''; 2264 my $idx = 0; 2265 my $line_no = 0; # Precede each line with a line number, except 2266 # for replication descriptors and for operator 2267 # descriptors with no data value in section 4 2268 ID: 2269 foreach my $id (@{$descriptors}) { 2270 my $value = defined $data->[$idx] ? $data->[$idx] : 'missing'; 2271 $idx++; 2272 my $f = substr($id, 0, 1); 2273 if ($f == 1) { 2274 $txt .= sprintf " %6d\n", $id; 2275 next ID; 2276 } elsif ($f == 2) { 2277 if ($id =~ /^205/) { # Character information operator 2278 $txt .= sprintf "%6d %06d %${width}.${width}s %s\n", 2279 ++$line_no, $id, $value, "CHARACTER INFORMATION"; 2280 next ID; 2281 } else { 2282 my $operator_name = _get_operator_name($id); 2283 if ($operator_name) { 2284 $txt .= sprintf " %06d %${width}.${width}s %s\n", 2285 $id, "", $operator_name; 2286 } 2287 next ID; 2288 } 2289 } elsif ($f == 9 && $id != 999999) { 2290 $txt .= sprintf "%6d %06d %${width}.${width}s %s %06d\n", 2291 ++$line_no, $id, $value, 'NEW REFERENCE VALUE FOR', $id - 900000; 2292 next ID; 2293 } elsif ($id == 31031) { # This is the only data descriptor 2294 # where all bits set to one should 2295 # not be rendered as missing value 2296 # (for replication/repetition factors in 2297 # class 31 $value has been adjusted already) 2298 $value = 1 if $value eq 'missing'; 2299 } 2300 _croak "Data descriptor $id is not present in BUFR table B" 2301 unless exists $B_table->{$id}; 2302 my ($name, $unit, $bits) = (split /\0/, $B_table->{$id})[0,1,4]; 2303 # Code or flag table number equals $id, so no need to display this in [unit] 2304 my $short_unit = $unit; 2305 my $unit_start = uc(substr($unit, 0, 4)); 2306 if ($unit_start eq 'CODE') { 2307 $short_unit = 'CODE TABLE'; 2308 } elsif ($unit_start eq 'FLAG') { 2309 $short_unit = 'FLAG TABLE'; 2310 } 2311 $txt .= sprintf "%6d %06d %${width}.${width}s %s\n", 2312 ++$line_no, $id, $value, "$name [$short_unit]"; 2313 2314 # Check for illegal flag value 2315 if ($Strict_checking && $short_unit eq 'FLAG TABLE' && $bits > 1) { 2316 if ($value ne 'missing' && $value % 2) { 2317 $bits += 0; # get rid of spaces 2318 my $max_value = 2**$bits - 1; 2319 _complain("$id - $value: rightmost bit $bits is set indicating missing value" 2320 . " but then value should be $max_value"); 2321 } 2322 } 2323 2324 # Resolve flag and code table values if code table is loaded 2325 # (but don't bother about 031031 - too much uninformative output) 2326 if ($C_table && $id != 31031 && $value ne 'missing') { 2327 my $num_spaces = $width + 18; 2328 $txt .= _get_code_table_txt($id,$value,$unit,$B_table,$C_table,$num_spaces) 2329 } 2330 } 2331 return $txt; 2332} 2333 2334# Operators which should always be displayed in dumpsection4 2335my %OPERATOR_NAME_A = 2336 ( 222000 => 'QUALITY INFORMATION FOLLOW', 2337 223000 => 'SUBSTITUTED VALUES FOLLOW', 2338 224000 => 'FIRST ORDER STATISTICS FOLLOW', 2339 225000 => 'DIFFERENCE STATISTICAL VALUES FOLLOW', 2340 232000 => 'REPLACE/RETAINED VALUES FOLLOW', 2341 235000 => 'CANCEL BACKWARD DATA REFERENCE', 2342 236000 => 'DEFINE DATA PRESENT BIT MAP', 2343 237000 => 'USE PREVIOUSLY DEFINED BIT MAP', 2344 ); 2345# Operators which should normally not be displayed in dumpsection4 2346my %OPERATOR_NAME_B = 2347 ( 201000 => 'CANCEL CHANGE DATA WIDTH', 2348 202000 => 'CANCEL CHANGE SCALE', 2349 203000 => 'CANCEL CHANGE REFERENCE VALUES', 2350 207000 => 'CANCEL INCREASE SCALE, REFERENCE VALUE AND DATA WIDTH', 2351 208000 => 'CANCEL CHANGE WIDTH OF CCITT IA5 FIELD', 2352 203255 => 'STOP CHANGING REFERENCE VALUES', 2353 223255 => 'SUBSTITUTED VALUES MARKER OPERATOR', 2354 224255 => 'FIRST ORDER STATISTICAL VALUES MARKER OPERATOR', 2355 225255 => 'DIFFERENCE STATISTICAL STATISTICAL VALUES MARKER OPERATOR', 2356 232255 => 'REPLACED/RETAINED VALUES MARKER OPERATOR', 2357 237255 => 'CANCEL DEFINED DATA PRESENT BIT MAP', 2358 ); 2359# Operator classes which should normally not be displayed in dumpsection4 2360my %OPERATOR_NAME_C = 2361 ( 201 => 'CHANGE DATA WIDTH', 2362 202 => 'CHANGE SCALE', 2363 203 => 'CHANGE REFERENCE VALUES', 2364 204 => 'ADD ASSOCIATED FIELD', 2365 # This one is displayed, treated specially (and named CHARACTER INFORMATION) 2366## 205 => 'SIGNIFY CHARACTER', 2367 206 => 'SIGNIFY DATA WIDTH FOR THE IMMEDIATELY FOLLOWING LOCAL DESCRIPTOR', 2368 207 => 'INCREASE SCALE, REFERENCE VALUE AND DATA WIDTH', 2369 208 => 'CHANGE WIDTH OF CCITT IA5 FIELD', 2370 221 => 'DATA NOT PRESENT', 2371 ); 2372sub _get_operator_name { 2373 my $id = shift; 2374 my $operator_name = ''; 2375 if ($OPERATOR_NAME_A{$id}) { 2376 $operator_name = $OPERATOR_NAME_A{$id} 2377 } elsif ($Show_all_operators) { 2378 if ($OPERATOR_NAME_B{$id}) { 2379 $operator_name = $OPERATOR_NAME_B{$id} 2380 } else { 2381 my $fx = substr $id, 0, 3; 2382 if ($OPERATOR_NAME_C{$fx}) { 2383 $operator_name = $OPERATOR_NAME_C{$fx}; 2384 } 2385 } 2386 } 2387 return $operator_name; 2388} 2389 2390## Display bit mapped values on same line as the original value. This 2391## offer a much shorter and easier to read dump of section 4 when bit 2392## maps has been used (i.e. for 222000 quality information, 223000 2393## substituted values, 224000 first order statistics, 225000 2394## difference statistics, 232000 replaced/retained values). '*******' 2395## is displayed if data is not present in bit map (bit set to 1 in 2396## 031031 or data not covered by the 031031 descriptors), 'missing' is 2397## displayed if value is missing. But note that we miss other 2398## descriptors like 001031 and 001032 if these come after 222000 etc 2399## with the current implementation. And there are more shortcomings, 2400## described in CAVEAT section in POD for bufrread.pl 2401sub dumpsection4_with_bitmaps { 2402 my $self = shift; 2403 my $data = shift; 2404 my $descriptors = shift; 2405 my $isub = shift; 2406 my $width = shift || 15; # Optional argument 2407 2408 # If no bit maps call the ordinary dumpsection4 2409 if (not defined $self->{BITMAPS}) { 2410 return $self->dumpsection4($data, $descriptors, $width); 2411 } 2412 2413 # $Show_all_operators must be turned off for this sub to work correctly 2414 _croak "Cannot dump section 4 properly with bitmaps" 2415 . " when Show_all_operators is set" if $Show_all_operators; 2416 2417 # The kind of bit maps (i.e. the operator descriptors) used in BUFR message 2418 my @bitmap_desc = @{ $self->{BITMAP_OPERATORS} }; 2419 2420 my @bitmap_array; # Will contain for each bit map a reference to a hash with 2421 # key: index (in data and descriptor arrays) for data value 2422 # value: index for bit mapped value 2423 2424 # For compressed data all subsets use same bit map (we assume) 2425 $isub = 0 if $self->{COMPRESSED_DATA}; 2426 2427 my $txt = "\n"; 2428 my $space = ' '; 2429 my $line = $space x (17 + $width); 2430 foreach my $bitmap_num (0..$#bitmap_desc) { 2431 $line .= " $bitmap_desc[$bitmap_num]"; 2432 # Convert the sequence of ($data_idesc,$bitmapped_idesc) pairs into a hash 2433 my %hash = @{ $self->{BITMAPS}->[$bitmap_num + 1]->[$isub] }; 2434 $bitmap_array[$bitmap_num] = \%hash; 2435 } 2436 # First make a line showing the operator descriptors using bit maps 2437 $txt .= "$line\n"; 2438 2439 my $B_table = $self->{B_TABLE}; 2440 # Add the artificial descriptor for associated field 2441 $B_table->{999999} = "ASSOCIATED FIELD\0Numeric"; 2442 my $C_table = $self->{C_TABLE} || ''; 2443 2444 my $idx = 0; 2445 # Loop over data descriptors 2446 ID: 2447 foreach my $id (@{$descriptors}) { 2448 # Stop printing when the bit map part starts 2449 last ID if (substr($id,0,1) eq '2' 2450 and ($id =~ /^22[2-5]/ || $id =~ /^232/)); 2451 2452 # Get the data value 2453 my $value = defined $data->[$idx] ? $data->[$idx] : 'missing'; 2454 _croak "Data descriptor $id is not present in BUFR table B" 2455 unless exists $B_table->{$id}; 2456 my ($name, $unit, $bits) = (split /\0/, $B_table->{$id})[0,1,4]; 2457 $line = sprintf "%6d %06d %${width}.${width}s ", 2458 $idx+1, $id, $value; 2459 2460 # Then get the corresponding bit mapped values, using '*******' 2461 # if 'data not present' in bit map 2462 my $max_len = 7; 2463 foreach my $bitmap_num (0..$#bitmap_desc) { 2464 my $val; 2465 if ($bitmap_array[$bitmap_num]->{$idx}) { 2466 # data marked as 'data present' in bitmap 2467 my $bitmapped_idesc = $bitmap_array[$bitmap_num]->{$idx}; 2468 $val = defined $data->[$bitmapped_idesc] 2469 ? $data->[$bitmapped_idesc] : 'missing'; 2470 $max_len = length($val) if length($val) > $max_len; 2471 } else { 2472 $val = '*******'; 2473 } 2474 # If $max_len has been increased, this might not always 2475 # print very pretty, but at least there is no truncation 2476 # of digits in value 2477 $line .= sprintf " %${max_len}.${max_len}s", $val; 2478 } 2479 # Code or flag table number equals $id, so no need to display this in [unit] 2480 my $short_unit = $unit; 2481 my $unit_start = uc(substr($unit, 0, 4)); 2482 if ($unit_start eq 'CODE') { 2483 $short_unit = 'CODE TABLE'; 2484 } elsif ($unit_start eq 'FLAG') { 2485 $short_unit = 'FLAG TABLE'; 2486 } 2487 $line .= sprintf " %s\n", "$name [$short_unit]"; 2488 $txt .= $line; 2489 2490 # Check for illegal flag value 2491 if ($Strict_checking && $short_unit eq 'FLAG TABLE' && $bits > 1) { 2492 if ($value ne 'missing' and $value % 2) { 2493 my $max_value = 2**$bits - 1; 2494 $bits += 0; # get rid of spaces 2495 _complain("$id - $value: rightmost bit $bits is set indicating missing value" 2496 . " but then value should be $max_value"); 2497 } 2498 } 2499 2500 # Resolve flag and code table values if code table is loaded 2501 if ($C_table && $value ne 'missing') { 2502 my $num_spaces = $width + 19 + 7*@bitmap_desc; 2503 $txt .= _get_code_table_txt($id,$value,$unit,$B_table,$C_table,$num_spaces) 2504 } 2505 $idx++; 2506 } 2507 return $txt; 2508} 2509 2510## Return the text found in flag or code tables for value $value of 2511## descriptor $id. The empty string is returned if $unit is neither 2512## CODE TABLE nor FLAG TABLE, or if $unit is CODE TABLE but for this 2513## $value there is no text in C table. Returns a "... does not exist!" 2514## message if flag/code table is not found. If $check_illegal is 2515## defined, an 'Illegal value' message is returned if $value is bigger 2516## than allowed or has highest bit set without having all other bits 2517## set. 2518sub _get_code_table_txt { 2519 my ($id,$value,$unit,$B_table,$C_table,$num_spaces,$check_illegal) = @_; 2520 2521 my $txt = ''; 2522 # Need case insensitive matching, since local tables from at least 2523 # DWD use 'Code table', not 'CODE TABLE', in the ECMWF ecCodes 2524 # distribution 2525 if ($unit =~ m/^CODE[ ]?TABLE/i) { 2526 my $code_table = sprintf "%06d", $id; 2527 return "Code table $code_table does not exist!\n" 2528 if ! exists $C_table->{$code_table}; 2529 if ($C_table->{$code_table}{$value}) { 2530 my @lines = split "\n", $C_table->{$code_table}{$value}; 2531 foreach (@lines) { 2532 $txt .= sprintf "%s %s\n", ' ' x ($num_spaces), lc $_; 2533 } 2534 } 2535 } elsif ($unit =~ m/^FLAG[ ]?TABLE/i) { 2536 my $flag_table = sprintf "%06d", $id; 2537 return "Flag table $flag_table does not exist!\n" 2538 if ! exists $C_table->{$flag_table}; 2539 2540 my $width = (split /\0/, $B_table->{$flag_table})[4]; 2541 $width += 0; # Get rid of spaces 2542 # Cannot handle more than 32 bits flags with current method 2543 _croak "Unable to handle > 32 bits flag; $id has width $width" 2544 if $width > 32; 2545 2546 my $max_value = 2**$width - 1; 2547 2548 if (defined $check_illegal and $value > $max_value) { 2549 $txt = "Illegal value: $value is bigger than maximum allowed ($max_value)\n"; 2550 } elsif ($value == $max_value) { 2551 $txt = sprintf "%s=> %s", ' ' x ($num_spaces), "bit $width set:" 2552 . sprintf "%s %s\n", ' ' x ($num_spaces), "missing value\n"; 2553 } else { 2554 # Convert to bitstring and localize the 1 bits 2555 my $binary = pack "N", $value; # Packed as 32 bits in big-endian order 2556 my $bitstring = substr unpack('B*',$binary), 32-$width; 2557 for my $i (1..$width) { 2558 if (substr($bitstring, $i-1, 1) == 1) { 2559 $txt .= sprintf "%s=> %s", ' ' x ($num_spaces), 2560 "bit $i set"; 2561 if ($C_table->{$flag_table}{$i}) { 2562 my @lines = split "\n", $C_table->{$flag_table}{$i}; 2563 $txt .= ': ' . lc (shift @lines) . "\n"; 2564 foreach (@lines) { 2565 $txt .= sprintf "%s %s\n", ' ' x ($num_spaces), lc $_; 2566 } 2567 } else { 2568 $txt .= "\n"; 2569 } 2570 } 2571 } 2572 if (defined $check_illegal and $txt =~ /bit $width set/) { 2573 $txt = "Illegal value ($value): bit $width is set indicating missing value," 2574 . " but then value should be $max_value\n"; 2575 } 2576 } 2577 } 2578 return $txt; 2579} 2580 2581## Convert from integer to descriptor 2582sub _int2fxy { 2583 my @fxy = map {sprintf("%1d%02d%03d", ($_>>14)&0x3, ($_>>8)&0x3f, $_&0xff)} @_; 2584 return @_ > 1 ? @fxy : $fxy[0]; 2585} 2586 2587## Expand a list of descriptors using BUFR table D, also expanding 2588## simple replication but not delayed replication 2589sub _expand_descriptors { 2590 my $D_table = shift; 2591 my @expanded = (); 2592 2593 for (my $di = 0; $di < @_; $di++) { 2594 my $descriptor = $_[$di]; 2595 _croak "$descriptor is not a BUFR descriptor" 2596 if $descriptor !~ /^\d{6}$/; 2597 my $f = int substr($descriptor, 0, 1); 2598 if ($f == 1) { 2599 my $x = substr $descriptor, 1, 2; # Replicate next $x descriptors 2600 my $y = substr $descriptor, 3; # Number of replications 2601 if ($y > 0) { 2602 # Simple replication (replicate next x descriptors y times) 2603 _croak "Cannot expand: Not enough descriptors following " 2604 . "replication descriptor $descriptor (or there is " 2605 . "a problem in nesting of replication)" if $di+$x+1 > @_; 2606 my @r = (); 2607 push @r, @_[($di+1)..($di+$x)] for (1..$y); 2608 # Recursively expand replicated descriptors $y times 2609 my @s = (); 2610 @s = _expand_descriptors($D_table, @r) if @r; 2611 if ($Show_replication) { 2612 # Adjust x since replicated descriptors might have been expanded 2613 # Unfortunately _spew is not available here to report the x>99 -> x=0 hack 2614 my $z = @s/$y > 99 ? 0 : @s/$y; 2615 substr($_[$di], 1, 2) = sprintf "%02d", $z; 2616 push @expanded, $_[$di]; 2617 } 2618 push @expanded, @s if @s; 2619 $di += $x; 2620 } else { 2621 # Delayed replication. Next descriptor ought to be the 2622 # delayed descriptor replication (and data repetition) 2623 # factor, i.e. one of 0310(00|01|02|11|12), followed 2624 # by the x descriptors to be replicated 2625 if ($di+2 == @_ && $_[$di+1] =~ /^0310(00|01|02|11|12)$/) { 2626 _complain "Missing the $x descriptors which should follow" 2627 . " $descriptor $_[$di+1]"; 2628 push @expanded, @_[$di,$di+1]; 2629 last; 2630 } 2631 _croak "Cannot expand: Not enough descriptors following delayed" 2632 . " replication descriptor $descriptor (or there is " 2633 . "a problem in nesting of replication)" if $di+$x+1 > @_; 2634 _croak "Cannot expand: Delayed replication descriptor " 2635 . "$descriptor is not followed by one of " 2636 . "0310(00|01|02|11|12) but by $_[$di+1]" 2637 if $_[$di+1] !~ /^0310(00|01|02|11|12)$/; 2638 my @r = @_[($di+2)..($di+$x+1)]; 2639 # Here we just expand the D descriptors in the 2640 # descriptors to be replicated. The final expansion 2641 # using delayed replication factor has to wait until 2642 # data part is decoded 2643 my @s = (); 2644 @s = _expand_descriptors($D_table, @r) if @r; 2645 # Must adjust x since replicated descriptors might have been expanded 2646 substr($_[$di], 1, 2) = sprintf "%02d", scalar @s; 2647 push @expanded, @_[$di,$di+1], @s; 2648 $di += 1+$x; # NOTE: 1 is added to $di on next iteration 2649 } 2650 next; 2651 } elsif ($f == 3) { 2652 _croak "No sequence descriptor $descriptor in BUFR table D" 2653 if not exists $D_table->{$descriptor}; 2654 # Expand recursively, if necessary 2655 push @expanded, 2656 _expand_descriptors($D_table, split /\s/, $D_table->{$descriptor}); 2657 } else { # f=0,2 2658 push @expanded, $descriptor; 2659 } 2660 } 2661 2662 return @expanded; 2663} 2664 2665## Return a text string suitable for printing information about the given 2666## BUFR table descriptors 2667## 2668## $how = 'fully': Expand all D descriptors fully into B descriptors, 2669## with name, unit, scale, reference value and width (each on a 2670## numbered line, except for replication operators which are not 2671## numbered). 2672## 2673## $how = 'partially': Like 'fully, but expand D descriptors only once 2674## and ignore replication. 2675## 2676## $how = 'noexpand': Like 'partially', but do not expand D 2677## descriptors at all. 2678## 2679## $how = 'simply': Like 'partially', but list the descriptors on one 2680## single line with no extra information provided. 2681sub resolve_descriptor { 2682 my $self = shift; 2683 my $how = shift; 2684 foreach (@_) { 2685 _croak("'$_' is not an integer argument to resolve_descriptor!") 2686 unless /^\d+$/; 2687 } 2688 my @desc = map { sprintf "%06d", $_ } @_; 2689 2690 my @allowed_hows = qw( simply fully partially noexpand ); 2691 _croak "First argument in resolve_descriptor must be one of" 2692 . " '@allowed_hows', is: '$how'" 2693 unless grep { $how eq $_ } @allowed_hows; 2694 2695 if (! $self->{B_TABLE}) { 2696 if ($BUFR_table{FORMAT} eq 'ECCODES' && $self->{LOCAL_TABLES_NOT_FOUND}) { 2697 _croak "Local table " . $self->{LOCAL_TABLES_NOT_FOUND} . " couldn't be found," 2698 . " or you might need to load WMO master table also?"; 2699 } else { 2700 _croak "No B table is loaded - did you forget to call load_BDtables?"; 2701 } 2702 } 2703 my $B_table = $self->{B_TABLE}; 2704 2705 # Some local tables are provided only for element descriptors, and 2706 # we might in fact not need the sequence descriptors for resolving 2707 my $D_table; 2708 my $need_Dtable = 0; 2709 foreach my $id (@desc) { 2710 if (substr($id,0,1) eq '3') { 2711 $need_Dtable = 1; 2712 } 2713 } 2714 if ($need_Dtable && ! $self->{D_TABLE}) { 2715 if ($BUFR_table{FORMAT} eq 'ECCODES' && $self->{LOCAL_TABLES_NOT_FOUND}) { 2716 _croak "Local table " . $self->{LOCAL_TABLES_NOT_FOUND} . " couldn't be found," 2717 . " or you might need to load WMO master table also?"; 2718 } else { 2719 _croak "No D table is loaded - did you forget to call load_BDtables?"; 2720 } 2721 } else { 2722 # Could consider omitting this if $need_Dtable = 0 ... 2723 $D_table = $self->{D_TABLE}; 2724 } 2725 2726 my $txt = ''; 2727 2728 if ($how eq 'simply' or $how eq 'partially') { 2729 my @expanded; 2730 foreach my $id (@desc) { 2731 my $f = substr $id, 0, 1; 2732 if ($f == 3) { 2733 _croak "$id is not in table D, unable to expand" 2734 unless $D_table->{$id}; 2735 push @expanded, split /\s/, $D_table->{$id}; 2736 } else { 2737 push @expanded, $id; 2738 } 2739 } 2740 if ($how eq 'simply') { 2741 return $txt = "@expanded\n"; 2742 } else { 2743 @desc = @expanded; 2744 } 2745 } 2746 if ($how eq 'fully') { 2747 if (@desc == 1 and $desc[0] =~ /^1/) { 2748 # This is simply a replication descriptor; do not try to expand 2749 } else { 2750 @desc = _expand_descriptors($D_table, @desc); 2751 } 2752 } 2753 2754 my $count = 0; 2755 foreach my $id (@desc) { 2756 if ($id =~ /^[123]/) { 2757 $txt .= sprintf " %06d\n", $id; 2758 } elsif ($B_table->{$id}) { 2759 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 2760 $txt .= sprintf "%3d %06d %s [%s] %d %d %d\n", 2761 ++$count,$id,$name,$unit,$scale,$refval,$width; 2762 } else { 2763 $txt .= sprintf "%3d %06d Not in table B\n", 2764 ++$count,$id; 2765 } 2766 } 2767 return $txt; 2768} 2769 2770## Return BUFR table B information for an element descriptor for the 2771## last table loaded, as an array of name, unit, scale, reference 2772## value and data width in bits. Returns false if the descriptor is 2773## not found or no data width is defined, or croaks if no table B has 2774## been loaded. 2775sub element_descriptor { 2776 my $self = shift; 2777 my $desc = shift; 2778 _croak "Argument to element_descriptor must be an integer\n" 2779 unless $desc =~ /^\d+$/; 2780 $desc = sprintf "%06d", $desc; 2781 _croak "No BUFR B table loaded\n" unless defined $self->{B_TABLE}; 2782 return unless defined $self->{B_TABLE}->{$desc}; 2783 my ($name, $unit, $scale, $refval, $width) 2784 = split /\0/, $self->{B_TABLE}->{$desc}; 2785 return unless defined $width && $width =~ /\d+$/; 2786 return ($name, $unit, $scale+0, $refval+0, $width+0); 2787} 2788 2789## Return BUFR table D information for a sequence descriptor for the 2790## last table loaded, as a space separated string of the descriptors 2791## in the direct (nonrecursive) lookup in table D. Returns false if 2792## the sequence descriptor is not found, or croaks if no table D has 2793## been loaded. 2794sub sequence_descriptor { 2795 my $self = shift; 2796 my $desc = shift; 2797 _croak "Argument to element_descriptor must be an integer\n" 2798 unless $desc =~ /^\d+$/; 2799 _croak "No BUFR D table loaded\n" unless defined $self->{D_TABLE}; 2800 return unless defined $self->{D_TABLE}->{$desc}; 2801 if (wantarray) { 2802 return split / /, $self->{D_TABLE}->{$desc}; 2803 } else { 2804 return $self->{D_TABLE}->{$desc}; 2805 } 2806} 2807 2808## Return a text string telling which bits are set and the meaning of 2809## the bits set when $value is interpreted as a flag value, also 2810## checking for illegal values. The empty string is returned if $value=0. 2811sub resolve_flagvalue { 2812 my $self = shift; 2813 my ($value,$flag_table,$table,$default_table,$num_leading_spaces) = @_; 2814 _croak "Flag value can't be negative!\n" if $value < 0; 2815 $num_leading_spaces ||= 0; # Default value 2816 2817 $self->load_Ctable($table,$default_table); 2818 my $C_table = $self->{C_TABLE}; 2819 2820 # Number of bits used for the flag is hard to extract from C 2821 # table; it is much easier to obtain from B table 2822 $self->load_BDtables($table); 2823 my $B_table = $self->{B_TABLE}; 2824 2825 my $unit = 'FLAG TABLE'; 2826 return _get_code_table_txt($flag_table,$value,$unit, 2827 $B_table,$C_table,$num_leading_spaces,'check_illegal'); 2828} 2829 2830## Return the contents of code table $code_table, or empty string if 2831## code table is not found 2832sub dump_codetable { 2833 my $self = shift; 2834 my ($code_table,$table,$default_table) = @_; 2835 _croak("code_table '$code_table' is not a (positive) integer in dump_codetable()") 2836 unless $code_table =~ /^\d+$/; 2837 $code_table = sprintf "%06d", $code_table; 2838 2839 $self->load_Ctable($table,$default_table); 2840 my $C_table = $self->{C_TABLE}; 2841 2842 return '' unless $C_table->{$code_table}; 2843 2844 my $dump; 2845 foreach my $value (sort {$a <=> $b} keys %{ $C_table->{$code_table} }) { 2846 my $txt = $C_table->{$code_table}{$value}; 2847 chomp $txt; 2848 $txt =~ s/\n/\n /g; 2849 $dump .= sprintf "%3d -> %s\n", $value, $txt; 2850 } 2851 return $dump; 2852} 2853 2854## Decode bitstream (data part of section 4) while working through the 2855## (expanded) descriptors in section 3. The final data and 2856## corresponding descriptors are put in $self->{DATA} and 2857## $self->{DESC} (indexed by subset number) 2858sub _decode_bitstream { 2859 my $self = shift; 2860 $self->{CODING} = 'DECODE'; 2861 my $bitstream = $self->{SEC4_RAWDATA} . "\0\0\0\0"; 2862 my $maxpos = 8*length($self->{SEC4_RAWDATA}); 2863 my $pos = 0; 2864 my @operators; 2865 my $ref_values_ref; # Hash ref to reference values with descriptors as keys; 2866 # to be implemented later (not used yet) 2867 my @subset_data; # Will contain data values for subset 1,2... 2868 my @subset_desc; # Will contain the set of descriptors for subset 1,2... 2869 # expanded to be in one to one correspondance with the data 2870 my $repeat_X; # Set to number of descriptors to be repeated if 2871 # delayed descriptor and data repetition factor is 2872 # in effect 2873 my $repeat_factor; # Set to number of times descriptors (and data) 2874 # are to be repeated if delayed descriptor and 2875 # data repetition factor is in effect 2876 my @repeat_desc; # The descriptors to be repeated 2877 my @repeat_data; # The data to be repeated 2878 my $B_table = $self->{B_TABLE}; 2879 2880 # Has to fully expand @desc for each subset in turn, as delayed 2881 # replication factors might be different for each subset, 2882 # resulting in different full expansions. During the expansion the 2883 # effect of operator descriptors are taken into account, causing 2884 # most of them to be eliminated (unless $Show_all_operators is 2885 # set), so that @desc and the equivalent $subset_desc[$isub] ends 2886 # up being in one to one correspondence with the data values in 2887 # $subset_data[$isub] (the operators included having data value 2888 # '') 2889 S_LOOP: foreach my $isub (1..$self->{NUM_SUBSETS}) { 2890 $self->_spew(2, "Decoding subset number %d", $isub) if $Spew; 2891 2892 # Bit maps might vary from subset to subset, so must be rebuilt 2893 undef $self->{BITMAP_OPERATORS}; 2894 undef $self->{BITMAP_START}; 2895 undef $self->{REUSE_BITMAP}; 2896 $self->{NUM_BITMAPS} = 0; 2897 $self->{BACKWARD_DATA_REFERENCE} = 1; 2898 $self->{NUM_CHANGE_OPERATORS} = 0; 2899 2900 my @desc = split /\s/, $self->{DESCRIPTORS_EXPANDED}; 2901 2902 # Note: @desc as well as $idesc may be changed during this loop, 2903 # so we cannot use a foreach loop instead 2904 D_LOOP: for (my $idesc = 0; $idesc < @desc; $idesc++) { 2905 my $id = $desc[$idesc]; 2906 my $f = substr($id,0,1); 2907 my $x = substr($id,1,2)+0; 2908 my $y = substr($id,3,3)+0; 2909 2910 if ($f == 1) { 2911 if ($Show_replication) { 2912 push @{$subset_desc[$isub]}, $id; 2913 push @{$subset_data[$isub]}, ''; 2914 $self->_spew(4, "X=0 in $id for F=1, might have been > 99 in expansion") 2915 if $Spew && $x == 0; 2916 } 2917 next D_LOOP if $y > 0; # Nothing more to do for normal replication 2918 2919 if ($x == 0) { 2920 _complain("Nonsensical replication of zero descriptors ($id)"); 2921 $idesc++; 2922 next D_LOOP; 2923 } 2924 2925 $_ = $desc[$idesc+1]; 2926 _croak "$id Erroneous replication factor" 2927 unless /^0310(00|01|02|11|12)/ && exists $B_table->{$_}; 2928 2929 my $width = (split /\0/, $B_table->{$_})[-1]; 2930 my $factor = bitstream2dec($bitstream, $pos, $width); 2931 $pos += $width; 2932 # Delayed descriptor replication factors (and 2933 # associated fields) are the only values in section 4 2934 # where all bits being 1 is not to be interpreted as a 2935 # missing value 2936 if (not defined $factor) { 2937 $factor = 2**$width - 1; 2938 } 2939 if ($Spew) { 2940 if ($_ eq '031011' || $_ eq '031012') { 2941 $self->_spew(4, "$_ Delayed repetition factor: %s", $factor); 2942 } else { 2943 $self->_spew(4, "$_ Delayed replication factor: %s", $factor); 2944 } 2945 } 2946 # Include the delayed replication in descriptor and data list 2947 splice @desc, $idesc++, 0, $_; 2948 push @{$subset_desc[$isub]}, $_; 2949 push @{$subset_data[$isub]}, $factor; 2950 2951 if ($_ eq '031011' || $_ eq '031012') { 2952 # For delayed repetition, descriptor *and* data are 2953 # to be repeated 2954 $repeat_X = $x; 2955 $repeat_factor = $factor; 2956 } 2957 my @r = (); 2958 push @r, @desc[($idesc+2)..($idesc+$x+1)] while $factor--; 2959 splice @desc, $idesc, 2+$x, @r; 2960 2961 if ($repeat_factor) { 2962 # Skip to the last set to be repeated, which will 2963 # then be included $repeat_factor times 2964 $idesc += $x * ($repeat_factor - 1); 2965 $self->_spew(4, "Delayed repetition ($id $_ -> @r)") if $Spew; 2966 } else { 2967 $self->_spew(4, "Delayed replication ($id $_ -> @r)") if $Spew; 2968 } 2969 if ($idesc < @desc) { 2970 redo D_LOOP; 2971 } else { 2972 last D_LOOP; # Might happen if delayed factor is 0 2973 } 2974 2975 } elsif ($f == 2) { 2976 my $flow; 2977 my $bm_idesc; 2978 ($pos, $flow, $bm_idesc, @operators) 2979 = $self->_apply_operator_descriptor($id, $x, $y, $pos, $isub, 2980 $desc[$idesc+1], @operators); 2981 if ($flow eq 'redo_bitmap') { 2982 # Data value is associated with the descriptor 2983 # defined by bit map. Remember original and new 2984 # index in descriptor array for the bit mapped 2985 # values ('dr' = data reference) 2986 my $dr_idesc; 2987 if (!defined $bm_idesc) { 2988 $dr_idesc = shift @{$self->{REUSE_BITMAP}->[$isub]}; 2989 } elsif (!$Show_all_operators) { 2990 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 2991 + $bm_idesc; 2992 } else { 2993 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}]; 2994 # Skip operator descriptors 2995 while ($bm_idesc-- > 0) { 2996 $dr_idesc++; 2997 $dr_idesc++ while ($desc[$dr_idesc] >= 200000); 2998 } 2999 } 3000 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 3001 $dr_idesc, $idesc; 3002 if ($Show_all_operators) { 3003 push @{$subset_desc[$isub]}, $id; 3004 push @{$subset_data[$isub]}, ''; 3005 } 3006 $desc[$idesc] = $desc[$dr_idesc]; 3007 redo D_LOOP; 3008 } elsif ($flow eq 'signify_character') { 3009 push @{$subset_desc[$isub]}, $id; 3010 # Extract ASCII string 3011 my $value = bitstream2ascii($bitstream, $pos, $y); 3012 $pos += 8*$y; 3013 # Trim string, also removing nulls 3014 $value = _trim($value, $id); 3015 push @{$subset_data[$isub]}, $value; 3016 next D_LOOP; 3017 } elsif ($flow eq 'no_value') { 3018 # Some operator descriptors ought to be included 3019 # in expanded descriptors even though they have no 3020 # corresponding data value, because they contain 3021 # valuable information to be displayed in 3022 # dumpsection4 (e.g. 222000 'Quality information follows') 3023 push @{$subset_desc[$isub]}, $id; 3024 push @{$subset_data[$isub]}, ''; 3025 next D_LOOP; 3026 } 3027 3028 if ($Show_all_operators) { 3029 push @{$subset_desc[$isub]}, $id; 3030 push @{$subset_data[$isub]}, ''; 3031 } else { 3032 # Remove operator descriptor from @desc 3033 splice @desc, $idesc--, 1; 3034 } 3035 3036 next D_LOOP if $flow eq 'next'; 3037 last D_LOOP if $flow eq 'last'; 3038 if ($flow eq 'skip') { 3039 $idesc++; 3040 next D_LOOP; 3041 } 3042 } 3043 3044 if ($self->{CHANGE_REFERENCE_VALUE}) { 3045 # The data descriptor is to be associated with a new 3046 # reference value, which is fetched from data stream 3047 _croak "Change reference operator 203Y is not followed by element" 3048 . " descriptor, but $id" if $f > 0; 3049 my $num_bits = $self->{CHANGE_REFERENCE_VALUE}; 3050 my $new_refval = bitstream2dec($bitstream, $pos, $num_bits); 3051 $pos += $num_bits; 3052 # Negative value if most significant bit is set (one's complement) 3053 $new_refval = $new_refval & (1<<$num_bits-1) 3054 ? -($new_refval & ((1<<$num_bits-1)-1)) 3055 : $new_refval; 3056 $self->_spew(4, "$id * Change reference value: ". 3057 ($new_refval > 0 ? "+" : "")."$new_refval") if $Spew; 3058 $self->{NEW_REFVAL_OF}{$id}{$isub} = $new_refval; 3059 # Identify new reference values by setting f=9 3060 push @{$subset_desc[$isub]}, $id + 900000; 3061 push @{$subset_data[$isub]}, $new_refval; 3062 next D_LOOP; 3063 } 3064 3065 # If operator 204$y 'Add associated field is in effect', 3066 # each data value is preceded by $y bits which should be 3067 # decoded separately. We choose to provide a descriptor 3068 # 999999 in this case (like the ECMWF BUFRDC software) 3069 if ($self->{ADD_ASSOCIATED_FIELD} and $id ne '031021') { 3070 # First extract associated field 3071 my $width = $self->{ADD_ASSOCIATED_FIELD}; 3072 my $value = bitstream2dec($bitstream, $pos, $width); 3073 # All bits set to 1 for associated field is NOT 3074 # interpreted as missing value 3075 $value = 2**$width - 1 if ! defined $value; 3076 $pos += $width; 3077 push @{$subset_desc[$isub]}, 999999; 3078 push @{$subset_data[$isub]}, $value; 3079 $self->_spew(4, "Added associated field: %s", $value) if $Spew; 3080 } 3081 3082 # We now have a "real" data descriptor 3083 push @{$subset_desc[$isub]}, $id; 3084 3085 # For quality information, if this relates to a bit map we 3086 # need to store index of the data ($data_idesc) for which 3087 # the quality information applies, as well as the new 3088 # index ($idesc) in the descriptor array for the bit 3089 # mapped values 3090 if (substr($id,0,3) eq '033' 3091 && defined $self->{BITMAP_OPERATORS} 3092 && $self->{BITMAP_OPERATORS}->[-1] eq '222000') { 3093 if (defined $self->{REUSE_BITMAP}) { 3094 my $data_idesc = shift @{ $self->{REUSE_BITMAP}->[$isub] }; 3095 _croak "$id: Not enough quality values provided" 3096 if not defined $data_idesc; 3097 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 3098 $data_idesc, $idesc; 3099 } else { 3100 my $data_idesc = shift @{ $self->{CURRENT_BITMAP} }; 3101 _croak "$id: Not enough quality values provided" 3102 if not defined $data_idesc; 3103 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 3104 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 3105 + $data_idesc, $idesc; 3106 } 3107 } 3108 3109 # Find the relevant entry in BUFR table B 3110 _croak "Data descriptor $id is not present in BUFR table B" 3111 unless exists $B_table->{$id}; 3112 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 3113 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name) if $Spew; 3114 3115 # Override Table B values if Data Description Operators are in effect 3116 if ($self->{NUM_CHANGE_OPERATORS} > 0) { 3117 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 3118 if (defined $self->{CHANGE_SRW}) { 3119 $scale += $self->{CHANGE_SRW}; 3120 $width += int((10*$self->{CHANGE_SRW}+2)/3); 3121 $refval *= 10*$self->{CHANGE_SRW}; 3122 } else { 3123 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 3124 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 3125 } 3126 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 3127 $width = $self->{CHANGE_CCITTIA5_WIDTH} 3128 } 3129 # To prevent autovivification (see perldoc -f exists) we 3130 # need this laborious test for defined 3131 $refval = $self->{NEW_REFVAL_OF}{$id}{$isub} if defined $self->{NEW_REFVAL_OF}{$id} 3132 && defined $self->{NEW_REFVAL_OF}{$id}{$isub}; 3133 # Difference statistical values use different width and reference value 3134 if ($self->{DIFFERENCE_STATISTICAL_VALUE}) { 3135 $width += 1; 3136 $refval = -2**$width; 3137 undef $self->{DIFFERENCE_STATISTICAL_VALUE}; 3138 $self->{NUM_CHANGE_OPERATORS}--; 3139 } 3140 } 3141 _croak "$id Data width <= 0" if $width <= 0; 3142 3143 my $value; 3144 if ($unit eq 'CCITTIA5') { 3145 # Extract ASCII string 3146 _croak "Width for unit CCITTIA5 must be integer bytes\n" 3147 . "is $width bits for descriptor $id" if $width % 8; 3148 $value = bitstream2ascii($bitstream, $pos, $width/8); 3149 $self->_spew(3, " %s", defined $value ? $value : 'missing') if $Spew; 3150 # Trim string, also removing nulls 3151 $value = _trim($value, $id); 3152 } else { 3153 $value = bitstream2dec($bitstream, $pos, $width); 3154 if (defined $value) { 3155 # Compute and format decoded value 3156 ($scale) = $scale =~ /(-?\d+)/; # untaint 3157 $value = $scale <= 0 ? ($value + $refval)/10**$scale 3158 : sprintf "%.${scale}f", ($value + $refval)/10**$scale; 3159 } 3160 $self->_spew(3, " %s", defined $value ? $value : 'missing') if $Spew; 3161 } 3162 $pos += $width; 3163 push @{$subset_data[$isub]}, $value; 3164 # $value = undef if missing value 3165 3166 if ($repeat_X) { 3167 # Delayed repetition factor (030011/030012) is in 3168 # effect, so descriptors and data are to be repeated 3169 push @repeat_desc, $id; 3170 push @repeat_data, $value; 3171 if (--$repeat_X == 0) { 3172 # Store $repeat_factor repetitions of data and descriptors 3173 # (one repetition has already been included) 3174 while (--$repeat_factor) { 3175 push @{$subset_desc[$isub]}, @repeat_desc; 3176 push @{$subset_data[$isub]}, @repeat_data; 3177 } 3178 @repeat_desc = (); 3179 @repeat_data = (); 3180 } 3181 } 3182 3183 if ($id eq '031031' and $self->{BUILD_BITMAP}) { 3184 # Store the index of expanded descriptors if data is 3185 # marked as present in data present indicator: 0 is 3186 # 'present', 1 (undef value) is 'not present'. E.g. 3187 # bitmap = 1100110 => (2,3,6) is stored in $self->{CURRENT_BITMAP} 3188 if (defined $value) { 3189 push @{$self->{CURRENT_BITMAP}}, $self->{BITMAP_INDEX}; 3190 } 3191 $self->{BITMAP_INDEX}++; 3192 if ($self->{BACKWARD_DATA_REFERENCE} == $self->{NUM_BITMAPS}) { 3193 my $numb = $self->{NUM_BITMAPS}; 3194 if (!defined $self->{BITMAP_START}[$numb]) { 3195 # Look up the element descriptor immediately 3196 # preceding the bitmap operator 3197 my $i = $idesc; 3198 $i-- while ($desc[$i] ne $self->{BITMAP_OPERATORS}->[-1] 3199 && $i >=0); 3200 $i-- while ($desc[$i] > 100000 && $i >=0); 3201 _croak "No element descriptor preceding bitmap" if $i < 0; 3202 $self->{BITMAP_START}[$numb] = $i; 3203 } else { 3204 $self->{BITMAP_START}[$numb]--; 3205 _croak "Bitmap too big" 3206 if $self->{BITMAP_START}[$numb] < 0; 3207 } 3208 } 3209 } elsif ($self->{BUILD_BITMAP} and $self->{BITMAP_INDEX} > 0) { 3210 # We have finished building the bit map 3211 $self->{BUILD_BITMAP} = 0; 3212 $self->{BITMAP_INDEX} = 0; 3213 if ($self->{BACKWARD_DATA_REFERENCE} != $self->{NUM_BITMAPS}) { 3214 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 3215 = $self->{BITMAP_START}[$self->{BACKWARD_DATA_REFERENCE}]; 3216 } 3217 } 3218 } # End D_LOOP 3219 } # END S_LOOP 3220 3221 # Check that length of section 4 corresponds to what expected from section 3 3222 $self->_check_section4_length($pos,$maxpos); 3223 3224 $self->{DATA} = \@subset_data; 3225 $self->{DESC} = \@subset_desc; 3226 return; 3227} 3228 3229## Decode bitstream (data part of section 4 encoded using BUFR 3230## compression) while working through the (expanded) descriptors in 3231## section 3. The final data and corresponding descriptors are put in 3232## $self->{DATA} and $self->{DESC} (the data indexed by subset number) 3233sub _decompress_bitstream { 3234 my $self = shift; 3235 $self->{CODING} = 'DECODE'; 3236 my $bitstream = $self->{SEC4_RAWDATA}."\0\0\0\0"; 3237 my $nsubsets = $self->{NUM_SUBSETS}; 3238 my $B_table = $self->{B_TABLE}; 3239 my $maxpos = 8*length($self->{SEC4_RAWDATA}); 3240 my $pos = 0; 3241 my @operators; 3242 my @subset_data; # Will contain data values for subset 1,2..., 3243 # i.e. $subset[$i] is a reference to an array 3244 # containing the data values for subset $i 3245 my @desc_exp; # Will contain the set of descriptors for one 3246 # subset, expanded to be in one to one 3247 # correspondance with the data, i.e. element 3248 # descriptors only 3249 my $repeat_X; # Set to number of descriptors to be repeated if 3250 # delayed descriptor and data repetition factor is 3251 # in effect. Will be decremented while (repeated) 3252 # data sets are extracted 3253 my $repeat_XX; # Like $repeat_X, but will not be decremented 3254 my $repeat_factor; # Set to number of times descriptors (and data) 3255 # are to be repeated if delayed descriptor and 3256 # data repetition factor is in effect 3257 my @repeat_desc; # The descriptors to be repeated 3258 my @repeat_data; # The data to be repeated (reference to an array 3259 # containing the data values for subset $i) 3260 3261 _complain("Compression set in section 1 for one subset message") 3262 if $nsubsets == 1; 3263 3264 $#subset_data = $nsubsets; 3265 3266 my @desc = split /\s/, $self->{DESCRIPTORS_EXPANDED}; 3267 # This will be further expanded to be in one to one correspondance 3268 # with the data, taking replication and table C operators into account 3269 3270 # All subsets in a compressed BUFR message must have exactly the same 3271 # fully expanded section 3, i.e. all replications factors must be the same 3272 # in all subsets. So, as opposed to noncompressed messages, it is enough 3273 # to run through the set of descriptors once. 3274 D_LOOP: for (my $idesc = 0; $idesc < @desc; $idesc++) { 3275 my $id = $desc[$idesc]; 3276 my $f = substr($id,0,1); 3277 my $x = substr($id,1,2)+0; 3278 my $y = substr($id,3,3)+0; 3279 3280 if ($f == 1) { 3281 if ($Show_replication) { 3282 push @desc_exp, $id; 3283 foreach my $isub (1..$nsubsets) { 3284 push @{$subset_data[$isub]}, ''; 3285 } 3286 $self->_spew(4, "X=0 in $id for F=1, might have been > 99 in expansion") 3287 if $Spew && $x == 0; 3288 } 3289 next D_LOOP if $y > 0; # Nothing more to do for normal replication 3290 3291 if ($x == 0) { 3292 _complain("Nonsensical replication of zero descriptors ($id)"); 3293 $idesc++; 3294 next D_LOOP; 3295 } 3296 3297 $_ = $desc[$idesc+1]; 3298 _croak "$id Erroneous replication factor" 3299 unless /^0310(00|01|02|11|12)/ && exists $B_table->{$_}; 3300 3301 my $width = (split /\0/, $B_table->{$_})[-1]; 3302 my $factor = bitstream2dec($bitstream, $pos, $width); 3303 $pos += $width + 6; # 6 bits for the bit count (which we 3304 # skip because we know it has to be 0 3305 # for delayed replication) 3306 # Delayed descriptor replication factors (and associated 3307 # fields) are the only values in section 4 where all bits 3308 # being 1 is not interpreted as a missing value 3309 if (not defined $factor) { 3310 $factor = 2**$width - 1; 3311 } 3312 # Include the delayed replication in descriptor and data list 3313 push @desc_exp, $_; 3314 splice @desc, $idesc++, 0, $_; 3315 foreach my $isub (1..$nsubsets) { 3316 push @{$subset_data[$isub]}, $factor; 3317 } 3318 3319 if ($_ eq '031011' || $_ eq '031012') { 3320 # For delayed repetition, descriptor *and* data is 3321 # to be repeated 3322 $repeat_X = $repeat_XX = $x; 3323 $repeat_factor = $factor; 3324 $self->_spew(4, "$_ Delayed repetition factor: $factor") if $Spew; 3325 } else { 3326 $self->_spew(4, "$_ Delayed replication factor: $factor") if $Spew; 3327 } 3328 my @r = (); 3329 push @r, @desc[($idesc+2)..($idesc+$x+1)] while $factor--; 3330 splice @desc, $idesc, 2+$x, @r; 3331 if ($Spew) { 3332 if ($repeat_factor) { 3333 $self->_spew(4, "$_ Delayed repetition ($id $_ -> @r)"); 3334 } else { 3335 $self->_spew(4, "$_ Delayed replication ($id $_ -> @r)"); 3336 } 3337 } 3338 3339 if ($idesc < @desc) { 3340 redo D_LOOP; 3341 } else { 3342 last D_LOOP; # Might happen if delayed factor is 0 3343 } 3344 3345 } elsif ($f == 2) { 3346 my $flow; 3347 my $bm_idesc; 3348 ($pos, $flow, $bm_idesc, @operators) 3349 = $self->_apply_operator_descriptor($id, $x, $y, $pos, 0, 3350 $desc[$idesc+1], @operators); 3351 if ($flow eq 'redo_bitmap') { 3352 # Data value is associated with the descriptor 3353 # defined by bit map. Remember original and new 3354 # index in descriptor array for the bit mapped 3355 # values ('dr' = data reference) 3356 my $dr_idesc; 3357 if (!defined $bm_idesc) { 3358 $dr_idesc = shift @{ $self->{REUSE_BITMAP}->[0] }; 3359 } elsif (!$Show_all_operators) { 3360 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 3361 + $bm_idesc; 3362 } else { 3363 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}]; 3364 # Skip operator descriptors 3365 while ($bm_idesc-- > 0) { 3366 $dr_idesc++; 3367 $dr_idesc++ while ($desc[$dr_idesc] >= 200000); 3368 } 3369 } 3370 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 3371 $dr_idesc, $idesc; 3372 if ($Show_all_operators) { 3373 push @desc_exp, $id; 3374 foreach my $isub (1..$nsubsets) { 3375 push @{$subset_data[$isub]}, ''; 3376 } 3377 } 3378 $desc[$idesc] = $desc[$dr_idesc]; 3379 redo D_LOOP; 3380 } elsif ($flow eq 'signify_character') { 3381 push @desc_exp, $id; 3382 $pos = $self->_extract_compressed_value($id, $idesc, $pos, $bitstream, 3383 $nsubsets, \@subset_data); 3384 next D_LOOP; 3385 } elsif ($flow eq 'no_value') { 3386 # Some operator descriptors ought to be included 3387 # in expanded descriptors even though they have no 3388 # corresponding data value, because they contain 3389 # valuable information to be displayed in 3390 # dumpsection4 (e.g. 222000 'Quality information follows') 3391 push @desc_exp, $id; 3392 foreach my $isub (1..$nsubsets) { 3393 push @{$subset_data[$isub]}, ''; 3394 } 3395 next D_LOOP; 3396 } 3397 3398 if ($Show_all_operators) { 3399 push @desc_exp, $id; 3400 foreach my $isub (1..$nsubsets) { 3401 push @{$subset_data[$isub]}, ''; 3402 } 3403 } else { 3404 # Remove operator descriptor from @desc 3405 splice @desc, $idesc--, 1; 3406 } 3407 3408 next D_LOOP if $flow eq 'next'; 3409 last D_LOOP if $flow eq 'last'; 3410 if ($flow eq 'skip') { 3411 $idesc++; 3412 next D_LOOP; 3413 } 3414 } 3415 3416 if ($self->{CHANGE_REFERENCE_VALUE}) { 3417 # The data descriptor is to be associated with a new 3418 # reference value, which is fetched from data stream 3419 _croak "Change reference operator 203Y is not followed by element" 3420 . " descriptor, but $id" if $f > 0; 3421 my $num_bits = $self->{CHANGE_REFERENCE_VALUE}; 3422 my $new_refval = bitstream2dec($bitstream, $pos, $num_bits); 3423 $pos += $num_bits + 6; 3424 # Negative value if most significant bit is set (one's complement) 3425 $new_refval = $new_refval & (1<<$num_bits-1) 3426 ? -($new_refval & ((1<<$num_bits-1)-1)) 3427 : $new_refval; 3428 $self->_spew(4, "$id * Change reference value: ". 3429 ($new_refval > 0 ? "+" : "")."$new_refval") if $Spew; 3430 $self->{NEW_REFVAL_OF}{$id} = $new_refval; 3431 # Identify new reference values by setting f=9 3432 push @desc_exp, $id + 900000; 3433 foreach my $isub (1..$nsubsets) { 3434 push @{$subset_data[$isub]}, $new_refval; 3435 } 3436 next D_LOOP; 3437 } 3438 3439 # If operator 204$y 'Add associated field is in effect', 3440 # each data value is preceded by $y bits which should be 3441 # decoded separately. We choose to provide a descriptor 3442 # 999999 in this case (like the ECMWF BUFRDC software) 3443 if ($self->{ADD_ASSOCIATED_FIELD} and $id ne '031021') { 3444 # First extract associated field 3445 push @desc_exp, 999999; 3446 $pos = $self->_extract_compressed_value(999999, $idesc, $pos, $bitstream, 3447 $nsubsets, \@subset_data); 3448 } 3449 3450 # We now have a "real" data descriptor, so add it to the descriptor list 3451 push @desc_exp, $id; 3452 3453 $pos = $self->_extract_compressed_value($id, $idesc, $pos, $bitstream, 3454 $nsubsets, \@subset_data, \@desc); 3455 if ($repeat_X) { 3456 # Delayed repetition factor (030011/030012) is in 3457 # effect, so descriptors and data are to be repeated 3458 push @repeat_desc, $id; 3459 foreach my $isub (1..$nsubsets) { 3460 push @{$repeat_data[$isub]}, $subset_data[$isub]->[-1]; 3461 } 3462 if (--$repeat_X == 0) { 3463 # Store $repeat_factor repetitions of data and descriptors 3464 # (one repetition has already been included) 3465 while (--$repeat_factor) { 3466 push @desc_exp, @repeat_desc; 3467 foreach my $isub (1..$nsubsets) { 3468 push @{$subset_data[$isub]}, @{$repeat_data[$isub]}; 3469 } 3470 $idesc += $repeat_XX; 3471 } 3472 @repeat_desc = (); 3473 @repeat_data = (); 3474 $repeat_XX = 0; 3475 } 3476 } 3477 } 3478 3479 # Check that length of section 4 corresponds to what expected from section 3 3480 $self->_check_section4_length($pos,$maxpos); 3481 3482 $self->{DATA} = \@subset_data; 3483 $self->{DESC} = \@desc_exp; 3484 return; 3485} 3486 3487## Extract the data values for descriptor $id (with index $idesc in 3488## the final expanded descriptor array) for each subset, into 3489## $subset_data_ref->[$isub], $isub = 1...$nsubsets (number of 3490## subsets). Extraction starts at position $pos in $bitstream. 3491sub _extract_compressed_value { 3492 my $self = shift; 3493 my ($id, $idesc, $pos, $bitstream, $nsubsets, $subset_data_ref, $desc_ref) = @_; 3494 my $B_table = $self->{B_TABLE}; 3495 3496 # For quality information, if this relates to a bit map we 3497 # need to store index of the data ($data_idesc) for which 3498 # the quality information applies, as well as the new 3499 # index ($idesc) in the descriptor array for the bit 3500 # mapped values 3501 if (substr($id,0,3) eq '033' 3502 && defined $self->{BITMAP_OPERATORS} 3503 && $self->{BITMAP_OPERATORS}->[-1] eq '222000') { 3504 if (defined $self->{REUSE_BITMAP}) { 3505 my $data_idesc = shift @{ $self->{REUSE_BITMAP}->[0] }; 3506 _croak "$id: Not enough quality values provided" 3507 if not defined $data_idesc; 3508 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 3509 $data_idesc, $idesc; 3510 } else { 3511 my $data_idesc = shift @{ $self->{CURRENT_BITMAP} }; 3512 _croak "$id: Not enough quality values provided" 3513 if not defined $data_idesc; 3514 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 3515 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 3516 + $data_idesc, $idesc; 3517 } 3518 } 3519 3520 # Find the relevant entry in BUFR table B 3521 my ($name,$unit,$scale,$refval,$width); 3522 if ($id == 999999) { 3523 $name = 'ASSOCIATED FIELD'; 3524 $unit = 'NUMERIC'; 3525 $scale = 0; 3526 $refval = 0; 3527 $width = $self->{ADD_ASSOCIATED_FIELD}; 3528 } elsif ($id =~ /^205(\d\d\d)/) { # Signify character 3529 $name = 'CHARACTER INFORMATION'; 3530 $unit = 'CCITTIA5'; 3531 $scale = 0; 3532 $refval = 0; 3533 $width = 8*$1; 3534 } else { 3535 _croak "Data descriptor $id is not present in BUFR table B" 3536 if not exists $B_table->{$id}; 3537 ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 3538 3539 # Override Table B values if Data Description Operators are in effect 3540 if ($self->{NUM_CHANGE_OPERATORS} > 0) { 3541 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 3542 if (defined $self->{CHANGE_SRW}) { 3543 $scale += $self->{CHANGE_SRW}; 3544 $width += int((10*$self->{CHANGE_SRW}+2)/3); 3545 $refval *= 10*$self->{CHANGE_SRW}; 3546 } else { 3547 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 3548 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 3549 } 3550 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 3551 $width = $self->{CHANGE_CCITTIA5_WIDTH} 3552 } 3553 $refval = $self->{NEW_REFVAL_OF}{$id} if defined $self->{NEW_REFVAL_OF}{$id}; 3554 # Difference statistical values use different width and reference value 3555 if ($self->{DIFFERENCE_STATISTICAL_VALUE}) { 3556 $width += 1; 3557 $refval = -2**$width; 3558 undef $self->{DIFFERENCE_STATISTICAL_VALUE}; 3559 $self->{NUM_CHANGE_OPERATORS}--; 3560 } 3561 } 3562 } 3563 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name) if $Spew; 3564 _croak "$id Data width <= 0" if $width <= 0; 3565 3566 if ($unit eq 'CCITTIA5') { 3567 # Extract ASCII string ('minimum value') 3568 _croak "Width for unit CCITTIA5 must be integer bytes\n" 3569 . "is $width bits for descriptor $id" if $width % 8; 3570 my $minval = bitstream2ascii($bitstream, $pos, $width/8); 3571 if ($Spew) { 3572 if ($minval eq "\0" x ($width/8)) { 3573 $self->_spew(5, " Local reference value has all bits zero"); 3574 } else { 3575 $self->_spew(5, " Local reference value: %s", $minval); 3576 } 3577 } 3578 $pos += $width; 3579 # Extract number of bytes for next subsets 3580 my $deltabytes = bitstream2dec($bitstream, $pos, 6); 3581 $self->_spew(5, " Increment width (bytes): %d", $deltabytes) if $Spew; 3582 $pos += 6; 3583 if ($deltabytes && defined $minval) { 3584 # Extract compressed data for all subsets. According 3585 # to 94.6.3 (2) (i) in FM 94 BUFR, the first value for 3586 # character data shall be set to all bits zero 3587 my $nbytes = $width/8; 3588 _complain("Local reference value for compressed CCITTIA5 data " 3589 . "hasn't all bits set to zero, but is '$minval'") 3590 if $Strict_checking and $minval ne "\0" x $nbytes; 3591 my $incr_values; 3592 foreach my $isub (1..$nsubsets) { 3593 my $string = bitstream2ascii($bitstream, $pos, $deltabytes); 3594 if ($Spew) { 3595 $incr_values .= defined $string ? "$string," : ','; 3596 } 3597 # Trim string, also removing nulls 3598 $string = _trim($string, $id); 3599 push @{$subset_data_ref->[$isub]}, $string; 3600 $pos += 8*$deltabytes; 3601 } 3602 if ($Spew) { 3603 chop $incr_values; 3604 $self->_spew(5, " Increment values: %s", $incr_values); 3605 } 3606 } else { 3607 # If min value is defined => All subsets set to min value 3608 # If min value is undefined => Data in all subsets are undefined 3609 my $value = defined $minval ? $minval : undef; 3610 # Trim string, also removing nulls 3611 $value = _trim($value, $id); 3612 foreach my $isub (1..$nsubsets) { 3613 push @{$subset_data_ref->[$isub]}, $value; 3614 } 3615 $pos += $nsubsets*8*$deltabytes; 3616 } 3617 $self->_spew(3, " %s", join ',', 3618 map { defined($subset_data_ref->[$_][-1]) ? 3619 $subset_data_ref->[$_][-1] : 'missing'} 1..$nsubsets) if $Spew; 3620 } else { 3621 # Extract minimum value 3622 my $minval = bitstream2dec($bitstream, $pos, $width); 3623 $minval += $refval if defined $minval; 3624 $pos += $width; 3625 $self->_spew(5, " Local reference value: %d", $minval) if $Spew && defined $minval; 3626 3627 # Extract number of bits for next subsets 3628 my $deltabits = bitstream2dec($bitstream, $pos, 6); 3629 $pos += 6; 3630 $self->_spew(5, " Increment width (bits): %d", $deltabits) if $Spew; 3631 3632 if ($deltabits && defined $minval) { 3633 # Extract compressed data for all subsets 3634 my $incr_values; 3635 foreach my $isub (1..$nsubsets) { 3636 my $value = bitstream2dec($bitstream, $pos, $deltabits); 3637 _complain("value " . ($value + $minval) . " in subset $isub for " 3638 . "$id too big to be encoded without compression") 3639 if ($Strict_checking && defined $value && 3640 ($value + $minval) > 2**$width); 3641 $incr_values .= defined $value ? "$value," : ',' if $Spew; 3642 if (defined $value) { 3643 # Compute and format decoded value 3644 ($scale) = $scale =~ /(-?\d+)/; # untaint 3645 $value = $scale <= 0 ? ($value + $minval)/10**$scale 3646 : sprintf "%.${scale}f", ($value + $minval)/10**$scale; 3647 } 3648 # All bits set to 1 for associated field is NOT 3649 # interpreted as missing value 3650 if ($id == 999999 and ! defined $value) { 3651 $value = 2**$width - 1; 3652 } 3653 push @{$subset_data_ref->[$isub]}, $value; 3654 $pos += $deltabits; 3655 } 3656 if ($Spew) { 3657 chop $incr_values; 3658 $self->_spew(5, " Increment values: %s", $incr_values); 3659 } 3660 } else { 3661 # If minimum value is defined => All subsets set to minimum value 3662 # If minimum value is undefined => Data in all subsets are undefined 3663 my $value; 3664 if (defined $minval) { 3665 # Compute and format decoded value 3666 ($scale) = $scale =~ /(-?\d+)/; # untaint 3667 $value = $scale <= 0 ? $minval/10**$scale 3668 : sprintf "%.${scale}f", $minval/10**$scale; 3669 } 3670 # Exception: all bits set to 1 for associated field is NOT 3671 # interpreted as missing value 3672 if ($id == 999999 and ! defined $value) { 3673 $value = 2**$width - 1; 3674 } 3675 foreach my $isub (1..$nsubsets) { 3676 push @{$subset_data_ref->[$isub]}, $value; 3677 } 3678 $pos += $nsubsets*$deltabits if defined $deltabits; 3679 } 3680 3681 # Bit maps need special treatment. We are only able to 3682 # handle those where all subsets have exactly the same 3683 # bit map with the present method. 3684 if ($id eq '031031' and $self->{BUILD_BITMAP}) { 3685 _croak "$id: Unable to handle bit maps which differ between subsets" 3686 . " in compressed data" if $deltabits; 3687 # Store the index of expanded descriptors if data is 3688 # marked as present in data present indicator: 0 is 3689 # 'present', 1 (undef value) is 'not present' 3690 if (defined $minval) { 3691 push @{$self->{CURRENT_BITMAP}}, $self->{BITMAP_INDEX}; 3692 } 3693 $self->{BITMAP_INDEX}++; 3694 if ($self->{BACKWARD_DATA_REFERENCE} == $self->{NUM_BITMAPS}) { 3695 my $numb = $self->{NUM_BITMAPS}; 3696 if (!defined $self->{BITMAP_START}[$numb]) { 3697 # Look up the element descriptor immediately 3698 # preceding the bitmap operator 3699 my $i = $idesc; 3700 $i-- while ($desc_ref->[$i] ne $self->{BITMAP_OPERATORS}->[-1] 3701 && $i >=0); 3702 $i-- while ($desc_ref->[$i] > 100000 && $i >=0); 3703 _croak "No element descriptor preceding bitmap" if $i < 0; 3704 $self->{BITMAP_START}[$numb] = $i; 3705 } else { 3706 if ($Show_all_operators) { 3707 my $i = $self->{BITMAP_START}[$numb] - 1; 3708 $i-- while ($desc_ref->[$i] > 100000 && $i >=0); 3709 $self->{BITMAP_START}[$numb] = $i; 3710 } else { 3711 $self->{BITMAP_START}[$numb]--; 3712 } 3713 _croak "Bitmap too big" 3714 if $self->{BITMAP_START}[$numb] < 0; 3715 } 3716 } 3717 } elsif ($self->{BUILD_BITMAP} and $self->{BITMAP_INDEX} > 0) { 3718 # We have finished building the bit map 3719 $self->{BUILD_BITMAP} = 0; 3720 $self->{BITMAP_INDEX} = 0; 3721 if ($self->{BACKWARD_DATA_REFERENCE} != $self->{NUM_BITMAPS}) { 3722 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 3723 = $self->{BITMAP_START}[$self->{BACKWARD_DATA_REFERENCE}]; 3724 } 3725 } 3726 $self->_spew(3, " %s", join ' ', 3727 map { defined($subset_data_ref->[$_][-1]) ? 3728 $subset_data_ref->[$_][-1] : 'missing'} 1..$nsubsets) if $Spew; 3729 } 3730 return $pos; 3731} 3732 3733## Takes a text $decoded_message as argument and returns BUFR messages 3734## which would give the same output as $decoded_message when running 3735## dumpsection0(), dumpsection1(), dumpsection3() and dumpsection4() in 3736## turn on each of the reencoded BUFR messages 3737sub reencode_message { 3738 my $self = shift; 3739 my $decoded_message = shift; 3740 my $width = shift || 15; # Optional argument 3741 # Data values usually start at column 31, but if a $width 3742 # different from 15 was used in dumpsection4 you should use the 3743 # same value here 3744 3745 my @lines = split /\n/, $decoded_message; 3746 my $bufr_messages = ''; 3747 my $i = 0; 3748 3749 MESSAGE: while ($i < @lines) { 3750 # Some tidying after decoding of previous message might be 3751 # necessary 3752 $self->{NUM_CHANGE_OPERATORS} = 0; 3753 undef $self->{CHANGE_WIDTH}; 3754 undef $self->{CHANGE_CCITTIA5_WIDTH}; 3755 undef $self->{CHANGE_SCALE}; 3756 undef $self->{CHANGE_REFERENCE_VALUE}; 3757 undef $self->{NEW_REFVAL_OF}; 3758 undef $self->{CHANGE_SRW}; 3759 undef $self->{ADD_ASSOCIATED_FIELD}; 3760 undef $self->{BITMAPS}; 3761 undef $self->{BITMAP_OPERATORS}; 3762 undef $self->{REUSE_BITMAP}; 3763 $self->{NUM_BITMAPS} = 0; 3764 # $self->{LOCAL_USE} is always set for BUFR edition < 4 in _encode_sec1 3765 undef $self->{LOCAL_USE}; 3766 3767 # Extract section 0 info 3768 $i++ while $lines[$i] !~ /^Section 0/ and $i < @lines-1; 3769 last MESSAGE if $i >= @lines-1; # Not containing any decoded BUFR message 3770 $i++; # Skip length of BUFR message 3771 ($self->{BUFR_EDITION}) = $lines[++$i] 3772 =~ /BUFR edition:\s+(\d+)/; 3773 _croak "BUFR edition number not provided or is not a number" 3774 unless defined $self->{BUFR_EDITION}; 3775 3776 # Extract section 1 info 3777 $i++ while $lines[$i] !~ /^Section 1/; 3778 _croak "reencode_message: Don't find decoded section 1" if $i >= @lines; 3779 $i++; # Skip length of section 1 3780 if ($self->{BUFR_EDITION} < 4 ) { 3781 ($self->{MASTER_TABLE}) = $lines[++$i] 3782 =~ /BUFR master table:\s+(\d+)/; 3783 ($self->{SUBCENTRE}) = $lines[++$i] 3784 =~ /Originating subcentre:\s+(\d+)/; 3785 ($self->{CENTRE}) = $lines[++$i] 3786 =~ /Originating centre:\s+(\d+)/; 3787 ($self->{UPDATE_NUMBER}) = $lines[++$i] 3788 =~ /Update sequence number:\s+(\d+)/; 3789 ($self->{OPTIONAL_SECTION}) = $lines[++$i] 3790 =~ /Optional section present:\s+(\d+)/; 3791 ($self->{DATA_CATEGORY}) = $lines[++$i] 3792 =~ /Data category \(table A\):\s+(\d+)/; 3793 ($self->{DATA_SUBCATEGORY}) = $lines[++$i] 3794 =~ /Data subcategory:\s+(\d+)/; 3795 ($self->{MASTER_TABLE_VERSION}) = $lines[++$i] 3796 =~ /Master table version number:\s+(\d+)/; 3797 ($self->{LOCAL_TABLE_VERSION}) = $lines[++$i] 3798 =~ /Local table version number:\s+(\d+)/; 3799 ($self->{YEAR_OF_CENTURY}) = $lines[++$i] 3800 =~ /Year of century:\s+(\d+)/; 3801 ($self->{MONTH}) = $lines[++$i] 3802 =~ /Month:\s+(\d+)/; 3803 ($self->{DAY}) = $lines[++$i] 3804 =~ /Day:\s+(\d+)/; 3805 ($self->{HOUR}) = $lines[++$i] 3806 =~ /Hour:\s+(\d+)/; 3807 ($self->{MINUTE}) = $lines[++$i] 3808 =~ /Minute:\s+(\d+)/; 3809 _croak "reencode_message: Something seriously wrong in decoded section 1" 3810 unless defined $self->{MINUTE}; 3811 } elsif ($self->{BUFR_EDITION} == 4) { 3812 ($self->{MASTER_TABLE}) = $lines[++$i] 3813 =~ /BUFR master table:\s+(\d+)/; 3814 ($self->{CENTRE}) = $lines[++$i] 3815 =~ /Originating centre:\s+(\d+)/; 3816 ($self->{SUBCENTRE}) = $lines[++$i] 3817 =~ /Originating subcentre:\s+(\d+)/; 3818 ($self->{UPDATE_NUMBER}) = $lines[++$i] 3819 =~ /Update sequence number:\s+(\d+)/; 3820 ($self->{OPTIONAL_SECTION}) = $lines[++$i] 3821 =~ /Optional section present:\s+(\d+)/; 3822 ($self->{DATA_CATEGORY}) = $lines[++$i] 3823 =~ /Data category \(table A\):\s+(\d+)/; 3824 ($self->{INT_DATA_SUBCATEGORY}) = $lines[++$i] 3825 =~ /International data subcategory:\s+(\d+)/; 3826 ($self->{LOC_DATA_SUBCATEGORY}) = $lines[++$i] 3827 =~ /Local data subcategory:\s+(\d+)/; 3828 ($self->{MASTER_TABLE_VERSION}) = $lines[++$i] 3829 =~ /Master table version number:\s+(\d+)/; 3830 ($self->{LOCAL_TABLE_VERSION}) = $lines[++$i] 3831 =~ /Local table version number:\s+(\d+)/; 3832 ($self->{YEAR}) = $lines[++$i] 3833 =~ /Year:\s+(\d+)/; 3834 ($self->{MONTH}) = $lines[++$i] 3835 =~ /Month:\s+(\d+)/; 3836 ($self->{DAY}) = $lines[++$i] 3837 =~ /Day:\s+(\d+)/; 3838 ($self->{HOUR}) = $lines[++$i] 3839 =~ /Hour:\s+(\d+)/; 3840 ($self->{MINUTE}) = $lines[++$i] 3841 =~ /Minute:\s+(\d+)/; 3842 ($self->{SECOND}) = $lines[++$i] 3843 =~ /Second:\s+(\d+)/; 3844 _croak "reencode_message: Something seriously wrong in decoded section 1" 3845 unless defined $self->{SECOND}; 3846 } 3847 3848 # Extract section 3 info 3849 $i++ while $lines[$i] !~ /^Section 3/; 3850 _croak "reencode_message: Don't find decoded section 3" if $i >= @lines; 3851 $i++; # Skip length of section 3 3852 3853 ($self->{NUM_SUBSETS}) = $lines[++$i] 3854 =~ /Number of data subsets:\s+(\d+)/; 3855 _croak "Don't support reencoding of 0 subset message" 3856 if $self->{NUM_SUBSETS} == 0; 3857 ($self->{OBSERVED_DATA}) = $lines[++$i] 3858 =~ /Observed data:\s+(\d+)/; 3859 ($self->{COMPRESSED_DATA}) = $lines[++$i] 3860 =~ /Compressed data:\s+(\d+)/; 3861 ($self->{DESCRIPTORS_UNEXPANDED}) = $lines[++$i] 3862 =~ /Data descriptors unexpanded:\s+(\d+.*)/; 3863 _croak "reencode_message: Something seriously wrong in decoded section 3" 3864 unless defined $self->{DESCRIPTORS_UNEXPANDED}; 3865 3866 # Extract data values to use in section 4 3867 my ($data_refs, $desc_refs); 3868 my $subset = 0; 3869 SUBSET: while ($i < @lines-1) { 3870 $_ = $lines[++$i]; 3871 next SUBSET if /^$/ or /^Subset/; 3872 last SUBSET if /^Message/; 3873 $_ = substr $_, 0, $width + 16; 3874 s/^\s+//; 3875 next SUBSET if not /^\d/; 3876 my ($n, $desc, $value) = split /\s+/, $_, 3; 3877 $subset++ if $n == 1; 3878 if (defined $value) { 3879 $value =~ s/\s+$//; 3880 $value = undef if $value eq '' or $value eq 'missing'; 3881 } else { 3882 # Some descriptors are not numbered (like 222000) 3883 $desc = $n; 3884 $value = ''; 3885 } 3886 push @{$data_refs->[$subset]}, $value; 3887 push @{$desc_refs->[$subset]}, $desc; 3888 } 3889 3890 # If optional section is present, pretend it is not, because we 3891 # are not able to encode this section 3892 if ($self->{OPTIONAL_SECTION}) { 3893 $self->{OPTIONAL_SECTION} = 0; 3894 carp "Warning: 'Optional section present' changed from 1 to 0'\n"; 3895 } 3896 3897 $bufr_messages .= $self->encode_message($data_refs, $desc_refs); 3898 } 3899 3900 return $bufr_messages; 3901} 3902 3903 3904## Encode a new BUFR message. All relevant metadata 3905## ($self->{BUFR_EDITION} etc) must have been initialized already or 3906## else the _encode_sec routines will croak. 3907sub encode_message { 3908 my $self = shift; 3909 my ($data_refs, $desc_refs) = @_; 3910 3911 _croak "encode_message: No data/descriptors provided" unless $desc_refs; 3912 3913 $self->{MESSAGE_NUMBER}++; 3914 $self->_spew(2, "Encoding message number %d", $self->{MESSAGE_NUMBER}) if $Spew; 3915 3916 $self->load_BDtables(); 3917 3918 $self->_spew(2, "Encoding section 1-3") if $Spew; 3919 my $sec1_stream = $self->_encode_sec1(); 3920 my $sec2_stream = $self->_encode_sec2(); 3921 my $sec3_stream = $self->_encode_sec3(); 3922 $self->_spew(2, "Encoding section 4") if $Spew; 3923 my $sec4_stream = $self->_encode_sec4($data_refs, $desc_refs); 3924 3925 # Compute length of whole message and encode section 0 3926 my $msg_len = 8 + length($sec1_stream) + length($sec2_stream) 3927 + length($sec3_stream) + length($sec4_stream) + 4; 3928 my $msg_len_binary = pack("N", $msg_len); 3929 my $bufr_edition_binary = pack('n', $self->{BUFR_EDITION}); 3930 my $sec0_stream = 'BUFR' . substr($msg_len_binary,1,3) 3931 . substr($bufr_edition_binary,1,1); 3932 3933 my $new_message = $sec0_stream . $sec1_stream . $sec2_stream 3934 . $sec3_stream . $sec4_stream . '7777'; 3935 return $new_message; 3936} 3937 3938## Encode and return section 1 3939sub _encode_sec1 { 3940 my $self = shift; 3941 3942 my $bufr_edition = $self->{BUFR_EDITION} or 3943 _croak "_encode_sec1: BUFR edition not defined"; 3944 3945 my @keys = qw( MASTER_TABLE CENTRE SUBCENTRE UPDATE_NUMBER 3946 OPTIONAL_SECTION DATA_CATEGORY MASTER_TABLE_VERSION 3947 LOCAL_TABLE_VERSION MONTH DAY HOUR MINUTE ); 3948 if ($bufr_edition < 4) { 3949 push @keys, qw( DATA_SUBCATEGORY YEAR_OF_CENTURY ); 3950 } elsif ($bufr_edition == 4) { 3951 push @keys, qw( INT_DATA_SUBCATEGORY LOC_DATA_SUBCATEGORY YEAR SECOND ); 3952 } 3953 3954 # Check that the required variables for section 1 are provided 3955 foreach my $key (@keys) { 3956 _croak "_encode_sec1: $key not given" 3957 unless defined $self->{$key}; 3958 } 3959 3960 $self->_validate_datetime() if ($Strict_checking); 3961 3962 my $sec1_stream; 3963 # Byte 4- 3964 if ($bufr_edition < 4) { 3965 $self->{LOCAL_USE} = "\0" if !defined $self->{LOCAL_USE}; 3966 $sec1_stream = pack 'C14a*', 3967 $self->{MASTER_TABLE}, 3968 $self->{SUBCENTRE}, 3969 $self->{CENTRE}, 3970 $self->{UPDATE_NUMBER}, 3971 $self->{OPTIONAL_SECTION} ? 128 : 0, 3972 $self->{DATA_CATEGORY}, 3973 $self->{DATA_SUBCATEGORY}, 3974 $self->{MASTER_TABLE_VERSION}, 3975 $self->{LOCAL_TABLE_VERSION}, 3976 $self->{YEAR_OF_CENTURY}, 3977 $self->{MONTH}, 3978 $self->{DAY}, 3979 $self->{HOUR}, 3980 $self->{MINUTE}, 3981 $self->{LOCAL_USE}; 3982 } elsif ($bufr_edition == 4) { 3983 $sec1_stream = pack 'CnnC7nC5', 3984 $self->{MASTER_TABLE}, 3985 $self->{CENTRE}, 3986 $self->{SUBCENTRE}, 3987 $self->{UPDATE_NUMBER}, 3988 $self->{OPTIONAL_SECTION} ? 128 : 0, 3989 $self->{DATA_CATEGORY}, 3990 $self->{INT_DATA_SUBCATEGORY}, 3991 $self->{LOC_DATA_SUBCATEGORY}, 3992 $self->{MASTER_TABLE_VERSION}, 3993 $self->{LOCAL_TABLE_VERSION}, 3994 $self->{YEAR}, 3995 $self->{MONTH}, 3996 $self->{DAY}, 3997 $self->{HOUR}, 3998 $self->{MINUTE}, 3999 $self->{SECOND}; 4000 $sec1_stream .= pack 'a*', $self->{LOCAL_USE} 4001 if defined $self->{LOCAL_USE}; 4002 } 4003 4004 my $sec1_len = 3 + length $sec1_stream; 4005 if ($bufr_edition < 4) { 4006 # Each section should be an even number of octets 4007 if ($sec1_len % 2) { 4008 $sec1_stream .= "\0"; 4009 $sec1_len++; 4010 } 4011 } 4012 4013 # Byte 1-3 4014 my $sec1_len_binary = substr pack("N", $sec1_len), 1, 3; 4015 4016 return $sec1_len_binary . $sec1_stream; 4017} 4018 4019## Encode and return section 2 (empty string if no optional section) 4020sub _encode_sec2 { 4021 my $self = shift; 4022 if ($self->{OPTIONAL_SECTION}) { 4023 _croak "_encode_sec2: No optional section provided" 4024 unless defined $self->{SEC2_STREAM}; 4025 return $self->{SEC2_STREAM}; 4026 } else { 4027 return ''; 4028 } 4029} 4030 4031## Encode and return section 3 4032sub _encode_sec3 { 4033 my $self = shift; 4034 4035 # Check that the required variables for section 3 are provided 4036 foreach my $key (qw(NUM_SUBSETS OBSERVED_DATA COMPRESSED_DATA 4037 DESCRIPTORS_UNEXPANDED)) { 4038 _croak "_encode_sec3: $key not given" 4039 unless defined $self->{$key}; 4040 } 4041 4042 my @desc = split / /, $self->{DESCRIPTORS_UNEXPANDED}; 4043 4044 # Byte 5-6 4045 my $nsubsets_binary = pack "n", $self->{NUM_SUBSETS}; 4046 4047 # Byte 7 4048 my $flag = pack 'C', $self->{OBSERVED_DATA}*128 + 4049 $self->{COMPRESSED_DATA}*64; 4050 4051 # Byte 8- 4052 my $desc_binary = "\0\0" x @desc; 4053 my $pos = 0; 4054 foreach my $desc (@desc) { 4055 my $f = substr($desc,0,1); 4056 my $x = substr($desc,1,2)+0; 4057 my $y = substr($desc,3,3)+0; 4058 dec2bitstream($f, $desc_binary, $pos, 2); 4059 $pos += 2; 4060 dec2bitstream($x, $desc_binary, $pos, 6); 4061 $pos += 6; 4062 dec2bitstream($y, $desc_binary, $pos, 8); 4063 $pos += 8; 4064 } 4065 4066 my $sec3_len = 7 + length $desc_binary; 4067 if ($self->{BUFR_EDITION} < 4) { 4068 # Each section should be an even number of octets 4069 if ($sec3_len % 2) { 4070 $desc_binary .= "\0"; 4071 $sec3_len++; 4072 } 4073 } 4074 4075 # Byte 1-4 4076 my $sec3_len_binary = pack("N", $sec3_len); 4077 my $sec3_start = substr($sec3_len_binary, 1, 3) . "\0"; 4078 4079 return $sec3_start . $nsubsets_binary . $flag . $desc_binary; 4080} 4081 4082## Encode and return section 4 4083sub _encode_sec4 { 4084 my $self = shift; 4085 my ($data_refs, $desc_refs) = @_; 4086 4087 # Check that dimension of argument arrays agrees with number of 4088 # subsets in section 3 4089 my $nsubsets = $self->{NUM_SUBSETS}; 4090 _croak "Wrong number of subsets ($nsubsets) in section 3?\n" 4091 . "Disagrees with dimension of descriptor array used as argument " 4092 . "to encode_message()" 4093 unless @$desc_refs == $nsubsets + 1; 4094 4095 my ($bitstream, $byte_len) = $self->{COMPRESSED_DATA} 4096 ? $self->_encode_compressed_bitstream($data_refs, $desc_refs) 4097 : $self->_encode_bitstream($data_refs, $desc_refs); 4098 4099 my $sec4_len = $byte_len + 4; 4100 my $sec4_len_binary = pack("N", $sec4_len); 4101 my $sec4_stream = substr($sec4_len_binary, 1, 3) . "\0" . $bitstream; 4102 4103 return $sec4_stream; 4104} 4105 4106## Encode a nil message, i.e. all values set to missing except delayed 4107## replication factors and the (descriptor, value) pairs in the hash 4108## ref $stationid_ref. Delayed replication factors will all be set to 4109## 1 unless $delayed_repl_ref is provided, in which case the 4110## descriptors 031001 and 031002 will get the values contained in 4111## @$delayed_repl_ref. Note that data in section 1 and 3 must have 4112## been set before calling this method. 4113sub encode_nil_message { 4114 my $self = shift; 4115 my ($stationid_ref, $delayed_repl_ref) = @_; 4116 4117 _croak "encode_nil_message: No station descriptors provided" 4118 unless $stationid_ref; 4119 4120 my $bufr_edition = $self->{BUFR_EDITION} or 4121 _croak "encode_nil_message: BUFR edition not defined"; 4122 4123 # Since a nil message necessarily is a one subset message, some 4124 # metadata might need to be adjusted (saving the user for having 4125 # to remember this) 4126 $self->set_number_of_subsets(1); 4127 $self->set_compressed_data(0); 4128 4129 $self->load_BDtables(); 4130 4131 $self->_spew(2, "Encoding NIL message") if $Spew; 4132 my $sec1_stream = $self->_encode_sec1(); 4133 my $sec3_stream = $self->_encode_sec3(); 4134 my $sec4_stream = $self->_encode_nil_sec4($stationid_ref, 4135 $delayed_repl_ref); 4136 4137 # Compute length of whole message and encode section 0 4138 my $msg_len = 8 + length($sec1_stream) + length($sec3_stream) 4139 + length($sec4_stream) + 4; 4140 my $msg_len_binary = pack("N", $msg_len); 4141 my $bufr_edition_binary = pack('n', $bufr_edition); 4142 my $sec0_stream = 'BUFR' . substr($msg_len_binary,1,3) 4143 . substr($bufr_edition_binary,1,1); 4144 4145 my $new_message = $sec0_stream . $sec1_stream . $sec3_stream . $sec4_stream 4146 . '7777'; 4147 return $new_message; 4148} 4149 4150## Encode and return section 4 with all values set to missing except 4151## delayed replication factors and the (descriptor, value) pairs in 4152## the hash ref $stationid_ref. Delayed replication factors will all 4153## be set to 1 unless $delayed_repl_ref is provided, in which case the 4154## descriptors 031001 and 031002 will get the values contained in 4155## @$delayed_repl_ref (in that order). 4156sub _encode_nil_sec4 { 4157 my $self = shift; 4158 $self->{CODING} = 'ENCODE'; 4159 my ($stationid_ref, $delayed_repl_ref) = @_; 4160 my @delayed_repl = defined $delayed_repl_ref ? @$delayed_repl_ref : (); 4161 4162 # Get the expanded list of descriptors (i.e. expanded with table D) 4163 if (not $self->{DESCRIPTORS_EXPANDED}) { 4164 _croak "_encode_nil_sec4: DESCRIPTORS_UNEXPANDED not given" 4165 unless $self->{DESCRIPTORS_UNEXPANDED}; 4166 my @unexpanded = split / /, $self->{DESCRIPTORS_UNEXPANDED}; 4167 _croak "_encode_nil_sec4: D_TABLE not given" 4168 unless $self->{D_TABLE}; 4169 my $alias = "$self->{TABLE_VERSION} " . $self->{DESCRIPTORS_UNEXPANDED}; 4170 if (exists $Descriptors_already_expanded{$alias}) { 4171 $self->{DESCRIPTORS_EXPANDED} = $Descriptors_already_expanded{$alias}; 4172 } else { 4173 $Descriptors_already_expanded{$alias} = $self->{DESCRIPTORS_EXPANDED} 4174 = join " ", _expand_descriptors($self->{D_TABLE}, @unexpanded); 4175 } 4176 } 4177 4178 # The rest is very similar to sub _decode_bitstream, except that we 4179 # now are encoding, not decoding a bitstream, with most values set 4180 # to missing value, and we do not need to fully expand the 4181 # descriptors. 4182 my $B_table = $self->{B_TABLE}; 4183 my @operators; 4184 my $bitstream = chr(255) x 65536; # one bits only 4185 my $pos = 0; 4186 4187 my @desc = split /\s/, $self->{DESCRIPTORS_EXPANDED}; 4188 D_LOOP: for (my $idesc = 0; $idesc < @desc; $idesc++) { 4189 4190 my $id = $desc[$idesc]; 4191 my $f = substr($id,0,1); 4192 my $x = substr($id,1,2)+0; 4193 my $y = substr($id,3,3)+0; 4194 4195 if ($f == 1) { 4196 # Delayed replication 4197 if ($x == 0) { 4198 _complain("Nonsensical replication of zero descriptors ($id)"); 4199 $idesc++; 4200 next D_LOOP; 4201 } 4202 _croak "$id _expand_descriptors() did not do its job" 4203 if $y > 0; 4204 4205 $_ = $desc[$idesc+1]; 4206 _croak "$id Erroneous replication factor" 4207 unless /^0310(00|01|02|11|12)/ && exists $B_table->{$_}; 4208 my $factor = 1; 4209 if (@delayed_repl && /^03100(1|2)/) { 4210 $factor = shift @delayed_repl; 4211 croak "Delayed replication factor must be positive integer in " 4212 . "encode_nil_message, is '$factor'" 4213 if ($factor !~ /^\d+$/ || $factor == 0); 4214 } 4215 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$_}; 4216 if ($Spew) { 4217 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name); 4218 $self->_spew(3, " %s", $factor); 4219 } 4220 dec2bitstream($factor, $bitstream, $pos, $width); 4221 $pos += $width; 4222 # Include the delayed replication in descriptor list 4223 splice @desc, $idesc++, 0, $_; 4224 4225 my @r = (); 4226 push @r, @desc[($idesc+2)..($idesc+$x+1)] while $factor--; 4227 $self->_spew(4, "Delayed replication ($id $_ -> @r)") if $Spew; 4228 splice @desc, $idesc, 2+$x, @r; 4229 4230 if ($idesc < @desc) { 4231 redo D_LOOP; 4232 } else { 4233 last D_LOOP; # Might happen if delayed factor is 0 4234 } 4235 4236 } elsif ($f == 2) { 4237 my $next_id = $desc[$idesc+1]; 4238 my $flow; 4239 my $bm_idesc; 4240 ($pos, $flow, $bm_idesc, @operators) 4241 = $self->_apply_operator_descriptor($id, $x, $y, $pos, 0, 4242 $next_id, @operators); 4243 next D_LOOP if $flow eq 'next'; 4244 } 4245 4246 # We now have a "real" data descriptor 4247 4248 # Find the relevant entry in BUFR table B 4249 _croak "Data descriptor $id is not present in BUFR table B" 4250 unless exists $B_table->{$id}; 4251 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 4252 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name) if $Spew; 4253 4254 # Override Table B values if Data Description Operators are in effect 4255 if ($self->{NUM_CHANGE_OPERATORS} > 0) { 4256 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 4257 if (defined $self->{CHANGE_SRW}) { 4258 $scale += $self->{CHANGE_SRW}; 4259 $width += int((10*$self->{CHANGE_SRW}+2)/3); 4260 $refval *= 10*$self->{CHANGE_SRW}; 4261 } else { 4262 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 4263 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 4264 } 4265 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 4266 $width = $self->{CHANGE_CCITTIA5_WIDTH} 4267 } 4268 $refval = $self->{NEW_REFVAL_OF}{$id} if defined $self->{NEW_REFVAL_OF}{$id}; 4269 } 4270 _croak "$id Data width <= 0" if $width <= 0; 4271 4272 if ($stationid_ref->{$id}) { 4273 my $value = $stationid_ref->{$id}; 4274 $self->_spew(3, " %s", $value) if $Spew; 4275 if ($unit eq 'CCITTIA5') { 4276 # Encode ASCII string in $width bits (left justified, 4277 # padded with spaces) 4278 my $num_bytes = int($width/8); 4279 _croak "Ascii string too long to fit in $width bits: $value" 4280 if length($value) > $num_bytes; 4281 $value .= ' ' x ($num_bytes - length($value)); 4282 ascii2bitstream($value, $bitstream, $pos, $num_bytes); 4283 } else { 4284 # Encode value as integer in $width bits 4285 $value = int($value * 10**$scale - $refval + 0.5); 4286 _croak "Data value no $id is negative: $value" 4287 if $value < 0; 4288 dec2bitstream($value, $bitstream, $pos, $width); 4289 } 4290 } else { 4291 # Missing value is encoded as 1 bits 4292 } 4293 $pos += $width; 4294 } 4295 4296 # Pad with 0 bits if necessary to get an even or integer number of 4297 # octets, depending on bufr edition 4298 my $padnum = $self->{BUFR_EDITION} < 4 ? (16-($pos%16)) % 16 : (8-($pos%8)) % 8; 4299 if ($padnum > 0) { 4300 null2bitstream($bitstream, $pos, $padnum); 4301 } 4302 my $len = ($pos + $padnum)/8; 4303 $bitstream = substr $bitstream, 0, $len; 4304 4305 # Encode section 4 4306 my $sec4_len_binary = pack("N", $len + 4); 4307 my $sec4_stream = substr($sec4_len_binary, 1, 3) . "\0" . $bitstream; 4308 4309 return $sec4_stream; 4310} 4311 4312## Encode bitstream using the data values in $data_refs, first 4313## expanding section 3 fully (and comparing with $desc_refs to check 4314## for consistency). This sub is very similar to sub _decode_bitstream 4315sub _encode_bitstream { 4316 my $self = shift; 4317 $self->{CODING} = 'ENCODE'; 4318 my ($data_refs, $desc_refs) = @_; 4319 4320 # Expand section 3 except for delayed replication and operator descriptors 4321 my @unexpanded = split / /, $self->{DESCRIPTORS_UNEXPANDED}; 4322 my $alias = "$self->{TABLE_VERSION} " . $self->{DESCRIPTORS_UNEXPANDED}; 4323 if (exists $Descriptors_already_expanded{$alias}) { 4324 $self->{DESCRIPTORS_EXPANDED} = $Descriptors_already_expanded{$alias}; 4325 } else { 4326 $Descriptors_already_expanded{$alias} = $self->{DESCRIPTORS_EXPANDED} 4327 = join " ", _expand_descriptors($self->{D_TABLE}, @unexpanded); 4328 } 4329 4330 my $nsubsets = $self->{NUM_SUBSETS}; 4331 my $B_table = $self->{B_TABLE}; 4332 my $maxlen = 1024; 4333 my $bitstream = chr(255) x $maxlen; # one bits only 4334 my $pos = 0; 4335 my @operators; 4336 4337 S_LOOP: foreach my $isub (1..$nsubsets) { 4338 $self->_spew(2, "Encoding subset number %d", $isub) if $Spew; 4339 4340 # Bit maps might vary from subset to subset, so must be rebuilt 4341 undef $self->{BITMAP_OPERATORS}; 4342 undef $self->{BITMAP_START}; 4343 undef $self->{REUSE_BITMAP}; 4344 $self->{NUM_BITMAPS} = 0; 4345 $self->{BACKWARD_DATA_REFERENCE} = 1; 4346 $self->{NUM_CHANGE_OPERATORS} = 0; 4347 4348 # The data values to use for this subset 4349 my $data_ref = $data_refs->[$isub]; 4350 # The descriptors from expanding section 3 4351 my @desc = split /\s/, $self->{DESCRIPTORS_EXPANDED}; 4352 # The descriptors to compare with for this subset 4353 my $desc_ref = $desc_refs->[$isub]; 4354 4355 # Note: @desc as well as $idesc may be changed during this loop, 4356 # so we cannot use a foreach loop instead 4357 D_LOOP: for (my $idesc = 0; $idesc < @desc; $idesc++) { 4358 my $id = $desc[$idesc] 4359 || _croak("No descriptor no. $idesc defined. Consider using --strict_checking 2" 4360 . " or --verbose 4 to explore what went wrong in the encoding"); 4361 my $f = substr($id,0,1); 4362 my $x = substr($id,1,2)+0; 4363 my $y = substr($id,3,3)+0; 4364 4365 if ($f == 1) { 4366 # Delayed replication 4367 if ($x == 0) { 4368 _complain("Nonsensical replication of zero descriptors ($id)"); 4369 $idesc++; 4370 next D_LOOP; 4371 } 4372 _croak "$id _expand_descriptors() did not do its job" 4373 if $y > 0; 4374 4375 my $next_id = $desc[$idesc+1]; 4376 _croak "$id Erroneous replication factor" 4377 unless $next_id =~ /^0310(00|01|02|11|12)/ && exists $B_table->{$next_id}; 4378 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $next_id" 4379 if $desc_ref->[$idesc] != $next_id; 4380 my $factor = $data_ref->[$idesc]; 4381 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$next_id}; 4382 if ($Spew) { 4383 $self->_spew(3, "%6s %-20s %s", $next_id, $unit, $name); 4384 $self->_spew(3, " %s", $factor); 4385 } 4386 ($bitstream, $pos, $maxlen) 4387 = $self->_encode_value($factor,$isub,$unit,$scale,$refval, 4388 $width,$next_id,$bitstream,$pos,$maxlen); 4389 # Include the delayed replication/repetition in descriptor list 4390 splice @desc, $idesc++, 0, $next_id; 4391 4392 my @r = (); 4393 push @r, @desc[($idesc+2)..($idesc+$x+1)] while $factor--; 4394 splice @desc, $idesc, 2+$x, @r; 4395 4396 if ($next_id eq '031011' || $next_id eq '031012') { 4397 # For delayed repetition we should include data just 4398 # once, so skip to the last set in data array 4399 $idesc += $x * ($data_ref->[$idesc-1] - 1); 4400 # We ought to check that the data sets we skipped are 4401 # indeed equal to the last set! 4402 $self->_spew(4, "Delayed repetition ($id $next_id -> @r)") if $Spew; 4403 } else { 4404 $self->_spew(4, "Delayed replication ($id $next_id -> @r)") if $Spew; 4405 } 4406 if ($idesc < @desc) { 4407 redo D_LOOP; 4408 } else { 4409 last D_LOOP; # Might happen if delayed factor is 0 4410 } 4411 4412 } elsif ($f == 2) { 4413 my $flow; 4414 my $bm_idesc; 4415 ($pos, $flow, $bm_idesc, @operators) 4416 = $self->_apply_operator_descriptor($id, $x, $y, $pos, $isub, 4417 $desc[$idesc+1], @operators); 4418 if ($flow eq 'redo_bitmap') { 4419 # Data value is associated with the descriptor 4420 # defined by bit map. Remember original and new 4421 # index in descriptor array for the bit mapped 4422 # values ('dr' = data reference) 4423 my $dr_idesc; 4424 if (!defined $bm_idesc) { 4425 $dr_idesc = shift @{ $self->{REUSE_BITMAP}->[$isub]}; 4426 } elsif (!$Show_all_operators) { 4427 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 4428 + $bm_idesc; 4429 } else { 4430 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}]; 4431 # Skip operator descriptors 4432 while ($bm_idesc-- > 0) { 4433 $dr_idesc++; 4434 $dr_idesc++ while ($desc[$dr_idesc] >= 200000); 4435 } 4436 } 4437 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 4438 $dr_idesc, $idesc; 4439 $desc[$idesc] = $desc[$dr_idesc]; 4440 redo D_LOOP; 4441 } elsif ($flow eq 'signify_character') { 4442 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $id" 4443 if $desc_ref->[$idesc] != $id; 4444 # Get ASCII string 4445 my $value = $data_ref->[$idesc]; 4446 my $name = 'SIGNIFY CHARACTER'; 4447 my $unit = 'CCITTIA5'; 4448 my ($scale, $refval, $width) = (0, 0, 8*$y); 4449 ($bitstream, $pos, $maxlen) 4450 = $self->_encode_value($value,$isub,$unit,$scale,$refval,$width,"205$y",$bitstream,$pos,$maxlen); 4451 next D_LOOP; 4452 } elsif ($flow eq 'no_value') { 4453 next D_LOOP; 4454 } 4455 4456 # Remove operator descriptor from @desc 4457 splice @desc, $idesc--, 1; 4458 4459 next D_LOOP if $flow eq 'next'; 4460 last D_LOOP if $flow eq 'last'; 4461 } 4462 4463 if ($self->{CHANGE_REFERENCE_VALUE}) { 4464 # The data descriptor is to be associated with a new 4465 # reference value, which is fetched from data stream, 4466 # possibly with f=9 instead of f=0 for descriptor 4467 $id -= 900000 if $id =~ /^9/; 4468 _croak "Change reference operator 203Y is not followed by element" 4469 . " descriptor, but $id" if $f > 0; 4470 my $new_refval = $data_ref->[$idesc]; 4471 $self->{NEW_REFVAL_OF}{$id}{$isub} = $new_refval; 4472 ($bitstream, $pos, $maxlen) 4473 = $self->_encode_reference_value($new_refval,$id,$bitstream,$pos,$maxlen); 4474 next D_LOOP; 4475 } 4476 4477 # If operator 204$y 'Add associated field' is in effect, 4478 # each data value is preceded by $y bits which should be 4479 # encoded separately. We choose to provide a descriptor 4480 # 999999 in this case (like the ECMWF BUFRDC software) 4481 if ($self->{ADD_ASSOCIATED_FIELD} and $id ne '031021') { 4482 # First encode associated field 4483 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected 999999" 4484 if $desc_ref->[$idesc] != 999999; 4485 my $value = $data_ref->[$idesc]; 4486 my $name = 'ASSOCIATED FIELD'; 4487 my $unit = 'NUMERIC'; 4488 my ($scale, $refval) = (0, 0); 4489 my $width = $self->{ADD_ASSOCIATED_FIELD}; 4490 $self->_spew(4, "Added associated field: %s", $value) if $Spew; 4491 ($bitstream, $pos, $maxlen) 4492 = $self->_encode_value($value,$isub,$unit,$scale,$refval,$width,999999,$bitstream,$pos,$maxlen); 4493 # Insert the artificial 999999 descriptor for the 4494 # associated value and increment $idesc to prepare for 4495 # handling the 'real' value below 4496 splice @desc, $idesc++, 0, 999999; 4497 } 4498 4499 4500 4501 # For quality information, if this relates to a bit map we 4502 # need to store index of the data ($data_idesc) for which 4503 # the quality information applies, as well as the new 4504 # index ($idesc) in the descriptor array for the bit 4505 # mapped values 4506 if (substr($id,0,3) eq '033' 4507 && defined $self->{BITMAP_OPERATORS} 4508 && $self->{BITMAP_OPERATORS}->[-1] eq '222000') { 4509 if (defined $self->{REUSE_BITMAP}) { 4510 my $data_idesc = shift @{ $self->{REUSE_BITMAP}->[$isub] }; 4511 _croak "$id: Not enough quality values provided" 4512 if not defined $data_idesc; 4513 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 4514 $data_idesc, $idesc; 4515 } else { 4516 my $data_idesc = shift @{ $self->{CURRENT_BITMAP} }; 4517 _croak "$id: Not enough quality values provided" 4518 if not defined $data_idesc; 4519 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[$isub] }, 4520 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 4521 + $data_idesc, $idesc; 4522 } 4523 } 4524 4525 my $value = $data_ref->[$idesc]; 4526 4527 if ($id eq '031031' and $self->{BUILD_BITMAP}) { 4528 # Store the index of expanded descriptors if data is 4529 # marked as present in data present indicator: 0 is 4530 # 'present', 1 (undef value) is 'not present'. E.g. 4531 # bitmap = 1100110 => (2,3,6) is stored in $self->{CURRENT_BITMAP} 4532 if (defined $value and $value == 0) { 4533 push @{$self->{CURRENT_BITMAP}}, $self->{BITMAP_INDEX}; 4534 } 4535 $self->{BITMAP_INDEX}++; 4536 if ($self->{BACKWARD_DATA_REFERENCE} == $self->{NUM_BITMAPS}) { 4537 my $numb = $self->{NUM_BITMAPS}; 4538 if (!defined $self->{BITMAP_START}[$numb]) { 4539 # Look up the element descriptor immediately 4540 # preceding the bitmap operator 4541 my $i = $idesc; 4542 $i-- while ($desc[$i] ne $self->{BITMAP_OPERATORS}->[-1] 4543 && $i >=0); 4544 $i-- while ($desc[$i] > 100000 && $i >=0); 4545 _croak "No element descriptor preceding bitmap" if $i < 0; 4546 $self->{BITMAP_START}[$numb] = $i; 4547 } else { 4548 $self->{BITMAP_START}[$numb]--; 4549 _croak "Bitmap too big" 4550 if $self->{BITMAP_START}[$numb] < 0; 4551 } 4552 } 4553 } elsif ($self->{BUILD_BITMAP} and $self->{BITMAP_INDEX} > 0) { 4554 # We have finished building the bit map 4555 $self->{BUILD_BITMAP} = 0; 4556 $self->{BITMAP_INDEX} = 0; 4557 if ($self->{BACKWARD_DATA_REFERENCE} != $self->{NUM_BITMAPS}) { 4558 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 4559 = $self->{BITMAP_START}[$self->{BACKWARD_DATA_REFERENCE}]; 4560 } 4561 } 4562 4563 _croak "Not enough descriptors provided (expected no $idesc to be $id)" 4564 unless exists $desc_ref->[$idesc]; 4565 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $id" 4566 if $desc_ref->[$idesc] != $id; 4567 4568 # Find the relevant entry in BUFR table B 4569 _croak "Error: Data descriptor $id is not present in BUFR table B" 4570 unless exists $B_table->{$id}; 4571 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 4572 $refval = $self->{NEW_REFVAL_OF}{$id}{$isub} if defined $self->{NEW_REFVAL_OF}{$id} 4573 && defined $self->{NEW_REFVAL_OF}{$id}{$isub}; 4574 if ($Spew) { 4575 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name); 4576 $self->_spew(3, " %s", defined $value ? $value : 'missing'); 4577 } 4578########### call to_encode_value inlined for speed 4579 # Override Table B values if Data Description Operators are in 4580 # effect (except for associated fields) 4581 if ($self->{NUM_CHANGE_OPERATORS} > 0 && $id != 999999) { 4582 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 4583 if (defined $self->{CHANGE_SRW}) { 4584 $scale += $self->{CHANGE_SRW}; 4585 $width += int((10*$self->{CHANGE_SRW}+2)/3); 4586 $refval *= 10*$self->{CHANGE_SRW}; 4587 } else { 4588 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 4589 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 4590 } 4591 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 4592 $width = $self->{CHANGE_CCITTIA5_WIDTH} 4593 } 4594 _croak "$id Data width is $width which is <= 0" if $width <= 0; 4595 $refval = $self->{NEW_REFVAL_OF}{$id}{$isub} if defined $self->{NEW_REFVAL_OF}{$id} 4596 && defined $self->{NEW_REFVAL_OF}{$id}{$isub}; 4597 # Difference statistical values use different width and reference value 4598 if ($self->{DIFFERENCE_STATISTICAL_VALUE}) { 4599 $width += 1; 4600 $refval = -2**$width; 4601 undef $self->{DIFFERENCE_STATISTICAL_VALUE}; 4602 $self->{NUM_CHANGE_OPERATORS}--; 4603 } 4604 } 4605 4606 # Ensure that bitstream is big enough to encode $value 4607 while ($pos + $width > $maxlen*8) { 4608 $bitstream .= chr(255) x $maxlen; 4609 $maxlen *= 2; 4610 } 4611 4612 if (not defined($value)) { 4613 # Missing value is encoded as 1 bits 4614 $pos += $width; 4615 } elsif ($unit eq 'CCITTIA5') { 4616 # Encode ASCII string in $width bits (left justified, 4617 # padded with spaces) 4618 my $num_bytes = int ($width/8); 4619 _croak "Ascii string too long to fit in $width bits: $value" 4620 if length($value) > $num_bytes; 4621 $value .= ' ' x ($num_bytes - length($value)); 4622 ascii2bitstream($value, $bitstream, $pos, $num_bytes); 4623 $pos += $width; 4624 } else { 4625 # Encode value as integer in $width bits 4626 _croak "Value '$value' is not a number for descriptor $id" 4627 unless looks_like_number($value); 4628 $value = int($value * 10**$scale - $refval + 0.5); 4629 _croak "Encoded data value for $id is negative: $value" if $value < 0; 4630 my $max_value = 2**$width - 1; 4631 _croak "Encoded data value for $id is too big to fit in $width bits: $value" 4632 if $value > $max_value; 4633 # Check for illegal flag value 4634 if ($Strict_checking && $unit =~ /^FLAG[ ]?TABLE/ && $width > 1 4635 && $value < $max_value && $value % 2) { 4636 _complain("$id - $value: rightmost bit $width is set indicating missing value" 4637 . " but then value should be $max_value"); 4638 } 4639 dec2bitstream($value, $bitstream, $pos, $width); 4640 $pos += $width; 4641 } 4642########### end inlining of_encode_value 4643 } # End D_LOOP 4644 } # END S_LOOP 4645 4646 4647 4648 4649 # Pad with 0 bits if necessary to get an even or integer number of 4650 # octets, depending on bufr edition 4651 my $padnum = $self->{BUFR_EDITION} < 4 ? (16-($pos%16)) % 16 : (8-($pos%8)) % 8; 4652 if ($padnum > 0) { 4653 null2bitstream($bitstream, $pos, $padnum); 4654 } 4655 my $len = ($pos + $padnum)/8; 4656 $bitstream = substr $bitstream, 0, $len; 4657 4658 return ($bitstream, $len); 4659} 4660 4661sub _encode_reference_value { 4662 my $self = shift; 4663 my ($refval,$id,$bitstream,$pos,$maxlen) = @_; 4664 4665 my $width = $self->{CHANGE_REFERENCE_VALUE}; 4666 4667 # Ensure that bitstream is big enough to encode $value 4668 while ($pos + $width > $maxlen*8) { 4669 $bitstream .= chr(255) x $maxlen; 4670 $maxlen *= 2; 4671 } 4672 4673 $self->_spew(4, "Encoding new reference value %d for %6s in %d bits", 4674 $refval, $id, $width) if $Spew; 4675 if ($refval >= 0) { 4676 _croak "Encoded reference value for $id is too big to fit " 4677 . "in $width bits: $refval" 4678 if $refval > 2**$width - 1; 4679 dec2bitstream($refval, $bitstream, $pos, $width); 4680 } else { 4681 # Negative reference values should be encoded by setting first 4682 # bit to 1 and then encoding absolute value 4683 _croak "Encoded reference value for $id is too big to fit " 4684 . "in $width bits: $refval" 4685 if -$refval > 2**($width-1) - 1; 4686 dec2bitstream(-$refval, $bitstream, $pos+1, $width-1); 4687 } 4688 $pos += $width; 4689 4690 return ($bitstream, $pos, $maxlen); 4691} 4692 4693sub _encode_value { 4694 my $self = shift; 4695 my ($value,$isub,$unit,$scale,$refval,$width,$id,$bitstream,$pos,$maxlen) = @_; 4696 4697 # Override Table B values if Data Description Operators are in 4698 # effect (except for associated fields) 4699 if ($self->{NUM_CHANGE_OPERATORS} > 0 && $id != 999999) { 4700 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 4701 if (defined $self->{CHANGE_SRW}) { 4702 $scale += $self->{CHANGE_SRW}; 4703 $width += int((10*$self->{CHANGE_SRW}+2)/3); 4704 $refval *= 10*$self->{CHANGE_SRW}; 4705 } else { 4706 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 4707 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 4708 } 4709 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 4710 $width = $self->{CHANGE_CCITTIA5_WIDTH} 4711 } 4712 _croak "$id Data width is $width which is <= 0" if $width <= 0; 4713 $refval = $self->{NEW_REFVAL_OF}{$id}{$isub} if defined $self->{NEW_REFVAL_OF}{$id} 4714 && defined $self->{NEW_REFVAL_OF}{$id}{$isub}; 4715 # Difference statistical values use different width and reference value 4716 if ($self->{DIFFERENCE_STATISTICAL_VALUE}) { 4717 $width += 1; 4718 $refval = -2**$width; 4719 undef $self->{DIFFERENCE_STATISTICAL_VALUE}; 4720 $self->{NUM_CHANGE_OPERATORS}--; 4721 } 4722 } 4723 4724 # Ensure that bitstream is big enough to encode $value 4725 while ($pos + $width > $maxlen*8) { 4726 $bitstream .= chr(255) x $maxlen; 4727 $maxlen *= 2; 4728 } 4729 4730 if (not defined($value)) { 4731 # Missing value is encoded as 1 bits 4732 $pos += $width; 4733 } elsif ($unit eq 'CCITTIA5') { 4734 # Encode ASCII string in $width bits (left justified, 4735 # padded with spaces) 4736 my $num_bytes = int ($width/8); 4737 _croak "Ascii string too long to fit in $width bits: $value" 4738 if length($value) > $num_bytes; 4739 $value .= ' ' x ($num_bytes - length($value)); 4740 ascii2bitstream($value, $bitstream, $pos, $num_bytes); 4741 $pos += $width; 4742 } else { 4743 # Encode value as integer in $width bits 4744 _croak "Value '$value' is not a number for descriptor $id" 4745 unless looks_like_number($value); 4746 $value = int($value * 10**$scale - $refval + 0.5); 4747 _croak "Encoded data value for $id is negative: $value" if $value < 0; 4748 my $max_value = 2**$width - 1; 4749 _croak "Encoded data value for $id is too big to fit in $width bits: $value" 4750 if $value > $max_value; 4751 # Check for illegal flag value 4752 if ($Strict_checking && $unit =~ /^FLAG[ ]?TABLE/ && $width > 1 4753 && $value < $max_value && $value % 2) { 4754 _complain("$id - $value: rightmost bit $width is set indicating missing value" 4755 . " but then value should be $max_value"); 4756 } 4757 dec2bitstream($value, $bitstream, $pos, $width); 4758 $pos += $width; 4759 } 4760 4761 return ($bitstream, $pos, $maxlen); 4762} 4763 4764# Encode reference value using BUFR compression, assuming all subsets 4765# have same reference value 4766sub _encode_compressed_reference_value { 4767 my $self = shift; 4768 my ($refval,$id,$nsubsets,$bitstream,$pos,$maxlen) = @_; 4769 4770 my $width = $self->{CHANGE_REFERENCE_VALUE}; 4771 4772 # Ensure that bitstream is big enough to encode $value 4773 while ($pos + ($nsubsets+1)*$width + 6 > $maxlen*8) { 4774 $bitstream .= chr(255) x $maxlen; 4775 $maxlen *= 2; 4776 } 4777 4778 $self->_spew(4, "Encoding new reference value %d for %6s in %d bits", 4779 $refval, $id, $width) if $Spew; 4780 # Encode value as integer in $width bits 4781 if ($refval >= 0) { 4782 _croak "Encoded reference value for $id is too big to fit " 4783 . "in $width bits: $refval" if $refval > 2**$width - 1; 4784 dec2bitstream($refval, $bitstream, $pos, $width); 4785 } else { 4786 # Negative reference values should be encoded by setting first 4787 # bit to 1 and then encoding absolute value 4788 _croak "Encoded reference value for $id is too big to fit " 4789 . "in $width bits: $refval" if -$refval > 2**($width-1) - 1; 4790 dec2bitstream(-$refval, $bitstream, $pos+1, $width-1); 4791 } 4792 $pos += $width; 4793 4794 # Increment width set to 0 4795 dec2bitstream(0, $bitstream, $pos, 6); 4796 $pos += 6; 4797 4798 return ($bitstream, $pos, $maxlen); 4799} 4800 4801sub _encode_compressed_value { 4802 my $self = shift; 4803 my ($bitstream,$pos,$maxlen,$unit,$scale,$refval,$width,$id,$data_refs,$idesc,$nsubsets) = @_; 4804 4805 # Override Table B values if Data Description Operators are in 4806 # effect (except for associated fields) 4807 if ($self->{NUM_CHANGE_OPERATORS} > 0 && $id != 999999) { 4808 if ($unit ne 'CCITTIA5' && $unit !~ /^(CODE|FLAG)/) { 4809 if (defined $self->{CHANGE_SRW}) { 4810 $scale += $self->{CHANGE_SRW}; 4811 $width += int((10*$self->{CHANGE_SRW}+2)/3); 4812 $refval *= 10*$self->{CHANGE_SRW}; 4813 } else { 4814 $scale += $self->{CHANGE_SCALE} if defined $self->{CHANGE_SCALE}; 4815 $width += $self->{CHANGE_WIDTH} if defined $self->{CHANGE_WIDTH}; 4816 } 4817 } elsif ($unit eq 'CCITTIA5' && defined $self->{CHANGE_CCITTIA5_WIDTH}) { 4818 $width = $self->{CHANGE_CCITTIA5_WIDTH} 4819 } 4820 _croak "$id Data width <= 0" if $width <= 0; 4821 $refval = $self->{NEW_REFVAL_OF}{$id} if defined $self->{NEW_REFVAL_OF}{$id}; 4822 # Difference statistical values use different width and reference value 4823 if ($self->{DIFFERENCE_STATISTICAL_VALUE}) { 4824 $width += 1; 4825 $refval = -2**$width; 4826 undef $self->{DIFFERENCE_STATISTICAL_VALUE}; 4827 $self->{NUM_CHANGE_OPERATORS}--; 4828 } 4829 } 4830 4831 # Ensure that bitstream is big enough to encode $value 4832 while ($pos + ($nsubsets+1)*$width + 6 > $maxlen*8) { 4833 $bitstream .= chr(255) x $maxlen; 4834 $maxlen *= 2; 4835 } 4836 4837 # Get all values for this descriptor 4838 my @values; 4839 my $first_value = $data_refs->[1][$idesc]; 4840 my $all_equal = 1; # Set to 0 if at least 2 elements differ 4841 foreach my $value (map { $data_refs->[$_][$idesc] } 2..$nsubsets) { 4842 if (defined $value && $unit ne 'CCITTIA5' && !looks_like_number($value)) { 4843 _croak "Value '$value' is not a number for descriptor $id" 4844 } 4845 # This used to be a sub (_check_equality), but inlined for speed 4846 if ($all_equal) { 4847 if (defined $value && defined $first_value) { 4848 if ($unit eq 'CCITTIA5') { 4849 $all_equal = 0 if $value ne $first_value; 4850 } else { 4851 $all_equal = 0 if $value != $first_value; 4852 } 4853 } elsif (defined $value || defined $first_value) { 4854 $all_equal = 0; 4855 } 4856 } 4857 if (not defined $value) { 4858 push @values, undef; 4859 } elsif ($unit eq 'CCITTIA5') { 4860 push @values, $value; 4861 } else { 4862 push @values, int($value * 10**$scale - $refval + 0.5); 4863 } 4864 # Check for illegal flag value 4865 if ($Strict_checking and $unit =~ /^FLAG[ ]?TABLE/ and $width > 1) { 4866 if (defined $value and $value ne 'missing' and $value % 2) { 4867 my $max_value = 2**$width - 1; 4868 _complain("$id - value $value in subset $_:\n" 4869 . "rightmost bit $width is set indicating missing value" 4870 . " but then value should be $max_value"); 4871 } 4872 } 4873 } 4874 4875 if ($all_equal) { 4876 # Same value in all subsets. No need to calculate or store increments 4877 if (defined $first_value) { 4878 if ($unit eq 'CCITTIA5') { 4879 # Encode ASCII string in $width bits (left justified, 4880 # padded with spaces) 4881 my $num_bytes = int ($width/8); 4882 _croak "Ascii string too long to fit in $width bits: $first_value" 4883 if length($first_value) > $num_bytes; 4884 $first_value .= ' ' x ($num_bytes - length($first_value)); 4885 ascii2bitstream($first_value, $bitstream, $pos, $num_bytes); 4886 } else { 4887 # Encode value as integer in $width bits 4888 _croak "First value '$first_value' is not a number for descriptor $id" 4889 unless looks_like_number($first_value); 4890 $first_value = int($first_value * 10**$scale - $refval + 0.5); 4891 _croak "Encoded data value for $id is negative: $first_value" 4892 if $first_value < 0; 4893 _croak "Encoded data value for $id is too big to fit " 4894 . "in $width bits: $first_value" 4895 if $first_value > 2**$width - 1; 4896 dec2bitstream($first_value, $bitstream, $pos, $width); 4897 } 4898 } else { 4899 # Missing value is encoded as 1 bits, but bitstream is 4900 # padded with 1 bits already 4901 } 4902 $pos += $width; 4903 # Increment width set to 0 4904 dec2bitstream(0, $bitstream, $pos, 6); 4905 $pos += 6; 4906 } else { 4907 if ($unit eq 'CCITTIA5') { 4908 unshift @values, $first_value; 4909 # Local reference value set to 0 bits 4910 null2bitstream($bitstream, $pos, $width); 4911 $pos += $width; 4912 # Do not store more characters than needed: remove leading 4913 # and trailing spaces, then right pad with spaces so that 4914 # all strings has same length as largest string 4915 my $largest_length = _trimpad(\@values); 4916 dec2bitstream($largest_length, $bitstream, $pos, 6); 4917 $pos += 6; 4918 # Store the character values 4919 foreach my $value (@values) { 4920 if (defined $value) { 4921 # Encode ASCII string in $largest_length bytes 4922 ascii2bitstream($value, $bitstream, $pos, $largest_length); 4923 } else { 4924 # Missing value is encoded as 1 bits, but 4925 # bitstream is padded with 1 bits already 4926 } 4927 $pos += $largest_length * 8; 4928 } 4929 } else { 4930 _croak "First value '$first_value' is not a number for descriptor $id" 4931 if defined($first_value) && !looks_like_number($first_value); 4932 unshift @values, defined $first_value 4933 ? int($first_value * 10**$scale - $refval + 0.5) 4934 : undef; 4935 # Numeric data. First find minimum value 4936 my ($min_value, $isub) = _minimum(\@values); 4937 _croak "Encoded data value for $id and subset $isub is negative: $min_value" 4938 if $min_value < 0; 4939 my @inc_values = 4940 map { defined $_ ? $_ - $min_value : undef } @values; 4941 # Find how many bits are required to hold the increment 4942 # values (or rather: the highest increment value pluss one 4943 # (except for associated values), to be able to store 4944 # missing values also) 4945 my $max_inc = _maximum(\@inc_values); 4946 my $deltabits = ($id eq '999999') 4947 ?_get_number_of_bits_to_store($max_inc) 4948 : _get_number_of_bits_to_store($max_inc + 1); 4949 # Store local reference value 4950 $self->_spew(5, " Local reference value: %d", $min_value) if $Spew; 4951 dec2bitstream($min_value, $bitstream, $pos, $width); 4952 $pos += $width; 4953 # Store increment width 4954 $self->_spew(5, " Increment width (bits): %d", $deltabits) if $Spew; 4955 dec2bitstream($deltabits, $bitstream, $pos, 6); 4956 $pos += 6; 4957 # Store values 4958 $self->_spew(5, " Increment values: %s", 4959 join(',', map { defined $inc_values[$_] 4960 ? $inc_values[$_] : ''} 0..$#inc_values)) 4961 if $Spew; 4962 foreach my $value (@inc_values) { 4963 if (defined $value) { 4964 _complain("value " . ($value + $min_value) . " for $id too big" 4965 . " to be encoded without compression") 4966 if ($Strict_checking && ($value + $min_value) > 2**$width -1); 4967 dec2bitstream($value, $bitstream, $pos, $deltabits); 4968 } else { 4969 # Missing value is encoded as 1 bits, but 4970 # bitstream is padded with 1 bits already 4971 } 4972 $pos += $deltabits; 4973 } 4974 } 4975 } 4976 4977 return ($bitstream, $pos, $maxlen); 4978} 4979 4980## Encode bitstream using the data values in $data_refs, first 4981## expanding section 3 fully (and comparing with $desc_refs to check 4982## for consistency). This sub is very similar to sub 4983## _decompress_bitstream 4984sub _encode_compressed_bitstream { 4985 my $self = shift; 4986 $self->{CODING} = 'ENCODE'; 4987 my ($data_refs, $desc_refs) = @_; 4988 4989 # Expand section 3 except for delayed replication and operator 4990 # descriptors. This expansion is the same for all subsets, since 4991 # delayed replication has to be the same (this needs to be 4992 # checked) for compression to be possible 4993 my @unexpanded = split / /, $self->{DESCRIPTORS_UNEXPANDED}; 4994 my $alias = "$self->{TABLE_VERSION} " . $self->{DESCRIPTORS_UNEXPANDED}; 4995 if (exists $Descriptors_already_expanded{$alias}) { 4996 $self->{DESCRIPTORS_EXPANDED} = $Descriptors_already_expanded{$alias}; 4997 } else { 4998 $Descriptors_already_expanded{$alias} = $self->{DESCRIPTORS_EXPANDED} 4999 = join " ", _expand_descriptors($self->{D_TABLE}, @unexpanded); 5000 } 5001 my @desc = split /\s/, $self->{DESCRIPTORS_EXPANDED}; 5002 5003 my $nsubsets = $self->{NUM_SUBSETS}; 5004 my $B_table = $self->{B_TABLE}; 5005 my $maxlen = 1024; 5006 my $bitstream = chr(255) x $maxlen; # one bits only 5007 my $pos = 0; 5008 my @operators; 5009 5010 my $desc_ref = $desc_refs->[1]; 5011 5012 # All subsets should have same set of expanded descriptors. This 5013 # is checked later, but we also need to check that the number of 5014 # descriptors in each subset is the same for all subsets 5015 my $num_desc = @{$desc_ref}; 5016 foreach my $isub (2..$nsubsets) { 5017 my $num_d = @{$desc_refs->[$isub]}; 5018 _croak "Compression impossible: Subset 1 contains $num_desc descriptors," 5019 . " while subset $isub contains $num_d descriptors" 5020 if $num_d != $num_desc; 5021 } 5022 5023 5024 D_LOOP: for (my $idesc = 0; $idesc < @desc; $idesc++) { 5025 my $id = $desc[$idesc]; 5026 my $f = substr($id,0,1); 5027 my $x = substr($id,1,2)+0; 5028 my $y = substr($id,3,3)+0; 5029 5030 if ($f == 1) { 5031 # Delayed replication 5032 if ($x == 0) { 5033 _complain("Nonsensical replication of zero descriptors ($id)"); 5034 $idesc++; 5035 next D_LOOP; 5036 } 5037 _croak "$id _expand_descriptors() did not do its job" 5038 if $y > 0; 5039 5040 my $next_id = $desc[$idesc+1]; 5041 _croak "$id Erroneous replication factor" 5042 unless $next_id =~ /^0310(00|01|02|11|12)/ && exists $B_table->{$next_id}; 5043 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $next_id" 5044 if $desc_ref->[$idesc] != $next_id; 5045 my $factor = $data_refs->[1][$idesc]; 5046 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$next_id}; 5047 if ($Spew) { 5048 $self->_spew(3, "%6s %-20s %s", $next_id, $unit, $name); 5049 $self->_spew(3, " %s", $factor); 5050 } 5051 ($bitstream, $pos, $maxlen) 5052 = $self->_encode_compressed_value($bitstream,$pos,$maxlen, 5053 $unit,$scale,$refval,$width, 5054 $next_id,$data_refs,$idesc,$nsubsets); 5055 # Include the delayed replication/repetition in descriptor list 5056 splice @desc, $idesc++, 0, $next_id; 5057 5058 my @r = (); 5059 push @r, @desc[($idesc+2)..($idesc+$x+1)] while $factor--; 5060 splice @desc, $idesc, 2+$x, @r; 5061 5062 if ($next_id eq '031011' || $next_id eq '031012') { 5063 # For delayed repetition we should include data just 5064 # once, so skip to the last set in data array 5065 $idesc += $x * ($data_refs->[1][$idesc-1] - 1); 5066 # We ought to check that the data sets we skipped are 5067 # indeed equal to the last set! 5068 $self->_spew(4, "Delayed repetition ($id $next_id -> @r)") if $Spew; 5069 } else { 5070 $self->_spew(4, "Delayed replication ($id $next_id -> @r)") if $Spew; 5071 } 5072 if ($idesc < @desc) { 5073 redo D_LOOP; 5074 } else { 5075 last D_LOOP; # Might happen if delayed factor is 0 5076 } 5077 5078 } elsif ($f == 2) { 5079 my $flow; 5080 my $bm_idesc; 5081 ($pos, $flow, $bm_idesc, @operators) 5082 = $self->_apply_operator_descriptor($id, $x, $y, $pos, 0, 5083 $desc[$idesc+1], @operators); 5084 if ($flow eq 'redo_bitmap') { 5085 # Data value is associated with the descriptor 5086 # defined by bit map. Remember original and new 5087 # index in descriptor array for the bit mapped 5088 # values ('dr' = data reference) 5089 my $dr_idesc; 5090 if (!defined $bm_idesc) { 5091 $dr_idesc = shift @{ $self->{REUSE_BITMAP}->[0] }; 5092 } elsif (!$Show_all_operators) { 5093 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 5094 + $bm_idesc; 5095 } else { 5096 $dr_idesc = $self->{BITMAP_START}[$self->{NUM_BITMAPS}]; 5097 # Skip operator descriptors 5098 while ($bm_idesc-- > 0) { 5099 $dr_idesc++; 5100 $dr_idesc++ while ($desc[$dr_idesc] >= 200000); 5101 } 5102 } 5103 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 5104 $dr_idesc, $idesc; 5105 $desc[$idesc] = $desc[$dr_idesc]; 5106 redo D_LOOP; 5107 } elsif ($flow eq 'signify_character') { 5108 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $id" 5109 if $desc_ref->[$idesc] != $id; 5110 # Get ASCII string 5111 my @values = map { $data_refs->[$_][$idesc] } 1..$nsubsets; 5112 my $name = 'SIGNIFY CHARACTER'; 5113 my $unit = 'CCITTIA5'; 5114 my ($scale, $refval, $width) = (0, 0, 8*$y); 5115 ($bitstream, $pos, $maxlen) 5116 = $self->_encode_compressed_value($bitstream,$pos,$maxlen, 5117 $unit,$scale,$refval,$width, 5118 "205$y",$data_refs,$idesc,$nsubsets); 5119 next D_LOOP; 5120 } elsif ($flow eq 'no_value') { 5121 next D_LOOP; 5122 } 5123 5124 # Remove operator descriptor from @desc 5125 splice @desc, $idesc--, 1; 5126 5127 next D_LOOP if $flow eq 'next'; 5128 last D_LOOP if $flow eq 'last'; 5129 } 5130 5131 if ($self->{CHANGE_REFERENCE_VALUE}) { 5132 # The data descriptor is to be associated with a new 5133 # reference value, which is fetched from data stream, 5134 # possibly with f=9 instead of f=0 for descriptor 5135 $id -= 900000 if $id =~ /^9/; 5136 _croak "Change reference operator 203Y is not followed by element" 5137 . " descriptor, but $id" if $f > 0; 5138 my @new_ref_values = map { $data_refs->[$_][$idesc] } 1..$nsubsets; 5139 my $new_refval = $new_ref_values[0]; 5140 # Check that they are all the same 5141 foreach my $val (@new_ref_values[1..$#new_ref_values]) { 5142 _croak "Change reference value differ between subsets" 5143 . " which cannot be combined with BUFR compression" 5144 if $val != $new_refval; 5145 } 5146 $self->{NEW_REFVAL_OF}{$id} = $new_refval; 5147 ($bitstream, $pos, $maxlen) 5148 = $self->_encode_compressed_reference_value($new_refval,$id,$nsubsets,$bitstream,$pos,$maxlen); 5149 next D_LOOP; 5150 } 5151 5152 # If operator 204$y 'Add associated field' is in effect, 5153 # each data value is preceded by $y bits which should be 5154 # encoded separately. We choose to provide a descriptor 5155 # 999999 in this case (like the ECMWF BUFRDC software) 5156 if ($self->{ADD_ASSOCIATED_FIELD} and $id ne '031021') { 5157 # First encode associated field 5158 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected 999999" 5159 if $desc_ref->[$idesc] != 999999; 5160 my @values = map { $data_refs->[$_][$idesc] } 1..$nsubsets; 5161 my $name = 'ASSOCIATED FIELD'; 5162 my $unit = 'NUMERIC'; 5163 my ($scale, $refval) = (0, 0); 5164 my $width = $self->{ADD_ASSOCIATED_FIELD}; 5165 if ($Spew) { 5166 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name); 5167 $self->_spew(3, " %s", 999999); 5168 } 5169 ($bitstream, $pos, $maxlen) 5170 = $self->_encode_compressed_value($bitstream,$pos,$maxlen, 5171 $unit,$scale,$refval,$width, 5172 999999,$data_refs,$idesc,$nsubsets); 5173 # Insert the artificial 999999 descriptor for the 5174 # associated value and increment $idesc to prepare for 5175 # handling the 'real' value below 5176 splice @desc, $idesc++, 0, 999999; 5177 } 5178 5179 5180 5181 # For quality information, if this relates to a bit map we 5182 # need to store index of the data ($data_idesc) for which 5183 # the quality information applies, as well as the new 5184 # index ($idesc) in the descriptor array for the bit 5185 # mapped values 5186 if (substr($id,0,3) eq '033' 5187 && defined $self->{BITMAP_OPERATORS} 5188 && $self->{BITMAP_OPERATORS}->[-1] eq '222000') { 5189 if (defined $self->{REUSE_BITMAP}) { 5190 my $data_idesc = shift @{ $self->{REUSE_BITMAP}->[0] }; 5191 _croak "$id: Not enough quality values provided" 5192 if not defined $data_idesc; 5193 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 5194 $data_idesc, $idesc; 5195 } else { 5196 my $data_idesc = shift @{ $self->{CURRENT_BITMAP} }; 5197 _croak "$id: Not enough quality values provided" 5198 if not defined $data_idesc; 5199 push @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}]->[0] }, 5200 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 5201 + $data_idesc, $idesc; 5202 } 5203 } 5204 5205 if ($id eq '031031' and $self->{BUILD_BITMAP}) { 5206 # Store the index of expanded descriptors if data is 5207 # marked as present in data present indicator: 0 is 5208 # 'present', 1 (undef value) is 'not present'. E.g. 5209 # bitmap = 1100110 => (2,3,6) is stored in $self->{CURRENT_BITMAP} 5210 5211 # NB: bit map might vary betwen subsets!!!!???? 5212 if ($data_refs->[1][$idesc] == 0) { 5213 push @{$self->{CURRENT_BITMAP}}, $self->{BITMAP_INDEX}; 5214 } 5215 $self->{BITMAP_INDEX}++; 5216 if ($self->{BACKWARD_DATA_REFERENCE} == $self->{NUM_BITMAPS}) { 5217 my $numb = $self->{NUM_BITMAPS}; 5218 if (!defined $self->{BITMAP_START}[$numb]) { 5219 # Look up the element descriptor immediately 5220 # preceding the bitmap operator 5221 my $i = $idesc; 5222 $i-- while ($desc[$i] ne $self->{BITMAP_OPERATORS}->[-1] 5223 && $i >=0); 5224 $i-- while ($desc[$i] > 100000 && $i >=0); 5225 _croak "No element descriptor preceding bitmap" if $i < 0; 5226 $self->{BITMAP_START}[$numb] = $i; 5227 } else { 5228 $self->{BITMAP_START}[$numb]--; 5229 _croak "Bitmap too big" 5230 if $self->{BITMAP_START}[$numb] < 0; 5231 } 5232 } 5233 } elsif ($self->{BUILD_BITMAP} and $self->{BITMAP_INDEX} > 0) { 5234 # We have finished building the bit map 5235 $self->{BUILD_BITMAP} = 0; 5236 $self->{BITMAP_INDEX} = 0; 5237 if ($self->{BACKWARD_DATA_REFERENCE} != $self->{NUM_BITMAPS}) { 5238 $self->{BITMAP_START}[$self->{NUM_BITMAPS}] 5239 = $self->{BITMAP_START}[$self->{BACKWARD_DATA_REFERENCE}]; 5240 } 5241 } 5242 5243 # We now have a "real" data descriptor 5244 _croak "Descriptor no $idesc is $desc_ref->[$idesc], expected $id" 5245 if $desc_ref->[$idesc] != $id; 5246 5247 # Find the relevant entry in BUFR table B 5248 _croak "Data descriptor $id is not present in BUFR table B" 5249 unless exists $B_table->{$id}; 5250 my ($name,$unit,$scale,$refval,$width) = split /\0/, $B_table->{$id}; 5251 if ($Spew) { 5252 $self->_spew(3, "%6s %-20s %s", $id, $unit, $name); 5253 $self->_spew(3, " %s", join ' ', 5254 map { defined($data_refs->[$_][$idesc]) ? 5255 $data_refs->[$_][$idesc] : 'missing'} 1..$nsubsets ); 5256 } 5257 ($bitstream, $pos, $maxlen) 5258 = $self->_encode_compressed_value($bitstream,$pos,$maxlen, 5259 $unit,$scale,$refval,$width, 5260 $id,$data_refs,$idesc,$nsubsets); 5261 } # End D_LOOP 5262 5263 # Pad with 0 bits if necessary to get an even or integer number of 5264 # octets, depending on bufr edition 5265 my $padnum = $self->{BUFR_EDITION} < 4 ? (16-($pos%16)) % 16 : (8-($pos%8)) % 8; 5266 if ($padnum > 0) { 5267 null2bitstream($bitstream, $pos, $padnum); 5268 } 5269 my $len = ($pos + $padnum)/8; 5270 $bitstream = substr $bitstream, 0, $len; 5271 5272 return ($bitstream, $len); 5273} 5274 5275## Check that the length of data section computed from expansion of 5276## section 3 ($comp_len) equals actual length of data part of section 5277## 4, allowing for padding zero bits according to BUFR Regulation 94.1.3 5278## Strict checking should also check that padding actually consists of 5279## zero bits only. 5280sub _check_section4_length { 5281 my $self = shift; 5282 my ($comp_len, $actual_len) = @_; 5283 5284 if ($comp_len > $actual_len) { 5285 _croak "More descriptors in expansion of section 3" 5286 . " than what can fit in the given length of section 4" 5287 . " ($comp_len versus $actual_len bits)"; 5288 } else { 5289 return if not $Strict_checking; # Excessive bytes in section 4 5290 # does not prevent further decoding 5291 return if $Noqc; # No more sensible checks to do in this case 5292 5293 my $bufr_edition = $self->{BUFR_EDITION}; 5294 my $actual_bytes = $actual_len/8; # This is sure to be an integer 5295 if ($bufr_edition < 4 and $actual_bytes % 2) { 5296 _complain("Section 4 is odd number ($actual_bytes) of bytes," 5297 . " which is an error in BUFR edition $bufr_edition"); 5298 } 5299 my $comp_bytes = int($comp_len/8); 5300 $comp_bytes++ if $comp_len % 8; # Need to pad with zero bits 5301 $comp_bytes++ if $bufr_edition < 4 and $comp_bytes % 2; # Need to pad with an extra byte of zero bits 5302 if ($actual_bytes > $comp_bytes) { 5303 _complain("Binary data part of section 4 longer ($actual_bytes bytes)" 5304 . " than expected from section 3 ($comp_bytes bytes)"); 5305 } 5306 } 5307 return; 5308} 5309 5310# Trim string, also removing nulls (and _complain if nulls found). 5311# If strict_checking, checks also for bit 1 set in each character 5312sub _trim { 5313 my ($str, $id) = @_; 5314 return unless defined $str; 5315 if ($str =~ /\0/) { 5316 (my $str2 = $str) =~ s|\0|\\0|g; 5317 _complain("Nulls (" . '\0' 5318 . ") found in string '$str2' for descriptor $id"); 5319 $str =~ s/\0//g; 5320 } elsif ($Strict_checking && $str =~/^ +$/) { 5321 _complain("Only spaces ('$str') found for descriptor $id, " 5322 . "ought to have been encoded as missing value "); 5323 } 5324 5325 $str =~ s/\s+$//; 5326 $str =~ s/^\s+//; 5327 5328 if ($Strict_checking && $str ne '') { 5329 foreach my $char (split //, $str) { 5330 if (ord($char) > 127) { 5331 _complain("Character $char (ascii value " . ord($char) . 5332 ") in string '$str' is not allowed in CCITTIA5"); 5333 last; # Don't want to warn for every bad character 5334 } 5335 } 5336 } 5337 return $str; 5338} 5339 5340## Remove leading and trailing spaces in the strings provided, then add 5341## spaces if necessary so that all strings have same length as largest 5342## trimmed string. This length (in bytes) is returned 5343sub _trimpad { 5344 my $string_ref = shift; 5345 my $largest_length = 0; 5346 foreach my $string (@{$string_ref}) { 5347 if (defined $string) { 5348 $string =~ s/^\s+//; 5349 $string =~ s/\s+$//; 5350 if (length $string > $largest_length) { 5351 $largest_length = length $string; 5352 } 5353 } 5354 } 5355 foreach my $string (@{$string_ref}) { 5356 if (defined $string) { 5357 $string .= ' ' x ($largest_length - length $string); 5358 } 5359 } 5360 return $largest_length; 5361} 5362 5363## Use timegm in Time::Local to validate date and time in section 1 5364sub _validate_datetime { 5365 my $self = shift; 5366 my $bufr_edition = $self->{BUFR_EDITION}; 5367 my $year = $bufr_edition < 4 ? $self->{YEAR_OF_CENTURY} + 2000 5368 : $self->{YEAR}; 5369 my $month = $self->{MONTH} - 1; 5370 my $second = $bufr_edition == 4 ? $self->{SECOND} : 0; 5371 5372 # All datetime variables set to 0 should be considered ok 5373 return if ($self->{MINUTE} == 0 && $self->{HOUR} == 0 5374 && $self->{DAY} == 0 && $self->{MONTH} == 0 5375 && $second == 0 && ($year == 0 || $year == 2000)); 5376 5377 eval { 5378 my $dummy = timegm($second,$self->{MINUTE},$self->{HOUR}, 5379 $self->{DAY},$month,$year); 5380 }; 5381 5382 _complain("Invalid date in section 1: $@") if $@; 5383} 5384 5385## Return number of bits necessary to store the nonnegative number $n 5386## (1 for 0,1, 2 for 2,3, 3 for 4,5,6,7 etc) 5387sub _get_number_of_bits_to_store { 5388 my $n = shift; 5389 return 1 if $n == 0; 5390 my $x = 1; 5391 my $i = 0; 5392 while ($x < $n) { 5393 $i++; 5394 $x *= 2; 5395 } 5396 return $x==$n ? $i+1 : $i; 5397} 5398 5399## Find minimum value among set of numbers (undefined values 5400## permitted, but at least one value must be defined). Also returns 5401## for which number the minimum occurs (counting from 1). 5402sub _minimum { 5403 my $v_ref = shift; 5404 my $min = 2**63; 5405 my $idx = 0; 5406 my $i=0; 5407 foreach my $v (@{$v_ref}) { 5408 $i++; 5409 next if not defined $v; 5410 if ($v < $min) { 5411 $min = $v; 5412 $idx = $i; 5413 } 5414 } 5415 return ($min, $idx); 5416} 5417 5418## Find maximum value among set of nonnegative numbers or undefined values 5419sub _maximum { 5420 my $v_ref = shift; 5421 my $max = 0; 5422 foreach my $v (@{$v_ref}) { 5423 next if not defined $v; 5424 if ($v > $max) { 5425 $max = $v; 5426 } 5427 } 5428 _croak "Internal error: Found no maximum value" if $max < 0; 5429 return $max; 5430} 5431 5432## Return index of first occurrence av $value in $list, undef if no match 5433sub _get_index_in_list { 5434 my ($list, $value) = @_; 5435 for (my $i=0; $i <= $#{$list}; $i++) { 5436 if ($list->[$i] eq $value) { # Match 5437 return $i; 5438 } 5439 } 5440 # No match 5441 return undef; 5442} 5443 5444## Apply the operator descriptor $id, adjusting $pos and 5445## @operators. Also returning $bm_idesc (explained in start of module) 5446## and a hint of what to do next in $flow 5447sub _apply_operator_descriptor { 5448 my $self = shift; 5449 my ($id, $x, $y, $pos, $isub, $next_id, @operators) = @_; 5450 # $isub should be 0 for compressed messages, else subset number 5451 5452 my $flow = ''; 5453 my $bm_idesc = ''; 5454 5455 if ($y == 0 && $x =~ /^[12378]$/) { 5456 # 20[12378]000 Cancellation of a data descriptor operator 5457 _complain("$id Cancelling unused operator") 5458 if $Strict_checking and !grep {$_ == $x} @operators; 5459 @operators = grep {$_ != $x} @operators; 5460 if ($x == 1) { 5461 $self->{NUM_CHANGE_OPERATORS}-- if $self->{CHANGE_WIDTH}; 5462 undef $self->{CHANGE_WIDTH}; 5463 } elsif ($x == 2) { 5464 $self->{NUM_CHANGE_OPERATORS}-- if $self->{CHANGE_SCALE}; 5465 undef $self->{CHANGE_SCALE}; 5466 } elsif ($x == 3) { 5467 $self->{NUM_CHANGE_OPERATORS}-- if $self->{NEW_REFVAL_OF}; 5468 undef $self->{NEW_REFVAL_OF}; 5469 } elsif ($x == 7) { 5470 $self->{NUM_CHANGE_OPERATORS}-- if $self->{CHANGE_SRW}; 5471 undef $self->{CHANGE_SRW}; 5472 } elsif ($x == 8) { 5473 $self->{NUM_CHANGE_OPERATORS}-- if $self->{CHANGE_CCITTIA5_WIDTH}; 5474 undef $self->{CHANGE_CCITTIA5_WIDTH}; 5475 } 5476 $self->_spew(4, "$id * Reset %s", 5477 ("width of CCITTIA5 field","data width","scale","reference values",0,0,0, 5478 "increase of scale, reference value and data width")[$x % 8]) if $Spew; 5479 $flow = 'next'; 5480 } elsif ($x == 1) { 5481 # ^201 Change data width 5482 _croak "201 operator cannot be nested within 207 operator" 5483 if grep {$_ == 7} @operators; 5484 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{CHANGE_WIDTH}; 5485 $self->{CHANGE_WIDTH} = $y-128; 5486 $self->_spew(4, "$id * Change data width: %d", $self->{CHANGE_WIDTH}) if $Spew; 5487 push @operators, $x; 5488 $flow = 'next'; 5489 } elsif ($x == 2) { 5490 # ^202 Change scale 5491 _croak "202 operator cannot be nested within 207 operator" 5492 if grep {$_ == 7} @operators; 5493 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{CHANGE_SCALE}; 5494 $self->{CHANGE_SCALE} = $y-128; 5495 $self->_spew(4, "$id * Change scale: %d", $self->{CHANGE_SCALE}) if $Spew; 5496 push @operators, $x; 5497 $flow = 'next'; 5498 } elsif ($x == 3 && $y == 255) { 5499 # 203255 Terminate change reference value definition 5500 $self->_spew(4, "$id * Terminate reference value definition %s", 5501 '203' . (defined $self->{CHANGE_REFERENCE_VALUE} 5502 ? sprintf("%03d", $self->{CHANGE_REFERENCE_VALUE}) : '???')) if $Spew; 5503 _complain("$id no current change reference value to terminate") 5504 unless defined $self->{CHANGE_REFERENCE_VALUE}; 5505 undef $self->{CHANGE_REFERENCE_VALUE}; 5506 $flow = 'next'; 5507 } elsif ($x == 3) { 5508 # ^203 Change reference value 5509 _croak "203 operator cannot be nested within 207 operator" 5510 if grep {$_ == 7} @operators; 5511 $self->_spew(4, "$id * Change reference value") if $Spew; 5512 # Get reference value from data stream ($y == number of bits) 5513 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{CHANGE_REFERENCE_VALUE}; 5514 $self->{CHANGE_REFERENCE_VALUE} = $y; 5515 push @operators, $x; 5516 $flow = 'next'; 5517 } elsif ($x == 4) { 5518 # ^204 Add associated field 5519 if ($y > 0) { 5520 _croak "$id Nesting of Add associated field is not implemented" 5521 if $self->{ADD_ASSOCIATED_FIELD}; 5522 $self->{ADD_ASSOCIATED_FIELD} = $y; 5523 $flow = 'next'; 5524 } else { 5525 _complain "$id No previous Add associated field" 5526 unless defined $self->{ADD_ASSOCIATED_FIELD}; 5527 undef $self->{ADD_ASSOCIATED_FIELD}; 5528 $flow = 'next'; 5529 } 5530 } elsif ($x == 5) { 5531 # ^205 Signify character (i.e. the following $y bytes is 5532 # character information) 5533 $flow = 'signify_character'; 5534 } elsif ($x == 6) { 5535 # ^206 Signify data width for the immediately following local 5536 # descriptor. If we find this local descriptor in BUFR table B 5537 # with data width $y bits, we assume we can use this table 5538 # entry to decode/encode the value properly, and can just 5539 # ignore the operator descriptor. Else we skip the local 5540 # descriptor and the corresponding value if decoding, or have 5541 # to give up if encoding 5542 my $ff = substr($next_id,0,1); 5543 _croak("Descriptor $next_id following Signify data width" 5544 . " operator $_ is not an element descriptor") 5545 if $ff != 0; 5546 if ($Strict_checking) { 5547 my $xx = substr($next_id,1,2); 5548 my $yy = substr($next_id,3,3); 5549 _complain("Descriptor $next_id following Signify data width" 5550 . " operator $id is not a local descriptor") 5551 if ($xx < 48 && $yy < 192); 5552 } 5553 if (exists $self->{B_TABLE}->{$next_id} 5554 and (split /\0/, $self->{B_TABLE}->{$next_id})[-1] == $y) { 5555 $self->_spew(4, "Found $next_id with data width $y, ignoring $id") if $Spew; 5556 $flow = 'next'; 5557 } else { 5558 _croak "Cannot encode descriptor $next_id (following $id), not found in table B" 5559 if $self->{CODING} eq 'ENCODE'; 5560 $self->_spew(4, "$_: Did not find $next_id in table B." 5561 . " Skipping $id and $next_id.") if $Spew; 5562 $pos += $y; # Skip next $y bits in bitstream if decoding 5563 $flow = 'skip'; 5564 } 5565 5566 } elsif ($x == 7) { 5567 # ^207 Increase scale, reference value and data width 5568 _croak "207 operator cannot be nested within 201/202/203 operators" 5569 if grep {$_ == 1 || $_ == 2 || $_ == 3} @operators; 5570 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{CHANGE_SRW}; 5571 $self->{CHANGE_SRW} = $y; 5572 $self->_spew(4, "$id * Increase scale, reference value and data width: %d", $y) if $Spew; 5573 push @operators, $x; 5574 $flow = 'next'; 5575 } elsif ($x == 8) { 5576 # ^208 Change data width for ascii data 5577 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{CHANGE_CCITTIA5_WIDTH}; 5578 $self->{CHANGE_CCITTIA5_WIDTH} = $y*8; 5579 $self->_spew(4, "$id * Change width for CCITTIA5 field: %d bytes", $y) if $Spew; 5580 push @operators, $x; 5581 $flow = 'next'; 5582 } elsif ($x == 9) { 5583 # ^209 IEEE floating point representation 5584 _croak "$id IEEE floating point representation (not implemented)"; 5585 } elsif ($x == 21) { 5586 # ^221 Data not present 5587 _croak "$id Data not present (not implemented)"; 5588 } elsif ($x == 22 && $y == 0) { 5589 # 222000 Quality information follows 5590 push @{ $self->{BITMAP_OPERATORS} }, '222000'; 5591 $self->{NUM_BITMAPS}++; 5592 # Mark that a bit map probably needs to be built 5593 $self->{BUILD_BITMAP} = 1; 5594 $self->{BITMAP_INDEX} = 0; 5595 $flow = $Noqc ? 'last' : 'no_value'; 5596 } elsif ($x == 23 && $y == 0) { 5597 # 223000 Substituted values follow, each one following a 5598 # descriptor 223255. Which value they are a substitute for is 5599 # defined by a bit map, which already may have been defined 5600 # (if descriptor 23700 is encountered), or will shortly be 5601 # defined by data present indicators (031031) 5602 push @{ $self->{BITMAP_OPERATORS} }, '223000'; 5603 $self->{NUM_BITMAPS}++; 5604 # Mark that a bit map probably needs to be built 5605 $self->{BUILD_BITMAP} = 1; 5606 $self->{BITMAP_INDEX} = 0; 5607 $flow = 'no_value'; 5608 } elsif ($x == 23 && $y == 255) { 5609 # 223255 Substituted values marker operator 5610 _croak "$id No bit map defined" 5611 unless (defined $self->{CURRENT_BITMAP} || defined $self->{REUSE_BITMAP}) 5612 && $self->{BITMAP_OPERATORS}[-1] eq '223000'; 5613 if (defined $self->{REUSE_BITMAP}) { 5614 _croak "More 223255 encountered than current bit map allows" 5615 unless @{ $self->{REUSE_BITMAP}->[$isub] }; 5616 $bm_idesc = undef; 5617 } else { 5618 _croak "More 223255 encountered than current bit map allows" 5619 unless @{$self->{CURRENT_BITMAP}}; 5620 $bm_idesc = shift @{$self->{CURRENT_BITMAP}}; 5621 } 5622 $flow = 'redo_bitmap'; 5623 } elsif ($x == 24 && $y == 0) { 5624 # 224000 First order statistical values follow 5625 push @{ $self->{BITMAP_OPERATORS} }, '224000'; 5626 $self->{NUM_BITMAPS}++; 5627 # Mark that a bit map probably needs to be built 5628 $self->{BUILD_BITMAP} = 1; 5629 $self->{BITMAP_INDEX} = 0; 5630 $flow = 'no_value'; 5631 } elsif ($x == 24 && $y == 255) { 5632 # 224255 First order statistical values marker operator 5633 _croak "$id No bit map defined" 5634 unless (defined $self->{CURRENT_BITMAP} || defined $self->{REUSE_BITMAP}) 5635 && $self->{BITMAP_OPERATORS}[-1] eq '224000'; 5636 if (defined $self->{REUSE_BITMAP}) { 5637 _croak "More 224255 encountered than current bit map allows" 5638 unless @{ $self->{REUSE_BITMAP}->[$isub] }; 5639 $bm_idesc = undef; 5640 } else { 5641 _croak "More 224255 encountered than current bit map allows" 5642 unless @{$self->{CURRENT_BITMAP}}; 5643 $bm_idesc = shift @{$self->{CURRENT_BITMAP}}; 5644 } 5645 $flow = 'redo_bitmap'; 5646 } elsif ($x == 25 && $y == 0) { 5647 # 225000 Difference statistical values follow 5648 push @{ $self->{BITMAP_OPERATORS} }, '225000'; 5649 $self->{NUM_BITMAPS}++; 5650 # Mark that a bit map probably needs to be built 5651 $self->{BUILD_BITMAP} = 1; 5652 $self->{BITMAP_INDEX} = 0; 5653 $flow = 'no_value'; 5654 } elsif ($x == 25 && $y == 255) { 5655 # 225255 Difference statistical values marker operator 5656 _croak "$id No bit map defined\n" 5657 unless (defined $self->{CURRENT_BITMAP} || defined $self->{REUSE_BITMAP}) 5658 && $self->{BITMAP_OPERATORS}[-1] eq '225000'; 5659 if (defined $self->{REUSE_BITMAP}) { 5660 _croak "More 225255 encountered than current bit map allows" 5661 unless @{ $self->{REUSE_BITMAP}->[$isub] }; 5662 $bm_idesc = undef; 5663 } else { 5664 _croak "More 225255 encountered than current bit map allows" 5665 unless @{$self->{CURRENT_BITMAP}}; 5666 $bm_idesc = shift @{$self->{CURRENT_BITMAP}}; 5667 } 5668 # Must remember to change data width and reference value 5669 $self->{NUM_CHANGE_OPERATORS}++ if !$self->{DIFFERENCE_STATISTICAL_VALUE}; 5670 $self->{DIFFERENCE_STATISTICAL_VALUE} = 1; 5671 $flow = 'redo_bitmap'; 5672 } elsif ($x == 32 && $y == 0) { 5673 # 232000 Replaced/retained values follow, each one following a 5674 # descriptor 232255. Which value they are a replacement for is 5675 # defined by a bit map, which already may have been defined 5676 # (if descriptor 23700 is encountered), or will shortly be 5677 # defined by data present indicators (031031) 5678 push @{ $self->{BITMAP_OPERATORS} }, '232000'; 5679 $self->{NUM_BITMAPS}++; 5680 # Mark that a bit map probably needs to be built 5681 $self->{BUILD_BITMAP} = 1; 5682 $self->{BITMAP_INDEX} = 0; 5683 $flow = 'no_value'; 5684 } elsif ($x == 32 && $y == 255) { 5685 # 232255 Replaced/retained values marker operator 5686 _croak "$id No bit map defined" 5687 unless (defined $self->{CURRENT_BITMAP} || defined $self->{REUSE_BITMAP}) 5688 && $self->{BITMAP_OPERATORS}[-1] eq '232000'; 5689 if (defined $self->{REUSE_BITMAP}) { 5690 _croak "More 232255 encountered than current bit map allows" 5691 unless @{ $self->{REUSE_BITMAP}->[$isub] }; 5692 $bm_idesc = undef; 5693 } else { 5694 _croak "More 232255 encountered than current bit map allows" 5695 unless @{$self->{CURRENT_BITMAP}}; 5696 $bm_idesc = shift @{$self->{CURRENT_BITMAP}}; 5697 } 5698 $flow = 'redo_bitmap'; 5699 } elsif ($x == 35 && $y == 0) { 5700 # 235000 Cancel backward data reference 5701 undef $self->{REUSE_BITMAP}; 5702 $self->{BACKWARD_DATA_REFERENCE} = $self->{NUM_BITMAPS} + 1; 5703 $flow = 'no_value'; 5704 } elsif ($x == 36 && $y == 0) { 5705 # 236000 Define data present bit map 5706 undef $self->{CURRENT_BITMAP}; 5707 $self->{BUILD_BITMAP} = 1; 5708 $self->{BITMAP_INDEX} = 0; 5709 $flow = 'no_value'; 5710 } elsif ($x == 37 && $y == 0) { 5711 # 237000 Use defined data present bit map 5712 _croak "$id No previous bit map defined" 5713 unless defined $self->{BITMAPS}; 5714 my %hash = @{ $self->{BITMAPS}->[$self->{NUM_BITMAPS}-1]->[$isub] }; 5715 $self->{REUSE_BITMAP}->[$isub] = [sort {$a <=> $b} keys %hash]; 5716 $flow = 'no_value'; 5717 } elsif ($x == 37 && $y == 255) { 5718 # 237255 Cancel 'use defined data present bit map' 5719 _complain("$id No data present bit map to cancel") 5720 unless defined $self->{REUSE_BITMAP}; 5721 undef $self->{REUSE_BITMAP}; 5722 $flow = 'next'; 5723 } elsif ($x == 41 && $y == 0) { 5724 # 241000 Define event 5725 _croak "$id Define event (not implemented)"; 5726 } elsif ($x == 41 && $y == 255) { 5727 # 241255 Cancel define event 5728 _croak "$id Cancel define event (not implemented)"; 5729 } elsif ($x == 42 && $y == 0) { 5730 # 242000 Define conditioning event 5731 _croak "$id Define conditioning event (not implemented)"; 5732 } elsif ($x == 42 && $y == 255) { 5733 # 242255 Cancel define conditioning event 5734 _croak "$id Cancel define conditioning event (not implemented)"; 5735 } elsif ($x == 43 && $y == 0) { 5736 # 243000 Categorial forecast values follow 5737 _croak "$id Categorial forecast values follow (not implemented)"; 5738 } elsif ($x == 43 && $y == 255) { 5739 # 243255 Cancel categorial forecast values follow 5740 _croak "$id Cancel categorial forecast values follow (not implemented)"; 5741 } else { 5742 _croak "$id Unknown data description operator"; 5743 } 5744 5745 return ($pos, $flow, $bm_idesc, @operators); 5746} 5747 5748## Extract data from selected subsets in selected bufr objects, joined 5749## into a single ($data_refs, $desc_refs), to later be able to make a 5750## single BUFR message by calling encode_message. Also returns number 5751## of subsets extracted. 5752sub join_subsets { 5753 my $self = shift; 5754 my (@bufr, @subset_list); 5755 my $last_arg_was_bufr; 5756 my $num_objects = 0; 5757 while (@_) { 5758 my $arg = shift; 5759 if (ref($arg) eq 'Geo::BUFR') { 5760 $bufr[$num_objects++] = $arg; 5761 $last_arg_was_bufr = 1; 5762 } elsif (ref($arg) eq 'ARRAY') { 5763 _croak "Wrong input (multiple array refs) to join_subsets" 5764 unless $last_arg_was_bufr; 5765 $subset_list[$num_objects-1] = $arg; 5766 $last_arg_was_bufr = 0; 5767 } else { 5768 _croak "Input is not Geo::BUFR object or array ref in join_subsets"; 5769 } 5770 } 5771 5772 my ($data_refs, $desc_refs); 5773 my $n = 1; # Number of subsets included 5774 # Ought to check for common section 3 also? 5775 for (my $i=0; $i < $num_objects; $i++) { 5776 $bufr[$i]->rewind(); 5777 my $isub = 1; 5778 if (!exists $subset_list[$i]) { # grab all subsets from this object 5779 while (not $bufr[$i]->eof()) { 5780 my ($data, $descriptors) = $bufr[$i]->next_observation(); 5781 last if !$data; 5782 $self->_spew(2, "Joining subset %d from bufr object %d", $isub, $i) if $Spew; 5783 $data_refs->[$n] = $data; 5784 $desc_refs->[$n++] = $descriptors; 5785 $isub++; 5786 } 5787 } else { # grab the subsets specified, also inserting them in the specified order 5788 my $num_found = 0; 5789 while (not $bufr[$i]->eof()) { 5790 my ($data, $descriptors) = $bufr[$i]->next_observation(); 5791 last if !$data; 5792 my $index = _get_index_in_list($subset_list[$i], $isub); 5793 if (defined $index) { 5794 $self->_spew(2, "Joining subset %d from subset %d" 5795 . " in bufr object %d", $isub, $index, $i) if $Spew; 5796 $data_refs->[$n + $index] = $data; 5797 $desc_refs->[$n + $index] = $descriptors; 5798 $num_found++; 5799 } 5800 $isub++; 5801 } 5802 _croak "Mismatch between number of subsets found ($num_found) and " 5803 . "expected from argument [@{$subset_list[$i]}] to join_subsets" 5804 if $num_found != @{$subset_list[$i]}; 5805 $n += $num_found; 5806 } 5807 $bufr[$i]->rewind(); 5808 } 5809 $n--; 5810 return ($data_refs, $desc_refs, $n) 5811} 5812 58131; # Make sure require or use succeeds. 5814 5815 5816__END__ 5817# Below is documentation for the module. You'd better read it! 5818 5819=pod 5820 5821=encoding utf8 5822 5823=head1 NAME 5824 5825Geo::BUFR - Perl extension for handling of WMO BUFR files. 5826 5827=head1 SYNOPSIS 5828 5829 # A simple program to print decoded contents of a BUFR file. Note 5830 # that a more sophisticated program (bufrread.pl) is included in the 5831 # package 5832 5833 use Geo::BUFR; 5834 5835 Geo::BUFR->set_tableformat('BUFRDC'); # ECCODES is also possible 5836 Geo::BUFR->set_tablepath('path to BUFR tables'); 5837 5838 my $bufr = Geo::BUFR->new(); 5839 5840 $bufr->fopen('name of BUFR file'); 5841 5842 while (not $bufr->eof()) { 5843 my ($data, $descriptors) = $bufr->next_observation(); 5844 print $bufr->dumpsections($data, $descriptors) if $data; 5845 } 5846 5847 $bufr->fclose(); 5848 5849 5850=head1 DESCRIPTION 5851 5852B<BUFR> = B<B>inary B<U>niversal B<F>orm for the B<R>epresentation of 5853meteorological data. BUFR is approved by WMO (World Meteorological 5854Organization) as the standard universal exchange format for 5855meteorological observations, gradually replacing a lot of older 5856alphanumeric data formats. 5857 5858This module provides methods for decoding and encoding BUFR messages, 5859and for displaying information in BUFR B and D tables and in BUFR flag 5860and code tables. 5861 5862Installing this module also installs some programs: C<bufrread.pl>, 5863C<bufrresolve.pl>, C<bufrextract.pl>, C<bufrencode.pl>, 5864C<bufr_reencode.pl> and C<bufralter.pl>. See 5865L<https://wiki.met.no/bufr.pm/start> for examples of use. For the 5866majority of potential users of Geo::BUFR I would expect these programs 5867to be all that you will need Geo::BUFR for. 5868 5869Note that being Perl, this module cannot compete in speed with for 5870example the (free) ECMWF BUFRDC Fortran library. Still, some effort 5871has been invested in making the module reasonable fast in that the 5872core routines for encoding and decoding bitstreams are implemented in 5873C. 5874 5875 5876=head1 METHODS 5877 5878The C<get_> methods will return undef if the requested information is 5879not available. The C<set_> methods as well as C<fopen>, C<fclose>, 5880C<copy_from> and C<rewind> will always return 1, or croak if failing. 5881 5882Create a new object: 5883 5884 $bufr = Geo::BUFR->new(); 5885 $bufr = Geo::BUFR->new($BUFRmessages); 5886 5887The second form of C<new> is useful if you want to provide the BUFR 5888messages to decode directly as an input buffer (string). Note that 5889merely calling C<new($BUFRmessages)> will not decode anything in the 5890BUFR messages, for that you need to call C<next_observation()> from 5891the newly created object. You also have the option of providing the 5892BUFR messages in a file, using the no argument form of C<new()> and 5893then calling C<fopen>. 5894 5895Associate the object with a file for reading of BUFR messages: 5896 5897 $bufr->fopen($filename); 5898 5899Close the associated file that was opened by fopen: 5900 5901 $bufr->fclose(); 5902 5903Check for end-of-file (or end of the input buffer provided as argument 5904to C<new>): 5905 5906 $bufr->eof(); 5907 5908Returns true if end-of-file (or end of input buffer) is reached, false 5909if not. 5910 5911Ensure that next call to C<next_observation> will decode first subset 5912in first BUFR message: 5913 5914 $bufr->rewind(); 5915 5916Copy from an existing object: 5917 5918 $bufr1->copy_from($bufr2,$what); 5919 5920If $what is 'all' or not provided, will copy everything in $bufr2 into 5921$bufr1, i.e. making a clone. If $what is 'metadata', only the metadata 5922in section 0, 1 and 3 will be copied (and all of section 2 if present). 5923 5924Load B and D tables: 5925 5926 $bufr->load_BDtables($table); 5927 5928$table is optional, and should for BUFRDC be (base)name of a file 5929containing a BUFR table B or D, using the ECMWF BUFRDC naming 5930convention, i.e. [BD]'table_version'.TXT. For ECCODES, use last part 5931of path, e.g. on UNIX-like systems '0/wmo/18' for master tables and 5932'0/local/8/78/236' for local tables, or both if that is needed, 5933e.g. '0/wmo/18,0/local/8/78/236'. If no argument is provided, 5934C<load_BDtables()> will use BUFR section 1 information in the $bufr 5935object to decide which tables to load (which for ECCODES might be up 5936to 4 table files, both local and master tables). Previously loaded 5937tables are kept in memory, and C<load_BDtables> will return 5938immediately if the tables already have been loaded. Will die (croak) 5939if tables cannot be found, but (in the no argument version) not if 5940these are local tables (Local table version number > 0) and the 5941corresponding master tables exist (Local table version number = 0), 5942which then will be loaded instead. Returns table version for the 5943tables loaded (see C<get_table_version>). 5944 5945Load C table: 5946 5947 $bufr->load_Ctable($table,$default_table); 5948 5949Both $table and $default_table are optional. This will load the flag 5950and code tables (if not already loaded), which in ECMWF BUFRDC are put 5951in tables C'table_version'.TXT (not to be confused with WMO BUFR table 5952C, which contains the operator descriptors). $default_table will be 5953used if $table is not found. For $table and $default_table in ECCODES, 5954use (just like for C<load_BDtables>) last part of path, e.g. on 5955UNIX-like systems '0/wmo/18' for master tables and '0/local/8/78/236' 5956for local tables, or both if that is needed, 5957e.g. '0/wmo/18,0/local/8/78/236'. Will for ECCODES then load all 5958tables in the codetables subdirectory. If no arguments are provided, 5959C<load_Ctable()> will use BUFR section 1 information in the $bufr 5960object to decide which table(s) to load. Will die (croak) if table 5961cannot be found, but not if this is a local table and the 5962corresponding master table exists, which then will be loaded 5963instead. Returns table version for the table loaded. 5964 5965Get next observation (next subset in current BUFR message or first subset 5966in next message): 5967 5968 ($data, $descriptors) = $bufr->next_observation(); 5969 5970where $descriptors is a reference to the array of fully expanded 5971descriptors for this subset, $data is a reference to the corresponding 5972values. This method is meant to be used to iterate through all BUFR 5973messages in the file or input buffer (see C<new>) associated with the 5974$bufr object, see example program in L</SYNOPSIS>. Whenever a new BUFR 5975message is reached, section 0-3 will also be decoded, the contents of 5976which is then available through the access methods listed below. This 5977is the main BUFR decoding routine in Geo::BUFR, and will call 5978C<load_BDtables()> internally (unless decoding of section 4 has been 5979turned off by use of C<set_nodata> or C<set_filter_db>), but not 5980C<load_Ctable>. Consult L</"DECODING/ENCODING"> if you want more 5981precise info about what is returned in $data and $descriptors. 5982 5983C<next_observation> will return the empty list (so both $data and 5984$descriptors will be undef) in the following cases: if there are no 5985more BUFR messages in file/input buffer (so next call to C<eof()> will 5986return false), if no decoding of section 4 was requested in 5987C<set_nodata>, if filtering was turned on in C<set_filter_db> and the 5988BUFR message met the filter criteria in the user defined callback 5989function, or if the BUFR message contained 0 subsets. If you need to 5990distinguish the first case from the rest, one way would be to check 5991C<get_current_subset_number()> which will return 0 only in this first 5992case. 5993 5994If an error is met during decoding, it is possible to trap the error 5995in an eval and then continue calling C<next_observation> (as 5996demonstrated in source code of C<bufrread.pl>). Care has been taken 5997that BUFR messages with incorrectly stated BUFR length should not 5998cause later proper BUFR messages to be skipped. But the possibility of 5999an erroneous last BUFR message in file led to abandonment of the 6000convenient feature retained until Geo::BUFR version 1:25 of C<eof> 6001always returning false if there were no more BUFR messages in 6002file/input buffer. Instead you should expect last call to 6003C<next_observation> to return false (empty list). 6004 6005Filter BUFR messages: 6006 6007 $bufr->set_filter_cb(\&callback,@args); 6008 6009Here user is responsible for defining the callback subroutine. This 6010subroutine will then be called in C<next_observation> (with arguments 6011@args if provided) right after section 3 is decoded, and, if returning 6012true, will cause C<next_observation> to return immediately, without 6013even trying to decode section 4 (the data section). Here is a simple 6014example of such a callback (without arguments), filtering on AHL and 6015Data category (table A) of the BUFR message. 6016 6017 sub callback { 6018 my $obj = shift; 6019 return 1 if $obj->get_data_category != 0; 6020 my $ahl = $obj->get_current_ahl() || ''; 6021 return ($ahl =~ /^IS.... (ENMI|TEST)/); 6022 } 6023 6024Check result of filtering: 6025 6026 $bufr->is_filtered(); 6027 6028Will return true (1) if C<next_observation> returned immediately as 6029described for C<set_filter_cb> above. But calling C<is_filtered> 6030should rarely be needed, as in most cases the simple check 'next if 6031!$data' after calling C<next_observation> would be the natural way to 6032proceed. 6033 6034Print the contents of a subset in BUFR message: 6035 6036 print $bufr->dumpsections($data,$descriptors,$options); 6037 6038$options is optional. If this is first subset in message, will start 6039by printing message number and, if this is first message in a GTS 6040bulletin, AHL (Abbreviated Header Line), as well as contents of 6041sections 0, 1 and 3. For section 4, will also print subset 6042number. $options should be an anonymous hash with possible keys 6043'width' and 'bitmap', e.g. { width => 20, bitmap => 0 }. 'bitmap' 6044controls which of C<dumpsection4> and C<dumpsection4_with_bitmaps> 6045will be called internally by C<dumpsections>. Default value for 6046'bitmap' is 1, causing C<dumpsection4_with_bitmaps> to be 6047called. 'width' controls the value of $width used by the 6048C<dumpsection4...> methods, default is 15. If you intend to provide 6049the output from C<dumpsections> as input to C<reencode_message>, be 6050sure to set 'bitmap' to 0, and 'width' not smaller than the largest 6051data width in bytes among the descriptors with unit CCITTIA5 occuring 6052in the message. 6053 6054Normally C<dumpsections> is called after C<next_observation>, with 6055same arguments $data,$descriptors as returned from this call. From the 6056examples given at L<https://wiki.met.no/bufr.pm/start#bufrreadpl> you 6057can get an impression of what the output might look like. If 6058C<dumpsections> does not give you exactly what you want, you might 6059prefer to instead call the individual dumpsection methods below. 6060 6061Print the contents of sections 0-3 in BUFR message: 6062 6063 print $bufr->dumpsection0(); 6064 print $bufr->dumpsection1(); 6065 print $bufr->dumpsection2($sec2_code_ref); 6066 print $bufr->dumpsection3(); 6067 6068C<dumpsection2> returns an empty string if there is no optional 6069section in the message. The argument should be a reference to a 6070subroutine which takes the optional section as (a string) argument and 6071returns the text you want displayed after the 'Length of section:' 6072line. For general BUFR messages probably the best you can do is 6073displaying a hex dump, in which case 6074 6075 sub {return ' Hex dump:' . ' 'x26 . unpack('H*',substr(shift,4))} 6076 6077might be a suitable choice for $sec2_code_ref. For most applications 6078there should be no real need to call C<dumpsection2>. 6079 6080Print the data of a subset (descriptor, value, name and unit): 6081 6082 print $bufr->dumpsection4($data,$descriptors,$width); 6083 print $bufr->dumpsection4_with_bitmaps($data,$descriptors,$width); 6084 6085$width fixes the number of characters used for displaying the data 6086values, and is optional (defaults to 15). $data and $descriptors are 6087references to arrays of data values and BUFR descriptors respectively, 6088likely to have been fetched from C<next_observation>. Code and flag 6089values will be resolved if a C table has been loaded, i.e. if 6090C<load_Ctable> has been called earlier on. C<dumpsection4_with_bitmaps> 6091will display the bit-mapped values side by side with the corresponding 6092data values. If there is no bit-map in the BUFR message, 6093C<dumpsection4_with_bitmaps> will provide same output as 6094C<dumpsection4>. See L</"DECODING/ENCODING"> for some more information 6095about what is printed, and 6096L<https://wiki.met.no/bufr.pm/start#bufrreadpl> for real life examples 6097of output. 6098 6099Set verbose level: 6100 6101 Geo::BUFR->set_verbose($level); # 0 <= $level <= 6 6102 $bufr->set_verbose($level); 6103 6104Some info about what is going on in Geo::BUFR will be printed to 6105STDOUT if $level > 0. With $level set to 1, all that is printed is the 6106B, C and D tables used (with full path). Each line of verbose output 6107starts with 'BUFR.pm: ', except for the level 6 specific 6108output. Setting verbose level > 1 might be helpful when debugging, or 6109for example if you want to extract as much information as possible 6110from an incorrectly formatted BUFR message. 6111 6112No decoding of section 4 (data section): 6113 6114 Geo::BUFR->set_nodata($n); 6115 - $n=1 (or not provided): Skip decoding of section 4 (might speed up 6116 processing considerably if only metadata in section 1-3 is sought for) 6117 - $n=0: Decode section 4 (default in Geo::BUFR) 6118 6119No decoding of quality information: 6120 6121 Geo::BUFR->set_noqc($n); 6122 - $n=1 (or not provided): Don't decode quality information (more 6123 specifically: skip all descriptors after 222000) 6124 - $n=0: Decode quality information (default in Geo::BUFR) 6125 6126Enable/disable strict checking of BUFR format for recoverable errors 6127(like using BUFR compression for one subset message etc): 6128 6129 Geo::BUFR->set_strict_checking($n); 6130 - $n=0: disable checking (default in Geo::BUFR) 6131 - $n=1: warn (carp) if error but continue decoding 6132 - $n=2: die (croak) if error 6133 6134Confer L</STRICT CHECKING> for details of what is being checked if 6135strict checking is enabled. 6136 6137Show all BUFR table C operators (data description operators, F=2) as well 6138as all replication descriptors (F=1) when calling dumpsection4: 6139 6140 Geo::BUFR->set_show_all_operators($n); 6141 - $n=1 (or not provided): Show replication descriptors and all operators 6142 - $n=0: Show no replication descriptors and only the really informative 6143 data description operators (default in Geo::BUFR) 6144 6145C<set_show_all_operators(1)> cannot be combined with C<dumpsections> 6146with bitmap option set (which is the default). 6147 6148Set or get tableformat: 6149 6150 Geo::BUFR->set_tableformat($tableformat); 6151 $tableformat = Geo::BUFR->get_tableformat(); 6152 6153Set or get tablepath: 6154 6155 Geo::BUFR->set_tablepath($tablepath); 6156 $tablepath = Geo::BUFR->get_tablepath(); 6157 6158Get table version: 6159 6160 $table_version = $bufr->get_table_version($table); 6161 6162$table is optional. Return table version from $table if provided, or 6163else from section 1 information in the currently processed BUFR 6164message. For BUFRDC, this is a stripped down version of table name. If 6165for example $table = 'B0000000000088013001.TXT', will return 6166'0000000000088013001'. For ECCODES, this is last path of table 6167location (e.g. '0/wmo/29'), and a stringified list of two such paths 6168(master and local) if local tables are used 6169(e.g. '0/wmo/29,0/local/8/78/236'). Returns undef if impossible to 6170determine table version. 6171 6172Get number of subsets: 6173 6174 $nsubsets = $bufr->get_number_of_subsets(); 6175 6176Get current subset number: 6177 6178 $subset_no = $bufr->get_current_subset_number(); 6179 6180If decoding of section 4 has been skipped (due to use of C<set_nodata> 6181or C<set_filter_cb>), will return number of subsets. For a BUFR 6182message with 0 subsets, will actually return 1 (a bit weird perhaps, 6183but then this is a really weird kind of BUFR message to handle). 6184 6185Get current message number: 6186 6187 $message_no = $bufr->get_current_message_number(); 6188 6189Get current BUFR message: 6190 6191 $binary_msg = get_bufr_message(); 6192 6193This returns the original raw (binary, not the decoded) BUFR 6194message. An empty string will be returned if no BUFR message is found, 6195or if the currently processed BUFR message is erroneous (even if 6196section 4 is not decoded, there will at least be a check for finding 6197'7777' at expected end of BUFR message, as calculated from length of 6198BUFR message decoded from section 0). 6199 6200Get Abbreviated Header Line (AHL) before current message: 6201 6202 $ahl = $bufr->get_current_ahl(); 6203 6204If there is no AHL immediately preceding current message, default is 6205for C<get_current_ahl> to return undef. Sometimes that might not be 6206what you want, e.g. when processing a file with GTS bulletins with 6207possibly more than one BUFR message in each bulletin, and especially 6208so if filtering on AHL using C<set_filter_cb>. 6209 6210 Geo::BUFR->reuse_current_ahl($n); 6211 - $n=1 (or not provided): Will cause C<get_current_ahl> to return last 6212 AHL extracted and not undef if currently processed BUFR message has 6213 no (immediately preceding) AHL 6214 - $n=0: Reset C<get_current_ahl> to default behaviour as described 6215 above 6216 6217Check if AHL has been reused: 6218 6219 $bufr->ahl_is_reused(); 6220 6221Will return true (1) if the AHL returned by C<get_current_ahl> is a 6222reused one, i.e. the AHL is not immediately preceding the current BUFR 6223message. 6224 6225Check length of BUFR message (as stated in section 0): 6226 6227 $bufr->bad_bufrlength(); 6228 6229Will return true (1) if no '7777' is found at the end of BUFR message 6230(as calculated from the stated length of BUFR message in section 0), 6231which usually means that the BUFR message is badly corrupted 6232(e.g. truncated). But note that there should be no need to call 6233C<bad_bufrlength> if section 4 is decoded, as in this case you should 6234expect C<next_observation> to die with a more precise error message 6235describing the kind of corruption found. If no decoding of section 4 6236is done (because C<set_nodata> or C<set_filter_cb> were called), 6237however, C<next_observation> is likely not to throw an error, and you 6238can use C<bad_bufrlength> to decide what to do next (see source code of 6239C<bufrextract.pl> for example of use). 6240 6241Accessor methods for section 0-3: 6242 6243 $bufr->set_<variable>($variable); 6244 $variable = $bufr->get_<variable>(); 6245 6246where E<lt>variableE<gt> is one of 6247 6248 bufr_length (get only) 6249 bufr_edition 6250 master_table 6251 subcentre 6252 centre 6253 update_sequence_number 6254 optional_section (0 or 1) 6255 data_category 6256 int_data_subcategory 6257 loc_data_subcategory 6258 data_subcategory 6259 master_table_version 6260 local_table_version 6261 year_of_century 6262 year 6263 month 6264 day 6265 hour 6266 minute 6267 second 6268 local_use 6269 number_of_subsets 6270 observed_data (0 or 1) 6271 compressed_data (0 or 1) 6272 descriptors_unexpanded 6273 6274C<set_year_of_century(0)> will set year of century to 100. 6275C<get_year_of_century> will for BUFR edition 4 calculate year of 6276century from year in section 1. 6277 6278 6279Encode a new BUFR message: 6280 6281 $new_message = $bufr->encode_message($data_refs,$desc_refs); 6282 6283where $desc_refs->[$i] is a reference to the array of fully expanded 6284descriptors for subset number $i ($i=1 for first subset), 6285$data_refs->[$i] is a reference to the corresponding values, using 6286undef for missing values. The required metadata in section 0, 1 and 3 6287must have been set in $bufr before calling this method. See 6288L</"DECODING/ENCODING"> for meaning of 'fully expanded descriptors'. 6289 6290Encode a (single subset) NIL message: 6291 6292 $new_message = $bufr->encode_nil_message($stationid_ref,$delayed_repl_ref); 6293 6294$delayed_repl_ref is optional. In section 4 all values will be set to 6295missing except delayed replication factors and the (descriptor, value) 6296pairs in the hashref $stationid_ref. $delayed_repl_ref (if provided) 6297should be a reference to an array of data values for all descriptors 6298031001 and 031002 occuring in the message (these values must all be 6299nonzero), e.g. [3,1,2] if there are 3 such descriptors which should 6300have values 3, 1 and 2, in that succession. If $delayed_repl_ref is 6301omitted, all delayed replication factors will be set to 1. The 6302required metadata in section 0, 1 and 3 must have been set in $bufr 6303before calling this method (although number of subsets and BUFR 6304compression will automatically be set to 1 and 0 respectively, 6305whatever value they had before). 6306 6307Reencode BUFR message(s): 6308 6309 $new_messages = $bufr->reencode_message($decoded_messages,$width); 6310 6311$width is optional. Takes a text $decoded_messages as argument and 6312returns a (binary) string of BUFR messages which, when printed to file 6313and then processed by C<bufrread.pl> with no output modifying options set 6314(except possibly C<--width>), would give output equal to 6315$decoded_messages. If C<bufrread.pl> is to be called with C<--width 6316$width>, this $width must be provided to C<reencode_message> also. 6317 6318Join subsets from several messages: 6319 6320 ($data_refs,$desc_refs,$nsub) = Geo::BUFR->join_subsets($bufr_1,$subset_ref_1, 6321 ... $bufr_n,$subset_ref_n); 6322 6323where each $subset_ref_i is optional. Will return the data and 6324descriptors needed by C<encode_message> to encode a multi subset 6325message, extracting the subsets from the first message of each $bufr_i 6326object. All subsets in (first message of) $bufr_i will be used, unless 6327next argument is an array reference $subset_ref_i, in which case only 6328the subset numbers listed will be included, in the order specified. On 6329return $nsub will contain the total number of subsets thus 6330extracted. After a call to C<join_subsets>, the metadata (of the first 6331message) in each object will be available through the C<get_>-methods, 6332while a call to C<next_observation> will start extracting the first 6333subset in the first message. Here is an example of use, fetching first 6334subset from bufr object 1, all subsets from bufr object 2, and subsets 63354 and 2 from bufr object 3, then building up a new multi subset BUFR 6336message (which will succeed only if the bufr objects all have the same 6337descriptors in section 3): 6338 6339 my ($data_refs,$desc_refs,$nsub) = Geo::BUFR->join_subsets($bufr1, 6340 [1],$bufr2,$bufr3,[4,2]); 6341 my $new_bufr = Geo::BUFR->new(); 6342 # Get metadata from one of the objects, then reset those metadata 6343 # which might not be correct for the new message 6344 $new_bufr->copy_from($bufr1,'metadata'); 6345 $new_bufr->set_number_of_subsets($nsub); 6346 $new_bufr->set_update_sequence_number(0); 6347 $new_bufr->set_compressed_data(0); 6348 my $new_message = $new_bufr->encode_message($data_refs,$desc_refs); 6349 6350Extract BUFR table B information for an element descriptor: 6351 6352 ($name,$unit,$scale,$refval,$width) = $bufr->element_descriptor($desc); 6353 6354Will fetch name, unit, scale, reference value and data width in bits 6355for element descriptor $desc in the last table B loaded in the $bufr 6356object. Returns false if the descriptor is not found. 6357 6358Extract BUFR table D information for a sequence descriptor: 6359 6360 @descriptors = $bufr->sequence_descriptor($desc); 6361 $string = $bufr->sequence_descriptor($desc); 6362 6363Will return the descriptors in a direct (nonrecursive) lookup for the 6364sequence descriptor $desc in the last table D loaded in the $bufr 6365object. In scalar context the descriptors will be returned as a space 6366separated string. Returns false if the descriptor is not found. 6367 6368Resolve BUFR table descriptors (for printing): 6369 6370 print $bufr->resolve_descriptor($how,@descriptors); 6371 6372where $how is one of 'fully', 'partially', 'simply' and 'noexpand'. 6373Returns a text string suitable for printing information about the BUFR 6374table descriptors given. $how = 'fully': Expand all D descriptors 6375fully into B descriptors, with name, unit, scale, reference value and 6376width (each on a numbered line, except for replication operators which 6377are not numbered). $how = 'partially': Like 'fully', but expand D 6378descriptors only once and ignore replication. $how = 'noexpand': Like 6379'partially', but do not expand D descriptors at all. $how = 'simply': 6380Like 'partially', but list the descriptors on one single line with no 6381extra information provided. The relevant B/D table must have been 6382loaded before calling C<resolve_descriptor>. 6383 6384Resolve flag table value (for printing): 6385 6386 print $bufr->resolve_flagvalue($value,$flag_table,$B_table, 6387 $default_B_table,$num_leading_spaces); 6388 6389Last 2 arguments are optional. $default_B_table will be used if 6390$B_table is not found, $num_leading_spaces defaults to 0. 6391Examples: 6392 6393 print $bufr->resolve_flagvalue(4,8006,'B0000000000098013001.TXT') # BUFRDC 6394 print $bufr->resolve_flagvalue(4,8006,'0/wmo/13') # ECCODES, master table 6395 print $bufr->resolve_flagvalue(4,8193,'0/local/1/98/0') # ECCODES, local table 6396 6397Print the contents of BUFR code (or flag) table: 6398 6399 print $bufr->dump_codetable($code_table,$table,$default_table); 6400 6401where in BUFRDC $table is (base)name of the C...TXT file containing the code 6402tables, optionally followed by a default table which will be used if 6403$table is not found. 6404 6405C<resolve_flagvalue> and C<dump_codetable> will return empty string if 6406flag value or code table is not found. 6407 6408 6409Manipulate binary data (these are implemented in C for speed and primarily 6410intended as module internal subroutines): 6411 6412 $value = Geo::BUFR->bitstream2dec($bitstream,$bitpos,$num_bits); 6413 6414Extracts $num_bits bits from $bitstream, starting at bit $bitpos. The 6415extracted bits are interpreted as a nonnegative integer. Returns 6416undef if all bits extracted are 1 bits. 6417 6418 $ascii = Geo::BUFR->bitstream2ascii($bitstream,$bitpos,$num_bytes); 6419 6420Extracts $num_bytes bytes from bitstream, starting at $bitpos, and 6421interprets the extracted bytes as an ascii string. Returns undef if 6422the extracted bytes are all 1 bits. 6423 6424 Geo::BUFR->dec2bitstream($value,$bitstream,$bitpos,$bitlen); 6425 6426Encodes nonnegative integer value $value in $bitlen bits in 6427$bitstream, starting at bit $bitpos. Last byte will be padded with 1 6428bits. $bitstream must have been initialized to a string long enough to 6429hold $value. The parts of $bitstream before $bitpos and after last 6430encoded byte are not altered. 6431 6432 Geo::BUFR->ascii2bitstream($ascii,$bitstream,$bitpos,$width); 6433 6434Encodes ASCII string $ascii in $width bytes in $bitstream, starting at 6435$bitpos. Last byte will be padded with 1 bits. $bitstream must have 6436been initialized to a string long enough to hold $ascii. The parts of 6437$bitstream before $bitpos and after last encoded byte are not altered. 6438 6439 Geo::BUFR->null2bitstream($bitstream,$bitpos,$num_bits); 6440 6441Sets $num_bits bits in bitstream starting at bit $bitpos to 0 bits. 6442Last byte affected will be padded with 1 bits. $bitstream must be at 6443least $bitpos + $num_bits bits long. The parts of $bitstream before 6444$bitpos and after last encoded byte are not altered. 6445 6446=head1 DECODING/ENCODING 6447 6448The term 'fully expanded descriptors' used in the description of 6449C<encode_message> (and C<next_observation>) in L</METHODS> might need 6450some clarification. The short version is that the list of descriptors 6451should be exactly those which will be written out by running 6452C<dumpsection4> (or C<bufrread.pl> without any modifying options set) 6453on the encoded message. If you don't have a similar BUFR message at 6454hand to use as an example when wanting to encode a new message, you 6455might need a more specific prescription. Which is that for every data 6456value which occurs in the section 4 bitstream, you should include the 6457corresponding BUFR descriptor, using the artificial 999999 for 6458associated fields following the 204Y operator, I<and> including the 6459data operator descriptors 22[2345]000 and 23[2567]000 with data value 6460set to the empty string, if these occurs among the descriptors in 6461section 3 (rather: in the expansion of these, use C<bufrresolve.pl> to 6462check!). Element descriptors defining new reference values (following 6463the 203Y operator) will have F=0 (first digit in descriptor) replaced 6464with F=9 in C<next_observation>, while in C<encode_message> both F=0 6465and F=9 will be accepted for new reference values. When encoding 6466delayed repetition you should repeat the set of data (and descriptors) 6467to be repeated the number of times indicated by 031011 or 031012 (if 6468given the feedback that this is considered cumbersome, an option for 6469including the set of data/descriptors just once might be added later, 6470both for encoding end decoding). 6471 6472Some words about the procedure used for decoding and encoding data in 6473section 4 might shed some light on this choice of design. 6474 6475When decoding section 4 for a subset, first of all the BUFR 6476descriptors provided in section 3 are expanded as far as possible 6477without looking at the actual bitstream, i.e. by eliminating 6478nondelayed replication descriptors (F=1) and by using BUFR table D to 6479expand sequence descriptors (F=3). Then, for each of the thus expanded 6480descriptors, the data value is fetched from the bitstream according to 6481the prescriptions in BUFR table B, applying the data operator 6482descriptors (F=2) from BUFR table C as they are encountered, and 6483reexpanding the remaining descriptors every time a delayed replication 6484factor is fetched from bitstream. The resulting set of data values is 6485returned in an array @data, with the corresponding B (and sometimes 6486also some C) BUFR table descriptors in an array 6487@descriptors. C<next_observation> returns references to these two 6488arrays. For convenience, some of the data operator descriptors without 6489a corresponding data value (like 222000) are included in the 6490@descriptors because they are considered to provide valuable 6491information to the user, with corresponding value in @data set to the 6492empty string. These descriptors without a value are written by the 6493dumpsection4 methods on unnumbered lines, thereby distinguishing them 6494from descriptors corresponding to 'real' data values in section 4, 6495which are numbered consecutively. 6496 6497Encoding a subset is done in a very similar way, by expanding the 6498descriptors in section 3 as described above, but instead fetching the 6499data values from the @data array that the user supplies (actually 6500@{$data_refs->{$i}} where $i is subset number), and then finally 6501encoding this value to bitstream. 6502 6503The input parameter $desc_ref to C<encode_message> is in fact not 6504strictly necessary to be able to encode a new BUFR message. But there 6505is a good reason for requiring it. During encoding the descriptors 6506from expanding section 3 will consecutively be compared with the 6507descriptors in the user supplied $desc_ref, and if these at some point 6508differ, encoding will be aborted with an error message stating the 6509first descriptor which deviated from the expected one. By requiring 6510$desc_ref as input, the risk for encoding an erroneous section 4 is 6511thus greatly reduced, and also provides the user with highly valuable 6512debugging information if encoding fails. 6513 6514When decoding character data (unit CCITTIA5), any null characters 6515found are silently (unless $Strict_checking is set) removed, as well 6516as leading and trailing white space. 6517 6518=head1 BUFR TABLE FILES 6519 6520The BUFR table files should follow the format and naming conventions 6521used by one of these two ECMWF software packages: either BUFRDC 6522(download from https://confluence.ecmwf.int/display/BUFR/Releases), or 6523ecCodes (download from https://confluence.ecmwf.int/display/ECC/Releases). 6524 6525The utility programs in Geo::BUFR will look for table files by default 6526in the standard installation directories, which in Unix-like systems 6527will be /usr/local/lib/bufrtables for BUFRDC and 6528/usr/local/share/eccodes/definitions/bufr/tables for eCcodes. You can 6529change that behaviour by either providing the environment variable 6530BUFR_TABLES, or setting path explicitly by using the 6531C<--tablepath>. Note that while BUFR_TABLES is a well known concept in 6532BUFRDC software, the closest you get in eCcodes is probably 6533ECCODES_DEFINITION_PATH (see 6534e.g. https://confluence.ecmwf.int/display/ECC/BUFR%3A+Local+configuration), 6535for which BUFR_TABLES should (or could) be set to 6536ECCODES_DEFINITION_PATH/bufr/tables (again in Unix-like systems). 6537 6538=head1 STRICT CHECKING 6539 6540The package global $Strict_checking defaults to 6541 6542 0: Ignore recoverable errors in BUFR format met during decoding or encoding 6543 6544but can be changed to 6545 6546 1: Issue warning (carp) but continue decoding/encoding 6547 6548 2: Croak (die) instead of carp 6549 6550by calling C<set_strict_checking>. The following is checked for when 6551$Strict_checking is set to 1 or 2: 6552 6553=over 6554 6555=item * 6556 6557Total length of BUFR message as stated in section 0 bigger than actual length 6558 6559=item * 6560 6561Excessive bytes in section 4 (section longer than computed from section 3) 6562 6563=item * 6564 6565Compression set in section 3 for one subset message (BUFR reg. 94.6.3.2) 6566 6567=item * 6568 6569Bits 3-8 in octet 7 in section 3 not set to zero 6570 6571=item * 6572 6573Local reference value for compressed character data not having all 6574bits set to zero (94.6.3.2.i) 6575 6576=item * 6577 6578Illegal flag values (rightmost bit set for non-missing values) (Note (9) 6579to Table B in FM 94 BUFR) 6580 6581=item * 6582 6583Character data not being CCITTIA5 (Note (9) in FM 94 BUFR first page) 6584 6585=item * 6586 6587Null characters in CCITTIA5 data (Note (4) to Table B in FM 94 BUFR) 6588 6589=item * 6590 6591Missing CCITTIA5 value encoded as spaces 6592 6593=item * 6594 6595Invalid date and/or time in section 1 6596 6597=item * 6598 6599Cancellation operators (20[1-4]00, 203255 etc) when there is nothing to cancel 6600 6601=item * 6602 66030 subsets in message. This may not break any formal rules, but is 6604likely to cause problems in further data processing (and Geo::BUFR 6605will not allow you to encode or reencode such a message anyway). 6606 6607=item * 6608 6609Leaving out descriptors to be repeated when corresponding delayed 6610replication/repetition factor in section 4 is 0 and this is last data 6611item. E.g. ending 'Data descriptors unexpanded' in section 3 with 6612'106000 031001' when data value for 031001 is 0. This (mal)practice, 6613however, defies the very point of replication operations (BUFR 6614reg. 94.5.4). Presumably the purpose is to save some space in the BUFR 6615message, but then why not leave out also '106000 031001' and the (0) 6616data value for 031001? 6617 6618=item * 6619 6620Value encoded using BUFR compression which would be too big to encode 6621without compression. For example, for a data descriptor with data 6622width 9 bits a value of 510 ought to be the biggest value possible to 6623encode, but in a multisubset message using BUFR compression it is 6624possible to encode almost arbitrarily large values in single subsets 6625as long as the average over all subsets is contained within 9 6626bits. This is not breaking any formal rules, but almost certainly not 6627desirable. 6628 6629=back 6630 6631Plus some few more checks not considered interesting enough to be 6632mentioned here. 6633 6634=begin more_on_strict_checking 6635 6636These are: 6637- Replication of 0 descriptors (F=1, X=0) 6638- year_of_century > 100 6639- 206Y operator is not followed by a local descriptor 6640 6641 6642=end more_on_strict_checking 6643 6644=head1 BUGS OR MISSING FEATURES 6645 6646Some BUFR table C operators are not implemented or are untested, 6647mainly because I do not have access to BUFR messages containing such 6648operators. If you happen to come over a BUFR message which the current 6649module fails to decode properly, I would therefore highly appreciate 6650if you could mail me this. 6651 6652=head1 AUTHOR 6653 6654Pål Sannes E<lt>pal.sannes@met.noE<gt> 6655 6656=head1 CREDITS 6657 6658I am very grateful to Alvin Brattli, who (while employed as a 6659researcher at the Norwegian Meteorological Institute) wrote the first 6660version of this module, with the sole purpose of being able to decode 6661some very specific BUFR satellite data, but still provided the main 6662framework upon which this module is built. 6663 6664=head1 SEE ALSO 6665 6666Guide to WMO Table Driven Code Forms: FM 94 BUFR and FM 95 CREX; Layer 3: 6667Detailed Description of the Code Forms (for programmers of encoder/decoder 6668software) 6669 6670L<https://wiki.met.no/bufr.pm/start> 6671 6672=head1 COPYRIGHT 6673 6674Copyright (C) 2010-2020 MET Norway 6675 6676This module is free software; you can redistribute it and/or 6677modify it under the same terms as Perl itself. 6678 6679=cut 6680