pdd21_namespaces.pod - OpenGrok cross reference for /dports/lang/parrot/parrot-8.1.0/docs/pdds/pdd21_namespaces.pod

# Copyright (C) 2005-2010, Parrot Foundation.

=head1 PDD 21: Namespaces

=head2 Abstract

Description and implementation of Parrot namespaces.

=head2 Description

=over 4

=item - Namespaces should be stored under first-level namespaces corresponding
to the HLL language name

=item - Namespaces should be hierarchical

=item - The C<get_namespace> opcode takes a multidimensional hash key or an
array of name strings

=item - Namespaces follow the semantics of the HLL in which they're defined

=item - exports follow the semantics of the library's language

=item - Two interfaces: typed and untyped

=back

=head2 Definitions

=head3 "HLL"

A High Level Language, such as Perl, Python, or Tcl, in contrast to PIR, which
is a low-level language.

=head3 "current namespace"

The I<current namespace> at runtime is the namespace associated with the
currently executing subroutine.  PASM assigns each subroutine a namespace when
compilation of the subroutine begins.  Don't change the associated namespace
of a subroutine unless you're prepared for weird consequences.

(PASM also has its own separate concept of current namespace which is used to
initialize the runtime current namespace as well as determine where to store
compiled symbols.)


=head2 Implementation

=head3 Namespace Indexing Syntax

Namespaces are denoted in Parrot as simple strings, multidimensional
hash keys, or arrays of name strings.

A namespace may appear in Parrot source code as the string C<"a"> or the
key C<["a"]>.

A nested namespace "b" inside the namespace "a" will appear as the key
C<["a"; "b"]>.

There is no limit to namespace nesting.

=head3 Naming Conventions

Parrot's target languages have a wide variety of namespace models. By
implementing an API and standard conventions, it should be possible to
allow interoperability while still allowing each one to choose the best
internal representation.


=over 4

=item True Root Namespace

The true root namespace is hidden from common usage, but it is available
via the C<get_root_namespace> opcode. For example:

  $P0 = get_root_namespace

This root namespace stringifies to the empty string.

=item HLL Root Namespaces

Each HLL must store public items in a namespace named with the lowercased
name of the HLL.  This is the HLL root namespace.  For instance, Tcl's
user-created namespaces should live in the C<tcl> namespace.  This
eliminates any accidental collisions between languages.

An HLL root namespace must be stored at the first level in Parrot's namespace
hierarchy.  These top-level namespaces should also be specified in a standard
unicode encoding.  The reasons for these restrictions is to allow compilers
to remain completely ignorant of each other.

Parrot internals are stored in the default HLL root namespace C<parrot>.

=item HLL Implementation Namespaces

Each HLL must store implementation internals (private items) in an HLL
root namespace named with an underscore and the lowercased name of the
HLL.  For instance, Tcl's implementation internals should live in the
C<_tcl> namespace.

=item HLL User-Created Namespaces

Each HLL must store all user-created namespaces under the HLL root
namespace.  It is suggested that HLLs use hierarchical namespaces to
practical extent.  A single flat namespace can be made to work, but it
complicates symbol exportation.

=back

=head3 Namespace PMC API

Most languages leave their symbols plain, which makes lookups quite
straightforward.  Others use sigils or other mangling techniques, complicating
the problem of interoperability.

Parrot namespaces assist with interoperability by providing two interface
subsets: the I<untyped interface> and the I<typed interface>.

=head4 Untyped Interface

Each HLL may, when working with its own namespace objects, use the I<untyped
interface>, which allows direct naming in the native style of the namespace's
HLL.

This interface consists of the standard Parrot hash interface, with all its
keys, values, lookups, deletions, etc.  Just treat the namespace like a
hash.  (It probably is one, really, deep down.)

The untyped interface also has one method:

=over 4

=item C<get_name>

=begin PIR_FRAGMENT

    $P1 = $P2.'get_name'()

=end PIR_FRAGMENT

Gets the name of the namespace $P2 as an array of strings.  For example,
if $P2 is a Perl 5 namespace "Some::Module", within the Perl 5 HLL, then
get_name() on $P2 returns an array of "perl5", "Some", "Module". It
returns the literal namespace names as the HLL stored them, without
filtering for name mangling.

NOTE: Due to aliasing, this value may be wrong -- i.e. it may disagree with
the namespace name with which you found the namespace in the first place.

=back

=head4 Typed Interface

When a given namespace's HLL is either different from the current HLL or
unknown, an HLL should generally use only the language-agnostic namespace
interface.  This interface isolates HLLs from each others' naming quirks.  It
consists of C<add_foo()>, C<find_foo()>, and C<del_foo()> methods, for
values of "foo" including "sub" (something executable), "namespace"
(something in which to find more names), and "var" (anything).

NOTE: The job of the typed interface is to bridge I<naming> differences, and
I<only> naming differences.  Therefore: 1) It does not enforce, nor even
notice, the interface requirements of "sub" or "namespace": e.g.
execution of C<add_sub("foo", $P0)> does I<not> automatically guarantee
that $P0 is an invokable subroutine; and 2) it does not prevent
overwriting one type with another.

=over 4

=item C<add_namespace>

=begin PIR_FRAGMENT

    $P1.'add_namespace'($S2, $P3)

=end PIR_FRAGMENT

Store $P3 as a namespace under the namespace $P1, with the name of $S2.

=item C<add_sub>

=begin PIR_FRAGMENT

    $P1.'add_sub'($S2, $P3)

=end PIR_FRAGMENT

Store $P3 as a subroutine with the name of $S2 in the namespace $P1.

=item C<add_var>

=begin PIR_FRAGMENT

    $P1.'add_var'($S2, $P3)

=end PIR_FRAGMENT

Store $P3 as a variable with the name of $S2 in the namespace $P1.

IMPLEMENTATION NOTE: Perl namespace implementations may choose to implement
add_var() by checking which parts of the variable interface are
implemented by $P0 (scalar, array, and/or hash) so it can decide on an
appropriate sigil.

=item C<del_namespace>, C<del_sub>, C<del_var>

=begin PIR_FRAGMENT

    $P1.'del_namespace'($S2)
    $P1.'del_sub'($S2)
    $P1.'del_var'($S2)

=end PIR_FRAGMENT

Delete the sub, namespace, or variable named $S2 from the namespace $P1.

=item C<find_namespace>, C<find_sub>, C<find_var>

=begin PIR_FRAGMENT

    $P1 = $P2.'find_namespace'($S3)
    $P1 = $P2.'find_sub'($S3)
    $P1 = $P2.'find_var'($S3)

=end PIR_FRAGMENT

Find the sub, namespace, or variable named $S3 in the namespace $P2.

IMPLEMENTATION NOTE: Perl namespace implementations should implement
find_var() to check all variable sigils, but the order is not to be counted on
by users.  If you're planning to let Python code see your module, you should
avoid exporting both C<our $A> and C<our @A>.  (Well, you might want to
consider not exporting variables at all, but that's a style issue.)

=item C<export_to>

=begin PIR_FRAGMENT

    $P1.'export_to'($P2, $P3)

=end PIR_FRAGMENT

Export items from the namespace $P1 into the namespace $P2.  The items to
export are named in $P3, which may be an array of strings, a hash, or null.
If $P3 is an array of strings, interpretation of items in an array follows
the conventions of the source (exporting) namespace.
If $P3 is a hash, the keys correspond to the names in the source namespace,
and the values correspond to the names in the destination namespace.
If a hash value is null or an empty string, the name in the hash key is used.
A null $P3 requests the 'default' set of items.
Any other type passed into $P3 throws an exception.

The base Parrot namespace export_to() function interprets item names as
literals -- no wildcards or other special meaning.  There is no default list
of items to export, so $P3 of null and $P3 of an empty array have the same
behavior.

NOTE: Exportation may entail non-obvious, odd, or even mischievous behavior.
For example, Perl's pragmata are implemented as exports, and they don't
actually export anything.

IMPLEMENTATION EXAMPLES: Suppose a Perl program were to import some Tcl module
with an import pattern of "c*" -- something that might be expressed in Perl 6
as C<use tcl:Some::Module 'c*'>.  This operation would import all the commands
that start with 'c' from the given Tcl namespace into the current Perl
namespace.  This is so because, regardless of whether 'c*' is a Perl 6 style
export pattern, it I<is> a valid Tcl export pattern.

{XXX - The ':' for HLL is just proposed. This example will need to be
updated later.}

IMPLEMENTATION NOTE: Most namespace C<export_to> implementations will restrict
themselves to using the typed interface on the target namespace.  However,
they may also decide to check the type of the target namespace and, if it
turns out to be of a compatible type, to use same-language shortcuts.

DESIGN TODO: Figure out a good convention for a default export list in the
base namespace PMC.  Maybe a standard method "expand_export_list()"?

=back

=head3 Compiler PMC API

=head4 Methods

=over 4

=item C<parse_name>

=begin PIR_FRAGMENT

    $P1 = $P2.'parse_name'($S3)

=end PIR_FRAGMENT

Parse the name in $S3 using the rules specific to the compiler $P2, and
return an array of individual name elements.

For example, a Java compiler would turn 'C<a.b.c>' to C<['a','b','c']>,
while a Perl compiler would turn 'C<a::b::c>' into the same result.
Meanwhile, due to Perl's sigil rules, 'C<$a::b::c>' would become
C<['a','b','$c']>.

=item C<get_namespace>

=begin PIR_FRAGMENT

    $P1 = $P2.'get_namespace'($P3)

=end PIR_FRAGMENT

Ask the compiler $P2 to find its namespace which is named by the
elements of the array in $P3.  If $P3 is a null PMC or an empty array,
C<get_namespace> retrieves the base namespace for the HLL.  It returns a
namespace PMC on success and a null PMC on failure.

This method allows other HLLs to know one name (the HLL) and then work with
that HLL's modules without having to know the name it chose for its namespace
tree.  (If you really want to know the name, the get_name() method should work
on the returned namespace PMC.)

Note that this method is basically a convenience and/or performance hack, as
it does the equivalent of C<get_root_namespace> followed by
zero or more calls to <namespace>.get_namespace().  However, any compiler is
free to cheat if it doesn't get caught, e.g. to use the untyped namespace
interface if the language doesn't mangle namespace names.

=item C<load_library>

=begin PIR_FRAGMENT

    $P1.'load_library'($P2, $P3)

=end PIR_FRAGMENT

Ask this compiler to load a library/module named by the elements of the array
in $P2, with optional control information in $P3.

For example, Perl 5's module named "Some::Module" should be loaded using (in
pseudo Perl 6): C<perl5.load_library(["Some", "Module"], null)>.

The meaning of $P3 is compiler-specific.  The only universal legal value is
Null, which requests a "normal" load.  The meaning of "normal" varies, but
the ideal would be to perform only the minimal actions required.

On failure, an exception is thrown.

=back

=head3 Subroutine PMC API

Some information must be available about subroutines to implement the correct
behavior about namespaces.

=head4 Methods

=over 4

=item C<get_namespace>

=begin PIR_FRAGMENT

    $P1 = $P2.'get_namespace'()

=end PIR_FRAGMENT

Retrieve the namespace $P1 where the subroutine $P2 was defined. (As
opposed to the namespace(s) that it may have been exported to.)

=back

=head3 Namespace Opcodes

The namespace opcodes all have 3 variants: one that operates from the
currently selected namespace (i.e. the namespace of the currently
executing subroutine), one that operates from the HLL root namespace
(identified by "hll" in the opcode name), and one that operates from the
true root namespace (identified by "root" in the name).

=over 4

=item C<set_namespace>

=begin PIR_FRAGMENT_INVALID

    set_namespace ['key'], $P1
    set_hll_namespace ['key'], $P1
    set_root_namespace ['key'], $P1

=end PIR_FRAGMENT_INVALID

Add the namespace PMC $P1 under the name denoted by a multidimensional
hash key.

=begin PIR_FRAGMENT_INVALID

    set_namespace $P1, $P2
    set_hll_namespace $P1, $P2
    set_root_namespace $P1, $P2

=end PIR_FRAGMENT_INVALID

Add the namespace PMC $P2 under the name denoted by an array of name
strings $P1.

=item C<get_namespace>

=begin PIR_FRAGMENT

    $P1 = get_namespace
    $P1 = get_hll_namespace
    $P1 = get_root_namespace

=end PIR_FRAGMENT

Retrieve the current namespace, the HLL root namespace, or the true
root namespace and store it in $P1.

=begin PIR_FRAGMENT_INVALID

    $P1 = get_namespace [key]
    $P1 = get_hll_namespace [key]
    $P1 = get_root_namespace [key]

=end PIR_FRAGMENT_INVALID

Retrieve the namespace denoted by a multidimensional hash key and
store it in C<$P1>.

=begin PIR_FRAGMENT

    $P1 = get_namespace $P2
    $P1 = get_hll_namespace $P2
    $P1 = get_root_namespace $P2

=end PIR_FRAGMENT

Retrieve the namespace denoted by the array of names $P2 and store it in
C<$P1>.

Thus, to get the "Foo::Bar" namespace from the top-level of the HLL if
the name was known at compile time, you could retrieve the namespace
with a key:

=begin PIR_FRAGMENT

  $P0 = get_hll_namespace ["Foo"; "Bar"]

=end PIR_FRAGMENT

If the name was not known at compile time, you would retrieve the
namespace with an array instead:

=begin PIR_FRAGMENT

  $P1 = split "::", "Foo::Bar"
  $P0 = get_hll_namespace $P1

=end PIR_FRAGMENT

=item C<make_namespace>

=begin PIR_FRAGMENT_INVALID

    $P1 = make_namespace [key]
    $P1 = make_hll_namespace [key]
    $P1 = make_root_namespace [key]

=end PIR_FRAGMENT_INVALID

Create and retrieve the namespace denoted by a multidimensional hash key
and store it in C<$P1>. If the namespace already exists, only retrieve
it.

=begin PIR_FRAGMENT_INVALID

    $P1 = make_namespace $P2
    $P1 = make_hll_namespace $P2
    $P1 = make_root_namespace $P2

=end PIR_FRAGMENT_INVALID

Create and retrieve the namespace denoted by the array of names $P2 and
store it in C<$P1>. If the namespace already exists, only retrieve it.

=item C<get_global>

=begin PIR_FRAGMENT

    $P1 = get_global $S2
    $P1 = get_hll_global $S2
    $P1 = get_root_global $S2

=end PIR_FRAGMENT

Retrieve the symbol named $S2 in the current namespace, HLL root
namespace, or true root namespace.

=begin PIR_FRAGMENT

    .local pmc key
    $P1 = get_global [key], $S2
    $P1 = get_hll_global [key], $S2
    $P1 = get_root_global [key], $S2

=end PIR_FRAGMENT

Retrieve the symbol named $S2 by a multidimensional hash key relative
to the current namespace, HLL root namespace, or true root namespace.

=begin PIR_FRAGMENT

    $P1 = get_global $P2, $S3
    $P1 = get_hll_global $P2, $S3
    $P1 = get_root_global $P2, $S3

=end PIR_FRAGMENT

Retrieve the symbol named $S3 by the array of names $P2 relative to the
current namespace, HLL root namespace, or true root namespace.

=item C<set_global>

=begin PIR_FRAGMENT

    set_global $S1, $P2
    set_hll_global $S1, $P2
    set_root_global $S1, $P2

=end PIR_FRAGMENT

Store $P2 as the symbol named $S1 in the current namespace, HLL root
namespace, or true root namespace.

=begin PIR_FRAGMENT

    .local pmc key
    set_global [key], $S1, $P2
    set_hll_global [key], $S1, $P2
    set_root_global [key], $S1, $P2

=end PIR_FRAGMENT

Store $P2 as the symbol named $S1 by a multidimensional hash key,
relative to the current namespace, HLL root namespace, or true root
namespace.  If the given namespace does not exist it is created.

=begin PIR_FRAGMENT

    set_global $P1, $S2, $P3
    set_hll_global $P1, $S2, $P3
    set_root_global $P1, $S2, $P3

=end PIR_FRAGMENT

Store $P3 as the symbol named $S2 by the array of names $P1, relative to
the current namespace, HLL root namespace, or true root namespace.  If
the given namespace does not exist it is created.

=back

=head3 HLL Namespace Mapping

In order to make this work, Parrot must somehow figure out what type of
namespace PMC to create.

=head4 Default Namespace

The default namespace PMC will implement Parrot's current behavior.

=head4 Compile-time Creation

This Perl:

  #!/usr/bin/perl
  package Foo;
  $x = 5;

should map roughly to this PIR:

=begin PIR_INVALID

  .HLL "Perl5"
  .loadlib "perl5_group"
  .namespace [ "Foo" ]
  .sub main :main
    $P0 = new 'PerlInt'
    $P0 = 5
    set_global "$x", $P0
  .end

=end PIR_INVALID

In this case, the C<main> sub would be tied to Perl 5 by the C<.HLL>
directive, so a Perl 5 namespace would be created.

=head4 Run-time Creation

Consider the following Perl 5 program:

  #!/usr/bin/perl
  $a = 'x';
  ${"Foo::$a"} = 5;

The C<Foo::> namespace is created at run-time (without any optimizations).  In
these cases, Parrot should create the namespace based on the HLL of the PIR
subroutine that calls the store function.

=begin PIR_INVALID

  .HLL "Perl5"
  .loadlib "perl5_group"
  .sub main :main
    # $a = 'x';
    $P0 = new 'PerlString'
    $P0 = "x"
    set_global "$a", $P0
    # ${"Foo::$a"} = 5;
    $P1 = new 'PerlString'
    $P1 = "Foo::"
    $P1 .= $P0
    $S0 = $P1
    $P2 = split "::", $S0
    $S0 = pop $P2
    $S0 = "$" . $S0
    $P3 = new 'PerlInt'
    $P3 = 5
    set_global $P2, $S0, $P3
  .end

=end PIR_INVALID

In this case, C<set_global> should see that it was called from "main",
which is in a Perl 5 namespace, so it will create the "Foo" namespace as
a Perl 5 namespace.

=head2 Language Notes

=head3 Perl 6

=head4 Sigils

Perl 6 may wish to be able to access the namespace as a hash with sigils.
That is certainly possible, even with subroutines and methods.  It's not
important that a HLL use the typed namespace API, it is only important that it
provides it for others to use.

So Perl 6 may implement C<get_keyed> and C<set_keyed> VTABLE slots that
allow the namespace PMC to be used as a hash.  The C<find_sub> method would,
in this case, append a "&" sigil to the front of the sub/method name and
search in the internal hash.

=head3 Python

=head4 Importing from Python

Since functions and variables overlap in Python's namespaces, when exporting
to another HLL's namespace, the Python namespace PMC's C<export_to> method
should use introspection to determine whether C<x> should be added using
C<add_var> or C<add_sub>.  C<$I0 = does $P0, "Sub"> may be enough to decide
correctly.

=head4 Subroutines and Namespaces

Since Python's subroutines and namespaces are just variables (the namespace
collides there), the Python PMC's C<find_var> method may return subroutines as
variables.


=head3 Examples

=head4 Aliasing

Perl:

  #!/usr/bin/perl6
  sub foo {...}
  %Foo::{"&bar"} = &foo;

PIR:

=begin PIR

  .sub main :main
    $P0 = get_global "&foo"
    $P1 = get_namespace ["Foo"]

    # A smart perl6 compiler would emit this,
    # because it knows that Foo is a perl6 namespace:
    $P1["&bar"] = $P0

    # But a naive perl6 compiler would emit this:
    $P1.'add_sub'("bar", $P0)

  .end

  .sub foo
    #...
  .end

=end PIR

=head4 Cross-language Exportation

Perl 5:

  #!/usr/bin/perl
  use tcl:Some::Module 'w*';   # XXX - ':' after HLL may change in Perl 6
  write("this is a tcl command");

PIR (without error checking):

=begin PIR

  .sub main :main
    .local pmc tcl
    .local pmc ns
    tcl = compreg "tcl"
    ns = new 'Array'
    ns = 2
    ns[0] = "Some"
    ns[1] = "Module"
    null $P0
    tcl.'load_library'(ns, $P0)
    $P0 = tcl.'get_namespace'(ns)
    $P1 = get_namespace
    $P0.'export_to'($P1, 'w*')
    "write"("this is a tcl command")
  .end

=end PIR

=head2 References

None.

=cut

__END__
Local Variables:
  fill-column:78
End:
vim: expandtab shiftwidth=4: