Design Decisions

The Lichen language design involves some different choices to those taken in Python's design. Many of these choices are motivated by the following criteria:

Lichen is in many ways a restricted form of Python. In particular, restrictions on the attribute names supported by each object help to clearly define the object types in a program, allowing us to identify those objects when they are used. Consequently, optimisations that can be employed in a Lichen program become possible in situations where they would have been difficult or demanding to employ in a Python program.

Some design choices evoke memories of earlier forms of Python. Removing nested scopes simplifies the inspection of programs and run-time representations and mechanisms. Other choices seek to remedy difficult or defective aspects of Python, notably the behaviour of Python's import system.

  1. Attributes
    1. Fixed Attribute Names
    2. No Shadowing
    3. No Dynamic Attributes
  2. Naming
    1. Traditional Local, Global and Built-In Scopes Only
    2. Reserved Self
    3. Instance Attribute Initialisers
  3. Inheritance and Binding
    1. Inherited Class Attributes
    2. Unbound Methods
    3. Assigning Functions to Class Attributes
    4. Well-Defined Base Classes
    5. Class Definitions and Functions
  4. Modules and Packages
    1. Independent Modules
    2. Specific Name Imports Only
    3. Modules Imported Independently
  5. Syntax and Control-Flow
    1. No If Expressions or Comprehensions
    2. No With Statement
    3. No Generators
    4. Positional and Keyword Arguments Only
    5. Positional Parameters Only

Attributes

Lichen

Python

Rationale

Objects have a fixed set of attribute names

Objects can gain and lose attributes at run-time

Having fixed sets of attributes helps identify object types

Instance attributes may not shadow class attributes

Instance attributes may shadow class attributes

Forbidding shadowing simplifies access operations

Attributes are simple members of object structures

Dynamic handling and computation of attributes is supported

Forbidding dynamic attributes simplifies access operations

Fixed Attribute Names

Attribute names are bound for classes through assignment in the class namespace, for modules in the module namespace, and for instances in methods through assignment to self. Class and instance attributes are propagated to descendant classes and instances of descendant classes respectively. Once bound, attributes can be modified, but new attributes cannot be bound by other means, such as the assignment of an attribute to an arbitrary object that would not already support such an attribute.

class C:
    a = 123
    def __init__(self):
        self.x = 234

C.b = 456 # not allowed (b not bound in C)
C().y = 567 # not allowed (y not bound for C instances)

Permitting the addition of attributes to objects would then require that such addition attempts be associated with particular objects, leading to a potentially iterative process involving object type deduction and modification, also causing imprecise results.

No Shadowing

Instances may not define attributes that are provided by classes.

class C:
    a = 123
    def shadow(self):
        self.a = 234 # not allowed (attribute shadows class attribute)

Permitting this would oblige instances to support attributes that, when missing, are provided by consulting their classes but, when not missing, may also be provided directly by the instances themselves.

No Dynamic Attributes

Instance attributes cannot be provided dynamically, such that any missing attribute would be supplied by a special method call to determine the attribute's presence and to retrieve its value.

class C:
    def __getattr__(self, name): # not supported
        if name == "missing":
            return 123

Permitting this would require object types to potentially support any attribute, undermining attempts to use attributes to identify objects.

Naming

Lichen

Python

Rationale

Names may be local, global or built-in: nested namespaces must be initialised explicitly

Names may also be non-local, permitting closures

Limited name scoping simplifies program inspection and run-time mechanisms

self is a reserved name and is optional in method parameter lists

self is a naming convention, but the first method parameter must always refer to the accessed object

Reserving self assists deduction; making it optional is a consequence of the method binding behaviour

Instance attributes can be initialised using .name parameter notation

Workarounds involving decorators and introspection are required for similar brevity

Initialiser notation eliminates duplication in program code and is convenient

Traditional Local, Global and Built-In Scopes Only

Namespaces reside within a hierarchy within modules: classes containing classes or functions; functions containing other functions. Built-in names are exposed in all namespaces, global names are defined at the module level and are exposed in all namespaces within the module, locals are confined to the namespace in which they are defined.

However, locals are not inherited by namespaces from surrounding or enclosing namespaces.

def f(x):
    def g(y):
        return x + y # not permitted: x is not inherited from f in Lichen (it is in Python)
    return g

def h(x):
    def i(y, x=x): # x is initialised but held in the namespace of i
        return x + y # succeeds: x is defined
    return i

Needing to access outer namespaces in order to access any referenced names complicates the way in which such dynamic namespaces would need to be managed. Although the default initialisation technique demonstrated above could be automated, explicit initialisation makes programs easier to follow and avoids mistakes involving globals having the same name.

Reserved Self

The self name can be omitted in method signatures, but in methods it is always initialised to the instance on which the method is operating.

class C:
    def f(y): # y is not the instance
        self.x = y # self is the instance

The assumption in methods is that self must always be referring to an instance of the containing class or of a descendant class. This means that self cannot be initialised to another kind of value, which Python permits through the explicit invocation of a method with the inclusion of the affected instance as the first argument. Consequently, self becomes optional in the signature because it is not assigned in the same way as the other parameters.

Instance Attribute Initialisers

In parameter lists, a special notation can be used to indicate that the given name is an instance attribute that will be assigned the argument value corresponding to the parameter concerned.

class C:
    def f(self, .a, .b, c): # .a and .b indicate instance attributes
        self.c = c # a traditional assignment using a parameter

To use the notation, such dot-qualified parameters must appear only in the parameter lists of methods, not plain functions. The qualified parameters are represented as locals having the same name, and assignments to the corresponding instance attributes are inserted into the generated code.

class C:
    def f1(self, .a, .b): # equivalent to f2, below
        pass

    def f2(self, a, b):
        self.a = a
        self.b = b

    def g(self, .a, .b, a): # not permitted: a appears twice
        pass

Naturally, self, being a reserved name in methods, can also be omitted from such parameter lists. Moreover, such initialising parameters can have default values.

class C:
    def __init__(.a=1, .b=2):
        pass

c1 = C()
c2 = C(3, 4)
print c1.a, c1.b # 1 2
print c2.a, c2.b # 3 4

Inheritance and Binding

Lichen

Python

Rationale

Class attributes are propagated to class hierarchy members during initialisation: rebinding class attributes does not affect descendant class attributes

Class attributes are propagated live to class hierarchy members and must be looked up by the run-time system if not provided by a given class

Initialisation-time propagation simplifies access operations and attribute table storage

Unbound methods must be bound using a special function taking an instance

Unbound methods may be called using an instance as first argument

Forbidding instances as first arguments simplifies the invocation mechanism

Functions assigned to class attributes do not become unbound methods

Functions assigned to class attributes become unbound methods

Removing method assignment simplifies deduction: methods are always defined in place

Base classes must be well-defined

Base classes may be expressions

Well-defined base classes are required to establish a well-defined hierarchy of types

Classes may not be defined in functions

Classes may be defined in any kind of namespace

Forbidding classes in functions prevents the definition of countless class variants that are awkward to analyse

Inherited Class Attributes

Class attributes that are changed for a class do not change for that class's descendants.

class C:
    a = 123

class D(C):
    pass

C.a = 456
print D.a # remains 123 in Lichen, becomes 456 in Python

Permitting this requires indirection for all class attributes, requiring them to be treated differently from other kinds of attributes. Meanwhile, class attribute rebinding and the accessing of inherited attributes changed in this way is relatively rare.

Unbound Methods

Methods are defined on classes but are only available via instances: they are instance methods. Consequently, acquiring a method directly from a class and then invoking it should fail because the method will be unbound: the "context" of the method is not an instance. Furthermore, the Python technique of supplying an instance as the first argument in an invocation to bind the method to an instance, thus setting the context of the method, is not supported. See "Reserved Self" for more information.

class C:
    def f(self, x):
        self.x = x
    def g(self):
        C.f(123) # not permitted: C is not an instance
        C.f(self, 123) # not permitted: self cannot be specified in the argument list
        get_using(C.f, self)(123) # binds C.f to self, then the result is called

Binding methods to instances occurs when acquiring methods via instances or explicitly using the get_using built-in. The built-in checks the compatibility of the supplied method and instance. If compatible, it provides the bound method as its result.

Normal functions are callable without any further preparation, whereas unbound methods need the binding step to be performed and are not immediately callable. Were functions to become unbound methods upon assignment to a class attribute, they would need to be invalidated by having the preparation mechanism enabled on them. However, this invalidation would only be relevant to the specific case of assigning functions to classes and this would need to be tested for. Given the added complications, such functionality is arguably not worth supporting.

Assigning Functions to Class Attributes

Functions can be assigned to class attributes but do not become unbound methods as a result.

class C:
    def f(self): # will be replaced
        return 234

def f(self):
    return self

C.f = f # makes C.f a function, not a method
C().f() # not permitted: f requires an explicit argument
C().f(123) # permitted: f has merely been exposed via C.f

Methods are identified as such by their definition location, they contribute information about attributes to the class hierarchy, and they employ certain structure details at run-time to permit the binding of methods. Since functions can defined in arbitrary locations, no class hierarchy information is available, and a function could combine self with a range of attributes that are not compatible with any class to which the function might be assigned.

Well-Defined Base Classes

Base classes must be clearly identifiable as well-defined classes. This facilitates the cataloguing of program objects and further analysis on them.

class C:
    x = 123

def f():
    return C

class D(f()): # not permitted: f could return anything
    pass

If base class identification could only be done reliably at run-time, class relationship information would be very limited without running the program or performing costly and potentially unreliable analysis. Indeed, programs employing such dynamic base classes are arguably resistant to analysis, which is contrary to the goals of a language like Lichen.

Class Definitions and Functions

Classes may not be defined in functions because functions provide dynamic namespaces, but Lichen relies on a static namespace hierarchy in order to clearly identify the principal objects in a program. If classes could be defined in functions, despite seemingly providing the same class over and over again on every invocation, a family of classes would, in fact, be defined.

def f(x):
    class C: # not permitted: this describes one of potentially many classes
        y = x
    return f

Moreover, issues of namespace nesting also arise, since the motivation for defining classes in functions would surely be to take advantage of local state to parameterise such classes.

Modules and Packages

Lichen

Python

Rationale

Modules are independent: package hierarchies are not traversed when importing

Modules exist in hierarchical namespaces: package roots must be imported before importing specific submodules

Eliminating module traversal permits more precise imports and reduces superfluous code

Only specific names can be imported from a module or package using the from statement

Importing "all" from a package or module is permitted

Eliminating "all" imports simplifies the task of determining where names in use have come from

Modules must be specified using absolute names

Imports can be absolute or relative

Using only absolute names simplifies the import mechanism

Modules are imported independently and their dependencies subsequently resolved

Modules are imported as import statements are encountered

Statically-initialised objects can be used declaratively, although an initialisation order may still need establishing

Independent Modules

The inclusion of modules in a program affects only explicitly-named modules: they do not have relationships implied by their naming that would cause such related modules to be included in a program.

from compiler import consts # defines consts
import compiler.ast # defines ast, not compiler

ast # is defined
compiler # is not defined
consts # is defined

Where modules should have relationships, they should be explicitly defined using from and import statements which target the exact modules required. In the above example, compiler is not routinely imported because modules within the compiler package have been requested.

Specific Name Imports Only

Lichen, unlike Python, also does not support the special __all__ module attribute.

from compiler import * # not permitted
from compiler import ast, consts # permitted

interpreter # undefined in compiler (yet it might be thought to reside there) and in this module

The __all__ attribute supports from ... import * statements in Python, but without identifying the module or package involved and then consulting __all__ in that module or package to discover which names might be involved (which might require the inspection of yet other modules or packages), the names imported cannot be known. Consequently, some names used elsewhere in the module performing the import might be assumed to be imported names when, in fact, they are unknown in both the importing and imported modules. Such uncertainty hinders the inspection of individual modules.

Modules Imported Independently

When indicating an import using the from and import statements, the toolchain does not attempt to immediately import other modules. Instead, the imports act as declarations of such other modules or names from other modules, resolved at a later stage. This permits mutual imports to a greater extent than in Python.

# Module M
from N import C # in Python: fails attempting to re-enter N

class D(C):
    y = 456

# Module N
from M import D # in Python: causes M to be entered, fails when re-entered from N

class C:
    x = 123

class E(D):
    z = 789

# Main program
import N

Such flexibility is not usually needed, and circular importing usually indicates issues with program organisation. However, declarative imports can help to decouple modules and avoid combining import declaration and module initialisation order concerns.

Syntax and Control-Flow

Lichen

Python

Rationale

If expressions and comprehensions are not supported

If expressions and comprehensions are supported

Omitting such syntactic features simplifies program inspection and translation

The with statement is not supported

The with statement offers a mechanism for resource allocation and deallocation using context managers

This syntactic feature can be satisfactorily emulated using existing constructs

Generators are not supported

Generators are supported

Omitting generator support simplifies run-time mechanisms

Only positional and keyword arguments are supported

Argument unpacking (using * and **) is supported

Omitting unpacking simplifies generic invocation handling

All parameters must be specified

Catch-all parameters (* and **) are supported

Omitting catch-all parameter population simplifies generic invocation handling

No If Expressions or Comprehensions

In order to support the classic ternary operator, a construct was added to the Python syntax that needed to avoid problems with the existing grammar and notation. Unfortunately, it reorders the components from the traditional form:

# Not valid in Lichen, only in Python.

# In C: condition ? true_result : false_result
true_result if condition else false_result

# In C: (condition ? inner_true_result : inner_false_result) ? true_result : false_result
true_result if (inner_true_result if condition else inner_false_result) else false_result

Since if expressions may participate within expressions, they cannot be rewritten as if statements. Nor can they be rewritten as logical operator chains in general.

# Not valid in Lichen, only in Python.

a = 0 if x else 1 # x being true yields 0

# Here, x being true causes (x and 0) to complete, yielding 0.
# But this causes ((x and 0) or 1) to complete, yielding 1.

a = x and 0 or 1 # not valid

But in any case, it would be more of a motivation to support the functionality if a better syntax could be adopted instead. However, if expressions are not particularly important in Python, and despite enhancement requests over many years, everybody managed to live without them.

List and generator comprehensions are more complicated but share some characteristics of if expressions: their syntax contradicts the typical conventions established by the rest of the Python language; they create implicit state that is perhaps most appropriately modelled by a separate function or similar object. Since Lichen does not support generators at all, it will obviously not support generator expressions.

Meanwhile, list comprehensions quickly encourage barely-readable programs:

# Not valid in Lichen, only in Python.

x = [0, [1, 2, 0], 0, 0, [0, 3, 4]]
a = [z for y in x if y for z in y if z]

Supporting the creation of temporary functions to produce list comprehensions, while also hiding temporary names from the enclosing scope, adds complexity to the toolchain for situations where programmers would arguably be better creating their own functions and thus writing more readable programs.

No With Statement

The with statement introduced the concept of context managers in Python 2.5, with such objects supporting a programming interface that aims to formalise certain conventions around resource management. For example:

# Not valid in Lichen, only in Python.

with connection = db.connect(connection_args):
    with cursor = connection.cursor():
        cursor.execute(query, args)

Although this makes for readable code, it must be supported by objects which define the __enter__ and __exit__ special methods. Here, the connect method invoked in the first with statement must return such an object; similarly, the cursor method must also provide an object with such characteristics.

However, the "pre-with" solution is as follows:

connection = db.connect(connection_args)
try:
    cursor = connection.cursor()
    try:
        cursor.execute(query, args)
    finally:
        cursor.close()
finally:
    connection.close()

Although this seems less readable, its behaviour is more obvious because magic methods are not being called implicitly. Moreover, any parameterisation of the acts of resource deallocation or closure can be done in the finally clauses where such parameterisation would seem natural, rather than being specified through some kind of context manager initialisation arguments that must then be propagated to the magic methods so that they may take into consideration contextual information that is readily available in the place where the actual resource operations are being performed.

No Generators

Generators were added to Python in the 2.2 release and became fully part of the language in the 2.3 release. They offer a convenient way of writing iterator-like objects, capturing execution state instead of obliging the programmer to manage such state explicitly.

# Not valid in Lichen, only in Python.

def fib():
    a, b = 0, 1
    while 1:
        yield b
        a, b = b, a+b

# Alternative form valid in Lichen.

class fib:
    def __init__(self):
        self.a, self.b = 0, 1

    def next(self):
        result = self.b
        self.a, self.b = self.b, self.a + self.b
        return result

# Main program.

seq = fib()
i = 0
while i < 10:
    print seq.next()
    i += 1

However, generators make additional demands on the mechanisms provided to support program execution. The encapsulation of the above example generator in a separate class illustrates the need for state that persists outside the execution of the routine providing the generator's results. Generators may look like functions, but they do not necessarily behave like them, leading to potential misunderstandings about their operation even if the code is superficially tidy and concise.

Positional and Keyword Arguments Only

When invoking callables, only positional arguments and keyword arguments can be used. Python also supports * and ** arguments which respectively unpack sequences and mappings into the argument list, filling the list with sequence items (using *) and keywords (using **).

def f(a, b, c, d):
    return a + b + c + d

l = range(0, 4)
f(*l) # not permitted

m = {"c" : 10, "d" : 20}
f(2, 4, **m) # not permitted

While convenient, such "unpacking" arguments obscure the communication between callables and undermine the safety provided by function and method signatures. They also require run-time support for the unpacking operations.

Positional Parameters Only

Similarly, signatures may only contain named parameters that correspond to arguments. Python supports * and ** in parameter lists, too, which respectively accumulate superfluous positional and keyword arguments.

def f(a, b, *args, **kw): # not permitted
    return a + b + sum(args) + kw.get("c", 0) + kw.get("d", 0)

f(1, 2, 3, 4)
f(1, 2, c=3, d=4)

Such accumulation parameters can be useful for collecting arbitrary data and applying some of it within a callable. However, they can easily proliferate throughout a system and allow erroneous data to propagate far from its origin because such parameters permit the deferral of validation until the data needs to be accessed. Again, run-time support is required to marshal arguments into the appropriate parameter of this nature, but programmers could just write functions and methods that employ general sequence and mapping parameters explicitly instead.