Saturday, March 20, 2021

[crexbapy] default lambda variable

first, introduce a special keyword LAMBDAVARIABLE so that an expression containing LAMBDAVARIABLE gets rewritten by the compiler to

\x -> expression

where x is a fresh symbol, and every instance of LAMBDAVARIABLE in the expression is replaced by x.  (we have used Haskell syntax for lambda functions.)

for example, (LAMBDAVARIABLE + 10) rewrites to (\x -> x + 10).

the benefits are not having to type "x" that one extra time at the beginning ("Don't Repeat Yourself"), and not having to think of a fresh name x.  the latter may be useful for generated code.

problems:

  1. the construct cannot be nested.
  2. you do not know that an expression is a lambda function until you've parsed into the expression deeply, encountering the special keyword LAMBDAVARIABLE.
  3. does (10 * (5 + LAMBDAVARIABLE)) mean (\x -> 10 * (5+x)) or (10 * (\x -> 5+x))?

solution: augment with another construct LAMBDA(...) which delimits a lambda expression and allows nesting.  for example,

LAMBDA(LAMBDAVARIABLE + LAMBDA(LAMBDAVARIABLE^2)(LAMBDAVARIABLE+1))

rewrites to (\x -> x + (\x -> x^2)(x+1)).  note that the inner x (and inner LAMBDAVARIABLE) masks the outer one via static scoping.  if you need access to an outer variable in nested lambdas, don't use this construct.

Mathematica calls lambda functions pure anonymous functions.  LAMBDAVARIABLE is # and LAMBDA is postfix & .

previously, lambda with more than one variable.  consider keywords BINARYLAMBDA, LAMBDAVARIABLE1, LAMBDAVARIABLE2.  beyond two arguments, it's probably best to force the programmer to explicitly name them.

there's a parallel here with lambda expressions themselves being shortcuts:

(\x -> expression) rewrites to
(let { f x = expression } in f)

with the benefits of not having to think of a fresh identifier f and not having to Repeat Yourself in f.

2 comments :

Rupert Swarbrick said...

In case you don't already know about them, this is sort of equivalent to "anaphoric" macros as used in (Common) Lisp. The usual convention is that the bound variable is called "it".

The result is that you can write things like "(awhen (get-a-foo) (use it))" as syntactic sugar for "(let ((foo (get-a-foo))) (when foo (use foo)))".

As with your example, this introduces shadowing so doesn't work well with nesting (de Bruijn indices, anyone?!)

Ganesh Sittampalam said...

As Rupert suggested, this does seem like it's heading towards de Bruijn notation. I haven't come across that being used as a source syntax before though.