Specifying and Verifying a Transformation of Recursive Functions into Tail-Recursive Functions

Suárez Polo, Axel; Lavalle Martínez, José de Jesús; Molina Rebolledo, Iván; Suárez Polo, Axel; Lavalle Martínez, José de Jesús; Molina Rebolledo, Iván

doi:10.13053/cys-27-1-4529

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.27 no.1 Ciudad de México ene./mar. 2023 Epub 16-Jun-2023

https://doi.org/10.13053/cys-27-1-4529

Articles of the thematic section

Specifying and Verifying a Transformation of Recursive Functions into Tail-Recursive Functions

Axel Suárez Polo¹

José de Jesús Lavalle Martínez¹^*

Iván Molina Rebolledo¹

¹1 Benemérita Universidad Autónoma de Puebla, Puebla, Mexico. axel.suarez@alumno.buap.mx, gustavo.molinar@alumno.buap.mx.

Abstract:

It is well known that some recursive functions admit a tail recursive counterpart which have a more efficient time-complexity behavior. This paper presents a formal specification and verification of such process. A monoid is used to generate a recursive function and its tail-recursive counterpart. Also, the monoid properties are used to prove extensional equality of both functions. In order to achieve this goal, the Agda programming language and proof assistant is used to generate a parametrized module with a monoid, via dependent types. This technique is exemplified with the length, reverse, and indices functions over lists.

Keywords: Dependent types; formal specification and verification; tail recursion; accumulation; program transformation

1 Introduction

Dependently typed programming languages provide an expressive system that allows both programming and theorem proving. Agda is an implementation of such a kind of language [⁷].

Using these programming languages, it can be proved that two functions return the same output when they receive the same input, which is a property known as extensional equality [⁶].

In Agda, the totality of a function is a requirement of the language, which means that every function must always return a value for any input, while ensuring that the function always terminates; which gives us a proof of termination for any construct within Agda [⁷].

Agda was chosen because of our familiarity with it, nevertheless there are other alternatives such as Coq [³] or Lean [⁸]. Programs can be developed using a transformational approach, where an initial program whose correctness is easy to verify is written, and after that, it is transformed into a more efficient program that preserves the same properties and semantics [¹⁴].

Proving that the transformed program works the same way as the original program is usually done by using algebraic reasoning [⁴], but this can also be done using dependently typed programming [¹³], with the advantage of the proof being verified by the compiler.

The accumulation strategy is a well-known program transformation technique to improve the efficiency of recursive functions [⁵].

This technique is the focus of this paper, in which dependently typed programming is used to develop a strategy to prove extensional equality between the original recursive programs and their tail-recursive counterparts. The source code of this paper is available here.

2 A Simple Example: List Length

Let us start with a simple example: a function to compute the length of a list. This function can be defined recursively as follows:

len : List A →ℕlen [] = 0len (x ∷ xs) = suc (len xs)

Nonetheless, this function requires space proportional to the length of the list due to the recursive calls. This program can be transformed into a tail-recursive function, which can be optimized automatically by the compiler to use constant space [²]. The transformed function is shown below:

len-tl : List A →ℕ→ℕlen-tl [] n = nlen-tl (x ∷ xs) n = len-tl xs (suc n)

In this example, it is clear to see that both functions return the same result for every possible list we provide as input. This fact can be represented in Agda using dependent function types:

len=len-tl : ∀ (xs : List A) → len xs ≡ len-tl xs 0

The notion of “sameness” used here is the one of intensional equality, which is an inductively defined family of types [⁹, ¹³] with the following definition:

data _ ≡_ {a} {A : Set a} (x : A) : A → Set a ↺ where instance refl : x ≡ x

This means that two terms are equal if they are exactly the same term. Additionally, in Agda, if both terms reduce to the same term, we can state that they are intensionally equal. For example:

refl : 2 + 3 ≡ 5

This notion of equality together with the addition of the universal quantifier, allows us to state a kind of equality for functions, known as point-wise equality or extensional equality [⁶].

To prove extensional equality for the length functions, we can proceed inductively over the list, which has the [] and x∷xs cases:

len≡len-tl : ∀ (xs : List A) → len xs ≡ len-tl xs 0len≡len-tl [] = ?len≡len-tl (x ∷ xs) = ?

The base case is trivial, because both sides of the equality in the resulting type reduce to the same term, which is:

len []=0(by definition)len-tl [] 0 = 0

Therefore, we can fill the first hole in our proof with the refl constructor, such that the resulting equation is:

len≡len-tl [] = refl

For the inductive case, we can reduce both sides of the equality instantiated with the argument, and check what is necessary to prove.

Note that this can be done automatically by querying Agda, and it is particularly useful when using the Agda mode in Emacs [¹⁷]. The reductions are shown below and follow from the definition:

len (x∷xs) = suc (len xs)len-tl (x∷xs) 0= len-tl xs (suc 0)= len-tl xs 1

We need to prove that suc (len xs)≡len-tl xs 1. This time, we cannot simply use refl, because both sides do not reduce to the same term. For this reason, we can proceed to call this function recursively with the tail of the list.

This is justified because of the Curry-Howard correspondence, and the fact that we are making a proof by induction. The result of the recursive call gives us the induction hypothesis:

len≡len-tl (x∷xs) = let ind-h = len≡len-tl xs in ?

The type of ind−h is len xs≡len-tl xs 0. The left sides of the induction hypothesis and what we are proving are almost the same. To make them match, we can apply the congruence property of equality, which has the following type:

cong : ∀ (f ; A → B) {x y} → x ≡ y → f x ≡ f ↺ y

Applying this function to the induction hypothesis, we get the function below:

len≡len-tl (x ∷ xs) = let ind-h = len≡len-tl xs suc-cong = cong suc ind-h in ?

The suc-cong term has the type:

suc (len xs) ≡ suc (len-tl xs 0)

As we can see the left sides match, so we can change our goal to prove that the right side of suc−cong is equal to the right side of the goal; by making use of the transitive property of equality, which has the following type in Agda:

trans : ∀ {x y z} → x ≡ y → y ≡ z → x ≡ z

Therefore, now our proof is:

len≡len-tl (x ∷ xs) = let ind-h = len≡len-tl xs suc-cong=cong suc ind-h in trans suc-cong ?

The type of the term required to fill the hole is:

suc (len-tl xs 0)≡len-tl xs 1

We need to “pull” the 1 from the accumulator somehow, and convert it to a suc call. We can extract this new goal into a helper function:

len-pull : ∀ (xs : List A) → suc (len-tl xs 0) = len-tl xs 1

We can try to prove this goal by straightforward induction over the list, but we reach a dead end:

len-pull [] =refllen-pull (x ∷ xs) = ?

The base case is trivial, following the definitions of the function, both terms reduce to 1. The problem is the inductive case, which reduces as follows:

suc (len-tl (x∷xs) 0) = suc (len-tl xs (suc 0)) = suc (len-tl xs 1)suc (len-tl (x∷xs) 1) = len-tl xs (suc 1) =len-tl xs 2

So, we are left with the following goal, which is very similar to the one we started with:

suc (len-tl xs 1) ≡ len-tl xs 2

We could try to prove this proposition by straightforward induction too, but that would require to prove a similar proposition for the next values 2 and 3, and so on.

To solve this issue, we can use a generalization strategy to prove this inductive property [¹]. The generalized property will allow us to vary the value of the accumulator in the different cases of the inductive proof, but we will need to introduce another variable for it.

It is important to note that after processing the first n items of the list, we will get n + len-tl xs 0 on the left side and len-tl xs n on the right one. Combining the generalization strategy and this fact, we can see that the property we have to prove is:

len-pull-generalized : ∀ (xs : List A) (n p : ℕ) → n + len-tl xs p ≡ len-tl xs (n + p)

This function can be proved by induction over the list:

len-pull-generalized [] n p = refllen-pull-generalized (x ∷ xs) n p = ?

The base case is trivial, because replacing the xs argument with [], and following a single reduction step on both sides, the common term n + p is reached. The inductive case is more interesting. Reducing both sides of the equation proceeds as follows:

n + len-tl (x∷xs) p = n + len-tl xs (suc p)len-tl (x∷xs) (n + p) = len-tl xs (suc (n + p))n + len-tl (x∷xs) p = n + len-tl xs (suc p)

We can see that we have pretty much the induction hypothesis, with the only difference being the accumulating parameter p. Nevertheless, as we have generalized the proposition, we can pick a value for p when using the induction hypothesis:

len-pull-generalized (x∷xs) n p = len-pull-generalized xs n (suc p)

This takes us closer to the goal we want to prove. Unfortunately, we are left with the following goal after performing the substitution of p with suc p:

n + len-tl xs (suc p) ≡ len-tl xs(n + suc p)

This is almost what we want, except for suc (n + p) not being equal to n + suc p. However, these two terms are indeed equal, but not definitionally, because the plus function is defined by induction on the first argument, and not on the second one:

_+_ : Nat → Nat → Natzero + m = msuc n + m = m

Therefore, applying reduction steps does not allow Agda to deduce the equality of these two terms. Fortunately, the fact that these terms are equal can be easily proved inductively as follows:

+-suc : ∀ m n → m + suc n ≡ suc (m + n)+-suc zero n = refl+-suc (suc m) n = cong suc (+-suc m n)

The remaining step is to “replace” the suc (n + p) term with n + suc p. Agda provides the rewrite construct to perform this transformation:

len-pull-generalized (x ∷ xs) n p rewrite (sym (+-suc n p)) = len-pull-generalized xs n (suc p)

We make use of the symmetric property of equality in the rewriting step, which allows us to flip the sides of the equality:

sym : ∀ {x y} → x ≡ y → y ≡ x

With all this in place, we can finally prove the remaining goals, giving as a result the complete proof:

len-pull-generalized : ∀ (xs : List A) (n p : ℕ) → n + len-tl xs p ≡ len-tl xs (n + p)len-pull-generalized [] n p = refllen-pull-generalized (x ∷ xs) n p rewrite (sym (+-suc n p)) =len-pull-generalized xs n (suc p)len-pull : ∀ (xs : List A) → suc (len-tl xs 0) ≡ len-tl xs 1len-pull xs = len-pull-generalized xs 1 0len≡len-tl : ∀ (xs : List A) → len xs ≡ len-tl xs 0len≡len-tl [] = refllen≡len-tl (x ∷ xs) = let ind-h = len≡len-tl xs suc-cong = cong suc ind-h suc-pull = len-pull xs in trans suc-cong suc-pull

3 Another Example: List Reverse

The list reversal function follows a similar pattern to the one we have seen before:

reverse : List A -> List Areverse [] = []reverse (x ∷ xs) = reverse xs ++ (x ∷ [])reverse-tl : List A -> List A -> List Areverse-tl [] ys = ysreverse-tl (x ∷ l) l' = reverse-tl l (x < l')

It should not come as a surprise that the equality proof is very similar too:

reverse-pull-generalized : ∀ (xs ys zs : List A) → reverse-tl xs ys ++ zs ≡ reverse-tl xs (ys ++ zs)reverse-pull-generalized [] ys zs = reflreverse-pull-generalized (x ∷ xs) ys zs = reverse-pull-generalized xs (x ∷ ys) zsreverse-pull : ∀ (x : A) (xs : List A) →reverse-tl xs [] ++ (x ∷ []) ≡ reverse-tl xs (x ∷ [])reverse-pull x xs = reverse-pull-generalized xs [] (x ∷ [])reverse≡reverse-tl : ∀ (xs : List A) → reverse xs ≡ reverse-tl xs []reverse≡reverse-tl [] = reflreverse≡reverse-tl (x ∷ xs) = let ind-h = reverse≡reverse-tl xs append-cong = cong (_++ (x ∷ [])) ind-h append-pull = reverse-pull x xsin trans append-cong append-pull

There are minor variations in the function signatures and the order of the parameters, but the structure is identical:

– Start proving by induction on the list.

– Fill the base case with refl.

– Take the inductive hypothesis by using a recursive call.

– Apply an operator to both sides of the equality, using cong.

– Create a function to pull the accumulator, and prove it using a generalized version of this function that allows varying the accumulator.

– Compose the two equalities using the trans function.

4 Generalization

Starting from the function definitions, we can see that they follow the same recursive pattern, we can write this pattern in Agda, which is just a specialization of a fold function [¹¹, ¹²]:

reduce : List A → Rreduce [] = emptyreduce (x ∷ xs) = f x <> reduce xs

where

– R is the result type of the function.

– empty is the term to return when the list is empty.

– f is a function to transform each element of the list into the result type.

– <> is the function to combine the current item and the recursive result.

In the case of the len function, the result type is ℕ, the natural numbers; empty is 0; the function to transform each element is a constant function that ignores its argument and returns 1; and the function to combine the current item and the result of the recursive call is the addition function.

For the reverse function, the result type is the same type as the original list, List A; empty is the empty list; the function to transform each element creates just a singleton list from its parameter; and the function to combine the current transformed item and the result of the recursive call, is the flipped concatenation function.

The flipping is necessary to make the function concatenate its first argument to the right:

reduce (x∷xs) = (λa®a∷[]) x <> reduce xs = (x∷[]) <> reduce xs = (λxs ys→ys ++ xs) (x∷[]) (reduce xs = reduce xs ++ (x∷[])

The functions that follow this pattern, can be defined in a tail-recursive way as follows:

reduce-tl : List A → R → Rreduce-tl [] r = rreduce-tl (x ∷ xs) r = reduce-tl xs(r <> f ↺ x)

We can check manually that this function matches the tail-recursive definition in the case of the reverse function:

reverse-tl (x∷xs)= reduce-tl xs (r <> (λa→a∷[]) x)= reduce-tl xs (r <> (x∷[]))= … xs ((λxs ys→ys ++ xs) r (x∷[]))= reduce-tl xs ((x∷[] ++ r)= reduce-tl xs (x∷r)

Now we can proceed to prove that these two functions are extensionally equal in the general case. The proof follows the same pattern as the one for the len function:

reduce≡reduce-tl : ∀ (xs : List A) → reduce xs ≡ reduce-tl xs ↺ emptyreduce≡reduce-tl [] = reflreduce≡reduce-tl (x ∷ xs) = let ind-h = reduce≡reduce-tl xs op-cong = cong (f x <>_) ind-h op-pull = reduce-pull (f x) xs in trans op-cong op-pull

We make use of a piece of syntactic sugar called sections, which allows us to write the function (λr→f x <> r) as (f x <>_). Apart from that, the proof is identical to the ones we have seen before.

However, to prove the accumulator pulling function, we need to use a different strategy. We are required to prove that:

reduce-pull : ∀ (r : R) (xs : List A) → r <> reduce-tl xs empty ≡ reduce-tl xs (empty <> r)

To do this, we can prove this proposition by induction over the list, which requires us to prove the proposition when xs is []:

r <> reduce-tl [] empty = r <> emptyreduce-tl [] (empty <> r) = empty <> r

So we are required to prove that r <> empty≡empty <> r. We could require the <> function to be commutative, but we can “ask for less” by just requiring empty to be a left and right identity for <>, this is expressed in Agda as:

<>-identityl : ∀ (r : R) → empty <> r ≡ r<>-identityr : ∀ (r : R) → r <> empty ≡ r

This way, we can use those identities to rewrite our goals, and make them match over the term r, and then, complete the base case using the trivial equality proof refl:

reduce-pull r [] rewrite <>-identityl r | <>-identityr r = refl

The inductive case goal is:

r <> reduce-tl (x∷xs) empty = r <> reduce-tl xs (empty <> f x)reduce-tl (x∷xs) (empty <> r) = reduce-tl xs ((empty <> r) <> f x)

Which cannot be proved directly by straightforward induction, as we have seen before, but at least we can simplify it by using the left identity property over empty <> f x and then over empty <> r:

reduce-pull r (x∷xs) rewrite <>-identityl (f x) | <>-identityl r = reduce-pull-generalized r (f x) xs

Finally, we just need to prove the generalized accumulation pulling function, which has the following type signature:

reduce-pull-generalized : ∀ (r s : R) (xs : List A) → r <> reduce-tl xs s ≡ reduce-tl xs (r <> ↺ s)

Note that the base case is trivial, and it is quite similar to the ones we have already proved, so we are going to focus on the inductive case. Following the same kind of reductions we have been doing before, we can see that our goal is:

r <> reduce-tl (x∷xs) s = r <> reduce-tl xs (s <> f x)reduce-tl (x∷xs) (s <> r) = reduce-tl xs ((r <> s) <> f x)

Following the generalization strategy, we have to call the function recursively, replacing the s by s <> f x, which almost gives what it is required, except that the right hand side accumulator is associated wrongly.

r <> reduce-tl xs (s <> f x) ≡ reduce-tl xs (r <> (s <> f x))

Associativity is indeed the last property that the <> function needs to satisfy. This can be expressed in Agda straightforwardly as:

<>-assoc : ∀ (r s t : R)→ (r <> s) <> t ≡ r <> (s <> t)

Which helps us complete the proof:

reduce-pull-generalized r s [] = reflreduce-pull-generalized r s (x ∷ xs) rewrite <>-assoc r s (f x) = reduce-pull-generalized r (s <> f ↺ x) xs

All of these properties match the definition of a monoid. We can complete the formalization and encapsulate it in a ready to use parametrized module, using the standard library definition of a monoid:

open import Algebra.Structures using ↺ (IsMonoid)module GenericBasic {A : Set} {R : Set} (f : A → R) (_<>_ : R → R → R) (empty : R) (m : IsMonoid _≡_ empty) whereopen IsMonoid m using () renaming ( identityl to <>-identityl ; identityr to <>-identityr ; assoc to <>-assoc )

5 Using the Module with the Examples

With the module in place, we can start using it to derive the recursive function, the tail-recursive counterpart, and the proof that both functions are extensionally equal.

The length function uses the usual sum monoid over the natural numbers:

open import GenericBasic {A = ℕ} (λ _ → 1) _+_ 0+-0-isMonoid renaming ( reduce to len ; reduce-tl to len-tl ; reduce≡reduce-tl to len≡len-tl )

The reverse function requires us to create an instance of a flipped monoid for ++, which can be done with the already defined properties for list concatenation, but flipping them when necessary.

++-flipped-isMonoid {A} = record { isSemigroup = record { isMagma = record { isEquivalence = isEquivalence ; ⋅-cong=cong2 (flip _++_) } ; assoc = λ x y z ® sym ( ++-assoc z y x) } ; identity = ++-identityr , ++-identityl }

Finally, the indices function also requires us to create a custom monoid. The original indices function specialized for lists of natural number is the following:

indices : ℕ → List ℕ → List ℕindices n [] = []indices n (x ∷ xs) with n =? x… | yes _ = 0 < map suc (indices n xs)… | no _ = map suc (indices n xs)

The monoid for this function has the following operation and identity element:

IndicesData : SetIndicesData = ℕ × List ℕempty : IndicesDataempty = 0 , []_<>_ : IndicesData → IndicesData → ↺IndicesData(ln , ll) <> (rn , rl) = ln + rn , ll ++ map (ln +_) rl

6 Conclusions

A technique to prove extensional equality between a recursive function and its tail-recursive counterpart has been presented, along with an Agda module to automatically generate the functions and the proof from an arbitrary monoid.

As far as we know there is no related work in the literature which gives a formal proof of the extensional equality and transformation of a recursive function into a tail-recursive one, which is the main contribution of this work.

The tail-recursive function generally improves the time complexity of the original recursive function and opens the possibility of performing tail-call optimization by the compiler, leading to a more space efficient function execution [², ¹⁵].

There are some caveats with this technique which are exemplified by the indices function.

Even though the generated function avoids mapping over the entire recursive call result, it introduces inefficiency by doing nested concatenations to the left, which leads to quadratic time complexity.

This could be solved by using higher order functions as the accumulating monoid [¹⁰], but proving the corresponding monoid laws will require to be able to transform extensional equality to intensional equality, which is not possible in Agda without using cubical type theory [⁶, ¹⁶], but that is out of the scope of this paper.

Further work can be done in order to generalize this result to arbitrary recursive data types and recursion schemes [¹²].

References

1. Abdali, S. K., Vytopil, J. (1984). Generalization heuristics for theorems related to recursively defined functions. Proceedings of the Fourth AAAI Conference on Artificial Intelligence, pp. 1–5. DOI: 10.555 5/2886937.2886938. [ Links ]

2. Bauer, A. (2003). Compilation of functional programming languages using gcc—tail calls. Master’s thesis, Institut für Informatik, Technische Universität München, Germany. [ Links ]

3. Bertot, Y., Castéran, P. (2013). Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions. DOI: 10.1007/978-3-662-07964-5. [ Links ]

4. Bird, R., De Moor, O. (1996). The algebra of programming. NATO ASI DPD, Vol. 152, pp. 167–203. [ Links ]

5. Bird, R. S. (1984). The promotion and accumulation strategies in transformational programming. ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 6, No. 4, pp. 487–504. DOI: 10.1145/1780.1781. [ Links ]

6. Botta, N., Brede, N., Jansson, P., Richter, T. (2021). Extensional equality preservation and verified generic programming. Journal of Functional Programming, Vol. 31. DOI: 10.1017/S0956796821000204. [ Links ]

7. Bove, A., Dybjer, P., Norell, U. (2009). A brief overview of agda–a functional language with dependent types. International Conference on Theorem Proving in Higher Order Logics, pp. 73–78. DOI: 10.1007/978-3-642-03359-9.pdf#page=84. [ Links ]

8. de Moura, L., Kong, S., Avigad, J., van Doorn, F., von Raumer, J. (2015). The lean theorem prover (system description). International Conference on Automated Deduction, pp. 378–388. DOI: 10.1007/978-3-319-21401-6_26. [ Links ]

9. Dybjer, P. (1994). Inductive families. Formal aspects of computing, Vol. 6, No. 4, pp. 440–465. DOI: 10.1007/BF01211308. [ Links ]

10. Hughes, J. (1986). A novel representation of lists and its application to the function “reverse”. Information Processing Letters, Vol. 22, No. 3, pp. 141–144. DOI: 10.1016/0020-0190(86)90059-1. [ Links ]

11. Hutton, G. (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming, Vol. 9, No. 4, pp. 355–372. DOI: 10.1017/S0956796899003500. [ Links ]

12. Meijer, E., Fokkinga, M., Paterson, R. (1991). Functional programming with bananas, lenses, envelopes and barbed wire. Conference on functional programming languages and computer architecture, pp. 124–144. [ Links ]

13. Mu, S. C., Ko, H. S., Jansson, P., . Algebra of programming using dependent types. Lecture Notes in Computer Science, pp. 268–283. DOI: 10.1007/978-3-540-70594-9_15. [ Links ]

14. Pettorossi, A., Proietti, M. (1993). Rules and strategies for program transformation. Formal Program Development, pp. 263–304. DOI: 10.1007/3-540-57499-9_23. [ Links ]

15. Rubio-Sánchez, M. (2017). Introduction to recursive programming. [ Links ]

16. Vezzosi, A., Mörtberg, A., Abel, A. (2021). Cubical agda: A dependently typed programming language with univalence and higher inductive types. Journal of Functional Programming, Vol. 31. DOI: 10.1145/3341691. [ Links ]

17. Wadler, P. (2018). Programming language foundations in agda. Brazilian Symposium on Formal Methods, pp. 56–73. DOI: 10.1007/978-3-030-03044-5_5. [ Links ]

https://github.com/ggzor/specifying-verifying-tail-recursion

The ? symbols are holes, which must be filled later to complete the proof, but are useful to write the proof incrementally.

Received: April 10, 2022; Accepted: June 06, 2022

^* Corresponding author: José de Jesús Lavalle Martínez, e-mail: jose.lavalle@correo.buap.mx

This is an open-access article distributed under the terms of the Creative Commons Attribution License