- iterators as such
- stateful iterators
- Nelua's stateless iterators
- higher order functions as iterators
- generic iterators
- coroutine iterators
iterators as such
Suppose you want to loop over the numbers from 1 to 10 inclusive, printing each. With only labels and conditional jumps you could write this, which really works:
local i = 1 ::next_i:: print(i) if i < 10 then goto bump_i end goto done_i ::bump_i:: i = i + 1 goto next_i ::done_i::
Gradually, structured programming came along and both a) formalized common patterns of control flow like this, and b) strongly suggested that people stick to them, and throw in the garbage the much larger group of hard-to-follow and error-prone control flow that goto is also capable of. So even if you're pretending that Nelua's numeric for loop doesn't exist, you'd write code more like:
local i = 1 while i < 10 do print(i) i = i + 1 end
And if you wrote that you'd have an error, because this loop doesn't include print(10).
To avoid such errors, especially in a codebase where you could have thousands of separate loops like this, and to repeat the success of structured programming, you could write the loop logic once in a function that you could get right, and then you could call that function an 'iterator'. Languages don't really need any special support for iterators as long as programmers can stick to a convention for them. For example:
local function upto(i: *integer, limit: integer): boolean if $i >= limit then return false end $i = $i + 1 return true end local i = 0 while upto(&i, 10) do print(i) end
Here the convention is: create a local variable with the initial state, then call the iterator with a pointer to that variable and any other information it needs, then use the variable if the iterator returns true.
Here's an infinite iterator that lazily yields the same value forever: function wow(_: pointer) return true end
This isn't a convention that I made up, by the way. Odin does something very similar with its strings.split_iterator. A demerit with this convention is that the explicit state can be modified separately from the iterator code, resulting in much more complex logic if done intentionally, and in bugs if done unintentionally.
stateful iterators
Lua has language support for this convention: an iterator is a function that returns a closure that, when called, either returns nil or returns the next value(s) to use. This is Lua:
function upto(limit)
local i = 0
return function()
if i >= limit then return nil end
i = i + 1
return i
end
end
for i in upto(10) do
print(i)
endThis convention would be hard for Nelua to mimic as it requires both closures and dynamic typing, but you can come close. This is Nelua:
local Upto = @record{
i: integer,
limit: integer,
}
local function upto(limit: integer): (Upto, auto)
return {0, limit}, function(state: *Upto): (integer, boolean)
if state.i >= state.limit then return 0, false end
state.i = state.i + 1
return state.i, true
end
end
local iter, iterf = upto(10)
while true do
local i, ok = iterf(&iter)
if not ok then break end
print(i)
endInstead of a closure, a function that receives what would be closed-over state as an argument. Instead of dynamic typing, an additional boolean return value. With some compiler support the loop could look like for i in upto(10) do print(i) end and rest of it would be handled for you.
Nelua's stateless iterators
Lua also has stateless iterators that don't require closures. These are what Nelua supports, albeit with the same boolean alternative to returning nil. This is a Nelua iterator with language support:
local function upto_next(invariant: integer, control: integer): (boolean, integer) <inline> if control >= invariant then return false end return true, control+1 end local function upto(n: integer): (auto, integer, integer) <inline> return upto_next, n, 0 end for i in upto(10) do print(i) end
upto returns a function, the invariant of the iterator (here, the limit of 10, but commonly a pointer to a list or array being iterated over), and the initial control variable of the iterator. The language then calls that function for you, passing it the invariant and control, accepting the next control variable, and terminating the loop if the boolean is false.
Explicitly:
local iterf, invariant, control = upto(10) local more: boolean while true do more, control = iterf(invariant, control) if not more then break end print(control) end
There's some risk of confusion here because the control value has a dual purpose of controlling iteration and representing the numbers wanted for this task. So here's something slightly different:
local function parity_upto_next(invariant: integer, i: integer): (boolean, integer, string) <inline>
i = i + 1
if i > invariant then return false end
if i % 2 == 0 then
return true, i, 'even'
else
return true, i, 'odd'
end
end
local function parity_upto(n: integer): (auto, integer, integer) <inline>
return parity_upto_next, n, 0
end
for _, p in parity_upto(10) do print(p) endAnd another, an iterator that yields only digits from a string:
require 'string'
require 'C.ctype'
local function digits_next(inv: string, i: usize): (boolean, usize, string)
for i=i, #inv do
if 0 ~= C.isdigit(string.byte(inv, i)) then
return true, i+1, string.sub(inv, i, i)
end
end
return false
end
local function digits(s: string): (auto, string, usize)
return digits_next, s, 1
end
for i, c in digits('a string 123 with some 7 digits 1 1 in it 1') do
print(i, c)
endhigher order functions as iterators
Programming in Lua has an aside about 'true' iterators which instead are functions that take the 'body of the loop' as a function, and call it with values that would've been yielded by an iterator. 'True' iterators have the advantage of being much easier to think about and implement, and also give the iterator much more control over iteration - for example, a 'true' iterator has a very obvious point at which to free any resources (that point: return), and a very obvious place to stash additional state for the loop (that place: the stack), but a stateless iterator could be called outside of a loop, or the loop could break early, without the iterator ever being run to completion.
In a language with closures, 'true' iterators are also very easy to use.
In Nelua, the first alternative to closures is quite unpleasant:
require 'string'
require 'C.ctype'
local function digits(s: string, f: auto, state: pointer)
for i=1, #s do
if 0 ~= C.isdigit(string.byte(s, i)) then
f(string.sub(s, i, i), state)
end
end
end
digits('a string 123 with some 7 digits 1 1 in it 1', function(c: string, _: pointer)
print(c)
end)
local sum = 0
digits('a string 123 with some 7 digits 1 1 in it 1', function(_: string, state: pointer)
local count = (@*integer)(state)
$count = $count + 1
end, &sum)
print(string.format('found %d digits in string', sum))Trying to make that neater will run you against impermissible pointer conversions, the impossibility of polymorphic anonymous functions, limits on the unpacking of varargs, etc.
How about borrowing from FP and threading state through the function, instead of expecting the function to mutate a pointer?
require 'string'
require 'C.ctype'
require 'C.stdlib'
local function digits(s: string, f: auto, state: auto)
local stop: boolean
for i=1, #s do
if 0 ~= C.isdigit(string.byte(s, i)) then
stop, state = f(string.sub(s, i, i), state)
if stop then break end
end
end
return state
end
local st = 'a string 123 with some 7 digits 1 1 in it 1'
digits(st, function(c: string, _: integer)
print(c)
return false, 0
end, 0)
local sum = digits(st, function(_: string, n: integer)
return false, n + 1
end, 0)
print(string.format('found %d digits in string', sum))
local n = C.atoi(digits(st, function(digit: string, acc: string)
if digit > '5' then return true, acc end
return false, acc .. digit
end, ''))
assert(n == 123)That's a little bit easier to work with.
generic iterators
Consider this iterator:
require 'span'
require 'string'
local function repeat_elems_next(sp: span(byte), i: usize): (boolean, usize, byte)
if i < #sp-1 then return true, i+1, sp[i+1] end
return true, 0, sp[0]
end
local function repeat_elems(sp: span(byte))
return repeat_elems_next, sp, #sp
end
local s: span(byte) = "hello"
for i, c in repeat_elems(s) do
print(i, c, string.format('%c', c))
endThis works as intended, and has endless output, starting
0 104 h 1 101 e 2 108 l 3 108 l 4 111 o 0 104 h 1 101 e 2 108 l 3 108 l 4 111 o 0 104 h
But it only works on span(byte), or something convertible to it. If you try to make it generic as you would in most languages with generics, you'll easily discover that returned functions - i.e., repeat_elems_next - cannot be polymorphic. Your generic iterator must return a specific function. Like so:
require 'span'
require 'string'
local function repeat_elems_next(sp: auto, i: usize): (boolean, usize, byte)
if i < #sp-1 then return true, i+1, sp[i+1] end
return true, 0, sp[0]
end
local function repeat_elems(sp: auto)
local function proxy(sp: #[sp.type]#, i: usize) return repeat_elems_next(sp, i) end
return proxy, sp, #sp
end
local limit = 3
local s: span(byte) = "abcd"
for i, b in repeat_elems(s) do
if limit < 1 then break end
print(i, string.format('%c', b))
limit = limit - 1
end
limit = 3
local s: span(integer) = {1, 2, 3, 4}
for i, n in repeat_elems(s) do
if limit < 1 then break end
print(i, n)
limit = limit - 1
endcoroutine iterators
Instead of an iterator to generate values, you could have a coroutine generate the values. This enormously simplifies the 'generating' logic, while also allowing communication back into the generator, but the loop consuming the iterator is no longer as simple.
require 'coroutine'
require 'utf8'
local function runes(s: string)
for _, r in utf8.codes(s) do
coroutine.yield(r)
end
end
local co = coroutine.spawn(runes, 'сети')
defer coroutine.destroy(co) end
repeat
local r: int32
coroutine.pop(co, &r)
local s <close> = utf8.char(r)
print(s)
coroutine.resume(co)
until coroutine.status(co) ~= 'suspended'At least, without some help:
require 'coroutine'
require 'utf8'
local function generator(f: function(string):void, arg: string)
local co = coroutine.spawn(f, arg)
return function(co: coroutine, i: integer)
if coroutine.status(co) ~= 'suspended' then
coroutine.destroy(co)
return false
end
local r: int32
coroutine.pop(co, &r)
coroutine.resume(co)
return true, i+1, r
end, co, 0
end
local function runes(s: string)
for _, r: int32 in utf8.codes(s) do
coroutine.yield(r)
end
end
for i, r in generator(runes, 'сети') do
local s <close> = utf8.char(r)
print(i, s)
end