The Introduction to luajr vignette shows you how to get started with calling Lua code from R. This vignette gives more details on how to create and manipulate R objects – such as vectors, lists, and data.frames – in your Lua code.
The luajr.lua module
Whenever the luajr R package opens a new Lua state,
it opens all the standard Lua
libraries in that state, as well as the luajr.lua
module.
The luajr.lua module provides some key capabilities for
your Lua code to interface well with R code, particularly around working
with R variables in your Lua code. This vignette documents those
capabilities of the luajr.lua module.
Specifically, this vignette describes:
- Vector types, which allow you to use R vectors (logical, integer, numeric, character) and lists.
- Additional types, including environments, R functions, matrices, and data frames.
-
Constants defined by the
luajr.luamodule. -
Parallel processing features of the
luajr.luamodule. -
Utility functions for
luajr.lua. - Considerations for using long vectors in Lua.
Vector types
The vector types are:
luajr.logicalluajr.integerluajr.numericluajr.characterluajr.list
If you pass an R value into Lua using the argcode auto,
logical, integer, numeric,
character, or vector / list, your
Lua code will receive a luajr vector type. You can also
create a new vector type in Lua with, for example,
Here are some of the things you can do in Lua with a vector type variable:
print(my_vec[3]) -- print the 3rd element of the vector (i.e. 42)
my_vec[3] = 43 -- change the value of the 3rd element
local len = #my_vec -- get the length of the vector
my_vec:clear() -- erase all elements of the vectorAll these and more are described in more detail below.
Important: For vector types, indexes start at 1, not
at 0, and writing to an index that is less than 1 or greater than the
length of the vector is undefined behaviour. You must
be very careful not to access or write out of these bounds, as the
luajr.lua module does not do bounds checking. Going out of
bounds will cause crashes. Also note that unlike Lua tables, vector
types can only be indexed with integers from 1 to the vector length, not
with strings or any other types (although there are methods of getting
or setting elements by name; see below).
Important: Because vector types depend upon R, they are not completely thread safe and therefore they cannot safely be used with parallel processing.
Creating and testing vector types
luajr.logical(a, b),
luajr.integer(a, b), luajr.numeric(a, b),
luajr.character(a, b),
luajr.list(a, b)
These functions can be used to create R vectors in Lua code. The
meaning of a and b depends on the type of
a and b and whether both are present.
Specifically,
luajr.logical() -- Size-0 logical vector
luajr.integer(s) -- Integer vector copy of SEXP
luajr.numeric(n, x) -- Size-n numeric vector, all entries equal to x
luajr.character(v) -- Character vector copy of v
luajr.list(t) -- List copied from Lua table t
luajr.numeric(z) -- Numeric vector copied from "vector-ish" object zAbove, s is an R.sexp of the same
underlying type as the vector, n is a nonnegative number,
x is a boolean, number, string, etc. (as appropriate for
the vector type) or nil, v is another vector
object of the same type, t is a Lua table, and
z is a Lua table or vector type. If t is a
table and the vector is luajr.list type, then the names
from t will be copied over into the
luajr.list.
If x == nil in the (n, x) form of the above
functions, then the values of the vector are left uninitialized (except
for luajr.character, where the values are set to the empty
string; for luajr.list, the values are set to
luajr.NULL).
luajr.is_logical(obj),
luajr.is_integer(obj), luajr.is_numeric(obj),
luajr.is_character(obj),
luajr.is_list(obj)
Check whether a value obj is one of the corresponding
vector types. These return true if obj is of
the corresponding type, and false otherwise.
Vector type methods
All the vector types have the following methods.
Note on attribute behaviour: the names
attribute is preserved across length-changing operations
(push_back, pop_back, insert,
erase, resize); whether those operations
preserve dim or dimnames attributes is
undefined, so do not call them on a matrix-shaped value unless you
intend the matrix shape to be discarded. “Replacing” operations
(v:assign(), v:clear()) drop all structural
attributes (names, dim, dimnames)
consistently. If you need a result to keep matrix shape after such
changes, re-attach dim via
v:set_attr("dim", ...) before returning it.
#v
Returns the number of elements in the vector.
Note: This is distinct from the vector’s capacity (returned by
v:capacity()), which can be larger after
v:reserve() or if v:push_back() overallocates
to amortize future growth (see below).
x = v[i], v[i] = x
Get or set the ith element of the vector.
Again: Note that for vector types, indexes start at 1, not at 0. You
must be very careful not to access or write out of these bounds, as the
luajr.lua module does not do bounds
checking. Going out of bounds will cause crashes or other undefined
behaviour. Unlike Lua tables, vector types can only be indexed with
integers from 1 to the vector length, not with strings or any other
types.
x = v(i)
Gets the ith element of the vector, but as a new vector
type instead of as a ‘bare’ value. If you are returning a single value
to R, prefer to use this form as it will preserve NAs; for
example, if v = luajr.integer({1, 2, luajr.NA_integer_}),
then return v[3] will return -2147483648 to R, but
return v(3) will correctly return NA of
integer type.
Instead of a numeric index, you can also pass a string key as
i. If this string key is not found, v(i) will
return the appropriate NA for atomic vectors or
NULL for luajr.list.
v:is_na(i)
Returns true if the ith element of the
vector is NA, and false otherwise. This is the
recommended way to test for NA because direct comparison
with luajr.NA_real_ always returns false (due
to NaN semantics in IEEE-754: NaN == NaN is false). For
luajr.numeric, v:is_na(i) matches R’s
is.na() semantics and returns true for both
NA_real_ and any other NaN value.
v:is_na(i) errors on luajr.list, since list
elements are themselves R values and have no per-element NA concept.
ipairs(v), pairs(v)
Use with a Lua for loop to iterate over the values of
the vector. ipairs is used to iterate over integer keys and
values of the vector. For example:
At the moment, pairs is the same as ipairs,
but in the future, pairs may be changed so that it provides
vector element names instead of an integer. The same loop above can be
written as:
tostring(v)
Converts the vector to a human-readable string; output is capped at
the first 100 entries, with a ... (N elements) tail for
longer vectors. Names, if present, are shown as
"name = value" for each element. For
luajr.list vectors, each element is shown as a short type
tag and length (e.g. "num[5]", "chr[3]")
rather than its full contents.
v:assign(a, b)
Assign a new value to the vector; a and b
have the same meaning as in the vector
constructors. Unlike construction, assign reuses the
vector’s existing storage in place when capacity allows, and only
allocates a new buffer if the new length would exceed the current
capacity.
v:concat(sep)
Returns a string comprised of the elements of the vector converted
into strings and concatenated together with sep as a
separator. If sep is missing, the default is
",".
v:debug_str()
Returns a compact string representation of the vector, useful mainly for debugging. This contains the length of the vector, then the capacity of the vector, then each of the vector elements separated by commas.
v:reserve(n)
If n is larger than the vector’s current capacity,
enlarges the vector’s capacity to n. Otherwise, does
nothing.
v:capacity()
Returns the capacity of the vector.
v:shrink_to_fit()
If the vector’s capacity is larger than its length, reallocates the vector so that its capacity is equal to its length.
v:clear()
Sets the size of the vector to 0, and removes names, dims, and dimnames attributes.
v:resize(n, val)
Sets the size of the vector to n. If n is
smaller than the vector’s current length, removes elements at the end of
the vector. If n is larger than the vector’s current
length, adds new elements at the end equal to val.
val can be nil or missing.
v:push_back(val)
Adds val to the end of the vector. If this operation
would grow the vector past its current capacity, this allocates a new
vector with the capacity for double the number of current elements. This
is so that push_back() can be used in a loop without too
much of a performance hit.
v:pop_back()
Removes one element from the end of the vector.
v:insert(i, a, b)
Inserts new elements before position i, which must be
between 1 and #v + 1. a and b
have the same meaning as in the vector
constructors.
v:erase(first, last)
Removes elements from position first to position
last, inclusive (e.g. so that v:erase(1, #v)
erases the whole vector). If last is nil or
missing, just erases the single element at position
first.
v:detach()
Allocates new memory for the vector and copies its contents to the new memory. If the vector has been passed into a Lua function by reference, or is aliasing an R SEXP, this ‘detaches’ the reference/alias. Not needed in normal use.
v:get_attr(k)
Gets the R attribute with name k.
v:set_attr(k, v)
Sets the R attribute named k to v.
v:unname()
Deletes any name attributes set on the vector.
v:find(name)
Returns the numerical index of the first element of the vector with
name name. If no such element exists, returns
nil.
v:get(name)
Gets the value of the vector element with name name.
Returns nil if no such element exists. For
luajr.list, the returned element functions as a copy, so
mutating an element of a list does not write through to the list
itself.
To update an element in place, read it, mutate the local element, and
assign back with v:set(). For example:
local df = luajr.dataframe({ x = luajr.numeric({1, 2, 3}),
y = luajr.logical({true, false, true}) })
local col = df:get('x')
col[2] = 42
df:set('x', col)v:set(name, v)
Sets the vector element named name to value
v. If no such name exists, adds a new element to the back
of the vector with that name and sets the value to v.
Additional types
Besides the vector types, there are several additional types that
enrich the functionality of the luajr.lua module.
R function
The luajr.rfunction type wraps an R function so it can
be called from Lua.
luajr.rfunction(a, b)
Constructs a luajr.rfunction around an R function. The
forms are:
luajr.rfunction(s) -- wrap an R function from an R.sexp
luajr.rfunction(name) -- look up `name` in the global environment
luajr.rfunction(name, env) -- look up `name` in `env`Above, s must be a SEXP of type CLOSXP,
SPECIALSXP, or BUILTINSXP; name
is a Lua string; env is a luajr.environment
(or R.sexp of type ENVSXP).
luajr.is_rfunction(obj)
Returns true if obj is a
luajr.rfunction, and false otherwise.
f(...)
Calls the R function with positional arguments. Each argument is converted to an R type automatically, and the result is returned as a luajr value depending on what the R function returned. For example:
local mean = luajr.rfunction("mean")
local m = mean(luajr.numeric({1, 2, 3})) -- m == luajr.numeric({2})f:call(args, env)
Calls the R function using a Lua table of arguments and an optional
evaluation environment. Integer keys 1..#args are passed as
positional arguments to the R function, in order; string keys are passed
as named arguments. env is a luajr.environment
(or R.sexp of type ENVSXP) and defaults to the
global environment when nil or missing. For example:
paste = luajr.rfunction("paste")
result = paste:call({"a", "b", "c", sep = "-"}) -- "a-b-c"
local sum = luajr.rfunction("sum")
result = sum:call({luajr.numeric({1, 2, luajr.NA_real_}), ["na.rm"] = true}) -- 3Note that R argument names containing . (such as
na.rm, length.out, row.names)
cannot be written as Lua identifiers, so the bracket form
["na.rm"] = true is needed.
Integer keys must form a sequence (1..#args with no
holes); any other integer key, or any key whose type is not number or
string, raises an error.
Environment
The luajr.environment type wraps an R environment so its
bindings can be read, written, and inspected from Lua.
luajr.environment(a)
Constructs a luajr.environment. The forms are:
luajr.environment() -- new empty hashed environment under EmptyEnv
luajr.environment(s) -- wrap an existing R environment SEXP
luajr.environment(name) -- look up a registered namespace by nameAbove, s is an R.sexp of type
ENVSXP; name is a Lua string naming a loaded R
package’s namespace (e.g. "stats").
luajr.is_environment(obj)
Returns true if obj is a
luajr.environment, and false otherwise.
e:get(k)
Returns the value bound to k in the environment, or
nil if no such binding exists. The returned value is
converted using the same rules as luajr.from_sexp, so atomic vectors come
back as the corresponding luajr vector type, R
functions as luajr.rfunction, etc.
e:set(k, v)
Binds the name k to value v in the
environment. v is converted to a SEXP using the same rules
as luajr.to_sexp.
e:exists(k)
Returns true if k is bound in the
environment (equivalent to e:get(k) ~= nil), and
false otherwise.
e:remove(k)
Removes the binding for k from the environment, if it
exists.
e:get_parent()
Returns the parent (enclosing) environment as a new
luajr.environment.
e:set_parent(env)
Sets the parent of e to env, which must be
a luajr.environment or an R.sexp of type
ENVSXP.
e:ls(all, sorted)
Returns the names bound in e as a
luajr.character vector. If all is
true, names beginning with . are also included
(default: false). If sorted is
true, the names are returned in lexicographic order
(default: true).
Data frame
A data frame can be created with
This is just a luajr.list with the "class"
attribute set to "data.frame". a and
b have the same meaning as in the vector
constructors. When a list with this class gets returned to R, it
gets turned into an R data.frame and row names are set
automatically. Otherwise it is equivalent to luajr.list and
all the methods of luajr.list can be used. Note that this
allows you to do, e.g.,
Matrix
A matrix can be created with
This is a special kind of luajr.numeric with the
"dim" attribute set to
luajr.integer({ nrow, ncol }), and all entries set to
init. If init is omitted or nil, the matrix is
uninitialised.
This gets recognized as a matrix when returned to R. However, you can only access elements with a single index that starts at 1 and goes in column-major order. So, for example, for a 2x2 matrix, the top-left element has index 1, bottom-left has index 2, top-right has index 3 and bottom-right has index 4.
Data matrix
A data matrix can be created with
This is similar to luajr.matrix, but it has
names as column names. names can be passed in
as a Lua table (e.g. { "foo", "bar" }) or as a
luajr.character object.
Using a data matrix is very slightly faster than using a data frame because only one memory allocation needs to be made, but this difference is on the order of microseconds for normal sized data. Also, if you are just going to convert the returned matrix into a data frame anyway, you lose the speed advantage. So don’t worry about it too much.
Constants
The following constants are defined in the luajr
table:
luajr.TRUE -- defined as 1
luajr.FALSE -- defined as 0
luajr.NA_logical_ -- equal to R.NA_LOGICAL
luajr.NA_integer_ -- equal to R.NA_INTEGER
luajr.NA_real_ -- equal to R.NA_REAL
luajr.NA_character_ -- equal to R.NA_STRING
luajr.NULL -- equal to R.NilValueNote that Lua’s semantics in dealing with logical types and NA values are very different from R’s semantics. Lua does not “understand” NA values in the same way that R does, and luajr sees the R logical type as fundamentally an integer, not a boolean.
Specifically, whereas in R you can do the following:
in Lua you have to do this:
x = luajr.logical({ luajr.TRUE, luajr.FALSE, luajr.NA_logical_ })
if x[1] == luajr.TRUE then print("First element of x is TRUE!") endThis is because under the hood, R’s TRUE is defined as the integer 1
(though any nonzero integer besides NA will also test as TRUE), FALSE is
defined as 0, and NA (when logical or integer) is defined as a special
‘flag’ value of -2147483648 (i.e. -2^31). However, in Lua, anything
other than nil or false evaluates as
true, meaning that the following Lua code
x = luajr.logical({ luajr.TRUE, luajr.FALSE, luajr.NA_logical_ })
for i = 1, #x do
if x[i] then print("Element", i, "of x is TRUE!") end
endwill incorrectly claim that luajr.TRUE,
luajr.FALSE, and luajr.NA_logical_ are all
“TRUE”. So, instead, explicitly set logical values to
either luajr.TRUE, or luajr.FALSE, or
luajr.NA_logical_, and explicitly test them against the
same values.
Note that the different NA constants are not interchangeable. So,
when testing for NA in Lua, you have to check if a value is equal to
(==) or not equal to (~=) the corresponding NA
constant above, depending on the type of the variable in question. The
one exception is luajr.NA_real_: because it is a
NaN value, IEEE-754 semantics make
x == luajr.NA_real_ always false, even when
x is NA_real_. For
luajr.numeric vectors, use the v:is_na(i) method instead of
== to test for NA (or NaN).
All of this only applies only when using R values, e.g. with the
vector types described above. Lua also has its own boolean type (with
possible values true and false) which will
never compare equal with either luajr.TRUE,
luajr.FALSE or any of the NA values. A Lua string can never
compare equal with luajr.NA_character_, but conversely a
Lua number may sometimes compare equal with
luajr.NA_logical_, luajr.NA_integer_, or
luajr.NA_real_, so do be careful not to mix Lua types and
these R constants.
Finally, luajr.NULL can be used to represent a
NULL value, either on its own or as part of a table or
luajr.list that gets returned to R. If you pass NULL in to
Lua through arg code "native", it will come out as
nil in Lua; but if you use arg code "auto", it
will come out as luajr.NULL.
Parallel processing
The luajr.lua module has facilities for running Lua code
in parallel across multiple threads. To use this functionality, you
first need to create a “worker pool”; each worker has its own
independent Lua state. You can then assign work to each worker in the
pool using some basic functions detailed below.
This is an advanced feature and you have to know what you are doing
to use it effectively. Note that because the vector types described
above depend on R, and R is not thread-safe, you should not allocate new
R objects from inside a worker (do not create new
luajr.numeric,luajr.list, etc.) and in general
you should not call any R functions. If you break these guidelines your
program will probably crash. Reading and writing the elements of a
vector that was allocated in the main state and passed in to the worker
is safe, provided different workers do not write to the same slot–this
is the standard pattern for collecting per-worker results, illustrated
below.
The following example uses parallel processing to generate an image
of what is known as a “Buddhabrot”: a Monte Carlo image of the
Mandelbrot set built by sampling random points, iterating each one, and
accumulating the trajectories of those that escape the Mandelbrot set.
Each worker accumulates into its own image buffer (to avoid write races
on shared pixels), and the main state sums them at the end, then
displays the result with R’s image():
buddhabrot <- lua_func("function(cfg)
local pool = luajr.workers()
-- Per-worker setup: seed the RNG so each worker draws different samples.
-- srun runs this once in each worker, in sequence.
pool:srun(function(thread_id)
math.randomseed(thread_id * 7919)
end)
-- One image buffer per worker, allocated in the main state. Each worker
-- writes only to its own buffer, so there's no race on shared pixels.
local bufs = {}
for t = 1, #pool do
bufs[t] = luajr.matrix(cfg.W, cfg.H, 0)
end
-- Each worker samples `n_per` random points and accumulates the orbits
-- of those that escape into its own buffer.
local n_per = math.ceil(cfg.samples / #pool)
pool:prun(function(bufs, cfg, n_per, thread_id)
local img = bufs[thread_id]
local W, H, max_iter = cfg.W, cfg.H, cfg.max_iter
local cx, cy, scale = cfg.cx, cfg.cy, cfg.scale
local xr = scale * W / H -- half-width in cr
for s = 1, n_per do
local cr = cx + (math.random() * 2 - 1) * xr
local ci = cy + (math.random() * 2 - 1) * scale
local zr, zi, iter = 0, 0, 0
-- First pass: does the orbit escape?
while zr*zr + zi*zi < 4 and iter < max_iter do
zr, zi = zr*zr - zi*zi + cr, 2*zr*zi + ci
iter = iter + 1
end
-- Only escaping orbits contribute (the Buddhabrot proper).
if iter < max_iter then
local escape_iters = iter
zr, zi = 0, 0
for k = 1, escape_iters do
zr, zi = zr*zr - zi*zi + cr, 2*zr*zi + ci
local i = math.floor((zr - cx) / xr * W / 2 + W / 2) + 1
local j = math.floor((zi - cy) / scale * H / 2 + H / 2) + 1
if i >= 1 and i <= W and j >= 1 and j <= H then
img[(j - 1) * W + i] = img[(j - 1) * W + i] + 1
end
end
end
end
end, bufs, cfg, n_per)
pool:close()
-- Sum the per-worker buffers into one result.
local result = luajr.matrix(cfg.W, cfg.H, 0)
local n = cfg.W * cfg.H
for t = 1, #bufs do
local b = bufs[t]
for k = 1, n do
result[k] = result[k] + b[k]
end
end
return result
end", "$.")
cfg <- list(W = 600, H = 400, max_iter = 200, samples = 1e8,
cx = -0.5, cy = 0, scale = 1.25)
img <- buddhabrot(cfg)
image(img, col = hcl.colors(256, "Oslo", rev = TRUE),
useRaster = TRUE, axes = FALSE)
Stepping through the above example:
-
luajr.workers()creates a pool of one worker per core;#poolgives the number of workers. You can also pass a number of requested workers toluajr.workers(n). -
pool:srun(setup)runssetupin each worker once, in sequence. This is used here to give each worker a different RNG seed (viathread_id) so the workers don’t redundantly sample the same points. Becausesrunis sequential, the setup doesn’t need to be thread-safe. -
bufsis a Lua table ofluajr.matrixbuffers (one per worker), all allocated in the main state. Passing the table into the workers gives each one access to all the buffers. Each worker picks its own viathread_id, so writes to the buffers don’t race between threads. -
pool:prun(work, bufs, cfg, n_per)runsworkonce in every worker, concurrently. Each worker drawsn_perrandom points, runs the Mandelbrot escape test, and for escaping orbits, re-traces the orbit and increments the pixel that each visited point maps to. -
pool:close()shuts the workers down. This is done automatically by Lua garbage collection, but you can also run it manually. - The main state sums the per-worker buffers into one result. Back in
R, the image is displayed using the function
image().
The complete reference for luajr’s parallel processing functionality follows.
luajr.ncores()
Returns a suggested number of concurrent threads for the current
machine (the value of std::thread::hardware_concurrency()
from C++). This is often the number of physical cores on the
machine.
luajr.workers(n)
Constructs a pool of n worker Lua states. If
n is nil or missing, the default is
luajr.ncores(). Each worker is created with the standard
luajr.lua and Lua modules already loaded.
The pool’s workers are torn down automatically when the pool is
garbage collected, or explicitly via pool:close(). A pool
must be created and used within a single call from R into Lua, because
when the workers are provided with references to all required data and
these references may only be valid within that scope. Specifically,
cdata arguments (vectors, environments, R functions) carried into
workers reference the calling state’s R objects, which R may only keep
alive for the duration of the .Call; pointer-type cdata
transferred as light userdata is similarly subject to whatever lifetime
the source state guarantees.
pool:preload(f, ...)
Loads the function f and the additional arguments
... into every worker state. f is a Lua
function, and the additional arguments are copied to each worker by
value (for plain Lua values) or by reference (for cdata / R values).
pool:srun(f, ...)
Sequentially runs the preloaded function in each worker. If
f is non-nil, calls preload first. The
function is invoked in each worker as
f(arg1, arg2, ..., thread_id), where
arg1..argN are the preloaded arguments and
thread_id is the 1-based index of the worker. The main use
case is running code that touches each worker’s Lua state, such as
loading modules, setting globals, or otherwise preparing per-worker
state, without having to worry about thread safety, since each worker
runs sequentially.
#pool
Returns the number of workers in the pool. This may not be the same
as the number of requested workers; for example, in debug mode, parallel
processing is disabled and so #pool will always be 1.
pool:prun(f, ...)
Same as srun, but runs the workers concurrently on
separate threads. If any worker throws an error, the first error is
propagated back to the calling state after all workers complete.
pool:pfor(i0, i1, f, ...)
Distributes the iterations i0..i1 (inclusive) across the
workers using atomic work-stealing. The function is invoked once per
iteration as f(i, arg1, ..., thread_id), where
i is the iteration index. As with prun, errors
from any worker are propagated after the loop completes.
pool:close()
Closes all worker states in the pool and releases their resources. Safe to call more than once. Workers cannot be used after the pool has been closed.
Utility functions
These are general-purpose helpers exposed by the
luajr.lua module.
luajr.to_sexp(v)
Converts a Lua value v to an R SEXP. The conversion
rules are:
-
nil→R_NilValue - Lua
number→REALSXPof length 1 - Lua
boolean→LGLSXPof length 1 - Lua
string→STRSXPof length 1 (orRAWSXPif the string contains embedded null bytes) - Lua
table→VECSXP; integer keys become positional entries, string keys become named entries (see alsoluajr.list) -
luajrvector/environment/rfunction → the underlying SEXP - bare
R.sexpcdata → the SEXP unchanged
Returns the result as an R.sexp. The returned SEXP is
unprotected; most callers consume it immediately (e.g. as an argument to
R.SET_VECTOR_ELT or R.defineVar).
luajr.from_sexp(s, argcode)
Converts an R SEXP s to the corresponding Lua/luajr
value. argcode is an optional numeric byte value that
selects the target representation; it is not the string
format (e.g. "$.", "&V") accepted by lua_func(), but the underlying
encoded byte that those strings ultimately compile to. If
argcode is nil or missing, the default selects
the auto representation.
luajr.readline(prompt)
Reads one line of input from R’s console using
R_ReadConsole, displaying prompt first.
Returns the line as a Lua string with the trailing newline removed. Used
internally by the debugger but also available for any Lua code that
needs an interactive prompt.
luajr.dbg(...)
Invokes the debugger.lua module on the given arguments, opening an interactive debugger session at the current Lua frame.
Long vectors
luajr can handle what are called “long vectors” in R (i.e., vectors with 2^31 elements or more), but there are some restrictions.
Vector types are allocated by R, so their size is limited by R’s
vector memory limit. This limit can be set and queried using the R
function mem.maxVSize().
There are limits on how large of a table you can create in Lua when
using the lua_createtable() function, which is what
luajr uses to allocate space for tables. This means
that if you try to pass a large vector into Lua as a native Lua array
(e.g. arg code $N for an atomic vector, or $V
for a list), the operation will fail if there are too many elements in
the vector. The maximum number of elements is currently set by LuaJIT to
2^27, or just over 134 million elements. This is also the maximum length
of an R list that can be passed into Lua. Passing a vector
as a SEXP (arg code S) or as a luajr
vector type (arg codes ., L, I,
N, C, V) does not go through
lua_createtable and is not subject to this limit.
As of R 4.6.0, a data.frame cannot hold long vectors;
nor can a data.table or tibble. However, a
list can hold long vectors.
If you are working with lots of data, you may want to keep it on the
hard drive rather than trying to load it all into working memory. You
could use something like C’s mmap
for this, an R package like arrow,
or something else.