Luau
luau-lang.org.web.brid.gy
Luau
@luau-lang.org.web.brid.gy
Luau (lowercase u, /ˈlu.aʊ/) is a fast, small, safe, gradually typed embeddable scripting language derived from Lua.

[bridged from https://luau-lang.org/ on the web: https://fed.brid.gy/web/luau-lang.org ]
Luau Recap: July 2024
Hello everyone! While the Luau team is actively working on a big rewrite of the type inference and type checking engines (more news about that in the near future), we wanted to go over other changes and updates since our last recap back in October. ## Official Luau mascot Luau has recently adopted a Hawaiian monk seal mascot named Hina, after the Hawaiian goddess of the moon. Please welcome Hina the Seal! ## Native Code Generation We are happy to announce that the native code feature is out from the ‘Preview’ state and is fully supported for X64 (Intel/AMD) or A64 (ARM) processor architectures. As a refresher, native code generation is the feature that allows Luau scripts that have been previously executed by interpreting bytecode inside the Luau VM to instead compile to machine code that the CPU understands and executes directly. Since the release of the Preview, we have worked on improving code performance, memory use of the system, correctness and stability. Some highlights: * Improved performance of the bit32 library functions * Improved performance of numerical loops * Optimized table array and property lookups * Added native support for new buffer type operations * Code optimizations based on knowing which types are returned from operations * Code optimizations based on function argument type annotations * This includes support for SIMD operations for annotated vector arguments There are many other small improvements in generated code performance and size and we have plans for additional optimizations. ### Native function attribute For a better control of what code runs natively, we have introduced new syntax for function attributes: @native -- function compiles natively local function foo() ... end This is the first attribute to become available and we are working on the ability to mark functions as deprecated using the `@deprecated` attribute. More on that here. ### Type information for runtime optimizations Native code generation works on any code without having to modify it. In certain situations, this means that the native compiler cannot be sure about the types involved in the operation. Consider a simple function, working on a few values: local function MulAddScaled(a, b, c) return a * b * 0.75 + c * 0.25 end Native compiler assumes that operations are most likely being performed on numbers and generates the appropriate fast path. But what if the function is actually called with a vector type? local intlPos = MulAddScaled(Part.Position, v, vector(12, 0, 0)) To handle this, a slower path was generated to handle any other potential type of the argument. Because this path is not chosen as the first possible option, extra checking overhead prevents code from running as fast as it can. When we announced the last update, we had already added some support for following the types used as arguments. local function MulAddScaled(a: vector, b: vector, c: vector) return a * b * 0.75 + c * 0.25 end > **_NOTE:_** `vector` type is not enabled by default, check out `defaultOptions` and `setupVectorHelpers` functions in `Conformance.test.cpp` file as an example of the `vector` library setup. Since then, we have extended this to support type information on locals, following complex types and even inferring results of additional operations. type Vertex = { p: vector, uv: vector, n: vector, t: vector, b: vector, h: number } type Mesh = { vertices: {Vertex}, indices: {number} } function calculate_normals(mesh: Mesh) for i = 1,#mesh.indices,3 do local a = mesh.vertices[mesh.indices[i]] local b = mesh.vertices[mesh.indices[i + 1]] local c = mesh.vertices[mesh.indices[i + 2]] local vba = a.p - b.p -- Inferred as a vector operation local vca = a.p - c.p local n = vba:Cross(vca) -- Knows that Cross returns vector a.n += n -- Inferred as a vector operation b.n += n c.n += n end end As can be seen, often it’s enough to annotate the type of the data structure and correct fast-path vector code will be generated from that without having to specify the type of each local or temporary. > **_NOTE:_** Advanced inference and operation lowering is enabled by using custom `HostIrHooks` callbacks. Check out ‘Vector’ test with ‘IrHooks’ option in `Conformance.test.cpp` and `ConformanceIrHooks.h` file for an example of the setup. Note that support for native lowering hooks allows generation of CPU code that is multiple times faster than a generic metatable call. Even when native compiler doesn’t have a specific optimization for a type, if the type can be resolved, shorter code sequences are generated and more optimizations can be made between separate operations. > **_NOTE:_** `HostIrHooks` callbacks also enable type inference and lowering for your custom userdata types. Check out ‘NativeUserdata’ test with ‘IrHooks’ option in `Conformance.test.cpp` and `ConformanceIrHooks.h` file for an example of the setup. ## Runtime changes ### Stricter `utf8` library validation `utf8` library will now correctly validate UTF-8 and reject inputs that have surrogates. `utf8.len` will return `nil` followed by the byte offset, `utf8.codepoint` and `utf8.codes` will error. This matches how other kinds of input errors were previously handled by those functions. Strings that are validated using `utf8.len` will now always work properly with `utf8.nfcnormalize` and `utf8.graphemes` functions. Custom per-character validation logic is no longer required to check if a string is valid under `utf8` requirements. ### Imprecise integer number warning Luau stores numbers as 64-bit floating-point values. Integer values up to 2^53 are supported, but higher numbers might experience rounding. For example, both 10000000000000000 and 9223372036854775808 are larger than 2^53, but match the rounding, while 10000000000000001 gets rounded down to 10000000000000000. In cases where rounding takes place, you will get a warning message. If the large value is intended and rounding can be ignored, just add “.0” to the number to remove the warning: local a = 10000000000000001 -- Number literal exceeded available precision and was truncated to closest representable number local b = 10000000000000001.0 -- Ok, but rounds to 10000000000000000 ### Leading `|` and `&` in types It is now possible to start your union and intersection types with a symbol. This can help align the type components more cleanly: type Options = | { tag: "cat", laziness: number } | { tag: "dog", happiness: number } You can find more information and examples in the proposal ## Analysis Improvements While our main focus is on a type-checking engine rewrite that is nearing completion, we have fixed some of the issues in the current one. * Relational operator errors are more conservative now and generate less false positive errors * It is not an error to iterate over table properties when indexer is not part of the type * Type packs with cycles are now correctly described in error messages * Improved error message when value that is not a function is being used in a call * Fixed stability issues which caused Studio to crash * Improved performance for code bases with large number of scripts and complex types ## Runtime Improvements When converting numbers to strings in scientific notation, we will now skip the trailing ‘.’. For example, `tostring(1e+30)` now outputs ‘1e+30’ instead of ‘1.e+30’. This improves compatibility with data formats like JSON. But please keep in mind that unless you are using JSON5, Luau can still output ‘inf’ and ‘nan’ numbers which might not be supported. * Construction of tables with 17-32 properties or 33-64 array elements is now 30% faster. * `table.concat` method is now 2x faster when the separator is not used and 40% faster otherwise. * `table.maxn` method is now 5-14x faster. * vector constants are now stored in the constant table and avoid runtime construction. * Operations like 5/x and 5-x with any constant on the left-hand-side are now performed faster, one less minor thing to think about! * It is no longer possible to crash the server on a hang in the `string` library methods. ## Luau as a supported language on GitHub Lastly, if you have open-source or even private projects on GitHub which use Luau, you might be happy to learn that Luau now has official support on GitHub for `.luau` file extension. This includes recognizing files as using Luau programming language and having support for syntax highlighting. A big thanks goes to our open source community for their generous contributions including pushing for broader Luau support: * birds3345 * bjornbytes * Gskartwii * jackdotink * JohnnyMorganz * khvzak * kostadinsh * mttsner * mxruben * petrihakkinen * zeux
luau.org
December 22, 2024 at 2:49 AM
Luau Recap: October 2023
We’re still quite busy working on some big type checking updates that we hope to talk about soon, but we have a few equally exciting updates to share in the meantime! Let’s dive in! ## Floor Division Luau now has a floor division operator. It is spelled `//`: local a = 10 // 3 -- a == 3 a //= 2 -- a == 1 For numbers, `a // b` is equivalent to `math.floor(a / b)`, and you can also overload this operator by implementing the `__idiv` metamethod. The syntax and semantics are borrowed from Lua 5.3 (although Lua 5.3 has an integer type while we don’t, we tried to match the behavior to be as close as possible). ## Native Codegen Preview We are actively working on our new native code generation module that can significantly improve the performance of compute-dense scripts by compiling them to X64 (Intel/AMD) or A64 (ARM) machine code and executing that natively. We aim to support all AArch64 hardware with the current focus being Apple Silicon (M1-M3) chips, and all Intel/AMD hardware that supports AVX1 (with no planned support for earlier systems). When the hardware does not support native code generation, any code that would be compiled as native just falls back to the interpreted execution. When working with open-source releases, binaries now have native code generation support compiled in by default; you need to pass `--codegen` command line flag to enable it. If you use Luau as a library in a third-party application, you would need to manually link `Luau.CodeGen` library and call the necessary functions to compile specific modules as needed - or keep using the interpreter if you want to! If you work in Roblox Studio, we have integrated native code generation preview as a beta feature, which currently requires manual annotation of select scripts with `--!native` comment. Our goal for the native code generation is to help reach ultimate performance for code that needs to process data very efficiently, but not necessarily to accelerate every line of code, and not to replace the interpreter. We remain committed to maximizing interpreted execution performance, as not all platforms will support native code generation, and it’s not always practical to use native code generation for large code bases because it has a larger memory impact than bytecode. We intend for this to unlock new performance opportunities for complex features and algorithms, e.g. code that spends a lot of time working with numbers and arrays, but not to dramatically change performance on UI code or code that spends a lot of its time calling Lua functions like `table.sort`, or external C functions (like Roblox engine APIs). Importantly, native code generation does not change our behavior or correctness expectations. Code compiled natively should give the same results when it executes as non-native code (just take a little less time), and it should not result in any memory safety or sandboxing issues. If you ever notice native code giving a different result from non-native code, please submit a bug report. We continue to work on many code size and performance improvements; here’s a short summary of what we’ve done in the last couple of months, and there’s more to come! * Repeated access to table fields with the same object and name are now optimized (e.g. `t.x = t.x + 5` is faster) * Numerical `for` loops are now compiled more efficiently, yielding significant speedups on hot loops * Bit operations with constants are now compiled more efficiently on X64 (for example, `bit32.lshift(x, 1)` is faster); this optimization was already in place for A64 * Repeated access to array elements with the same object and index is now faster in certain cases * Performance of function calls has been marginally improved on X64 and A64 * Fix code generation for some `bit32.extract` variants where we could produce incorrect results * `table.insert` is now faster when called with two arguments as it’s compiled directly to native code * To reduce code size, module code outside of functions is not compiled natively unless it has loops ## Analysis Improvements The `break` and `continue` keywords can now be used in loop bodies to refine variables. This was contributed by a community member - thank you, AmberGraceSoftware! function f(objects: { { value: string? } }) for _, object in objects do if not object.value then continue end local x: string = object.value -- ok! end end When type information is present, we will now emit a warning when `#` or `ipairs` is used on a table that has no numeric keys or indexers. This helps avoid common bugs like using `#t == 0` to check if a dictionary is empty. local message = { data = { 1, 2, 3 } } if #message == 0 then -- Using '#' on a table without an array part is likely a bug end Finally, some uses of `getfenv`/`setfenv` are now flagged as deprecated. We do not plan to remove support for `getfenv`/`setfenv` but we actively discourage its use as it disables many optimizations throughout the compiler, runtime, and native code generation, and interferes with type checking and linting. ## Autocomplete Improvements We used to have a bug that would arise in the following situation: --!strict type Direction = "Left" | "Right" local dir: Direction = "Left" if dir == ""| then end (imagine the cursor is at the position of the `|` character in the `if` statement) We used to suggest `Left` and `Right` even though they are not valid completions at that position. This is now fixed. We’ve also added a complete suggestion for anonymous functions if one would be valid at the requested position. For example: local p = Instance.new('Part') p.Touched:Connect( You will see a completion suggestion `function (anonymous autofilled)`. Selecting that will cause the following to be inserted into your code: local p = Instance.new('Part') p.Touched:Connect(function(otherPart: BasePart) end We also fixed some confusing editor feedback in the following case: game:FindFirstChild( Previously, the signature help tooltip would erroneously tell you that you needed to pass a `self` argument. We now correctly offer the signature `FindFirstChild(name: string, recursive: boolean?): Instance` ## Runtime Improvements * `string.format`’s handling of `%*` and `%s` is now 1.5-2x faster * `tonumber` and `tostring` are now 1.5x and 2.5x faster respectively when working on primitive types * Compiler now recognizes `math.pi` and `math.huge` and performs constant folding on the expressions that involve these at `-O2`; for example, `math.pi*2` is now free. * Compiler now optimizes `if...then...else` expressions into AND/OR form when possible (for example, `if x then x else y` now compiles as `x or y`) * We had a few bugs around `repeat..until` statements when the `until` condition referred to local variables defined in the loop body. These bugs have been fixed. * Fix an oversight that could lead to `string.char` and `string.sub` generating potentially unlimited amounts of garbage and exhausting all available memory. * We had a bug that could cause the compiler to unroll loops that it really shouldn’t. This could result in massive bytecode bloat. It is now fixed. ## luau-lang on GitHub If you’ve been paying attention to our GitHub projects, you may have noticed that we’ve moved `luau` repository to a new luau-lang GitHub organization! This is purely an organizational change but it’s helping us split a few repositories for working with documentation and RFCs and be more organized with pull requests in different areas. Make sure to update your bookmarks and star our main repository if you haven’t already! Lastly, a big thanks to our open source community for their generous contributions: * MagelessMayhem * cassanof * LoganDark * j-hui * xgqt * jdpatdiscord * Someon1e * AmberGraceSoftware * RadiantUwU * SamuraiCrow
luau.org
December 22, 2024 at 2:49 AM
Luau Recap: July 2023
Our team is still spending a lot of time working on upcoming replacement for our type inference engine as well as working on native code generation to improve runtime performance. However, we also worked on unrelated improvements during this time that are summarized here. Cross-posted to the [Roblox Developer Forum.] ## Analysis improvements Indexing table intersections using `x["prop"]` syntax has been fixed and no longer reports a false positive error: type T = { foo: string } & { bar: number } local x: T = { foo = "1", bar = 2 } local y = x["bar"] -- This is no longer an error Generic `T...` type is now convertible to `...any` variadic parameter. This solves issues people had with variadic functions and variadic argument: local function foo(...: any) print(...) end local function bar<T...>(...: T...) foo(...) -- This is no longer an error end We have also improved our general typechecking performance by ~17% and by additional ~30% in modules with complex types. Other fixes include: * Fixed issue with type `T?` not being convertible to `T | T` or `T?` which could’ve generated confusing errors * Return type of `os.date` is now inferred as `DateTypeResult` when argument is “ _t” or “!_ t” ## Runtime improvements Out-of-memory exception handling has been improved. `xpcall` handlers will now actually be called with a “not enough memory” string and whatever string/object they return will be correctly propagated. Other runtime improvements we’ve made: * Performance of `table.sort` was improved further. It now guarantees N*log(N) time complexity in the worst case * Performance of `table.concat` was improved by ~5-7% * Performance of `math.noise` was improved by ~30% * Inlining of functions is now possible even when they used to compute their own arguments * Improved logic for determining whether inlining a function or unrolling a loop is profitable ## Autocomplete improvements An issue with exported types not being suggested is now fixed. ## Debugger improvements We have fixed the search for the closest executable breakpoint line. Previously, breakpoints might have been skipped in `else` blocks at the end of a function. This simplified example shows the issue: local function foo(isIt) if isIt then print("yes") else -- When 'true' block exits the function, breakpoint couldn't be placed here print("no") end end ## Thanks A very special thanks to all of our open source contributors: * Petri Häkkinen * JohnnyMorganz * Gael * Jan * Alex Orlenko * mundusnine * Ben Mactavsin * RadiatedExodus * Lodinu Kalugalage * MagelessMayhem * Someon1e
luau.org
December 22, 2024 at 2:49 AM
Luau Recap: March 2023
How the time flies! The team has been busy since the last November Luau Recap working on some large updates that are coming in the future, but before those arrive, we have some improvements that you can already use! Cross-posted to the [Roblox Developer Forum.] ## Improved type refinements Type refinements handle constraints placed on variables inside conditional blocks. In the following example, while variable `a` is declared to have type `number?`, inside the `if` block we know that it cannot be `nil`: local function f(a: number?) if a ~= nil then a *= 2 -- no type errors end ... end One limitation we had previously is that after a conditional block, refinements were discarded. But there are cases where `if` is used to exit the function early, making the following code essentially act as a hidden `else` block. We now correctly preserve such refinements and you should be able to remove `assert` function calls that were only used to get rid of false positive errors about types being `nil`. local function f(x: string?) if not x then return end -- x is a 'string' here end Throwing calls like `error()` or `assert(false)` instead of a `return` statement are also recognized. local function f(x: string?) if not x then error('first argument is nil') end -- x is 'string' here end Existing complex refinements like `type`/`typeof`, tagged union checks and other are expected to work as expected. ## Marking table.getn/foreach/foreachi as deprecated `table.getn`, `table.foreach` and `table.foreachi` were deprecated in Lua 5.1 that Luau is based on, and removed in Lua 5.2. `table.getn(x)` is equivalent to `rawlen(x)` when ‘x’ is a table; when ‘x’ is not a table, `table.getn` produces an error. It’s difficult to imagine code where `table.getn(x)` is better than either `#x` (idiomatic) or `rawlen(x)` (fully compatible replacement). `table.getn` is also slower than both alternatives and was marked as deprecated. `table.foreach` is equivalent to a `for .. pairs` loop; `table.foreachi` is equivalent to a `for .. ipairs` loop; both may also be replaced by generalized iteration. Both functions are significantly slower than equivalent for loop replacements, are more restrictive because the function can’t yield. Because both functions bring no value over other library or language alternatives, they were marked deprecated as well. You may have noticed linter warnings about places where these functions are used. For compatibility, these functions are not going to be removed. ## Autocomplete improvements When table key type is defined to be a union of string singletons, those keys can now autocomplete in locations marked as ‘^’: type Direction = "north" | "south" | "east" | "west" local a: {[Direction]: boolean} = {[^] = true} local b: {[Direction]: boolean} = {["^"]} local b: {[Direction]: boolean} = {^} We also fixed incorrect and incomplete suggestions inside the header of `if`, `for` and `while` statements. ## Runtime improvements On the runtime side, we added multiple optimizations. `table.sort` is now ~4.1x faster (when not using a predicate) and ~2.1x faster when using a simple predicate. We also have ideas on how improve the sorting performance in the future. `math.floor`, `math.ceil` and `math.round` now use specialized processor instructions. We have measured ~7-9% speedup in math benchmarks that heavily used those functions. A small improvement was made to builtin library function calls, getting a 1-2% improvement in code that contains a lot of fastcalls. Finally, a fix was made to table array part resizing that brings large improvement to performance of large tables filled as an array, but at an offset (for example, starting at 10000 instead of 1). Aside from performance, a correctness issue was fixed in multi-assignment expressions. arr[1], n = n, n - 1 In this example, `n - 1` was assigned to `n` before `n` was assigned to `arr[1]`. This issue has now been fixed. ## Analysis improvements Multiple changes were made to improve error messages and type presentation. * Table type strings are now shown with newlines, to make them easier to read * Fixed unions of `nil` types displaying as a single `?` character * “Type pack A cannot be converted to B” error is not reported instead of a cryptic “Failed to unify type packs” * Improved error message for value count mismatch in assignments like `local a, b = 2` You may have seen error messages like `Type 'string' cannot be converted to 'string?'` even though usually it is valid to assign `local s: string? = 'hello'` because `string` is a sub-type of `string?`. This is true in what is called Covariant use contexts, but doesn’t hold in Invariant use contexts, like in the example below: local a: { x: Model } local b: { x: Instance } = a -- Type 'Model' could not be converted into 'Instance' in an invariant context In this example, while `Model` is a sub-type of `Instance` and can be used where `Instance` is required. The same is not true for a table field because when using table `b`, `b.x` can be assigned an `Instance` that is not a `Model`. When `b` is an alias to `a`, this assignment is not compatible with `a`’s type annotation. * * * Some other light changes to type inference include: * `string.match` and `string.gmatch` are now defined to return optional values as match is not guaranteed at runtime * Added an error when unrelated types are compared with `==`/`~=` * Fixed issues where variable after `typeof(x) == 'table'` could not have been used as a table ## Thanks A very special thanks to all of our open source contributors: * niansa/tuxifan * B. Gibbons * Epix * Harold Cindy * Qualadore
luau.org
December 22, 2024 at 2:49 AM
Luau Recap: November 2022
While the team is busy to bring some bigger things in the future, we have made some small improvements this month. Cross-posted to the [Roblox Developer Forum.] ## Analysis improvements We have improved tagged union type refinements to only include unhandled type cases in the `else` branch of the `if` statement: type Ok<T> = { tag: "ok", value: T } type Err = { tag: "error", msg: string } type Result<T> = Ok<T> | Err function unwrap<T>(r: Result<T>): T? if r.tag == "ok" then return r.value else -- Luau now understands that 'r' here can only be the 'Err' part print(r.msg) return nil end end For better inference, we updated the definition of `Enum.SomeType:GetEnumItems()` to return `{Enum.SomeType}` instead of common `{EnumItem}` and the return type of `next` function now includes the possibility of key being `nil`. Finally, if you use `and` operator on non-boolean values, `boolean` type will no longer be added by the type inference: local function f1(a: number?) -- 'x' is still a 'number?' and doesn't become 'boolean | number' local x = a and 5 end ## Error message improvements We now give an error when built-in types are being redefined: type string = number -- Now an error: Redefinition of type 'string' We also had a parse error missing in case you forgot your default type pack parameter value. We accepted the following code silently without raising an issue: type Foo<T... = > = nil -- Now an error: Expected type, got '>' Error about function argument count mismatch no longer points at the last argument, but instead at the function in question. So, instead of: function myfunction(a: number, b:number) end myfunction(123) ~~~ We now highlight this: function myfunction(a: number, b:number) end myfunction(123) ~~~~~~~~~~ If you iterate over a table value that could also be `nil`, you get a better explanation in the error message: local function f(t: {number}?) for i,v in t do -- Value of type {number}? could be nil --... end end Previously it was `Cannot call non-function {number}?` which was confusing. And speaking of confusing, some of you might have seen an error like `Type 'string' could not be converted into 'string'`. This was caused by Luau having both a primitive type `string` and a table type coming from `string` library. Since the way you can get the type of the `string` library table is by using `typeof(string)`, the updated error message will mirror that and report `Type 'string' could not be converted into 'typeof(string)'`. Parsing now recovers with a more precise error message if you forget a comma in table constructor spanning multiple lines: local t = { a = 1 b = 2 -- Expected ',' after table constructor element c = 3 -- Expected ',' after table constructor element }
luau.org
December 22, 2024 at 2:49 AM
Luau origins and evolution
At the heart of Roblox technology lies Luau, a scripting language derived from Lua 5.1 that is being developed by an internal team of programming language experts with the help of open source contributors. It powers all user-generated content on Roblox, providing access to a very rich set of APIs that allows manipulation of objects in the 3D world, backend API access, UI interaction and more. Hundreds of thousands of developers write code in Luau every month, with top experiences using hundreds of thousands of lines of code, adding up to hundreds of millions of lines of code across the platform. For many of them, it is the first programming language they learn, and one they spend the majority of their time programming in. Using a set of extended APIs developers also customize their workflows by writing plugins to Roblox Studio, where they work on their experiences, using an extended API surface to interact with all aspects of the editor. It also powers a lot of application code that Roblox engineers are writing: Universal App, the gateway to the worlds of Roblox that is used by tens of millions of people every day, has 95% of its functionality implemented in Luau, and Roblox Studio has a lot of builtin critical functionality such as part and terrain editors, marketplace browser, avatar and animation editors, material manager and more, implemented in Luau as a plugin, mostly using the same APIs that developers have access to. Every week, updates to this internal codebase that is now over 2 million lines large, are shipped to all Roblox users. In addition to Roblox use cases, Luau is also open-source and is seeing an increased adoption in other projects and applications. But why did we use Lua in the first place, and why did we decide to pursue building a new language on top of it? # Early beginnings Around 2006, when a very early version of the Roblox platform was developed, the question of user generated behaviors emerged. Before that, users were able to build non-interactive content on Roblox, and the only form of interaction was physics simulation. While this provided rich emergent behavior, it was hard to build gameplay on top of this: for example, to build a Capture The Flag game, you need to handle collision between players and flags spread throughout the map with a bit of logic that dictates how to adjust team points and when to remove or recreate the objects. After an early and brief misstep when we decided to add a few gameplay objects to the core definition of Roblox worlds (some developers may recognize FlagStand as a class name…), the Roblox co-founder Erik Cassel realized that an approach like this is fundamentally limiting the power of user generated content. It’s not enough to give creators the basic blocks on top of which to build their creations, it’s critical to expose the power of a full Turing-complete programming language. Without this, the expressive capability and the reach of the platform would have been restricted far too much. But which programming language to choose? This is where Lua, which was, and still is, one of the dominant programming languages used in video games, comes in. In addition to its simplicity, which made the language easy to learn and get productive in, Lua was the fastest scripting language compared to popular alternatives like Python or JavaScript at the time1, designed to be embedded which meant an easy ability to expose APIs from the host application to the scripts as well as high degree of execution control from the host, and implemented coroutines, a very powerful concurrency primitive that allowed to easily and intuitively script behaviors for independent actors in game using linear control flow. Instead of having a large standard library, the expectation was that the embedding application would define a set of APIs that that application needed, as well as establish policies of running the code - which gave us a lot of freedom in how to structure the APIs and when the scripts would get triggered during the simulation of a single frame. # Power of simplicity Lua is a simple language. What does simplicity mean for us? Being a simple language means having a small set of features. Lua has all the fundamental features but doesn’t have a lot of syntax sugar - this means the language is easier to teach and learn, and you rarely run into code that’s difficult to understand syntactically because it uses an unfamiliar construct. Of course, this also means that some programs in Lua are longer than equivalent programs in languages that have more dedicated constructs to solve specific problems, such as list comprehensions in Python. Being a simple language means having a minimal set of rules for every feature. Lua does deviate from this in certain respects (which is to say, the language could have been even simpler!), but notably for a dynamic language the behavior of fundamental operators is generally easy to explain and unsurprising - for example, two values in Lua are equal iff they have the same type and the same value, as such `0 == “0”` is `false`; as another example, `for` loops introduce unique variable bindings on every iteration, as such capturing the iteration variable in a closure produces unique values. These decisions lead to more concise and efficient implementation and eliminate a class of bugs in programs. Being a simple language means having a small implementation. This may be immaterial to people writing code in the language, but it leads to an implementation that can be of higher quality; simpler implementations can also be easier to optimize for memory or performance, and are easier to build upon. Developers on the Roblox platform have very diverse programming backgrounds. Some are writing their first line of code in Roblox Studio, while others have computer science degrees and experience working in multiple different programming languages. While it’s always possible to support two different programming languages that target different segments of the audience, that fragments the ecosystem and makes the programming story less consistent (impacting documentation, tutorials, code reuse, ability for community members to help each other write code, presents challenges with interaction between different languages in the same experience and more). A better outcome is one where a single language can serve both audiences - this requires a language that strikes a balance between simplicity and generality, and while Lua isn’t perfect here, it’s great as a foundation for a language like this2. In many ways, Lua is simultaneously simple and pragmatic: many parts of the language are difficult to make much better without a lot of added complexity, but at the same time it requires little in the way of extra functionality to be able to solve problems efficiently. That said, no language is perfect, and within several areas of Lua we felt that the tradeoffs weren’t quite right for our use case. # Respectful evolution In 2019, we decided to build Luau - a language derived from Lua and compatible with Lua 5.1, which is the version we’ve been using all these years. At the time we evaluated other routes, but ultimately settled on this as the most optimal long-term. On one hand, we loved a lot of things about Lua - both design wise and implementation wise, while there were some decisions we felt were suboptimal, by and large it was an almost perfect foundation for what we’ve set out to achieve. On the other hand, we’ve been running into the limitations of Lua on large code bases in absence of type checking, performance was good but not great, and some missing features would have been great to have. Some of the things we’ve been missing have been added in later versions of Lua, yet we were still using Lua 5.1. While we would have loved to use a later version of the language standard, Lua 5.x releases are not backwards compatible, and some releases remove support for features that are in wide use at Roblox. For Roblox, backwards compatibility is an essential feature of the platform - while we don’t have a guarantee that content created 10 years ago still works, to the extent that we can achieve that without restricting the platform evolution too much, we try. What we’ve realized is that Lua is a great foundation for a perfect language that we can build for Roblox. We would maintain backwards compatibility with Lua 5.1 but evolve the language from there; sometimes this means taking later features from Lua that don’t conflict with the existing language or our design values, sometimes this means innovating beyond what Lua has done. Crucially, we must maintain the balance between simplicity and power - we still value simplicity, we still need to avoid a feature explosion to ensure that the features compose and are of high quality, and we still need the language to be a good fit for beginners. One of the largest limitations that we’ve seen is the lack of type checking making it easy to make mistakes in large code bases, as such support for type checking was a requirement for Luau. However, it’s important that the type checker is mostly transparent to the developers who don’t want to invest the time to learn it - anything else would change the learning curve too much for the language to be suitable for beginners. As such, we’ve investing in gradual typing, and our type checker is learning to strike a balance between inferring useful types for completely untyped programs (which, among other things, greatly enhances editing experience through type-aware autocomplete), and the lack of false positive diagnostics that can be confusing and distracting. While we did need to introduce extra syntax to the language - most notably, to support optional type annotations - it was important for us to maintain the cohesion of the overall syntax. We aren’t seeking to make a new language with a syntax alien to Lua programmers - Luau programs are still recognizably Lua, and to the extent possible we try to avoid new syntactic features. In a sense, we still want the syntax, semantics, and the runtime to be simple and minimal - but at the same time we have important problems to solve with respect to ergonomics, robustness and performance of the language, and solving some of them requires having slightly more complex syntax, semantics, or implementation. So in finding ways to evolve Luau, we strive to design features that feel like they would be at home in Lua. At the same time, we’ve adopted a more open evolution process - the language development is driven through RFCs that are designs open to the public that anyone can contribute to - this is in contrast with Lua, which has a very closed development process, and is one of the reasons why it would have been difficult for us to keep using Lua as we wouldn’t get a say in its development. At the same time, to ensure the design criterias are met, it’s important that the Luau development team at Roblox maintains a final say over design and implementation of the language3, while taking the community’s proposals and input into consideration. # Importance of co-design Luau language is developed in concert with the language compiler, runtime, type checker and other analysis tools, autocomplete engine and other tooling, and that development is guided by the vast volume of existing Luau code, both internal and external. This is one of the key principles behind our evolution philosophy - neither layer is developed in isolation, and instead concerns at every level inform all other aspects of the language design and implementation. This means that when designing language features, we make sure that they can be implemented efficiently, type checked properly, can be supported well in editing and analysis tools and have a positive impact on the code internal and external engineers write. When we find issues in any component, we can always ask, what changes to other components or even language design would make for a better overall solution. This avoids some classes of design problems, for example we won’t specify a language feature that has a prohibitively high implementation cost, as it violates our simplicity criteria, or that is impractical to implement efficiently, as that would create a performance hazard. This also means that when implementing various components of the language we cross-check the concerns and applicability of these across the entire stack - for example, we’ve reworked our auto-complete system to use the same type inference engine that the type checking / analysis tools use, which had immense benefits for the experience of editing code, but also applied significant back pressure on the type inference itself, forcing us to improve it substantially and fix a lot of corner cases that would otherwise have lingered unnoticed. Whenever we develop features, optimizations, improve our analysis engine or enhance the standard libraries, we also heavily rely on code written in Luau to validate our hypotheses. When working on new features we find motivation in the real problems that we see our developers face. For example, we implemented the new ternary operator after seeing a large set of cases where existing Lua’s `a and b or c` pattern was error-prone for boolean values, which made it easy to accidentally introduce a mistake that was hard to identify automatically. All optimizations and new analysis features are validated on our internal 2M LOC codebase before being added to Luau, which allows us to quickly get initial validation of ideas, or invalidate some approaches as infeasible / unhelpful. In addition to that, while we don’t have direct access to community-developed source code for privacy reasons, we can run experiments and collect telemetry4, which also helps us make decisions regarding backwards compatibility. Due to Hyrum’s law, technically any change in the language or libraries, no matter how small, would be backwards incompatible - instead we adopt the notion of pragmatic balance between strict backwards compatibility5 and pragmatic compatibility concerns. For example, later versions of Lua make some library functions like `table.insert`/`table.remove` more strict with how they handle out of range indices. We have evaluated this change for compatibility by collecting telemetry on the use of out of range indices in these functions on the Roblox platform and concluded that applying the stricter checking would break existing programs, and instead had to slightly adjust the rules for out of range behavior in ways that was benign for existing code but prevented catastrophic performance degradation for large out of range indices. Because we couldn’t afford to introduce new runtime errors in this case, we also added a set of linting rules to our analysis engine to flag potential misuse of `table.insert`/`table.remove` before the code ever gets to run - this diagnostics is informational and as such doesn’t affect backwards compatibility, but does help prevent mistakes. There are also cases where this co-design approach prevents introduction of features that can lead to easy misuse, which can be difficult to see in the design of the feature itself, but becomes more apparent when you consider features in context of the entire ecosystem. This is a good thing - it means co-design acts as a forcing function on the language simplicity and makes it easier to flag potential bad interactions between different language features, or language features and tooling, or language features and existing programming patterns that are in widespread use in real-world code. By making sure that all features are validated for their impact across the stack and on code written in Luau, we ultimately get a better, simpler and more cohesive language. # Efficient execution One of the critical goals in front of Luau is efficiency, both from the performance and memory perspective. There’s only so many milliseconds in a frame, and we simultaneously see the need to increase the scale and complexity of simulated experiences, which requires more memory and computation, as well as the need to fit more comfortably into smaller budgets of performance memory for better experience on smaller devices. In fact, one of the motivations for Luau in 2019 has been improved performance, as we saw many opportunities to go beyond Lua with a redesigned implementation. Crucially, our performance needs are somewhat unique and require somewhat unique solutions. We need Luau to run on many platforms where native code generation is either prohibited by the platform vendor or impractical due to tight memory constraints. As such, in terms of execution performance it’s critical that we have a very fast interpreter6. However, we have freedom in terms of high level design of the entire stack - for example, clients never see the source code of the scripts as all compilation to bytecode happens on the server; this gives us an opportunity to perform more involved and expensive optimizations during that process as well as have the smallest possible startup time on the client without complex pre-parse steps. Notably, our bytecode compiler performs a series of high level optimizations including function inlining and loop unrolling that in other dynamic languages is often left to the just-in-time compiler. Another area where performance is critical is garbage collection. Garbage collection is crucial for the language’s simplicity as it makes memory management easier to reason about, but it does require a substantial amount of implementation effort to keep it efficient. For Roblox and for any other game engine or interactive simulation, latency is critical and so our collector is heavily optimized for that - to the extent possible collection is incremental and stop-the-world pauses are very brief. Another part of the performance story here however is the language and data structure design - by making sure that core data types are efficient in how they are laid out in memory we reduce the amount of work garbage collector takes to trace the heap, and, as another example of co-design, we try to make sure that language features are conscious of the impact they have on memory and garbage collection efficiency. However, from a whole-platform standpoint there’s a lot of performance aspects that go beyond single-threaded execution. This is an active area of research and development for the team, as to really leverage the hardware the code is running on we need to think about SIMD, hardware thread utilization as well as running code in a cluster of nodes. These considerations inform current and future development of the runtime and the language (for example, our runtime now supports efficient operations on short SIMD vectors even in interpreted mode, and the VM is fairly lightweight to instantiate which makes running many VMs per core practical, with message passing or access to shared Roblox data model used to make gameplay features feasible to implement), but we’re definitely in the early days here - our first implementation of parallel script execution in Roblox just shipped earlier this year. This is likely the area where a lot of future innovations will happen as well. # Future We’re very happy with the success of Luau - in several years we’ve established consistent processes for evolving the language and so far we found a good balance between simplicity, ease of use, performance and robustness of the language, its implementation and the tooling surrounding it. The language keeps continuously evolving but at a pace that is easy to stay on top of - in 2022 we shipped a few syntactic extensions for type annotations but no changes to the syntax of the language outside of types, and only one major semantic change to the for loop iteration that actually made the language easier to use by avoiding the need to specify the table traversal style via `pairs`/`ipairs`. We try to make sure that the features are general and provide enough extensibility so that libraries can be built on top of the language to make it easier to write code, while also making it practical to use the language without complex supporting frameworks. There’s still a lot of ground to cover, and we’ll be working on Luau for years to come. We’re in the process of building the next version of our type inference / checking engine to make sure that all users of the language regardless of their expertise benefit from it, we’ve started investing in native code generation as we’re reaching the limits of interpreted performance (although some exciting opportunities for compiler optimization are still on the horizon), and there’s still a lot of hard design and implementation work ahead of us for some important language features and standard libraries. And as mentioned, our execution model will likely see a lot of innovation as we push the boundaries of hardware utilization across cores and nodes. Overall, Luau is like an iceberg - the surface is simple to learn and use, but it hides the tremendous amount of careful design, engineering and attention to detail, and we plan to continue to invest in it while trying to keep the outer surface comparatively small. We’re excited to see how far we can take it! 1. High-performance JavaScript engines didn’t exist at the time! LuaJIT was around the corner and redefined the performance expectations of dynamic languages. ↩ 2. In fact, scaling to large teams of expert programmers is one of the core motivations behind our creating Luau, while a requirement to still be suitable for beginner programmers guides our evolution direction. ↩ 3. This would have been difficult to drive in any existing large established language like JavaScript or Python. ↩ 4. This is limited to Roblox platform and doesn’t exist in open-source releases. ↩ 5. Which we do follow in some areas, such as syntactic compatibility - all existing programs that parse must continue to parse the same way as the language evolves. ↩ 6. Some design decisions and implementation techniques are documented on our performance page. ↩
luau.org
December 22, 2024 at 2:50 AM
Luau Recap: September &amp; October 2022
Luau is our new language that you can read more about at https://luau-lang.org. Cross-posted to the [Roblox Developer Forum.] ## Semantic subtyping One of the most important goals for Luau is to avoid _false positives_ , that is cases where Script Analysis reports a type error, but in fact the code is correct. This is very frustrating, especially for beginners. Spending time chasing down a gnarly type error only to discover that it was the type system that’s wrong is nobody’s idea of fun! We are pleased to announce that a major component of minimizing false positives has landed, _semantic subtyping_ , which removes a class of false positives caused by failures of subtyping. For example, in the program local x : CFrame = CFrame.new() local y : Vector3 | CFrame if (math.random()) then y = CFrame.new() else y = Vector3.new() end local z : Vector3 | CFrame = x * y -- Type Error! an error is reported, even though there is no problem at runtime. This is because `CFrame`’s multiplication has two overloads: ((CFrame, CFrame) -> CFrame) & ((CFrame, Vector3) -> Vector3) The current syntax-driven algorithm for subtyping is not sophisticated enough to realize that this is a subtype of the desired type: (CFrame, Vector3 | CFrame) -> (Vector3 | CFrame) Our new algorithm is driven by the semantics of subtyping, not the syntax of types, and eliminates this class of false positives. If you want to know more about semantic subtyping in Luau, check out our technical blog post on the subject. ## Other analysis improvements * Improve stringification of function types. * Improve parse error warnings in the case of missing tokens after a comma. * Improve typechecking of expressions involving variadics such as `{ ... }`. * Make sure modules don’t return unbound generic types. * Improve cycle detection in stringifying types. * Improve type inference of combinations of intersections and generic functions. * Improve typechecking when calling a function which returns a variadic e.g. `() -> (number...)`. * Improve typechecking when passing a function expression as a parameter to a function. * Improve error reporting locations. * Remove some sources of memory corruption and crashes. ## Other runtime and debugger improvements * Improve performance of accessing debug info. * Improve performance of `getmetatable` and `setmetatable`. * Remove a source of freezes in the debugger. * Improve GC accuracy and performance. ## Thanks Thanks for all the contributions! * AllanJeremy * JohnnyMorganz * jujhar16 * petrihakkinen
luau.org
December 22, 2024 at 2:49 AM
Semantic Subtyping in Luau
Luau is the first programming language to put the power of semantic subtyping in the hands of millions of creators. ## Minimizing false positives One of the issues with type error reporting in tools like the Script Analysis widget in Roblox Studio is _false positives_. These are warnings that are artifacts of the analysis, and don’t correspond to errors which can occur at runtime. For example, the program local x = CFrame.new() local y if (math.random()) then y = CFrame.new() else y = Vector3.new() end local z = x * y reports a type error which cannot happen at runtime, since `CFrame` supports multiplication by both `Vector3` and `CFrame`. (Its type is `((CFrame, CFrame) -> CFrame) & ((CFrame, Vector3) -> Vector3)`.) False positives are especially poor for onboarding new users. If a type-curious creator switches on typechecking and is immediately faced with a wall of spurious red squiggles, there is a strong incentive to immediately switch it off again. Inaccuracies in type errors are inevitable, since it is impossible to decide ahead of time whether a runtime error will be triggered. Type system designers have to choose whether to live with false positives or false negatives. In Luau this is determined by the mode: `strict` mode errs on the side of false positives, and `nonstrict` mode errs on the side of false negatives. While inaccuracies are inevitable, we try to remove them whenever possible, since they result in spurious errors, and imprecision in type-driven tooling like autocomplete or API documentation. ## Subtyping as a source of false positives One of the sources of false positives in Luau (and many other similar languages like TypeScript or Flow) is _subtyping_. Subtyping is used whenever a variable is initialized or assigned to, and whenever a function is called: the type system checks that the type of the expression is a subtype of the type of the variable. For example, if we add types to the above program local x : CFrame = CFrame.new() local y : Vector3 | CFrame if (math.random()) then y = CFrame.new() else y = Vector3.new() end local z : Vector3 | CFrame = x * y then the type system checks that the type of `CFrame` multiplication is a subtype of `(CFrame, Vector3 | CFrame) -> (Vector3 | CFrame)`. Subtyping is a very useful feature, and it supports rich type constructs like type union (`T | U`) and intersection (`T & U`). For example, `number?` is implemented as a union type `(number | nil)`, inhabited by values that are either numbers or `nil`. Unfortunately, the interaction of subtyping with intersection and union types can have odd results. A simple (but rather artificial) case in older Luau was: local x : (number?) & (string?) = nil local y : nil = nil y = x -- Type '(number?) & (string?)' could not be converted into 'nil' x = y This error is caused by a failure of subtyping, the old subtyping algorithm reports that `(number?) & (string?)` is not a subtype of `nil`. This is a false positive, since `number & string` is uninhabited, so the only possible inhabitant of `(number?) & (string?)` is `nil`. This is an artificial example, but there are real issues raised by creators caused by the problems, for example https://devforum.roblox.com/t/luau-recap-july-2021/1382101/5. Currently, these issues mostly affect creators making use of sophisticated type system features, but as we make type inference more accurate, union and intersection types will become more common, even in code with no type annotations. This class of false positives no longer occurs in Luau, as we have moved from our old approach of _syntactic subtyping_ to an alternative called _semantic subtyping_. ## Syntactic subtyping AKA “what we did before.” Syntactic subtyping is a syntax-directed recursive algorithm. The interesting cases to deal with intersection and union types are: * Reflexivity: `T` is a subtype of `T` * Intersection L: `(T₁ & … & Tⱼ)` is a subtype of `U` whenever some of the `Tᵢ` are subtypes of `U` * Union L: `(T₁ | … | Tⱼ)` is a subtype of `U` whenever all of the `Tᵢ` are subtypes of `U` * Intersection R: `T` is a subtype of `(U₁ & … & Uⱼ)` whenever `T` is a subtype of all of the `Uᵢ` * Union R: `T` is a subtype of `(U₁ | … | Uⱼ)` whenever `T` is a subtype of some of the `Uᵢ`. For example: * By Reflexivity: `nil` is a subtype of `nil` * so by Union R: `nil` is a subtype of `number?` * and: `nil` is a subtype of `string?` * so by Intersection R: `nil` is a subtype of `(number?) & (string?)`. Yay! Unfortunately, using these rules: * `number` isn’t a subtype of `nil` * so by Union L: `(number?)` isn’t a subtype of `nil` * and: `string` isn’t a subtype of `nil` * so by Union L: `(string?)` isn’t a subtype of `nil` * so by Intersection L: `(number?) & (string?)` isn’t a subtype of `nil`. This is typical of syntactic subtyping: when it returns a “yes” result, it is correct, but when it returns a “no” result, it might be wrong. The algorithm is a _conservative approximation_ , and since a “no” result can lead to type errors, this is a source of false positives. ## Semantic subtyping AKA “what we do now.” Rather than thinking of subtyping as being syntax-directed, we first consider its semantics, and later return to how the semantics is implemented. For this, we adopt semantic subtyping: * The semantics of a type is a set of values. * Intersection types are thought of as intersections of sets. * Union types are thought of as unions of sets. * Subtyping is thought of as set inclusion. For example: Type | Semantics ---|--- `number` | { 1, 2, 3, … } `string` | { “foo”, “bar”, … } `nil` | { nil } `number?` | { nil, 1, 2, 3, … } `string?` | { nil, “foo”, “bar”, … } `(number?) & (string?)` | { nil, 1, 2, 3, … } ∩ { nil, “foo”, “bar”, … } = { nil } and since subtypes are interpreted as set inclusions: Subtype | Supertype | Because ---|---|--- `nil` | `number?` | { nil } ⊆ { nil, 1, 2, 3, … } `nil` | `string?` | { nil } ⊆ { nil, “foo”, “bar”, … } `nil` | `(number?) & (string?)` | { nil } ⊆ { nil } `(number?) & (string?)` | `nil` | { nil } ⊆ { nil } So according to semantic subtyping, `(number?) & (string?)` is equivalent to `nil`, but syntactic subtyping only supports one direction. This is all fine and good, but if we want to use semantic subtyping in tools, we need an algorithm, and it turns out checking semantic subtyping is non-trivial. ## Semantic subtyping is hard NP-hard to be precise. We can reduce graph coloring to semantic subtyping by coding up a graph as a Luau type such that checking subtyping on types has the same result as checking for the impossibility of coloring the graph For example, coloring a three-node, two color graph can be done using types: type Red = "red" type Blue = "blue" type Color = Red | Blue type Coloring = (Color) -> (Color) -> (Color) -> boolean type Uncolorable = (Color) -> (Color) -> (Color) -> false Then a graph can be encoded as an overload function type with subtype `Uncolorable` and supertype `Coloring`, as an overloaded function which returns `false` when a constraint is violated. Each overload encodes one constraint. For example a line has constraints saying that adjacent nodes cannot have the same color: type Line = Coloring & ((Red) -> (Red) -> (Color) -> false) & ((Blue) -> (Blue) -> (Color) -> false) & ((Color) -> (Red) -> (Red) -> false) & ((Color) -> (Blue) -> (Blue) -> false) A triangle is similar, but the end points also cannot have the same color: type Triangle = Line & ((Red) -> (Color) -> (Red) -> false) & ((Blue) -> (Color) -> (Blue) -> false) Now, `Triangle` is a subtype of `Uncolorable`, but `Line` is not, since the line can be 2-colored. This can be generalized to any finite graph with any finite number of colors, and so subtype checking is NP-hard. We deal with this in two ways: * we cache types to reduce memory footprint, and * give up with a “Code Too Complex” error if the cache of types gets too large. Hopefully this doesn’t come up in practice much. There is good evidence that issues like this don’t arise in practice from experience with type systems like that of Standard ML, which is EXPTIME-complete, but in practice you have to go out of your way to code up Turing Machine tapes as types. ## Type normalization The algorithm used to decide semantic subtyping is _type normalization_. Rather than being directed by syntax, we first rewrite types to be normalized, then check subtyping on normalized types. A normalized type is a union of: * a normalized nil type (either `never` or `nil`) * a normalized number type (either `never` or `number`) * a normalized boolean type (either `never` or `true` or `false` or `boolean`) * a normalized function type (either `never` or an intersection of function types) etc Once types are normalized, it is straightforward to check semantic subtyping. Every type can be normalized (sigh, with some technical restrictions around generic type packs). The important steps are: * removing intersections of mismatched primitives, e.g. `number & bool` is replaced by `never`, and * removing unions of functions, e.g. `((number?) -> number) | ((string?) -> string)` is replaced by `(nil) -> (number | string)`. For example, normalizing `(number?) & (string?)` removes `number & string`, so all that is left is `nil`. Our first attempt at implementing type normalization applied it liberally, but this resulted in dreadful performance (complex code went from typechecking in less than a minute to running overnight). The reason for this is annoyingly simple: there is an optimization in Luau’s subtyping algorithm to handle reflexivity (`T` is a subtype of `T`) that performs a cheap pointer equality check. Type normalization can convert pointer-identical types into semantically-equivalent (but not pointer-identical) types, which significantly degrades performance. Because of these performance issues, we still use syntactic subtyping as our first check for subtyping, and only perform type normalization if the syntactic algorithm fails. This is sound, because syntactic subtyping is a conservative approximation to semantic subtyping. ## Pragmatic semantic subtyping Off-the-shelf semantic subtyping is slightly different from what is implemented in Luau, because it requires models to be _set-theoretic_ , which requires that inhabitants of function types “act like functions.” There are two reasons why we drop this requirement. **Firstly** , we normalize function types to an intersection of functions, for example a horrible mess of unions and intersections of functions: ((number?) -> number?) | (((number) -> number) & ((string?) -> string?)) normalizes to an overloaded function: ((number) -> number?) & ((nil) -> (number | string)?) Set-theoretic semantic subtyping does not support this normalization, and instead normalizes functions to _disjunctive normal form_ (unions of intersections of functions). We do not do this for ergonomic reasons: overloaded functions are idiomatic in Luau, but DNF is not, and we do not want to present users with such non-idiomatic types. Our normalization relies on rewriting away unions of function types: ((A) -> B) | ((C) -> D) → (A & C) -> (B | D) This normalization is sound in our model, but not in set-theoretic models. **Secondly** , in Luau, the type of a function application `f(x)` is `B` if `f` has type `(A) -> B` and `x` has type `A`. Unexpectedly, this is not always true in set-theoretic models, due to uninhabited types. In set-theoretic models, if `x` has type `never` then `f(x)` has type `never`. We do not want to burden users with the idea that function application has a special corner case, especially since that corner case can only arise in dead code. In set-theoretic models, `(never) -> A` is a subtype of `(never) -> B`, no matter what `A` and `B` are. This is not true in Luau. For these two reasons (which are largely about ergonomics rather than anything technical) we drop the set-theoretic requirement, and use _pragmatic_ semantic subtyping. ## Negation types The other difference between Luau’s type system and off-the-shelf semantic subtyping is that Luau does not support all negated types. The common case for wanting negated types is in typechecking conditionals: -- initially x has type T if (type(x) == "string") then -- in this branch x has type T & string else -- in this branch x has type T & ~string end This uses a negated type `~string` inhabited by values that are not strings. In Luau, we only allow this kind of typing refinement on _test types_ like `string`, `function`, `Part` and so on, and _not_ on structural types like `(A) -> B`, which avoids the common case of general negated types. ## Prototyping and verification During the design of Luau’s semantic subtyping algorithm, there were changes made (for example initially we thought we were going to be able to use set-theoretic subtyping). During this time of rapid change, it was important to be able to iterate quickly, so we initially implemented a prototype rather than jumping straight to a production implementation. Validating the prototype was important, since subtyping algorithms can have unexpected corner cases. For this reason, we adopted Agda as the prototyping language. As well as supporting unit testing, Agda supports mechanized verification, so we are confident in the design. The prototype does not implement all of Luau, just the functional subset, but this was enough to discover subtle feature interactions that would probably have surfaced as difficult-to-fix bugs in production. Prototyping is not perfect, for example the main issues that we hit in production were about performance and the C++ standard library, which are never going to be caught by a prototype. But the production implementation was otherwise fairly straightforward (or at least as straightforward as a 3kLOC change can be). ## Next steps Semantic subtyping has removed one source of false positives, but we still have others to track down: * overloaded function applications and operators, * property access on expressions of complex type, * read-only properties of tables, * variables that change type over time (aka typestates), * … The quest to remove spurious red squiggles continues! ## Acknowledgments Thanks to Giuseppe Castagna and Ben Greenman for helpful comments on drafts of this post. ## Further reading If you want to find out more about Luau and semantic subtyping, you might want to check out… * Luau. https://luau-lang.org/ * Lily Brown, Andy Friesen and Alan Jeffrey, _Goals of the Luau Type System_ , Human Aspects of Types and Reasoning Assistants (HATRA), 2021. https://arxiv.org/abs/2109.11397 * Luau Typechecker Prototype. https://github.com/luau-lang/agda-typeck * Agda. https://agda.readthedocs.io/ * Andrew M. Kent. _Down and Dirty with Semantic Set-theoretic Types_ , 2021. https://pnwamk.github.io/sst-tutorial/ * Giuseppe Castagna, _Covariance and Contravariance_ , Logical Methods in Computer Science 16(1), 2022. https://arxiv.org/abs/1809.01427 * Giuseppe Castagna and Alain Frisch, _A gentle introduction to semantic subtyping_ , Proc. Principles and practice of declarative programming (PPDP), pp 198–208, 2005. https://doi.org/10.1145/1069774.1069793 * Giuseppe Castagna, Mickaël Laurent, Kim Nguyễn, Matthew Lutze, _On Type-Cases, Union Elimination, and Occurrence Typing_ , Principles of Programming Languages (POPL), 2022. https://doi.org/10.1145/3498674 * Giuseppe Castagna, _Programming with union, intersection, and negation types_ , 2022. https://arxiv.org/abs/2111.03354 * Sam Tobin-Hochstadt and Matthias Felleisen, _Logical types for untyped languages_. International Conference on Functional Programming (ICFP), 2010. https://doi.org/10.1145/1863543.1863561 * José Valim, _My Future with Elixir: set-theoretic types_ , 2022. https://elixir-lang.org/blog/2022/10/05/my-future-with-elixir-set-theoretic-types/ Some other languages which support semantic subtyping… * ℂDuce https://www.cduce.org/ * Ballerina https://ballerina.io * Elixir https://elixir-lang.org/ * eqWAlizer https://github.com/WhatsApp/eqwalizer And if you want to see the production code, it’s in the C++ definitions of tryUnifyNormalizedTypes and NormalizedType in the open source Luau repo.
luau.org
December 22, 2024 at 2:49 AM
Luau Recap: July &amp; August 2022
Luau is our new language that you can read more about at https://luau-lang.org. Cross-posted to the [Roblox Developer Forum.] ## Tables now support `__len` metamethod See the RFC Support `__len` metamethod for tables and `rawlen` function for more details. With generalized iteration released in May, custom containers are easier than ever to use. The only thing missing was the fact that tables didn’t respect `__len`. Simply, tables now honor the `__len` metamethod, and `rawlen` is also added with similar semantics as `rawget` and `rawset`: local my_cool_container = setmetatable({ items = { 1, 2 } }, { __len = function(self) return #self.items end }) print(#my_cool_container) --> 2 print(rawlen(my_cool_container)) --> 0 ## `never` and `unknown` types See the RFC `never` and `unknown` types for more details. We’ve added two new types, `never` and `unknown`. These two types are the opposites of each other by the fact that there’s no value that inhabits the type `never`, and the dual of that is every value inhabits the type `unknown`. Type inference may infer a variable to have the type `never` if and only if the set of possible types becomes empty, for example through type refinements. function f(x: string | number) if typeof(x) == "string" and typeof(x) == "number" then -- x: never end end This is useful because we still needed to ascribe a type to `x` here, but the type we used previously had unsound semantics. For example, it was possible to be able to _expand_ the domain of a variable once the user had proved it impossible. With `never`, narrowing a type from `never` yields `never`. Conversely, `unknown` can be used to enforce a stronger contract than `any`. That is, `unknown` and `any` are similar in terms of allowing every type to inhabit them, and other than `unknown` or `any`, `any` allows itself to inhabit into a different type, whereas `unknown` does not. function any(): any return 5 end function unknown(): unknown return 5 end -- no type error, but assigns a number to x which expects string local x: string = any() -- has type error, unknown cannot be converted into string local y: string = unknown() To be able to do this soundly, you must apply type refinements on a variable of type `unknown`. local u = unknown() if typeof(u) == "string" then local y: string = u -- no type error end A use case of `unknown` is to enforce type safety at implementation sites for data that do not originate in code, but from over the wire. ## Argument names in type packs when instantiating a type We had a bug in the parser which erroneously allowed argument names in type packs that didn’t fold into a function type. That is, the below syntax did not generate a parse error when it should have. Foo<(a: number, b: string)> ## New IntegerParsing lint See the announcement for more details. We include this here for posterity. We’ve introduced a new lint called IntegerParsing. Right now, it lints three classes of errors: 1. Truncation of binary literals that resolves to a value over 64 bits, 2. Truncation of hexadecimal literals that resolves to a value over 64 bits, and 3. Double hexadecimal prefix. For 1.) and 2.), they are currently not planned to become a parse error, so action is not strictly required here. For 3.), this will be a breaking change! See the rollout plan for details. ## New ComparisonPrecedence lint We’ve also introduced a new lint called `ComparisonPrecedence`. It fires in two particular cases: 1. `not X op Y` where `op` is `==` or `~=`, or 2. `X op Y op Z` where `op` is any of the comparison or equality operators. In languages that uses `!` to negate the boolean i.e. `!x == y` looks fine because `!x` _visually_ binds more tightly than Lua’s equivalent, `not x`. Unfortunately, the precedences here are identical, that is `!x == y` is `(!x) == y` in the same way that `not x == y` is `(not x) == y`. We also apply this on other operators e.g. `x <= y == y`. -- not X == Y is equivalent to (not X) == Y; consider using X ~= Y, or wrap one of the expressions in parentheses to silence if not x == y then end -- not X ~= Y is equivalent to (not X) ~= Y; consider using X == Y, or wrap one of the expressions in parentheses to silence if not x ~= y then end -- not X <= Y is equivalent to (not X) <= Y; wrap one of the expressions in parentheses to silence if not x <= y then end -- X <= Y == Z is equivalent to (X <= Y) == Z; wrap one of the expressions in parentheses to silence if x <= y == 0 then end As a special exception, this lint pass will not warn for cases like `x == not y` or `not x == not y`, which both looks intentional as it is written and interpreted. ## Function calls returning singleton types incorrectly widened Fix a bug where widening was a little too happy to fire in the case of function calls returning singleton types or union thereof. This was an artifact of the logic that knows not to infer singleton types in cases that makes no sense to. function f(): "abc" | "def" return if math.random() > 0.5 then "abc" else "def" end -- previously reported that 'string' could not be converted into '"abc" | "def"' local x: "abc" | "def" = f() ## `string` can be a subtype of a table with a shape similar to `string` The function `my_cool_lower` is a function `<a...>(t: t1) -> a... where t1 = {+ lower: (t1) -> a... +}`. function my_cool_lower(t) return t:lower() end Even though `t1` is a table type, we know `string` is a subtype of `t1` because `string` also has `lower` which is a subtype of `t1`’s `lower`, so this call site now type checks. local s: string = my_cool_lower("HI") ## Other analysis improvements * `string.gmatch`/`string.match`/`string.find` may now return more precise type depending on the patterns used * Fix a bug where type arena ownership invariant could be violated, causing stability issues * Fix a bug where internal type error could be presented to the user * Fix a false positive with optionals & nested tables * Fix a false positive in non-strict mode when using generalized iteration * Improve autocomplete behavior in certain cases for `:` calls * Fix minor inconsistencies in synthesized names for types with metatables * Fix autocomplete not suggesting globals defined after the cursor * Fix DeprecatedGlobal warning text in cases when the global is deprecated without a suggested alternative * Fix an off-by-one error in type error text for incorrect use of `string.format` ## Other runtime improvements * Comparisons with constants are now significantly faster when using clang as a compiler (10-50% gains on internal benchmarks) * When calling non-existent methods on tables or strings, `foo:bar` now produces a more precise error message * Improve performance for iteration of tables * Fix a bug with negative zero in vector components when using vectors as table keys * Compiler can now constant fold builtins under -O2, for example `string.byte("A")` is compiled to a constant * Compiler can model the cost of builtins for the purpose of inlining/unrolling * Local reassignment i.e. `local x = y :: T` is free iff neither `x` nor `y` is mutated/captured * Improve `debug.traceback` performance by 1.15-1.75x depending on the platform * Fix a corner case with table assignment semantics when key didn’t exist in the table and `__newindex` was defined: we now use Lua 5.2 semantics and call `__newindex`, which results in less wasted space, support for NaN keys in `__newindex` path and correct support for frozen tables * Reduce parser C stack consumption which fixes some stack overflow crashes on deeply nested sources * Improve performance of `bit32.extract`/`replace` when width is implied (~3% faster chess) * Improve performance of `bit32.extract` when field/width are constants (~10% faster base64) * `string.format` now supports a new format specifier, `%*`, that accepts any value type and formats it using `tostring` rules ## Thanks Thanks for all the contributions! * natteko * JohnnyMorganz * khvzak * Anaminus * memery-rbx * jaykru * Kampfkarren * XmiliaH * Mactavsin
luau.org
December 22, 2024 at 2:50 AM