A Tale of Two Fixes.

Can you help improve your favourite game? Hardcore C mages, talented artists, and players with any level of experience are welcome!
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

A Tale of Two Fixes.

Post by nef »

A Tale of Two Fixes.

IT WAS the best of times, it was the worst of times, it was ... er, no. Wrong story.

This tale is about tolua functions: listenv() and _freeciv_state_dump().

Both of these can be used for debugging Lua scripts, but the latter is also key to the save/restore process reinstating 'scalar' variables such as booleans, numbers, strings and userdata (but not tables, functions, or threads).

Both are broken but nobody seems to have noticed; perhaps for a good reason being that the problems with both are likely to be experienced only by those who are 'serious' Lua scribes. If this is not you then you may safely skim this thread (FOR NOW), or skip it entirely.

Taking them one at a time .....
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

listenv().

For listenv there are a number of pathological problems:

A. The function chokes when it encounters a key data type it does not expect.
  • For a demonstration, start or restart a game (not using my hut enter code) and enter the following server command (in the chatline):
    • /lua cmd _G[true] = true
    Next try listenv:
    • /lua cmd listenv()
    and stand back for the pyrotechnic display of listenv crashing and burning. (To fix the damage enter /lua cmd _G[true] = nil.)

    [If you are using the hut enter code you can create a 'private' copy of default.lua in the ruleset directory: copy a distro version of default/default.lua to the ruleset you are using for the test.]

    So what`s going wrong here? The problem is that listenv expects the key of a key/value field in a table to be a string. This illustrates that listenv() was written with a fundamental misunderstanding of Lua`s table structure. Now it is the case that examples of table indexing invariably show the key to be either a string OR A NUMBER, but it is also the case that the official Lua manual clearly states that both the key and the value can each be of ANY data type (except nil).
B. The function is unable to deal with recursive table structures.
  • For a demonstration, start or restart a game and enter the following server command (in the chatline):
    • /lua cmd t = {}; t.x = t
    Next try listenv:
    • /lua cmd listenv()
    In this case you should kill the session before you run out of disk space, but you have time to see the problem in the chat printout. (You will actually have to wait a few moments before you see anything at all.)
C. Somewhat more obscurely, listenv() does not cope well when a table has a metamethod __pairs. I will not demonstrate this because it requires 'some code' so you will have to take my word for it. (Oh alright: t = setmetatable ({}, {__pairs = function() assert (nil, "GOTCHA") end}).) This particular problem may have crept up on fc. If so then it would be a maintenance problem rather than an original design fault. (Fix: t = nil)


The replacement fixes these problems, but it also does much more. There are three reasons for considering this new version:

1. It doesn`t crash (AFAIK).

2. Much better facilities:
  • a. user options - see the block comment in the file for details,
    b. results printed are detailed and comprehensive (see below).
3. It is in clear text:
  • a. you can see what`s going on,
    b. you can change it as you see fit.
fce.txt
(7.65 KiB) Downloaded 155 times
EDIT: this file has been updated

This function is only ever intended for debugging so there is no particular need to include the code in any script.lua (or default.lua) so may be kept as a separate file. Place the file directly under 'data' (for conveniece) and change the extension to .lua. Use a server command to load the Lua code prior to use.
Last edited by nef on Mon Dec 20, 2021 4:02 pm, edited 1 time in total.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

Some comments on the printout:

The function preserves the basic hierarchical style but adds the following features:
  • • values and (sub)types for both the 'key' and 'value' of a table field are printed (on a single line). Generally values are self identifying (with %s or %q formatting) but in many cases this is just a memory address, so for these the function uses a more convenient enumeration. Duplicates are then highlighted by appending an equals ('='). Tables referenced are parsed ONCE ONLY. Types are broken done to subtypes where possible. In addition to tolua.type() and math.type() the function also separately identifies those strings that could be used as identifiers. Moreover, some strings can cause problems for %q (and %s) so for these the function replaces unprintable characters with an asterisk ('*'). The type is then written as STRING# where # is the number of substitutions made. In addition to tables, the function also gives details on other Lua 'object' types - except functions (by policy). Userdata that is 'data' in nature rather than 'metadata' also has selected data printed. The former ('data') is created by the game; the latter by the ruleset (and therefore has a rule_name - except for direction)
    • metatables for each datum are identified on the same line. Metatables are themselves regular Lua tables so each datum can have attached a 'chain' of metatables; each one being parsed (for key/value pairs) as for any other table.
The function will also return to the caller the lists used for enumeration. 'data' type userdata will also be recorded but with the id as numerator. Note that "Nonexistent" has no id so it is also enumerated. For a simplistic analysis one can use the function itself, for example:
  • /lua cmd fce(fce(_G), 2, " lists ")
but the result is primitive, so those with an interest can do some additional coding to suit their requirements.

This example shows that fce can be used for ANY table, in fact any datum. For a really exciting ride try: fce(true). Even better - compare fce(1) to fce(1.0). One can see the potential for a variety of debugging/diagnostic purposes.

In development the utility of the replacement for general debugging became evident so the default for the starting table (_ENV) was removed. This allows reporting of nil (and false). To replicate listenv() behaviour a minor code change can be made or (better) use a wrapper.


The name of the replacement function (as supplied) is fce but this can be easily changed (even to listenv).

Also, I have used log.error for printout, but this too can be easily changed. Any suggestions here would be welcome. (For debugging a vars script log.error is needed.)

And finally, for really advanced use, one can enable a code line to print the hex memory address of functions if you really must have it. (Tables and threads are already there.) Alternately, an analysis of the returned tables would do the same without disturbing the function.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

_freeciv_state_dump()

This function has a number of problems starting with:

A. As for listenv() _freeciv_state_dump chokes when it encounters key types it does not expect. In fact it is a little worse since the vars script it prepares assumes that all keys are identifiers.

B. Implementation of userdata has serious omissions which _freeciv_state_dump is not aware of. More details of this in a later post.

C. Inability to save tables. Not a bug as such, but a serious restriction. More on this in the next post.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

Tables:

Yes, this replacement version can be used to save tables!!!

The idea of saving tables is not straight forward because it needs to be selective. 'System' tables provided by Lua and tolua are established before script.lua and default.lua are run and there is no need to interfere, in fact, quite the opposite. Something similar can be said about tables established by scripts, but it is not absolute. Conversely, tables established by callbacks are likely candidates for saving but this also is not absolute.

There are several ways of providing a means to allow the Lua scribe to specify which tables to be saved but after considering the issues I settled on the idea of using a metatable with a field _fc_keep.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

Userdata:

Before discussing the issues it is worth briefly discussing the natue of userdata. In fc the userdata can be classified into two distinct classes: data and metadata.

The distinction between these is that data is created, modified, and (typically) destoyed by in-game events. The userdata subtypes for data are:
  • Tile
    Player
    City
    Unit
    Nonexistent
Conversely, metadata is fixed for the duration of the game (not quite - the user can change the ruleset between a save and the restore), and is defined by the ruleset or is hardcoded. With one infamous exception (direction), metadata can be identified by 'rule_name' (in addition to 'id'). The hardcoded metadata is:
  • Action
    Direction
All other metadata is user defined in sections with a header in the format [prefix_*] where prefix is reserved text identifying the type of metadata.

The issues with userdata are as follows:

I. Missing identifiers.

There are two of these as mentioned above:
i. Direction. As cases go this is a basket case. So bad it has been "withdrawn" and will not be 'fixed' until fc 3.1. My reading of the ticket is that it will likely remain a basket case, so my advice here is to plan on NEVER having Direction as userdata that can be saved.
ii. Nonexistent. Turns out that this also looks like a basket case but there are nuances that can be exploited. The details are so substantial I will have to explain these in a separate post.

II. Missing finders.

Quite a few of these.
i. Direction. (What a surprise!)
ii. Disaster. No current ticket?
iii. Achievement. No current ticket?
iv. Action. Scheduled for fc 3.1, so you have to wait until then.

The current distro version of _freeciv_state_dump is not aware of these omissions so the vars script will crash and burn if you try to save any of these userdata types. The replacement version will AUTOMATICALLY detect cases of 'no finder', so will not try to save these data. If/when the finder (and identifier) becomes available with a new release the userdata WILL be saved by the replacement.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

Nonexistent.

Warning: this is long and boring meant only for the tragic Lua scribe.

So first, what is this subtype?

Well the idea appears to be that when fc data is destroyed it is replaced by a tolua analogue of the Lua type nil. What happens is that the original subtype (Player, City, Unit) is changed to Nonexistent. But there is redundancy here: the method :exists() is common to all of these and returns true for 'live' data and false for Nonexistent. It is this gratuitous redundacy that is at the heart of the problem. More on this later, but for now the issues.

When an fc data object 'dies' and the object changes to Nonexistent the identity of the object is preserved - that is, it is unique and therefore different to any other instance of Nonexistent. This DIRECTLY conflicts with the return value of the finder ( find.nonexistent() ) which only ever returns a reference to a single private instance. Now, of course, there will be a temptation to 'fix' this and make ALL versions of Nonexistent just one object, but this is fraught with problems. It is bad enough to have the associated 'value' of a field suddenly change, but it becomes a serious issue for Lua if keys change in this way. I should point out that Lua makes a distinction between two classes of variable values: tables, functions, and userdata values are objects and the variable contains a reference. Other types (nil, boolean, number, string) contain the actual value. The point here is that Lua expects that objects may have internal details change (notably for tables) but NOT THE REFERENCE.

So the plan is to NOT use find.nonexistent but, instead, create and kill a data object each time a new version of Nonexistent is needed. This is a kludge that is not as robust as I would like for two reasons:
  • a. Some feature in a ruleset may cause the kludge to fail. I am not aware of anything in fc 2.6 (or anything currently announced for fc 3.x) that would allow this. But this could change at the drop of a hat.
    b. Interaction with the unit_lost callback can give 'unexpected' results. (See the support package for details on one way to deal with this.)

The code used in the vars script could be easily broken by misguided attempts to 'fix' Nonexistent. But having said that, the code may survive for some time, firstly, because of inertia in anything being done to tolua and, secondly, attempts to 'fix' Nonexistent may easily result in an endless cascade of new bugs requiring a similar endless stream of reversals.

It does not have to be this way. IMO there is a simple robust solution to this. And this is to remove Nonexistent as the subtype for dead data. The method :exists() is sufficient to identify the condition. This still leaves a problem for the vars script, but this can easily be solved by allowing the finders to accept an 'invalid' id and return a related instance of the userdata with :exists() returning false. If this technique is used the dead userdata will need to keep the id field, but other fields may be problematic for the vars script. I will note that id`s are not recycled. If they ever are then some additional considerations apply (such as the metamethod __gc). For legacy reasons the current finder (find.nonexistent) should be left as is until no one remembers why it was there. The replacement version of _freeciv_state_dump uses it for legacy purposes.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

Design notes for saving tables.

The first point to note is that in Lua any data type (except nil) can be used for the key, as well as the value for any table field. The replacement version of _freeciv_state_dump() UNDERSTANDS THIS. (The distro version does not!)

The next is to realise that some data cannot be saved mostly because the data has no 'literal' form needed to write the vars script. The obvious consequence of this is that a table field can only be saved if both the key AND value are eligible.

The replacement version of _freeciv_state_dump asseses data to be eligible as follows:
  • boolean
    number
    string (provided it has no unprintable characters; see also special note)
    userdata (subject to special considerations described below)
    table (provided it is designated as saveable explained below)
special note for string:
A simple field in the environment (_ENV) with a key that is an identifier starting with an underscore will NOT be saved (so that, for example _VERSION will not be overwritten).

userdata:
The two criteria for userdata are that it must be identifiable, and that there must be a finder. Both criteria are assessed automatically so as fc software changes, eligibily will change with it. See a previous post for current status as of fc 2.6. I will add a special note here. As previously noted userdata that is metadata has both id and rule_name as identifiers. The current version of _freeciv_state_dump only ever uses id, but this should be deprecated because it contributes to the save file compatibility problem of not being able to remove or reorganise the defining sections in the ruleset - the reason being that id`s are assigned by the order of those defining sections. Note that this issue is MOOT if the metadata is hardcoded (obviously) but also would be if the id is also user defined. Unfortunately, this is not the case. Currently in fc2.6 metadata of type Terrain has an "identifier =" field which is supposedly used for save restore processes but it is NOT used as the tolua .id field. In any event there is no apparent harm in using rule_name wherever possible, and it removes one excuse for dev`s to not fix the general save/restore misuse of id`s for metadata. (One might note the drip feed of problem reports of save file incompatibilities due to misuse of metadata id in the general save/restore process. Considering the warnings given to modders about preserving rule_name compatibility this must be a classic case of 'Do as I say, not as I do'.)

table:
As mentioned previously, the replacement version of _freeciv_state_dump will consider a table to be eligible if it has a metatable with a SAVEABLE field with the identifier _fc_keep as the key. Clearly the key is eligible but one must ensure that the value is also eligible. Note that this INCLUDES ELIGIBLE TABLES. To preserve elibility from one save/restore to the next the metatable will also be saved (and linked as the table`s metatable). Metatables are ordinary Lua tables so they too could have a metatable. This would be highly unusual and I know of only one person who would do such a silly thing, but it is possible. So accordingly, a policy decision was made to save the entire chain of metatables if the originating table and metatable were saved. At the moment you are unlikely to ever see this happen - unless you actually want to test for it. There are two ways to terminate this chain:
  • a. The metatable has been previously encoutered and since this will have already been processed nothing more needs to be done.
    b. The chain terminates in a table that has no metatable thus creating a 'tail'.

Unfortunately the latter termination raises an obscure issue: one could use a 'system' table as a tail to the chain. There is no easy way to reliably reconstruct this structure so the replacement will create private copies - to the extent possible.

But all is not lost. Included in the vars script is a conditional call to a user supplied function with the name _fc_restore. This function is called with two arguments - the first being a list of tables restored. This will allow the Lua scribe to fix the mess (and much more - in fact the primary purpose was to allow the scribe to recreate those fields that could not be saved).


One final design note:

The allowance for the associated value of _fc_keep to be a table can create another 'chain'. Like the metatable chain this is also linear and can be terminated in similar ways but with some important differences. The conditions are:
  • a. The associated value of some _fc_keep field will be clearly identified as ineligible in which case the entire chain will be abandoned.
    b. Similar to (a) but the value is clearly identified as eligible so the chain will be eligible.
    c. The value is a table that has previously been encountered in the current chain as a potentially saveable table - that is one that has a metatable with an _fc_keep field. In this case the chain terminates in a loop and WILL BE ASSESSED AS BEING ELIGIBLE. Note that this assessment excludes tables that have been saved ONLY because they are part of a saved metatable chain. This exclusion is made to avoid a minor technical issue that will be of interest to no one.
I will close this post with a caveat 'for the record'. A metatable can be hidden if it contains the metamethod __metatable. I know of no way to defeat this. The regular scribe will have no need for this for saveable tables and so should be avoided as it will corrupt the restored table structure.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

There are three reasons for considering this new version:

1. It doesn`t crash (AFAIK).

2. Much better facilities:
  • a. correctly reconstructs Nonexistent
    b. tables can be saved
3. It is in clear text:
  • a. you can see what`s going on,
    b. you can change it as you see fit.
fsd.txt
(5.8 KiB) Downloaded 155 times
EDIT: this file has been updated

The replacement version of _freeciv_state_dump should be placed in default.lua since it should be OK for all rulesets. The function name SHOULD NOT BE CHANGED.
Last edited by nef on Mon Dec 20, 2021 3:57 pm, edited 1 time in total.
nef
Elite
Posts: 324
Joined: Mon Jun 25, 2018 5:01 pm

Re: A Tale of Two Fixes.

Post by nef »

In developing the replacement _freeciv_state_dump I discovered a character limit for log.normal() likely due to a limit in the client/server protocol but could be anywhere.

The new vars script easily runs into this limit so I am also providing a function to print the script in pieces. This function (named getfsd()) takes two optional arguments
  • arg_1: The string to be printed. Default is the return value of a call to _freeciv_state_dump.
    arg_2: The function used to print the pieces. Default is log.error(). (log.error() is needed if getfsd() is run as part of the vars script (e.g. by being called from _fc_restore() ).
The fragmentation looks for a newline so is not entirely bullet proof; if you find some data being dropped you can try with a smaller 'size'.

I also have included materials that can be used to test and/or demonstrate features of _freeciv_state_dump (and fce).
testfsd.txt
(2.44 KiB) Downloaded 207 times
The material is intended for debugging so may be kept as a separate file. Place the file directly under 'data' (for conveniece) and change the extension to .lua. Use a server command to load the Lua code prior to use. (Consider combining it with fce.lua if that`s what you plan on using.)

Some hints on interpreting the body of the vars script:

Each line in the body sets a table field and is in the form rawset(t[<number>]), key, value).
t is a table of tables being restored which are individually identified by <number>. The key and value will be the literal form of any saveable data, but there are two special cases:
  • • userdata type Nonexistent. The data will be identified by indexing the table ne; each unique index will be for a unique version of Nonexistent. ne[1] is for find.nonexistent().

    • tables. These are identified using a function call to the table t (referred to above). The numbers used in the argument list will be the indexes for the chain of metatables (in REVERSE ORDER). The last number is the prime table being referenced. The first number is either the last metatable in the tail or is a table that has been already processed in the same or previous rawset(). t[1] is reserved for the direct environment (_ENV).
Post Reply