Jump to content
  • Sky
  • Blueberry
  • Slate
  • Blackcurrant
  • Watermelon
  • Strawberry
  • Orange
  • Banana
  • Apple
  • Emerald
  • Chocolate
  • Charcoal
  • 0
Solra Bizna

Custom architecture, runSynchronized() vs. runThreaded() vs. time

Question

(I don't know if this is the right forum to post this in, but I didn't see a clearly better one.)

Still working on my custom architecture, and I've got some open questions, mostly about issues of time.

First, I note that `runThreaded` can return a "sleep" result. Does the worker pool operate such that `runThreaded` is called repeatedly on every non-sleeping instance, and only when all are sleeping is a game tick allowed to complete? If so, is there ever a situation where a number of game ticks other than exactly the requested sleep period will pass? (Consistency with "real" time is irrelevant here, only consistency against game ticks matters.)

Second, `runThreaded` can request that `runSynchronized` be called. Will such a request ever cross a tick boundary?

Third, is `runSynchronized` ever called other than when requested by `runThreaded`? (e.g. when the machine first starts up)

Fourth, what is the preferred way to get the current tick count from inside `runThreaded`? If sleeps do not drift, this question is moot.

Last, the life cycle of a machine started and stopped in one session is pretty clear, but the exact sequence of calls when a machine is loaded is less so. Which calls are called, and in what order? When `load()` is called, is `machine.host().internalComponents()` valid and populated?

Link to post
Share on other sites

6 answers to this question

Recommended Posts

  • 0
  • Solution

The sleep result type is used to tell the machine running the architecture how long it may wait before continuing execution. This is essentially a non-busy idle. The sleep may be interrupted at any moment, however. In particular, this is the case when a signal arrives. Similarly, the sleep never actually takes if there are still signals in the queue.

All threaded execution has no (direct) impact on tick time. They run completely in parallel. The only dependency is that sleep time for machines is only checked every server tick, so sleeps can only last a multiple of 50ms (the duration of a single tick), nothing more fine-grained than that. If a worker does not sleep, it is immediately re-queued in the worker pool. That means unless the thread pool is quite busy, it'll pretty much immediately resume, unless sleeping (or switching to a synchronized call).

 

Threaded execution can switch to a synchronized execution mode to allow architectures to interact with the Minecraft world in a thread-safe fashion. Generally, each callback is annotated with whether it is a "direct" call (performed directly in the worker thread), or not (in which case the architecture must switch to synchronized before making the actual call). Therefore, synchronized calls are executed in the machine's server thread driven update loop. That also means synchronized calls can only occur once per tick. After the synchronized call was completed, the architecture will be resumed in threaded mode. Therefore, unless there are hiccups or delays, an architecture can usually perform one synchronized call per tick (i.e. it can queue the next synchronized call before the next server tick).

 

Indeed, runSynchronized is only called when so requested by runThreaded.

 

To get the current world tick time (which may indeed change during the execution of a worker) use `li.cil.oc.api.machine.Machine.worldTime()`, which is the thread-safe way to get it (it's updated in the machine when it ticks).

 

When loading, the architecture is constructed last, i.e. internalComponents is indeed populated. The incoming calls will be, in order: <init>, recomputeMemory, initialize, load. However! Due to how loading in MC works, at this point there will be no network and no world object available. Once the machine has been added to the network and hooked up with its components that way, the architecture's onConnect method will be called, so if you need to perform any initialization depending on actually accessing other components, I'd recommend doing that in here (OC's Lua archs don't use this callback anymore, they did in very early versions of the mod for connecting the built-in ROM filesystem, that has since been removed and replaced with EEPROMs and the OpenOS floppy).

 

 

 

Phew, I hope that answers most of it (and that I didn't misremember anything). If there's anymore, lemme know!

 

 

 

PS: oh, and also a note on call limits. Some callbacks, while direct, are marked as having a "limit". This basically determines the "cost" of calling that function, with a machine having a certain call budget per tick (depending on the CPU tier). If, when making a component call (via Machine.invoke, which is what you want to use for that) that limit is exceeded, a `LimitReachedException` will be thrown. Catch that and make the next call to that callback in a synchronized fashion to honor the call limit without having to sleep (because sleeping could be interrupted, leading to lost signals).

Edited by Sangar
Link to post
Share on other sites
  • 0

That thoroughly answers all my original questions. Thanks. Now I have new ones instead. :)

I can definitely see why this is done the way it is, particularly since the Lua architecture doesn't (and, really, can't) have built-in fine grained timeslicing. My own emulation code, however, is designed with precisely controllable timing down to the cycle level. If, some day, support were added for architectures with "hard timing", being processed as part of the world tick... my architecture, at least, would be ready to support it. (Hypothetically. :P)

So, my current plan timing-wise is to have runThreaded first calculate how many cycles to add to the cycle budget based on how many game ticks have passed, and attempt to run until the cycle budget is exhausted. If the program performs a synchronized call (whether by making a synchronous indirect call or exceeding the direct call limit without requesting a non-blocking call), I zero the cycle budget and request a runSynchronized call. If the program requests a sleep, I request a sleep---which, as designed, will end immediately if there is already a signal queued (which I should detect and not zero the cycle budget), and early if a signal arrives during the sleep. If the cycle budget runs out... this is where it gets hairy, since I must cross a tick boundary here before resuming execution. In this situation, I assume I can request a runSynchronized call (whether I actually do anything in it or not) and be guaranteed to have crossed at least one tick boundary the next time runThreaded is called?

If I have understood the way timing works correctly (and not made a mistake in my planning), this should result in a CPU that on average runs at a specific clock speed relative to game time, but doesn't bog the Minecraft world down during times of heavy CPU load. (We can blame those unstable redstone clocks for the jitter ;))

Loading, however, poses a slight problem, if I have understood correctly. During the initialization process, the speed and physical memory map for the machine is established. This means determining the CPU tier (which is currently done during recomputeMemory because it's convenient), speed and size of memory modules (recomputeMemory is for this), and the contents of the EEPROM. This latter part requires some special work if I cannot invoke on the EEPROM during initialize()/load(). Is this the case? (If so, I can defer access to ROM until runSynchronized can be called or onConnect is called.)

Link to post
Share on other sites
  • 0

In this situation, I assume I can request a runSynchronized call (whether I actually do anything in it or not) and be guaranteed to have crossed at least one tick boundary the next time runThreaded is called?

Indeed, since whatever you do in `runSynchronized` is up to you, you can request a that and do nothing in the actual call, to force a wait for the next game tick.

 

This latter part requires some special work if I cannot invoke on the EEPROM during initialize()/load(). Is this the case?

Correct. The component cannot be accessed at that time, because the network has not yet been rebuilt. I'm curious: why do you need the contents of the EEPROM at that point? Wouldn't it have been copied into the byte array used as memory, and therefore restored already, if the machine were restored from a running state? And if the machine wasn't running, wouldn't it not matter, since it'd be enough to access it once requested to start up (i.e. when initialize is called the "normal" way)? But if you really do need it, yes, delaying full initialization until onConnect would be necessary (which should always happen before the first run, IIRC).

Link to post
Share on other sites
  • 0

I'm curious: why do you need the contents of the EEPROM at that point? Wouldn't it have been copied into the byte array used as memory, and therefore restored already, if the machine were restored from a running state?

The way I'm handling the EEPROM is by, conceptually, having reads from memory above 0xFFFF0000 be accesses to the currently-installed EEPROM. What's actually happening is that every time an EEPROM is added or removed (or flashed), I'm copying the byte array off and caching it. I can store the cached byte array and restore it with the other state, but I'd rather not, since there wouldn't actually be a situation where it would be different from the current state of the EEPROM. It's just a question of when I do the copy, really.

The difference between this approach and the approach used by the Lua machines is that the EEPROM is not, conceptually, being copied into the machine's memory (unless its code does so). It's acting as though it's being read directly, as needed.

I haven't implemented this yet, but I'm planning to do some similar sleight of hand regarding the "volatile data" on EEPROMs, where it can be read and written as ordinary (but slow and cramped) RAM and is kept up-to-date behind the scenes. The benefits would be two-fold: one, it would be easy for a bootloader to read the volatile data; two, it would allow "RAMless" EEPROM-based programs if they can be made to work in only a few hundred bytes of (slow) RAM. (Edit: The use of "RAMless" ARM machines would be a config option, defaulting to off, because it could be considered cheating.)

Thanks for the answers, I'm out of questions for now. :D

Link to post
Share on other sites
  • 0

Ahh, I see. That makes sense then, yeah. Sounds like a sound design! (hah)

 

Maybe flag the EEPROM memory as dirty (after loading / construction / it changing) and update it when accessing it, in that case? Would avoid any special code for loading, I think.

 

The RAM-less mode sounds like an interesting challenge. And useful. It could allow a BIOS to bind the GPU to a screen, so that "need memory to boot" errors can actually be displayed on those, and you don't need an analyzer to get that error message.

Link to post
Share on other sites
  • 0

Maybe flag the EEPROM memory as dirty (after loading / construction / it changing) and update it when accessing it, in that case? Would avoid any special code for loading, I think.

The call in question is a direct one, so I can totally do that. It would even avoid the copy in the normal case of the EEPROM never being accessed again after boot. (Why didn't I think of that? :))

 

The RAM-less mode sounds like an interesting challenge. And useful. It could allow a BIOS to bind the GPU to a screen, so that "need memory to boot" errors can actually be displayed on those, and you don't need an analyzer to get that error message.

Hm. That's a good point. In that case, maybe I should enable it by default...
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use and Privacy Policy.