Disclaimer: I'm not a game developer; but, I've worked on a lot of projects with...

notacoward · on Sept 21, 2019

> it was generally non-deterministic

This is basically it. People who have never worked on actual real-time systems just never seem to get that in those environments determinism often matters more than raw performance. I don't know about "soft" real-time (e.g. games, or audio/video) but in "hard" real-time (e.g. avionics, industrial control) it's pretty routine to do things like disable caches and take a huge performance hit for the sake of determinism. If you can run 10% faster but miss deadlines 0.1% more often, that's a fail. It's too easy for tyros to say pauses don't matter. In many environments they do.

Skunkleton · on Sept 21, 2019

Its not just pauses either. Non-deterministic memory usage is a big deal too.

GuB-42 · on Sept 21, 2019

I have actually worked on a system where malloc() was forbidden. In fact it always returned null. Buffers were all statically allocated and stack usage was kept to a minimum (it was only a few kB anyways).

The software was shipped with a memory map file so you know exactly what each memory address is used for. A lot of test procedures involved reading and writing at specific memory locations.

It was for avionics BTW. As you may have guessed, it was certified code with hard real time constraints. Exceeding the cycle time is equivalent to a crash, causing the watchdog to trigger a reset.

Impossible · on Sept 22, 2019

Malloc was forbidden in a AAA title I worked on. Interestingly enough we also had two (!!!) embedded garbage collected scripting languages that could only allocate in small (32MB, 16 MB) arenas and would assert if they over allocated.

zippie · on Sept 22, 2019

In modern C libraries malloc is mmap’ed anon pages underneath for an increasingly larger set of conditions these days.

hackits · on Sept 21, 2019

Sounds very similar to NASA C programming guidelines. Each module during the initialization period would allocate its static memory size. Every loop had a upper bound's max iteration to prevent infinite loops.

I think they posted the guidelines and it was a wonderful read about how they developed real time systems.

mike00632 · on Sept 22, 2019

Do you happen to have a link to the guidelines?

kimown · on Sept 22, 2019

https://ntrs.nasa.gov/search.jsp?R=19950022400

http://mechatronics.me.wisc.edu/labresources/DataSheets/NASA...

hackits · on Sept 22, 2019

http://spinroot.com/gerard/pdf/P10.pdf

neop1x · on Sept 23, 2019

I was wondering why I couldn't open this pdf on my android pdf viewer and this link gives me cloudflare's captcha. It uses google's ReCaptcha - storefronts, bicycles, semaphores. What a terrible practice. People stop overusing cloudflare!

loeg · on Sept 22, 2019

Why even define a malloc() in this environment if it always returns NULL?

seanmcdirmid · on Sept 22, 2019

Because even if you don't use it directly, you might use some code that uses it. And even then, it might not ever be called given the way you are reusing the code, so...a dynamic check is the best way to ensure it never actually gets used.

loeg · on Sept 22, 2019

> Because even if you don't use it directly, you might use some code that uses it.

It seems extremely unlikely that any general purpose code you might adopt, which happens to invoke malloc() at all, would be fit for purpose in such a restricted environment without substantial modification; in which case you would just remove the malloc() calls as well.

> And even then, it might not ever be called given the way you are reusing the code, so...a dynamic check is the best way to ensure it never actually gets used.

In such a restricted environment, it is unlikely you just have unknown deadcode in your project. "Oh, those parts that call malloc()? They're probably not live code and we'll find out via a crash at runtime." That's like the opposite of what you want in a hard realtime system.

So, no — a static, compile/link time check is a strictly superior way to ensure it never gets used.

lonelappde · on Sept 22, 2019

Spoken like someone who never had a 3rd party binary blob as a critical dependency.

jorams · on Sept 22, 2019

If you need your system to never dynamically allocate memory, an opaque 3rd party binary blob which might still do that doesn't seem like a valid inclusion.

loeg · on Sept 22, 2019

Again, we're talking about a very restricted hard realtime environment. How do you trust, let alone qualify, a 3rd party binary blob that you know calls malloc(), which is a direct violation of your own design requirements?

alxlaz · on Sept 22, 2019

Not OP but the most common reason I've seen is that this was an additional restriction that they imposed for their project. The standard library includes a perfectly working malloc. You override it so that you can't end up calling it accidentally, either explicitly (i.e. your brain farts and you do end up malloc()-ing something) or implicitly (i.e by calling a function that ends up calling malloc down the line). The latter is what happens, and surprisingly easy and often. Not all libraries have the kind of documentation that glibc has, nor can you always see the code.

the_mitsuhiko · on Sept 22, 2019

Some things are okay to allocate at some point in time. So you can disable malloc later when you no longer want it to provide memory.

loeg · on Sept 22, 2019

That's plausible, but not how OP defined it:

> a system where malloc() was forbidden. In fact it always returned null.

drblast · on Sept 22, 2019

So you crash immediately if someone tries to use it or a library gets added that does?

Never experienced the malloc() thing but do throw exceptions and fail fast under conditions like these so they're caught in testing.

a_t48 · on Sept 22, 2019

I think what he's asking to do is to make it a linker failure by not adding it to the standard lib at all. Compile fails are nicer than runtime fails.

loeg · on Sept 22, 2019

Like sibling commenter says, a compile/link error is even better and faster than finding out at runtime with a crash.

aetherspawn · on Sept 21, 2019

I’d be interested in knowing more if you care to write more about this. I’m currently costing/scoping out what’s required in writing software at that level but don’t really have some real industry insight into what other people do

Static memory is something we already do but we’re pretty interested whether industry actually adopts redundant code paths, monitors, to what extent watchdogs (how many cycles can be missed?), etc.

rramadass · on Sept 22, 2019

I suggest you take a look at the books by Michael Pont and the offerings through his company "SafeTTy Systems". His/Company's work deals with hard real-time systems with all the associated standards.

80x86 · on Sept 21, 2019

Wow - disabling the caches is an extreme measure. I get it though. After doing that, you can (probably) do cycle counting on routines again. Just like the 80s or earlier.

kabdib · on Sept 22, 2019

There's probably enough jitter in the memory system and in instruction parallelism that accurate cycle counting will still be challenging.

Also, you probably want some padding so that newer versions of the CPU can be used without too much worry. It's possible for cycle counts of some routines to increase, depending on how new chips implement things under the hood.

[says a guy who was counting cycles, in the 1980s :-)]

viraptor · on Sept 22, 2019

I don't. Maybe we're thinking about different kind of caches, but if these are transparent, no-performance-impact caches, then why wouldn't you prove the system works well with caches off (guarantee deadlines are met), then enable caches for opportunistic power gains?

notacoward · on Sept 22, 2019

> if these are transparent, no-performance-impact caches

If there were no performance impact, there would be no point. I'm not just being snarky; there's an important point here. Caches exist to have a performance impact. In many domains it's OK to think about caching as a normal case, and to consider cache hit ratio during designs. When you say "no performance impact" you mean no negative performance impact, and that might be technically true (or it might not), but...

But that's not how a hard real-time system is designed. In that world, uncached has to be treated as the normal case. Zero cache hit ratio, all the time. That's what you have to design against, even counting cycles and so on if you need to. If you're designing and testing a system to do X work in Y time every time under worst-case assumptions, then any positive impact of caches doesn't buy you anything. Does completing a task before deadline because of caching allow you to do more work? No, because it's generally considered preferable to keep those idle cycles as a buffer against unforeseen conditions (or future system changes) than try to use them. Anything optional should have been passed off to the non-realtime parts of the larger system anyway. There should be nothing to fill that space. If that means the system was overbuilt, so be it.

The only thing caches can do in such a system is mask the rare cases where a code path really is over budget, so it slips through testing and then misses a deadline in a live system where the data-dependent cache behavior is less favorable. Oops. That's a good way for your product and your company to lose trust in that market. Once you're designing for predictability in the worst case, it's actually safer for the worst case to be the every-time case.

It's really different than how most people in other domains think about performance, I know, but within context it actually makes a lot of sense. I for one am glad that computing's not all the same, that there are little pockets of exotic or arcane practice like this, and kudos to all those brave enough to work in them.

orlp · on Sept 22, 2019

While you might test your hard real-time requirements with caches disabled, there's still reason to run the code with caches afterwards.

E.g. errors that didn't match a branch or input scenario during testing which would go over budget without cache, but with cache might prevent a crash.

Another could be power consumption, latency optimization, or improvement of accuracy. E.g. some signal analysis doesn't work at all if the real-time code is above some required Nyquist threshold, but faster performance improves the maximum frequency that can be handled, improving accuracy.

notacoward · on Sept 22, 2019

You could be right on some of those. That didn't seem to be the prevailing attitude when I worked in that area, but as I said that was a long time ago - and it was in only a few specific sub-domains as well.

notacoward · on Sept 22, 2019

Forgot to mention: cache misses can be more expensive than uncached accesses, so testing with caches off and then turning them on in production can be a disaster if you hit a cache-busting access pattern. Always run what you tested.

Baeocystin · on Sept 22, 2019

Because you lose deterministic behavior, and there are cases where that is non-negotiable, regardless of performance cost.

piaste · on Sept 22, 2019

Who will provide a guarantee that the caches are truly transparent and will not trigger any new bugs?

Essentially, you would need to prove the statement "if a system works well with caches off, then it works well with caches on" to the satisfaction of whatever authority is giving you such stringent requirements.

squeaky-clean · on Sept 22, 2019

You'd have to very solidly prove that in the wors-case a cache only ever make execution time equal to or faster than a processor not using cache and never causes anything to be slower.

darnir · on Sept 24, 2019

Even that is not enough. The cache may make everything faster, but it could lead to higher contention on a different physical resource slowing things down there. The cache cannot be guaranteed to prevent that.

PorterDuff · on Sept 21, 2019

""soft" real-time (e.g. games, or audio/video) "

Hah. Who sez that audio and video products have 'soft' real time? Go on now.

notacoward · on Sept 22, 2019

I didn't make up the terminology. It was already in common use at least thirty years ago when I was actively working in that domain. To simplify, it's basically about whether a missed deadline is considered fatal or recoverable. That difference leads to very different design choices. Perhaps some kinds of video software is hard real-time by that definition, but for sure a lot of it isn't. I'd apologize, but it was never meant to be pejorative in the first place and being on one side of the line is cause for neither pride nor shame. They're just different kinds of systems.

detaro · on Sept 22, 2019

What percentage of the market is A/V build to actual hard real-time standards, and not expected to run on devices that can't provide it (so no PCs with normal OSes, no smartphones)? For the vast majority, soft real-time is fine, since an occasional deadline-miss results in minor inconvenience, not property damage, injury or death.

I assume some dedicated devices are more or less hard real time, due to running way simpler software stacks on dedicated hardware.

PorterDuff · on Sept 22, 2019

I take it that scarcely anyone here has written software for video switchers, routers, DVEs, linear editors, audio mixers, glue products, master control systems, character generators, etc. etc. Missing a RT schedule rarely results in death, but you'd think so given the attitude from the customer. That's a silly definition for it.

There's a whole world out there of hard real time, the world is not simply made up of streaming video and cell phones.

The cool thing on HN is you can get down voted for simply making that observation. It's a sign of the times I'm afraid.

AnimalMuppet · on Sept 22, 2019

I actually have written software for video routers and character generators. We didn't consider them hard real time, though I wouldn't claim that such was standard industry usage.

For example, if you're doing a take, you have to complete it during the blanking interval, but usually the hardware guarantees that. In the software, you want you take to happen in one particular vertical blanking interval (and yes, it really is a frame-accurate industry). But if you miss, you're only going to miss by one. We didn't (so far as I know) specify guarantees to the customer ("If you get your command to the router X ms before the vertical interval, your take will happen in that vertical"), so we could always claim that the customer didn't get the command to the router in time. Again, so far as I know - there may have been guarantees given to the customer, but I didn't know about them.

But that was 20 years ago, back in the NTSC 525 days.

Nice name, by the way. Do you know of any video cards that will do a true Porter & Duff composite these days? I recall looking (again, 20 years ago) at off-the-shelf video cards, and while they could do an alpha composite, it wasn't right (and therefore wasn't useful to us).

ajanuary · on Sept 22, 2019

I currently work on software controlling the hardware like video routers, and this is definitely my experience. It’s all very much soft real-time.

In terms of customers and how much they care, the North American market seems to care less than Europe.

jcelerier · on Sept 22, 2019

I work on an open-source music sequencer (https://ossia.io) and no later than two days ago I had a fair amount of mails with someone who wanted to know the best settings for his machine to not have any clicks during the show (which are the audio symptoms of "missed deadline"). I've met some users who did not care, but the overwhelming majority does, even for a single click in a 1-hour long concert.

littlestymaar · on Sept 22, 2019

If it's running in a consumer OS (not a RT one) and it counts on having enough CPU available to avoid missing the deadline, that's exactly what soft-realtime is.

Compare your “not a single click in an hour [for quality reason]” to a “not a single missed deadline in 30 years of the life expectancy of a plane, on a fleet of a few thousands planes [for safety reasons]”. That's the difference of requirements between hard and soft RT.

I did some soft real-time (video decoding) and I have a friend working on hard real-time (avionics) and we clearly didn't worked in the same world.

AnimalMuppet · on Sept 22, 2019

Yeah. To me, hard real time is when you count cycle (or have a tool that does it for you), to guarantee that you make your timing requirements. We never did that.

fastball · on Sept 22, 2019

You just agreed with the OC.

RT video/audio failing never results in death. Where as failures in "avionics, industrial control" absolutely can / do. That seems to be where OC was drawing the line.

detaro · on Sept 22, 2019

Seems to be a common distinction, although GP is right with the addition that the production side of things is more demanding (and at least would suffer financial damage if problems occur to often) than the playback side formed by random consumer gear, and has some, especially low-level/synchronization-related, gear to hard standards. But often soft is enough, as long as it's reliable enough on average.

coldtea · on Sept 22, 2019

>For the vast majority, soft real-time is fine, since an occasional deadline-miss results in minor inconvenience, not property damage, injury or death.

A "minor inconvenience" like a recording session going wrong, a live show with stuttering audio, skipped frames in a live TV show, and so on?

squeaky-clean · on Sept 22, 2019

Most professional recording studios are using consumer computer hardware that can't do hard realtime with software that doesn't support hard realtime.

People like deadmau5, Daft Punk, Lady Gaga all perform with Ableton Live and a laptop or desktop behind their rig. If it were anything more than a minor inconvenience, these people wouldn't use this.

It's very unlikely to have audio drop outs, a proper setup will basically never have them. But still if you have one audio dropout in your life, you're not dead, your audience isn't dead, a fire doesn't start, a medical device doesn't fail to pump, and so on.

And yes you can badly configure and system, but the point is you can't configure these to be 100% guaranteed, 99.99% is perfectly fine.

Edit: Sometimes people call these "firm" realtime systems. Implying the deadline cannot be missed for it to operate, but also that failure to meet deadlines doesn't result in something serious like death (e.g in a video game you can display frames slower than realtime and it kind of works but feels laggy, however you cannot also slow down the audio processing because you'll a lowered pitch, so you have to drop the audio.)

detaro · on Sept 22, 2019

As long as the individual event happens seldom enough few of these actually are a big problem. Soft real-time being allowed to blow deadlines doesn't mean it can't be expected to have a very high rate of success (at least that's the definitions I've learned), and clearly a sufficiently low rate of failure is tolerated. There's a vast difference between "there's an audio stutter every day/week/month/..." and "noticeably stuttering audio". The production side is obviously a lot more sensitive about this than playback, but will still run parts e.g. on relatively normal desktop systems because the failure rate is low enough.

tripzilch · on Sept 23, 2019

The production side usually renders the final audio mix off-line, so no real-time requirements there for getting optimum sound quality. I'd say the occasional rare pop or stutter is worse to have during a live performance than when mixing and producing music.

greggman2 · on Sept 22, 2019

There's probably 500+ successful GC based games on the Steam store another 100000 to a million hobby games doing just fine with GC.

I started game programming on the Atari 800, Apple 2, TRS-80. Wrote NES games with 2k of ram. I wrote games in C throughout the 90s including games on 3DO and PS1 and at the arcade.

I was a GC hater forever and I'm not saying you can ignore it but the fact that Unity runs in C# with GC and that so many quality and popular shipping games exist using it is really proof that GC is not the demon that people make it out to be

Some games made with GC include Cuphead, Kerbal Space Program, Just Shapes & Beats, Subnautica, Ghost of a Tale, Beat Saber, Hollow Knight, Cities: Skylines, Broforce, Ori and the Blind Forest

sgtnoodle · on Sept 22, 2019

Kerbal Space Program is a lot of fun, but for me it freezes for half a second every 10 seconds... due to garbage collection. Drives me crazy.

exDM69 · on Sept 22, 2019

Even on a fast PC, Kerbal Space Program audio is choppy because of GC pauses.

It's successful in spite of that, but that doesn't make it any better.

kungito · on Sept 22, 2019

But maybe it had to do with them being able to use a higher level library, not worry about gc, and focus on other things

mcdevilkiller · on Sept 22, 2019

And a very important one, Minecraft, atleast the Java version. It doesn't matter until you are running a couple hundred mods with lots of blocks and textures, when the max memory allocated gets saturated and it stutters like hell.

ClumsyPilot · on Sept 22, 2019

I don't think the argument is "you can't ship succesfull game with GC"

You might spend more time fighting the GC than benefitting from it. And that seems to be the experience for large games - simpler ones might not care.

Unity offers a lot more than just a language, and developers have to choose, are they willing to put up with GC to get the rest of what Unity offers.

EdwardDiego · on Sept 22, 2019

Minecraft is the only Java based popular game I can think of. And damn did I love Subnautica.

Daiz · on Sept 22, 2019

The recent roguelite hit Slay the Spire (over a million copies sold on Steam) is also made in Java.

pdpi · on Sept 21, 2019

> and the times you find yourself "at the mercy" of the garbage collector is pretty damn frustrating.

You're still at the mercy of the malloc implementation. I've seen some fairly nasty behaviour involving memory leaks and weird pauses on free coming from a really hostile allocation pattern causing fragmentation in jemalloc's internal data.

Gene_Parmesan · on Sept 21, 2019

Which is why you generally almost never use the standard malloc to do your piecemeal allocations. A fair number of codebases I've seen allocate their big memory pools at startup, and then have custom allocators which provide memory for (often little-'o') objects out of that pool. You really aren't continually asking the OS for memory on the heap.

In fact, doing that is often a really bad idea in general because of the extreme importance of cache effects. In a high-performance game engine, you need to have a fine degree of control over where your game objects get placed, because you need to ensure your iterations are blazingly fast.

mlthoughts2018 · on Sept 21, 2019

Doesn’t this just change semantics? Whatever custom handlers you wrote for manipulating that big chunk of memory are now the garbage collector. You’re just asking for finer grained control than what the native garbage collection implementation supports, but you are not omitting garbage collection.

Ostensibly you could do the exact same thing in e.g. Python if you wanted, by disabling with the gc module and just writing custom allocation and cleanup in e.g. Cython. Probably similar in many different managed environment languages.

therein · on Sept 21, 2019

I mean, nobody is suggesting they leave the garbage around and not clean up after themselves.

But instead what you can do is to reuse the "slots" you are handing out from your allocator's memory arena for allocations of some specific type/kind/size/lifetime. If you are controlling how that arena is managed, you will find yourself coming across many opportunities to avoid doing things a general purpose GC/allocator would choose to do in favor of the needs dictated by your specific use case.

For instance you can choose to draw the frame and throw away all the resources you used to draw that frame in one go.

mntmoss · on Sept 21, 2019

The semantics matter. A lot of game engines use a mark-and-release per-frame allocation buffer. It is temporary throwaway data for that frame's computation. It does not get tracked or freed piecemeal - it gets blown away.

Garbage collection emulates the intent of this method with generational collection strategies, but it has to use a heuristic to do so. And you can optimize your code to behave very similarly within a GC, but the UI to the strategy is full of workarounds. It is more invasive to your code than applying an actual manual allocator.

PrototypeNM1 · on Sept 22, 2019

> A lot of game engines use a mark-and-release per-frame allocation buffer.

I've heard of this concept but a search for "mark-and-release per-frame allocation buffer" returned this thread. Is there something else I could search?

theresistor · on Sept 22, 2019

It’s just a variation of arena allocation. You allocate everything for the current frame in an arena. When the frame is complete. You free the entire arena, without needing any heap walking.

A generational GC achieves a similar end result, but has to heuristically discover the generations, whereas an arena allocator achieves the same result deterministically And without extra heap walking.

meheleventyone · on Sept 22, 2019

Linear or stack allocator are other common terms. Just a memory arena where an allocation is just a pointer bump and you free the whole buffer at once by returning the pointer to the start of the arena.

asveikau · on Sept 22, 2019

Getting rid of this buffer is literally nothing. There is no free upon the individual objects needed. You just forget there was anything there and use the same buffer for the next frame. Vs. Waiting for a GC to detect thousands of unused objects in that buffer and discard them, meanwhile creating a new batch of thousands of objects and having to figure out where to put those.

pyrale · on Sept 21, 2019

You can do many things in many languages. You may realize in the process that doing useful things is made harder when your use case is not a common concern in the language.

correct_horse · on Sept 21, 2019

C's free() gives memory back to the operating system(1), whereas, as a performance optimization, many GCd languages don't give memory back after they run a garbage collection (see https://stackoverflow.com/questions/324499/java-still-uses-s...). Every Python program is using a "custom allocator," only it is built in to the Python runtime. You may argue that this is a dishonest use of the term custom allocator, but custom is difficult to define (It could be defined as any allocator used in only one project, but that definition has multiple problems). The way I see it, there are allocators that free to the OS and those that don't or usually don't (hereafter referred to as custom). In C, a custom allocator conceivably could be built into, say, a game engine. You might call ge_free(ptr) which would signal to the custom allocator that chunk of memory is available and ge_malloc() would use the first biggest chunk of internally allocated memory, calling normal malloc() if necessary. Custom allocators in C are a bit more than just semantics, and affect performance (for allocation-heavy code). Furthermore, they are distinct from GC, as they can work with allocate/free semantics, rather than allocate/forget (standard GC) semantics. Yes, one could technically change any GCd language to use a custom allocator written by one's self. But Python can't use allocate/free semantics (so don't expect any speedup). Python code never attempts manual memory management, (i.e. 3rd party functions allocate on the heap all the time without calling free()) because that is how Python is supposed to work. To use manual memory management semantics in Python, you would need to rewrite every Python method with a string or any user defined type in it to properly free.

(1) malloc implementations generally allocate a page at a time and give the page back to the OS when all objects in the page are gone. ptr = malloc(1); malloc(1); free(ptr); doesn't give the single allocated page back to the OS.

takeda · on Sept 22, 2019

Python is a bad example to talk about gc, because it uses different garbage collector than most of languages. It is also the primary reason why getting rid of GIL and retaing performance is so hard. Python uses reference counters and as soon as the reference count drops to 0 it immediately frees the object, so in a way it is more predictable. It has also a traditional GC and I guess that's what was mentioned you can disable it. The reason for it is that reference count won't free memory of there is a loop (e.g. object A references B and B references A, in that case both have reference count 1 even though nothing is using them), do that's where the traditional GC steps in.

ncmncm · on Sept 22, 2019

Freeing memory to the OS causes TLB cache stalls in all other threads in the process.

If the program runs for any length of time, it will probably need the same memory again, so freeing it is a pessimization.

Standard C library free() implementations very, very rarely free memory back to the OS.

angry_octet · on Sept 22, 2019

It's not a performance optimisation not to give space back. GCs could easily give space back after a GC if they know a range (bigger than a page) is empty, it's just that they rarely know it is empty unless they GC everything, and even then there is likely to be a few bytes used. Hence the various experiments with generational GC, to try to deal with fragmentation.

Many C/C++ allocators don't release to the OS often or ever.

Rusky · on Sept 21, 2019

That's true, and it's why the alternative to GC is generally not "malloc and free" or "RAII" but "custom allocators."

Games are very friendly to that approach- with a bit of thought you can use arenas and object pools to cover 99% of what you need, and cut out all of the failure modes of a general purpose GC or malloc implementation.

xyproto · on Sept 21, 2019

Interestingly, it's fully possible to disable the automatic garbage collection in Go to achieve this.

Disable the garbage collector:

  debug.SetGCPercent(-1)

Trigger garbage collection:

  runtime.GC()

It is also possible to allocate a large block of memory and then manage it yourself.

littlestymaar · on Sept 21, 2019

Due the low throughput of Go's GC (which trades a lot of it in favor of short pause duration), you risk running out if memory if you have a lot of allocations and you don't run your GC enough times.

xyproto · on Sept 21, 2019

For a computer game, if you start out by allocating a large block of memory, then manage it yourself, I don't see how this would be a problem.

littlestymaar · on Sept 21, 2019

You're not using the GC at all then. Why use Go (and praise its GC) in that case?

Skunkleton · on Sept 21, 2019

For the joy of getting the round peg through the square hole of course.

xyproto · on Sept 21, 2019

Go has many advantages over C that are not related to GC.

littlestymaar · on Sept 22, 2019

In a context where you don't allocate memory, you lose a lot of those (for instance, you almost cannot use interfaces, because indirect calls cause parameters to those calls to be judged escaping and unconditionally allocated on the heap).

Go is a good language for web backend and other network services, but it's not a C replacement.

xyproto · on Sept 25, 2019

If you allocate a large block of memory manually at the start of the program, then trigger the GC manually when it suits you, won't you get the best of both worlds?

Valmar · on Sept 22, 2019

Go also has many disadvantages, compared to plain C.

xyproto · on Sept 22, 2019

Can't think of any. Do you have an example?

littlestymaar · on Sept 22, 2019

You can't call native libraries without going through cgo. So unless you don't want to have audio, draw text and have access to the graphic APIs, you'll need cgo, which is really slow due to Go's runtime. For game dev, that's a no go (pun intended).

Additionally, the Go compiler isn't trying really hard at optimizing your code, which makes it several times slower on a CPU-bound task. That's for a good reason: because for Go's usecase, compile-time is a priority over performances.

Saying that there is no drawbacks in Go is just irrational fandom…

xyproto · on Sept 22, 2019

You are only talking about the Go compiler from Google.

GCC also supports Go (gccgo) and can call native libraries just like from C.

I'm not saying there are no drawbacks in Go, just that I can't think of any advantages of C over Go.

takeda · on Sept 22, 2019

Go was pushed as a C replacement, but very few C programmers switched to it, it seems like it took hearts of some of Python, Ruby, Java etc programmers.

freyr · on Sept 22, 2019

Nonetheless, Go has many advantages over C that are not related to GC.

takeda · on Sept 22, 2019

So does Python or Ruby, that doesn't mean it is a C replacement.

majewsky · on Sept 21, 2019

> It is also possible to allocate a large block of memory and then manage it yourself.

At which point you're mostly just writing C in Go.

pm90 · on Sept 21, 2019

Actually you're not.

I would very much prefer a stripped down version of Go used for these situations rather than throwing more C at it. The main benefits of using Go are not the garbage collection, its the tooling, the readability (and thus maintainability) of the code base, the large number of folks who are versatile in using it.

kerkeslager · on Sept 21, 2019

Readability is subjective.

Large user base? C is number 2. Go isn't even in the top 10.[1]

Tooling? C has decades of being one of the most commonly used languages, and a general culture of building on existing tools instead of jumping onto the latest hotness every few months. As a result, C has a very mature tool set.

[1] https://www.tiobe.com/tiobe-index/

strken · on Sept 21, 2019

Unfortunately the excellent standard library is a major benefit of Go, and it uses the GC, so if you set GOGC=off you're left to write your own standard library.

I would also like to see a stripped-down version of Go that disables most heap allocations, but I have no idea what it would look like.

metiscus · on Sept 21, 2019

Are you saying that there are more go developers than c developers? Is there a user survey that shows such things? I'm curious what the ratio is.

rdbell · on Sept 21, 2019

I'd be willing to wager that C programmers would be more comfortable working with a Golang codebase than Golang programmers would be working with a C codebase.

There may be more "C programmers" by number but a Golang codebase is going to be more accessible to a wider pool of applicants.

weberc2 · on Sept 21, 2019

In my experience it takes a few days for a moderate programmer to come up to speed on Go, whereas it takes several months for C. You need to hire C programmers for a C position, you can hire any programmers for a Go position.

cozzyd · on Sept 21, 2019

If they don't already know C though, how well will they cope with manual memory management?

nicoburns · on Sept 22, 2019

How do people learn C without knowing about manual memory management? They learn about it as they learn the language. This can be done in any language that allows for manual memory management (and most have much better safeguards and documentation than C, which has a million ways to shoot yourself in the foot)

weberc2 · on Sept 22, 2019

It will be a learning curve, but a much, much smaller one than learning C.

kerkeslager · on Sept 23, 2019

But the entire point of this line of questioning is that there are more programmers who already know C.

kerkeslager · on Sept 21, 2019

He's wrong: https://www.tiobe.com/tiobe-index/

weberc2 · on Sept 21, 2019

You’re writing in a much improved C. Strong type system (including closures/interfaces/arrays/slices/maps), sane build tooling (including dead simple cross compilation), no null-terminated strings, solid standard library, portability, top notch parallelism/concurrency implementation, memory safety (with far fewer caveats, anyway), etc. Go has it’s own issues and C is still better for many things, but “Go with manually-triggered GC” is still far better than C for 99.9% of use cases.

bufo · on Sept 21, 2019

Go’s compiler is not at all optimized for generating fast floating point instructions like AVX and its very cumbersome to add any kind of intrinsics. This might not matter for light games but an issue when you want to simply switch to wide floating point operations to optimize some math.

weberc2 · on Sept 22, 2019

Yeah, C compilers optimize much more than Go compilers. Performance is C’s most noteworthy advantage over Go.

xyproto · on Sept 22, 2019

GCC can compile both C and Go. I searched for benchmarks but found none for GCC 9 that compares the performance of C and Go. Do you have any sources on this?

weberc2 · on Sept 23, 2019

I don’t have a source, but it’s common knowledge in the Go community. Not sure how GCC works, but it definitely produces slower binaries than gc (the standard Go compiler). There are probably some benchmarks where this is not the case, but the general rule is that gcc is slower. gc purposefully doesn’t optimize as aggressively in order to keep compile times low.

Personally I would love for a —release mode that had longer compile times in exchange for C-like performance, but I use Python by day (about 3 orders of magnitude slower than C) so I’d be happy to have speeds that were half as fast C. :)

xyproto · on Sept 22, 2019

Which compiler? The one from Google, GCC (gccgo) or TinyGo?

omaranto · on Sept 21, 2019

Does Go really let you use closures, arrays, slices and maps when you disable the garbage collector? If so, does that just leak memory?

weberc2 · on Sept 22, 2019

Yes, the idea is that you must invoke the GC when you’re not in a critical section. Alternatively you can just avoid allocations using arenas or similar. (You can use arrays and slices without the GC).

omaranto · on Sept 22, 2019

To make sure I understand, is this an accurate expansion of your comment?

Yes it would leak, to avoid leaking you could invoke the GC when you’re not in a critical section. Alternatively, if you don't use maps and instead structure all your data into arrays, slices and structs, you can just avoid allocations using arenas or similar. (You can use arrays and slices without the GC, but maps require it).

weberc2 · on Sept 22, 2019

Yes, that is correct. Anything that allocates on the heap requires GC or it will leak memory. Go doesn’t have formal semantics about what allocates on the heap and what allocates on the stack, but it’s more or less intuitive and the toolchain can tell you where your allocations are so you can optimize them away. If you’re putting effort into minimizing allocations, you can probably even leave the GC on and the pause times will likely be well under 1ms.

Insanity · on Sept 21, 2019

And not to forget that using Go correctly you'd end up doing mostly stack pushes and pops

z3t4 · on Sept 21, 2019

With JS the trick is to avoid creating new objects and instead have a pool of objects that are always referenced.

bcrescimanno · on Sept 21, 2019

Definitely! Object pools are common in game dev too from what I know. We used them extensively to reduce the amount of GC we needed.

eutropia · on Sept 21, 2019

Speedrunners thank you for reusing objects! I'm certain that decisions like this are what lead to interesting teleportation techniques and item duplications. Games wouldn't be the same without these fun Easter eggs!

tomxor · on Sept 21, 2019

Once I wrote a very small vector library in JS for this very reason: almost all JS vector libraries out there tend to dynamically create new vectors for every vector binary operation, this makes JS GC go nuts. It's also prohibitively expensive to dynamically instantiate typedarray based vectors on the fly, even though they are generally faster to operate on... most people focus on fixing the latter in order to be able to use typedarrays by creating vector pools (often as part of the same library), but this creates a not-insignificant overhead.

Instead my miniature library obviated pools by simply having binary operators operate directly on one of the vector objects passed to it, if more than one vector was required for the operation internally they would be "statically allocated" by defining them in the function definition's context (some variants i would also return one of these internal vectors - which was only safe to use until a subsequent call of the same operator!).

The result this had on the calling code looked quite out of place for JS, because you would effectively end up doing a bunch of static memory allocation by assigning a bunch of persistent vectors for each function in it's definition context, and then you would often need to explicitly reinitialize the vectors if they were expected to be zero.

... it was however super fast and always smooth - I wish it was possible to turn the GC off in cases like this when you know it's not necessary. It was more of a toy as a library, but i did write some small production simulations with it - i'm not sure how well the method would extend to comprehensive vector and matrix libraries, I think the main problem is that most users would not be willing to use a vector library this way, because they want to focus on the math and not have to think about memory.

RobertKerans · on Sept 21, 2019

You wouldn't have this still up somewhere like GH would you? I'm currently writing a toy ECS implementation and have somewhat similar needs, and I've been trying to build up a reference library of sorts covering novel ways of dealing with these kind of JS issues

tomxor · on Sept 21, 2019

No sorry, this is from more than 5 years ago and it was never FOSS, but I only implemented rudimentary operators anyway, you could easily adapt existing libraries or write your own, the above concept is more valuable than any specific implementation details... the core concept being, never implicitly generate objects, e.g operate directly on parameter objects, or re-use them for return value, or return persistent internal objects (potentially dangerously since they will continue to be referenced and used internally).

All of these ideas require more care when using the library though.

RobertKerans · on Sept 22, 2019

Thank you though, what you've written here is very useful -- you're describing things I'm immediately recognising in what I'm doing. As I say, it's just a small toy (and I think one that would definitely be easier in a different language, but anyway...). At the minute I'm actually at the point where I'm preallocating and persisting a set of internal objects, and at a very very small scale it's ok, but each exploration in structure starts to become manageable pretty quickly.

meheleventyone · on Sept 22, 2019

Three.js implements their math this way.

https://github.com/mrdoob/three.js/blob/master/src/math/Vect...

You can see most operations act on the Vector and there are some shared temporary variables that have been preallocated. If you look through some of the other parts you can see closures used to capture pre-allocated temporaries per function as well.

RobertKerans · on Sept 22, 2019

Ah brilliant, that's definitely useful. Thank you for the pointer

rienbdj · on Sept 21, 2019

This kind of place orientated programming can make the actual algorithm very hard to follow. I really hope that JS gets custom value types in the future.

chrisseaton · on Sept 21, 2019

But then that damages performance because your objects are always globally visible rather than being able to be optimised away.

weberc2 · on Sept 21, 2019

Go’s GC is low latency and it allows you to explicitly trigger a collection as well as prevent the collector from running for a time. I would wager that the developer time/frustration spent taming the GC would be more than made up for by the dramatic improvement in ergonomics everywhere else. Of course, the game dev libraries for Go would have to catch up before the comparison is valid.

tomxor · on Sept 21, 2019

Why is this downvoted? is it factually wrong? (I don't know Go so I'm asking)

weberc2 · on Sept 22, 2019

It’s factually correct. My other post got downvoted for pointing out that not every GC is optimized for throughout. Seems like I’m touching on some cherished beliefs, but not sure exactly which ones.