More

thingfish · 2025-04-28T10:35:32 1745836532

OpenBSD has pkg_add and friends. Which I can't release yet. They need Perl which I can't release yet either. Although stable already, there still is one vital thing I need implemented in MinC.

thingfish · 2025-04-28T10:00:48 1745834448

I wrote an entire 3-year (dutch) course around it [0]. My predecessor started his course making them do endless exercises with user rights and vim, which is very boring for 16-year olds. I start them on Apache webserver. This generates immediate interest, and at the same time tricks them into learning about user rights and using vim.

[0] http://commandlinerevolution.nl/Huiswerk/

thingfish · 2025-04-28T09:29:04 1745832544

My choice for this particular IRC client (not naming it) came from the Mr.Robot series (S04E04 at 59:38:00). I am teaching 16 year olds. They like that kind of thing.

thingfish · on April 24, 2025

Cool. You found out why MinC is not Cygwin. To fork(), Cygwin starts a second copy of the entire executable. CreateThread() is much closer to what UNIX does. Any thoughts? BTW, there's another "easteregg".

cryptonector · on April 24, 2025

Not sure I'll look for the other "easteregg", though I noticed that you set the new thread's impersonation token, which is presumably how you implement setuid() and friends [yes, that is how you implement them] -- is that the other easteregg? I also see that you use the last RID of SIDs as the UID/GID, and that you have hardcoded passwd(4)/group(4) entries, so you only support local users (which is fine).

Oh, ah, _task_copy(). You swap stacks... I think because of course.

Your execve() is real though, since you use CreateProcessAsUser(), which is good.

The compromise you made is pretty neat, though obviously some things might not work well. Though I'm guessing this is working quite well for you.

Your file names are weird; they often tell me nothing about what to expect to find in them.

I looked for fork() because that's the difficult one to implement on WIN32. What Cygwin does is some crazy gymnastics that sometimes fails. You might know that I think fork() is "evil" [0], or if you hadn't, you do now, and I guess you're likely to agree with that take, but I'd love to hear your thoughts on that, because clearly you've thought about fork().

I've to say that you've clearly thought a lot about this, and though I've not tried it (and probably won't) you seem to have done an excellent job.

[0] https://gist.github.com/nicowilliams/a8a07b0fc75df05f684c23c...

thingfish · on April 25, 2025

> I've to say that you've clearly thought a lot about this, and though I've not tried it (and probably won't) you seem to have done an excellent job.

Thank you. I needed that. I started the project in 2016 and have been obsessed with it since then. The weird thing is that I never had to use workarounds or hacks that are mentioned by the Cygwin team [0]. For instance, the select() call does map cleanly on top of the Win32 API. I just needed to use WSAWaitForMultipleEvents() instead of WaitForMultipleEvents() (the other "easteregg"). Why the Cygwin people didn't figure this out baffles me. I guess their current code base doesn't allow the rewrite. My big breakthrough was when I realized that the "inconsistent interfaces" [1] in Win32 file handles can be implemented as virtual file systems. One for each handle type (char, disk, pipe, etc). That was my "throwing away 1000 lines of code" [2] moment.

As to the weird file names, I use the file names OpenBSD uses. My rule is to always use the file name of the header (.h) file where the system call is declared in OpenBSD. I also use their struct and constant names, prefixed with "WIN_".

The "fork is evil" thing is discussed a lot in the programmers community. I myself find it quite clever. Threads are highly volatile and are very hard to program without running into race conditions. The solution is to make a copy of everything the child will be using: duplicate file descriptors, the stack, globals (rss). The kernel does all this for you in one system call. I often wonder how the people who complain about the absence of real concurrency in their programming languages [3] actually would use this feature. In my opinion the best way to use concurrency is to string individual programs into a pipeline. This will never go "evil" on you.

[0] https://cygwin.com/cygwin-ug-net/highlights.html

[1] https://www.usenix.org/legacy/publications/library/proceedin...

[2] https://skeptics.stackexchange.com/questions/43800/did-the-c...

[3] https://news.ycombinator.com/item?id=32408577

cryptonector · on April 25, 2025

> Thank you. I needed that. I started the project in 2016 and have been obsessed with it since then.

Well, like I said, I'm impressed.

> I myself find [fork()] quite clever.

Oh for sure it is clever. Though vfork() would have been more clever. The thing that fork() did that was very nice is make it real easy to spawn processes in a shell, which meant not having to design a spawn() system call (which are invariably large APIs), which greatly simplified Unix development in the 70s, both kerne-land and user-land. vfork() would have been more clever, but that didn't occur to Ritchie, Thompson, et. al. I wonder how things would have gone if they had thought of vfork().

thingfish · on April 26, 2025

Wait, I sense some genuine concern here. In fact, you tricked me into learning, which is what I also do with my students. I've read the article you provided in your earlier comment, plus the one from Microsoft [0].

They expose a dark secret behind the fork() call and I felt it too when implementing it myself. Almost gave me a heart failure. So here's my simplified take: what if the parent did a malloc() and put the resulting pointer on the stack (which is what git does, BTW). A simple copy of the stack to the child wouldn't be sufficient. The kernel then would have to follow the pointer, malloc() new memory and copy the data. It's not hard to see where this is going. What if there is malloc()ed memory in this copy? It's madness. I suspect this could bring a whole system to its knees.

This is a problem a kernel should not try to solve. Only user space knows about its application of memory. While reading the source code of the software I include in MinC, I always thought this is bad programming, perpetuated by Torvalds' #1 rule "don't mess with user-space", leading to things like copy-on-write.

This all leads me to believe I can get away with doing flat copies of stack and globals during fork(), implement spawn(), never implement copy-on-write and patch the userland code if needed. Am I right?

[0] https://www.microsoft.com/en-us/research/wp-content/uploads/...

cryptonector · 2025-04-26T22:56:00 1745708160

> Wait, I sense some genuine concern here. In fact, you tricked me into learning, which is what I also do with my students.

Haha, well, tricking you was not my intent. I'm glad I did though!

> The kernel then would have to follow [...]

The kernel doesn't have to do anything like that because the kernel doesn't know about user-space allocators. The kernel only knows about the pages used in constructing the user-land address space for the process. What you're really getting at is "fork-safety" (like thread safety, but fork fork()).

Whether using fork() or vfork(), in principle the child[0] process is only permitted to use async-signal-safe functions on the child side. It can only use async-signal safe functions, which are typically the system calls needed to do everything up to an execve() (which is also safe).

In practice however, many of us know how to write multi-process daemons that do very much use async-signals-UNsafe functions on both sides of the fork(), and it's OK if you know what you're doing, and if it's a _real_ fork(). If it's more like like a combination of threads and vfork() then it's not safe at all to use async-signal-UNsafe functions on the child side!

And malloc() (and free()!) is absolutely NOT async-signal-safe! Which is what you noticed in thinking about this.

So a fork() that creates a new thread but not a new address space, and which swaps the stack back and forth as each of the parent or child execute, is NOT safe to use with async-signal-UNsafe functions on the child side of fork().

So your fork() implementation, if I understood what it does, is probably only safe for a certain class of programs that happen to be using fork() exactly as the fork(2) man page says.

So you might need to patch some fork()-using OpenBSD programs to function correctly in MinC. And any other fork()-using programs one might want to use under MinC may also need to be patched.

Programs using posix_spawn() will be OK _if_ OpenBSD's implementation uses vfork() and the MinC kernel implements vfork().

With vfork() the danger of using anything other than async-signal-safe functions on the child-side is so much clearer that it is paradoxically and in my opinion safer than fork().

Although I called fork() "evil", I use it lots in my code. I've written many versions of daemon(3) that have the parent exit only when the child signals that it is ready (this is to avoid race conditions in multi-service systems and testing). I've written multi-processed daemons that do use async-signal-UNsafe functions on both sides of fork(). But I don't really condone that :cry-laugh:. One has to be quite aware of the dangers, and understand them, in order to use fork() like that.

BTW, I think it would be interesting to have a new try at implementing fork() in WIN32. I wonder if one can create a copy of the parent's address space in the child w/o having to use any of the LoadLibrary*() functions to load DLLs, thus avoiding the ASLR issues for example. I imagine that it must be possible, but also that it must be very tricky. You can see that abandoning fork() for vfork() and spawn-type APIs would be best for running Unix software on Windows...

[1] is an implementation of daemonization that spawns a child instead of fork()-and-continue. That has an option to exec on the child-side to make it possible to test on Unix logic that otherwise would only be tested on Windows. One could use the same approach to build multi-processed servers, where you'd spawn each child rather than fork() each child -- i.e., vfork() then execve() with a special command-line option or env var to indicate "you are a worker process". OpenSSH's sshd nowadays always execs on the child-side of fork().

[0] daemon(3) inherently violates the requirement that the child-side of fork() not use async-signal-UNsafe functions, but this is OK because the real [but unstated] requirement is that only one of the parent or child may use async-signal-UNsafe functions.

[1] https://github.com/heimdal/heimdal/blob/master/lib/roken/det...

thingfish · on April 24, 2025

I was a limper too, until I saw this demonstration by the creator of PowerShell from 2004: https://www.youtube.com/watch?v=4mBRA7pqITM

thingfish · on April 24, 2025

I understand your point, but I'm intrigued. Could you elaborate with some kind of example why it is disappointing that people don't date their web-pages? Is this a generic problem or is it with specific web-sites only?

dredmorbius · on April 24, 2025

Time matters.

A recently-released project may be intriguing but something risk-averse entities (individual/organisations) might prefer to hold back on for fear of, let's call it "infant death syndrome".

An old project with no active development is generally perceived as "dead", with risks that security- or bug-fixes could remain unaddressed for long times, or that there may be current zero-day exploits possible.

An old project with a healthy activity stream counters both points: the project has exhibited staying power and addressing ongoing maintenance concerns. Even active projects might give caution (say: feature creap or enshittification), but that's beyond the scope of merely giving initial / latest activity timestamps.

NB: I've well over 30 years of professional IT experience in shops ranging from small operations to multi-billion-dollar firms. Advocating for, or against, various technologies and solutions is a large part of that role. It's also something that carries into my choices for my own personal systems.

I'd be unlikely to do much with MinC myself as I don't use MS Windows, though it fits in with a long tradition of similar tools I have used, often with fond memories, including Cygwin, David Korn's UWIN, MKS Toolkit (Mortice Kern Systems, licenced by Microsoft for early versions of WSU, precursor to WSL, I've just learned), VMWare, Xen, qemu, Virtualbox, Parallels, etc. These have different architectures but all basically address the problem of "run programs from OS X on OS Y", which turns out to be a fairly-frequently-encountered challenge. I might well recommend MinC to those with such needs.

thingfish · on April 23, 2025

Ah! You got it. Are you a teacher? You should try it.

skrebbel · on April 23, 2025

Nop not at all. In fact I'm a Windows user who doesn't actually know how to pipe things into sed. :-)

thingfish · on April 23, 2025

I don't mind. Many of the comments are about my incorrect use of the term Linux. In a way, they are right. But if one wants to teach Automotive Technology to future BMW mechanics, you start them working on a VW Beetle. Wax on wax off.

kergonath · on April 23, 2025

> I don't mind.

That’s good. It’s an interesting project and the discussion here is overall very interesting. Thanks for posting it.

oblio · on April 23, 2025

You're wrong, it should be BMW Beetle.

:-p

thingfish · on April 23, 2025

thingfish · on April 23, 2025

You hit an important point here. Teaching IT to 16 jear old children is very much like being an admin for a large corporation. Most of them have installed VMWare, VirtualBox, Hyper-V and pfSense, depending on the preference of the other teachers. Most first-graders don't know anything about networking, so I didn't want to add WSL to that. I used Cygwin. Worked perfectly until I started to teach Sendmail. Sendmail runs in unprivileged mode. This means that it starts as root, but then switches to a user with very few rights. Cygwin couldn't handle that at first, but I got it working. But then students couldn't uninstall Cygwin anymore because of changed file ownership in Windows. I had to install a second Cygwin to uninstall the first. With MinC I took extreme care to get all the different models of Access Control Lists that are used in Windows right.

hughw · on April 23, 2025

wait, first graders?!?

nottorp · on April 24, 2025

He said 16 year old. I was installing Slackware from floppies at 16 so why wouldn't modern kids play with VMs?