That'll certainly help but you'd still have issues with any connected devices (USB, network API alls, etc).
I don't think it is a solvable problem though. There is no way to know ahead of time which usage is safe and which is destructive because of the variety of UNIX-like configurations out there, variety of coreutils and variety of optional flags. Not to mention all the 3rd party tools out there. And some seemingly destructive usages are vital for safe usage (eg tmp files). Worse still, by the time you build a safe container you've have spent more time spinning up a sandbox than you've had spent using the CLI preview tools anyways (yeah ZFS could help here but few people run ZFS on their dev machines and you still haven't solved mocking APIs for connected devices).
I just can't see how this can be achieved safely without completely redesigning how SYSCALLs work from the ground up to follow a method more akin to iOS or FireFox's on-demand permissions. At which point you've gone well beyond the realm of "yak shaving"
But my point is even the safest of coreutils will make dozens of SYSCALLs. The moment you start pipelining commands there will be several writes since that's how pipelining works and if you break even one of those writes the entire pipeline fails and the usefulness of this pipeline preview utility falls flat.
There isn't a way you can make this utility safe and useful.
You can make it safe, but then you're drifting into a whole new field of computing with regards to analysing suspicious binaries. Unfortunately you then don't have reliable, accurate, real time pipeline previews. Or you can have the real time previews but you then have to accept there is some risk involved. But you can't have it both safe and risk free.
This is why murex took the approach of having a safe list of trusted executables. It doesn't remove the risk but it at least reduces the risk to a subset of commands that are typically read only. However even that is far from a perfect solution.
Thinking about it more, the kind of guarantees you want are the kind given by a VM for running untrusted binaries. The only difference is that you don't mind read-access to the host system.
Yes, there will be certain things that involve system-calls or network activity that your previewer will not accurately show. But those cases fall far outside the goal of "live preview of text output of data processing pipelines".
How do you differentiate between a write which is safe and a write which isn't? File system location gives a hint but it's no guarantee. Likewise for network connections, SELinux couldn't protect against an unsafe HTTPS POST request vs side-effect free GET request.
And the amount of SELinux tuning you'd need to cover every single executable in every single configuration available on SELinux enabled machine (and even then, your limiting your tool to only work on systems with SELinux) is monumental, manual, inevitably error prone.
This thread was originally about using a container / overlay FS that has read access to the underlying filesystem, and writes happen into a temporary overlay.
Network connections and certain system-calls might still be scary. However, VM / container systems that are supposed to defend against hostile hosted machines should also be safe here. The only difference is that here information leaking from the host into the VM is the point, instead of dangerous.
No, this thread was originally about a version of said tool that I built and the dilemma I had about making it safe.
A VM would be too slow to spin up to be practical for a live preview. A container would be too host specific. And neither solutions fully solve the problem because all you’re addressing is file system access and, as mentioned several times already, that’s only scratching the surface of the interactions a command line tool can make.
My shell, murex, is designed around making networking access as seamless as writing to a local file. Bash does this too with its /dev/$datagram/$port pseudo file system too. So you can’t just dismiss non-local changes as irrelevant.
You're right, it wouldn't be perfect. I was being a bit facetious :) A perfect solution is never really possible and in this case, if you can't have pretty good guarantees of getting it right, probably best not to try. The solution here might just be to be very careful when using such a tool.
I don't think it is a solvable problem though. There is no way to know ahead of time which usage is safe and which is destructive because of the variety of UNIX-like configurations out there, variety of coreutils and variety of optional flags. Not to mention all the 3rd party tools out there. And some seemingly destructive usages are vital for safe usage (eg tmp files). Worse still, by the time you build a safe container you've have spent more time spinning up a sandbox than you've had spent using the CLI preview tools anyways (yeah ZFS could help here but few people run ZFS on their dev machines and you still haven't solved mocking APIs for connected devices).
I just can't see how this can be achieved safely without completely redesigning how SYSCALLs work from the ground up to follow a method more akin to iOS or FireFox's on-demand permissions. At which point you've gone well beyond the realm of "yak shaving"