That'll certainly help but you'd still have issues with any connected devices (U...

rocqua · on March 31, 2021

Network API calls seems like a scary one. But for things like USB and sys-calls, I'd suggest a "block and log" method.

Like "running this command shows this output (with some extra error messages" and tried to delete these files and do these system calls.

hnlmorg · on March 31, 2021

But my point is even the safest of coreutils will make dozens of SYSCALLs. The moment you start pipelining commands there will be several writes since that's how pipelining works and if you break even one of those writes the entire pipeline fails and the usefulness of this pipeline preview utility falls flat.

There isn't a way you can make this utility safe and useful.

You can make it safe, but then you're drifting into a whole new field of computing with regards to analysing suspicious binaries. Unfortunately you then don't have reliable, accurate, real time pipeline previews. Or you can have the real time previews but you then have to accept there is some risk involved. But you can't have it both safe and risk free.

This is why murex took the approach of having a safe list of trusted executables. It doesn't remove the risk but it at least reduces the risk to a subset of commands that are typically read only. However even that is far from a perfect solution.

rocqua · on March 31, 2021

Thinking about it more, the kind of guarantees you want are the kind given by a VM for running untrusted binaries. The only difference is that you don't mind read-access to the host system.

Yes, there will be certain things that involve system-calls or network activity that your previewer will not accurately show. But those cases fall far outside the goal of "live preview of text output of data processing pipelines".

hnlmorg · on March 31, 2021

No they don’t. Believe it or not a lot of people make network API calls from the command line.

I know this because I’m one of them. I’ve even published guides I’m on how to make said API calls and parse them all as one command pipeline.

michaelmior · on March 31, 2021

Sounds like SELinux might do the trick here :)

hnlmorg · on March 31, 2021

How do you differentiate between a write which is safe and a write which isn't? File system location gives a hint but it's no guarantee. Likewise for network connections, SELinux couldn't protect against an unsafe HTTPS POST request vs side-effect free GET request.

And the amount of SELinux tuning you'd need to cover every single executable in every single configuration available on SELinux enabled machine (and even then, your limiting your tool to only work on systems with SELinux) is monumental, manual, inevitably error prone.

There isn't a practical way of making this safe.

rocqua · on March 31, 2021

This thread was originally about using a container / overlay FS that has read access to the underlying filesystem, and writes happen into a temporary overlay.

Network connections and certain system-calls might still be scary. However, VM / container systems that are supposed to defend against hostile hosted machines should also be safe here. The only difference is that here information leaking from the host into the VM is the point, instead of dangerous.

hnlmorg · on March 31, 2021

No, this thread was originally about a version of said tool that I built and the dilemma I had about making it safe.

A VM would be too slow to spin up to be practical for a live preview. A container would be too host specific. And neither solutions fully solve the problem because all you’re addressing is file system access and, as mentioned several times already, that’s only scratching the surface of the interactions a command line tool can make.

My shell, murex, is designed around making networking access as seamless as writing to a local file. Bash does this too with its /dev/$datagram/$port pseudo file system too. So you can’t just dismiss non-local changes as irrelevant.

michaelmior · on March 31, 2021

You're right, it wouldn't be perfect. I was being a bit facetious :) A perfect solution is never really possible and in this case, if you can't have pretty good guarantees of getting it right, probably best not to try. The solution here might just be to be very careful when using such a tool.