I just can't find the words to describe how happy I was to receive so much feedback and understanding on your part, this community is truly wonderful <3
I'll work on some of your (legal) ideas in my spare time, which is not much but I work fast! So far I've designed a generalized edit distance function with some pretty cool properties that would make it useful for running an organization under non-friendly conditions:
$ cat message1.txt
Liberal anti-fascism is a reactionary idea. Anti-fascism is not practical without being anti-capitalist.
$ cat message2.txt
The anti-fascism of liberals is not a progressive idea. It is impractical with no anti-capitalism.
$ cat leak.txt
Liberals' anti-fascism is not useful without anti-capitalism, not a progressive idea.
$ ./trace.py leak.txt message*.txt
Delta | File name
----------------------
19 | message1.txt
8 | message2.txt
Predicted origin of 'leak.txt' is 'message2.txt'.
So far it works with any sufficiently long text I've thrown at it. Make two versions of any text, rewrite any of the two into a third file, and the algorithm will trace its origin. Also the math is pretty elegant!
For obvious reasons I won't be publishing any of it any time soon :) Maybe the RTC will advise me on what to do. Or maybe I'll just hoard a bunch of software like this. Anyway, thanks, I'll keep at it.