You must be vouched for by a vouched user to participate.
-
Could you expand a little bit on the archiving of the links? How does that work with http at the moment?
My plan is to make a lib for both client and server, and then create mini binaries to demonstrate the usage.
So at some point it will be totally possible to either use it in script, or write your own rust program for that purpose.
At the moment it only handles 20 response code (sounds like that's enough for archiving purpose, but a lot of cleaning up should be made).
Basically, it fetches the url,
if it's a pdf it tries to get plain text output,
if it's html it pipes it to a rust binary that uses a "readability" algorithm library to get the content as plain text (here's the code)
also, it calls
w3m
to get formatted text output from that also.
So my thought was that if it's gemini, not http, we can get the gem mark with a small tool like
fetch_remote_content
.Yup, sounds like a tool for a job, I'll publish the poptea crate once it has clean tls client code and TOFU is optional and then I'll make PR for sic tools.
I suppose gemtext is plain enough, or you would like some formatting as well (like in readability case)?
Kudos on your project :) Could we use it to archive sic submissions with gemini urls? It's done for http only at the moment.