Hi, i’m into programming, sexual transmutation and psychedelics!

  • 10 Posts
  • 32 Comments
Joined 2 years ago
cake
Cake day: June 13th, 2023

help-circle






  • Hi! First of all thank you so much for the detailed explanation!

    What I’m trying to do is scraping some content.

    Yes I’m trying to return all links (maybe in a vector), I have a list of elements (Select, which actually is scraper::html::Select<'_, '_>) which contain essentially html nodes selections, and I would like to grab each of them, extract the actual link value (&str), convert it into an actual String and push it firstly into a vector containing all the links and then in an istance of a struct which will contain several datas about the scraped page later.

    I was trying to use a for loop because that was the first structure that came to my mind, I’m finding it hard to wrap my head around ownership and error handling with rust, using the if let construct can be a good idea, and I didn’t consider the use of break!

    I also managed to build the “match version” of what I was trying to achieve:

    fn get_links(link_nodes: scraper::html::Select<'_, '_>) -> Vec<String> {
            let mut links = vec![];
    
            for node in link_nodes {
                match node.value().attr("href") {
                    Some(link) => {
                        links.push(link.to_string());
                    }
                    None => (),
                }
            }
    
            dbg!(&links);
            links
        }
    

    I didn’t understand that I had to return the same type for each of the Option match arms, I thought enum variants could have different types, so if the Some match arm returns (), also None has to do the same…

    If I try with a simpler example I still cannot understand why I cannot do something like:

    enum OperativeSystem {
                Linux,
                Windows,
                Mac,
                Unrecognised,
            }
    
            let absent_os = OperativeSystem::Unrecognised;
            find_os(absent_os);
    
            fn find_os(os: OperativeSystem) -> String {
                match os {
                    debian => {
                        let answer = "This pc uses Linux";
                        answer.to_string()
                    }
                    windows10home => {
                        let answer = "This pc uses Windows, unlucky you!";
                        answer.to_string()
                    }
                    ios15 => {
                        let answer = "This pc uses Mac, I'm really sorry!";
                        answer.to_string()
                    }
                    _ => {
                        let is_unrecognised = true;
                        is_unrecognised
                    }
                }
            }
    

    match is much more intuitive for a beginner, there’s a lot of stuff which go under the hood with ?












  • I mean theoretically if you are hosting your own chat server, for example on Matrix, you can easily make all the chats unaccessible from the clients by issuing a command to shutdown your server or simply the chat server service if there’s no content cached locally.

    I think you can do this pretty easily with a raspberry pi by connecting via ssh…

    Just use a shell script that changes the static ip to something else after the command to shutdown the service/wipe out the data (depending on what your goal is) has been issued, or use a vpn or something like that if possible, because anyone issuing the command would need to know your server ip.

    And issuing a command by ssh to a remote server both from smartphone or pc should be as easy that you can actually build a very small app for that, or use some app that creates shortcuts that directly connects and issue custom commands.

    That way you are forced to give people your new ip every time chats become unaccessible/deleted and someone can’t connect back even if wanting to without talking to you, unless you decide you can use the older ip for whatever reason.

    Of course not using your real ip but using some service like a vpn or proxy (or tor?) would be much better here, but i don’t really know how.

    That can give you full power on the chat history and create the said “panic button” for every client involved.




  • The RSS feature is amazing, i wanted to do something like that with RSS Bridge, but it looks like both Instagram and Facebook are doing their best to block exactly these kind of things, so it works half of the times and it needs to be fixed quite often, i think now it doesn’t work very well either… Also it is very complicated to be set up if you don’t know a bit of PHP. Of course i’m willing to learn but all this blocking that projects like this (see Barinsta or Bibliogram) get is really discouraging. I think Meta content is probably one of the worst to scrape.

    Regarding Proxygram: for now it works, i’m using a public istance to grab some RSS feeds, if it proves to be reliable i will be happy to host my own istance as well, if possible :) It’s sometimes slow to grab data (i guess because sessions get easily blocked/limited, getting error 500) but not really a problem as i just want to see new events every couple of days, one issue tho is that the RSS doesn’t show all the posts (only showing the last three of them), which can be annoying as you may lose something if you don’t see it and save it.

    EDIT: It actually does get other posts as well, just reaaally slowly, meaning that if you follow really large accounts in a week or so you can find your feed full of older posts marked as unread.

    Anyway thanks to whoever is making the hard job of building/owning an instagram scraper, I really know it can be tough.




  • Not necessarily, of course open source is better, but i really just want something that gives me power over the content i can see, this is usually what happens in open source, that’s why i posted here, but im open to closedsource solutions. I didn’t know of the existence of modded stuff like this. How can they mod a closed source app? Are they partner with instagram? How does it work? Do they just scrape content like anyone else? How are they not getting banned like other scraper apps?