Just here for good conversation with good people.

  • 0 Posts
  • 77 Comments
Joined 2 years ago
cake
Cake day: July 20th, 2023

help-circle









  • OP, you’re looking for something called “Bot as a Service”. There are more and more companies that cater to those needing a bot infrastructure. Bright data, ScrapingBee, ZenRows, and Apify are some of the more common services I typically work against that offer what you’re looking for.

    *Edit: If you’re just looking for performance testing, you can use services like Loadster.






  • It can be both. Reddit has a history of fabricating conversations. The way they sell advertising implies a certain level of engagement from their user base which can lead to bots pushing products in the form of reviews or by mention.

    I think it’s worth noting that Reddit, at one time, did have third party bot protection; however, it only protected their advertising. I can only imagine what the rest of their traffic looks like, but I would not be surprised if they were using bots of their own.

    Like you said, they can make some money selling your information but they can also control the narrative how they choose.






  • Unfortunately it is out of date.

    • IPs used by bots are now *highly * distributed. We will see the same bot use hundreds of thousands of IP addresses. Each IP can easily only make one or two requests which is hard to limit with volume based detections. Also, I’m not sure where you’re at in the world, but it’s more common in countries outside of North America to have IP addresses that are heavily shared. Not to mention, there are companies in Europe that will pay you for use of your IP address explicitly for bots.

    • You might think you could limit by IP classification but bots increasingly use residential classified IPs.

    • As for allowing good bots, that isn’t so much an issue. They respect the robots.txt that companies implement. We see bots scraping data for LLMs more and more that don’t respect this file. Also, bots that are scraping prices and anything else you don’t want them doing, like credential stuffing, aren’t going to listen or respect that either.

    • In terms of using a VPN, absolutely limit outside access to sensitive infrastructure but that’s not really where most companies experience pain from bots. That’s not to say that we don’t see bots attempting vulnerability scanning. These requests can be highly distributed too.

    Companies ultimately reach out to companies like Cloudflare because the usual methods aren’t working for them. Onboarding some clients, I’ve seen more bot requests than human requests which can be detrimental for business.

    I’m happy to answer any other questions you might have. While I do work in the industry, I don’t know everything. I just want to reiterate that I am not a fan of how things are currently on the Internet. I wish this was illegal as I think it would cut down on a lot of bot traffic which would make it much more manageable for everyone.