Davriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-26 days agoThe AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comexternal-linkmessage-square250linkfedilinkarrow-up1869arrow-down16
arrow-up1863arrow-down1external-linkThe AI company Perplexity is complaining their bots can't bypass Cloudflare's firewallwww.searchenginejournal.comDavriellelouna@lemmy.world to Technology@lemmy.worldEnglish · edit-26 days agomessage-square250linkfedilink
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up5arrow-down3·edit-25 days agoThey do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping I believe a lot of websites don’t want both though
minus-squarethreeganzi@sh.itjust.workslinkfedilinkEnglisharrow-up2·4 days agoDoes it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
minus-squareElectricd@lemmybefree.netlinkfedilinkEnglisharrow-up1·4 days agoI assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on Some selenium stuff
They do have a point though. It would be great to let per-prompt searches go through, but not mass scrapping
I believe a lot of websites don’t want both though
Does it not need to be scraped to be indexed, assuming it’s semi-typical RAG stuff?
I assume their script does some search engine stuff like query google or bing and then “scrap” the links they go on
Some selenium stuff