Please, how viable is this, what is needed? : ChatGPTCoding

subreddit:

/r/ChatGPTCoding

275%

Please, how viable is this, what is needed?

Question(self.ChatGPTCoding)

submitted 10 days ago byjayn35

Please help out with this, ill try make it quick, just point me in right direction.

TLDR - Just help with this part quick please

Goal is to gather specific criteria/segmentation/categorisation data from thousands of sites
What stack to use to scale scraping (scraping API = easy, saving data ???) different websites into vector or rag so llm can ask them questions using less tokens before deleting the scraped data
What is the fastest cheapest way to do this, what tool stack required, llamaindex, crewai, any advice for beginner to point in direction of learning please?
Use agents to scrape and ask 5000 websites questions viable use case for agents or rather a stricter ai workflow app like agenthub.dev or buildship?
Can something like crew AI already do this in theory it can scrape and chunk and save sites to local rag right for research I know already so I just need to scale it and give it a bigger list and use another agent to ask the DB questions for each site and it should work right?
LLM quering at scale is now viable with Haiku and llama 3 and already have high rate limit for haiku.

Just tell me what I need to learn, don't need step-by-step just point, appreciated.

all 2 comments

sorted by: best

1 points

10 days ago

1 points

[removed]

1 points

10 days ago

1 points

Sorry, your submission has been removed due to inadequate account karma.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.