Improving LLM Performance
I'm trying to figure out whether it's possible to use LLM's to categorize the internet. Using back of napkin math, if it takes a few seconds per web page this would take $XXX,XXX+ to process the common crawl. Does anyone have tips on speeding up LLMs? Is it possible to use LLMs to train a cheaper student model? Thanks!
Common Crawl is full of real junk, you'd need some kind of classifier just to pick out the stuff that's worth classifying...