Notes on OpenAI Q&A Finetuning GPT-3 Vs Semantic Search - Which to Use, When, and Why
A great video about finetuning vs semantic search. Finetuning teaches a model to write new patterns, not to have a theory of mind.
https://www.youtube.com/watch?v=9qq6HTr7Ocw
Overall Thoughts
This is really great video; it was very thorough and had great analogies. I didn't know about the unfreezing of the partial model, that's a neat fact!
I've had many office hours with people coming with finetune related questions that I believe would be better off 99% of the time with semantic search and the Hypothetical Document Embeddings (HyDE) in the rarest of cases perhaps.
Notated Transcript
01:21
a history of transfer learning
01:49
long live NLU, rip NLP
02:30
fine tuning is tweaking a task
03:46
only similarity b/w finetuning is q/a search is that they both use embeddings at some point
04:34
fine tuning unfreezes part of a model -- does not stop confabulation (hallucination)
05:30
unfreezing an entire model is expensive af
06:00
models barf out patterns, they do not have a theory of mind or knowledge
bigger models are more convincing, but a largest model will never know itself (as an information store)
08:20
finetuning is way more difficult than prompt engineering (10,000x harder)
09:20
finetuning at scale is very hard -- how much do we share in alignment
11:11
cost of fine tuning goes up with more data -- needs constant retraining
12:20
instruct -> question + body of info -> is answer in here?
12:58
finetuning teaches model to write a new pattern
14:27
formulate -> research -> criticize -> answer
15:29
dewey decimal is indexing on a smaller set of data, compile all the relevant research and scale it