Notes on OpenAI Q&A Finetuning GPT-3 Vs Semantic Search - Which to Use, When, and Why

A great video about finetuning vs semantic search. Finetuning teaches a model to write new patterns, not to have a theory of mind.

https://www.youtube.com/watch?v=9qq6HTr7Ocw

Overall Thoughts

This is really great video; it was very thorough and had great analogies. I didn't know about the unfreezing of the partial model, that's a neat fact!

I've had many office hours with people coming with finetune related questions that I believe would be better off 99% of the time with semantic search and the Hypothetical Document Embeddings (HyDE) in the rarest of cases perhaps.

Notated Transcript

 01:21

a history of transfer learning

 01:49

long live NLU, rip NLP

 02:30

fine tuning is tweaking a task

 03:46

only similarity b/w finetuning is q/a search is that they both use embeddings at some point

 04:34

fine tuning unfreezes part of a model -- does not stop confabulation (hallucination)

 05:30

unfreezing an entire model is expensive af

 06:00

models barf out patterns, they do not have a theory of mind or knowledge

bigger models are more convincing, but a largest model will never know itself (as an information store)

 08:20

finetuning is way more difficult than prompt engineering (10,000x harder)

 09:20

finetuning at scale is very hard -- how much do we share in alignment

 11:11

cost of fine tuning goes up with more data -- needs constant retraining

 12:20

instruct -> question + body of info -> is answer in here?

 12:58

finetuning teaches model to write a new pattern

 14:27

formulate -> research -> criticize -> answer

 15:29

dewey decimal is indexing on a smaller set of data, compile all the relevant research and scale it