videoplayback
Jun 20, 2026 05:15
· 2:11:11
· English
· Whisper Turbo
· 6 speakers
Bản ghi này hết hạn trong 24 2 ngày.
Tăng cấp cho lưu trữ vĩnh viễn →
Chỉ hiển thị
0:00
S…
Speaker 2 (videoplayback)
Hi, everyone.
0:00
S…
Speaker 2 (videoplayback)
So in this video,
0:01
S…
Speaker 2 (videoplayback)
I would like to continue our general audience series on large
0:05
S…
Speaker 2 (videoplayback)
language models like ChatGPT.
0:07
S…
Speaker 1 (videoplayback)
Now,
0:08
S…
Speaker 2 (videoplayback)
in a previous video,
0:09
S…
Speaker 2 (videoplayback)
deep dive into LLMs that you can find on my YouTube,
0:11
S…
Speaker 2 (videoplayback)
we went into a lot of the under -the -hood fundamentals of how these models are trained and
0:15
S…
Speaker 2 (videoplayback)
how you should think about their cognition or psychology.
0:18
S…
Speaker 2 (videoplayback)
Now, in this video,
0:20
S…
Speaker 2 (videoplayback)
I want to go into more practical applications of these tools.
0:23
S…
Speaker 2 (videoplayback)
I want to show you lots of examples.
0:25
S…
Speaker 2 (videoplayback)
I want to take you through all the different settings that are available.
0:27
S…
Speaker 2 (videoplayback)
And I want to show you how I use these tools and how you can also use them in
0:32
S…
Speaker 2 (videoplayback)
your own life and work.
0:33
S…
Speaker 2 (videoplayback)
So let's dive in.
0:35
S…
Speaker 1 (videoplayback)
Okay,
0:35
S…
Speaker 2 (videoplayback)
so first of all, the web page that I have pulled up here is chatgpt .com.
0:39
S…
Speaker 1 (videoplayback)
Now,
0:40
S…
Speaker 2 (videoplayback)
as you might know, ChatGPT was developed by OpenAI and deployed in 2022.
0:45
S…
Speaker 2 (videoplayback)
So this was the first time that people could actually just kind of like talk to a large language model
0:49
S…
Speaker 2 (videoplayback)
through a text interface.
0:50
S…
Speaker 2 (videoplayback)
And this went viral and all over the place on the internet.
0:53
S…
Speaker 2 (videoplayback)
And this was huge.
0:55
S…
Speaker 1 (videoplayback)
Now,
0:56
S…
Speaker 2 (videoplayback)
since then,
0:56
S…
Speaker 2 (videoplayback)
though, the ecosystem has grown a lot.
0:58
S…
Speaker 2 (videoplayback)
So I'm going to be showing you a lot of examples of ChatGPT specifically.
1:01
S…
Speaker 2 (videoplayback)
But now in 2025,
1:04
S…
Speaker 2 (videoplayback)
there's many other apps that are kind of like ChatGPT -like.
1:08
S…
Speaker 2 (videoplayback)
And this is now a much bigger and richer ecosystem.
1:11
S…
Speaker 2 (videoplayback)
So in particular,
1:11
S…
Speaker 2 (videoplayback)
I think ChatGPT by OpenAI is this original gangster incumbent.
1:15
S…
Speaker 2 (videoplayback)
It's most popular and most feature -rich also because it's been around the
1:19
S…
Speaker 1 (videoplayback)
longest.
1:20
S…
Speaker 2 (videoplayback)
But there are many other kind of clones available,
1:23
S…
Speaker 2 (videoplayback)
I would say.
1:23
S…
Speaker 2 (videoplayback)
I don't think it's too unfair to say.
1:25
S…
Speaker 2 (videoplayback)
But in some cases,
1:26
S…
Speaker 2 (videoplayback)
there are kind of like unique experiences that are not found in ChatGPT.
1:29
S…
Speaker 2 (videoplayback)
And we're going to see examples of those.
1:32
S…
Speaker 2 (videoplayback)
So for example,
1:33
S…
Speaker 2 (videoplayback)
Big Tech has followed with a lot of kind of ChatGPT -like experiences.
1:37
S…
Speaker 2 (videoplayback)
So for example,
1:38
S…
Speaker 2 (videoplayback)
Gemini, Meta AI,
1:39
S…
Speaker 2 (videoplayback)
and Copilot from Google,
1:40
S…
Speaker 2 (videoplayback)
Meta, and Microsoft,
1:41
S…
Speaker 1 (videoplayback)
respectively.
1:42
S…
Speaker 2 (videoplayback)
And there's also a number of startups.
1:43
S…
Speaker 2 (videoplayback)
So for example,
1:44
S…
Speaker 2 (videoplayback)
Anthropic has Cloud,
1:46
S…
Speaker 2 (videoplayback)
which is kind of like a ChatGPT equivalent.
1:48
S…
Speaker 2 (videoplayback)
XAI,
1:49
S…
Speaker 2 (videoplayback)
which is Elon's company,
1:50
S…
Speaker 2 (videoplayback)
has Grok.
1:51
S…
Speaker 2 (videoplayback)
And there's many others.
1:53
S…
Speaker 2 (videoplayback)
So all of these here are from the United States.
1:57
S…
Speaker 2 (videoplayback)
companies basically deep seek is a chinese company and le chat is
2:01
S…
Speaker 2 (videoplayback)
a french company mistral now where can you find these and how can
2:05
S…
Speaker 2 (videoplayback)
you keep track of them well number one on the internet somewhere but there are some leaderboards and
2:09
S…
Speaker 2 (videoplayback)
in the previous video i've shown you chatbot arena is one of them so here you
2:13
S…
Speaker 2 (videoplayback)
can come to some ranking of different models and you can see sort of their strength
2:18
S…
Speaker 2 (videoplayback)
or elo score
2:19
S…
Speaker 2 (videoplayback)
And so this is one place where you can keep track of them.
2:21
S…
Speaker 2 (videoplayback)
I would say like another place maybe is this seal leaderboard
2:26
S…
Speaker 2 (videoplayback)
from scale.
2:27
S…
Speaker 2 (videoplayback)
And so here you can also see different kinds of evals and different kinds
2:31
S…
Speaker 2 (videoplayback)
of models and how well they rank.
2:32
S…
Speaker 2 (videoplayback)
And you can also come here to see which models are currently performing the best
2:36
S…
Speaker 2 (videoplayback)
on a wide variety of tasks.
2:40
S…
Speaker 2 (videoplayback)
So understand that the ecosystem is fairly rich,
2:42
S…
Speaker 2 (videoplayback)
but for now,
2:43
S…
Speaker 2 (videoplayback)
I'm going to start with OpenAI because it is the incumbent and is most feature -rich,
2:47
S…
Speaker 2 (videoplayback)
but I'm going to show you others over time as well.
2:50
S…
Speaker 2 (videoplayback)
So let's start with ChatGPT.
2:51
S…
Speaker 2 (videoplayback)
What is this text box and what do we put in here?
2:54
S…
Speaker 2 (videoplayback)
Okay, so the most basic form of interaction with the language model is that we give it
2:58
S…
Speaker 2 (videoplayback)
text and then we get some text back in response.
3:00
S…
Speaker 2 (videoplayback)
So as an example,
3:02
S…
Speaker 2 (videoplayback)
we can ask to get a haiku about what it's like to be a large language model.
3:06
S…
Speaker 2 (videoplayback)
So this is a good kind of example task for a language model because these
3:11
S…
Speaker 2 (videoplayback)
models are really good at writing.
3:12
S…
Speaker 2 (videoplayback)
So writing haikus or poems or cover letters or
3:16
S…
Speaker 2 (videoplayback)
resumes or email replies,
3:18
S…
Speaker 2 (videoplayback)
they're just good at writing.
3:20
S…
Speaker 2 (videoplayback)
So when we ask for something like this,
3:22
S…
Speaker 2 (videoplayback)
what happens looks as follows.
3:23
S…
Speaker 2 (videoplayback)
The model basically responds,
3:25
S…
Speaker 2 (videoplayback)
words flow like a stream,
3:28
S…
Speaker 2 (videoplayback)
endless echoes nevermind,
3:29
S…
Speaker 2 (videoplayback)
ghost of thought unseen.
3:31
S…
Speaker 2 (videoplayback)
Okay,
3:32
S…
Speaker 2 (videoplayback)
it's pretty dramatic.
3:34
S…
Speaker 2 (videoplayback)
But what we're seeing here in ChatGPT is something that looks a bit like a conversation that you
3:38
S…
Speaker 2 (videoplayback)
would have with a friend.
3:38
S…
Speaker 2 (videoplayback)
These are kind of like chat bubbles.
3:40
S…
Speaker 1 (videoplayback)
Now,
3:41
S…
Speaker 2 (videoplayback)
we saw in the previous video is that what's going on under the hood here is
3:45
S…
Speaker 2 (videoplayback)
that this is what we call a user query,
3:47
S…
Speaker 2 (videoplayback)
this piece of text.
3:49
S…
Speaker 2 (videoplayback)
And this piece of text and also the response from the model,
3:52
S…
Speaker 2 (videoplayback)
this piece of text is chopped up into little text chunks that
3:56
S…
Speaker 2 (videoplayback)
we call tokens.
3:58
S…
Speaker 2 (videoplayback)
So this sequence of text is under the hood a token sequence,
4:03
S…
Speaker 2 (videoplayback)
one -dimensional token sequence.
4:04
S…
Speaker 2 (videoplayback)
Now the way we can see those tokens is we can use an app like,
4:07
S…
Speaker 2 (videoplayback)
for example, TickTokenizer.
4:08
S…
Speaker 2 (videoplayback)
So making sure that GPT -40 is selected,
4:10
S…
Speaker 2 (videoplayback)
I can paste my text here.
4:12
S…
Speaker 2 (videoplayback)
And this is actually what the model sees under the hood.
4:14
S…
Speaker 2 (videoplayback)
My piece of text to the model looks like a sequence of
4:19
S…
Speaker 2 (videoplayback)
exactly 15 tokens.
4:20
S…
Speaker 2 (videoplayback)
And these are the little text chunks that the model sees.
4:25
S…
Speaker 2 (videoplayback)
Now, there's a vocabulary here of 200 ,000 roughly of possible
4:29
S…
Speaker 2 (videoplayback)
tokens.
4:30
S…
Speaker 2 (videoplayback)
And then these are the token IDs corresponding to all these little
4:34
S…
Speaker 2 (videoplayback)
text chunks that are part of my query.
4:35
S…
Speaker 2 (videoplayback)
And you can play with this and update it.
4:37
S…
Speaker 2 (videoplayback)
And you can see that,
4:38
S…
Speaker 2 (videoplayback)
for example, this is kate -sensitive.
4:39
S…
Speaker 2 (videoplayback)
You would get different tokens.
4:40
S…
Speaker 2 (videoplayback)
And you can kind of edit it and see live how the token sequence changes.
4:44
S…
Speaker 2 (videoplayback)
So our query was 15 tokens.
4:46
S…
Speaker 2 (videoplayback)
And then the model response is right here.
4:50
S…
Speaker 2 (videoplayback)
And it responded back to us with a sequence of exactly 19
4:54
S…
Speaker 1 (videoplayback)
tokens.
4:54
S…
Speaker 2 (videoplayback)
So that haiku is this sequence of 19 tokens.
4:58
S…
Speaker 1 (videoplayback)
Now...
5:00
S…
Speaker 2 (videoplayback)
So we said 15 tokens and it said 19 tokens back.
5:03
S…
Speaker 1 (videoplayback)
Now,
5:04
S…
Speaker 2 (videoplayback)
because this is a conversation and we want to actually maintain a lot of the metadata that
5:08
S…
Speaker 2 (videoplayback)
actually makes up a conversation object,
5:10
S…
Speaker 2 (videoplayback)
this is not all that's going on under the hood.
5:13
S…
Speaker 2 (videoplayback)
And we saw in the previous video a little bit about the conversation format.
5:17
S…
Speaker 2 (videoplayback)
So it gets a little bit more complicated in that we have to take our user
5:21
S…
Speaker 1 (videoplayback)
query.
5:22
S…
Speaker 2 (videoplayback)
And we have to actually use this chat format.
5:24
S…
Speaker 2 (videoplayback)
So let me delete the system message.
5:26
S…
Speaker 2 (videoplayback)
I don't think it's very important for the purposes of understanding what's going on.
5:29
S…
Speaker 2 (videoplayback)
Let me paste my message as the user.
5:31
S…
Speaker 2 (videoplayback)
And then let me paste the model response as an assistant.
5:35
S…
Speaker 2 (videoplayback)
And then let me crop it here properly.
5:38
S…
Speaker 2 (videoplayback)
The tool doesn't do that properly.
5:40
S…
Speaker 2 (videoplayback)
So here we have it as it actually
5:44
S…
Speaker 2 (videoplayback)
happens under the hood.
5:46
S…
Speaker 2 (videoplayback)
There are all these special tokens that basically begin a message from
5:50
S…
Speaker 2 (videoplayback)
the user, and then the user says,
5:52
S…
Speaker 2 (videoplayback)
and this is the content of what we said.
5:54
S…
Speaker 2 (videoplayback)
And then the user ends,
5:56
S…
Speaker 2 (videoplayback)
and then the assistant begins and says this,
5:59
S…
Speaker 1 (videoplayback)
etc.
6:00
S…
Speaker 1 (videoplayback)
Now,
6:01
S…
Speaker 2 (videoplayback)
the precise details of the conversation format are not important.
6:03
S…
Speaker 2 (videoplayback)
What I want to get across here is that what looks to you and I as little chat
6:08
S…
Speaker 2 (videoplayback)
bubbles going back and forth,
6:09
S…
Speaker 2 (videoplayback)
under the hood,
6:10
S…
Speaker 2 (videoplayback)
we are collaborating with the model.
6:12
S…
Speaker 2 (videoplayback)
And we're both writing into a token stream.
6:16
S…
Speaker 2 (videoplayback)
And these two bubbles back and forth were in
6:20
S…
Speaker 2 (videoplayback)
a sequence of exactly 42 tokens under the hood.
6:23
S…
Speaker 2 (videoplayback)
I contributed some of the first tokens,
6:25
S…
Speaker 2 (videoplayback)
and then the model continued the sequence of tokens with its response.
6:29
S…
Speaker 2 (videoplayback)
And we could alternate and continue adding tokens here.
6:32
S…
Speaker 2 (videoplayback)
And together,
6:33
S…
Speaker 2 (videoplayback)
we are building out a token window,
6:35
S…
Speaker 2 (videoplayback)
a one -dimensional sequence of tokens.
6:38
S…
Speaker 1 (videoplayback)
Okay,
6:39
S…
Speaker 2 (videoplayback)
so let's come back to ChachiPT now.
6:42
S…
Speaker 2 (videoplayback)
What we are seeing here is kind of like little bubbles going back and forth between us and the
6:46
S…
Speaker 2 (videoplayback)
model. Under the hood,
6:47
S…
Speaker 2 (videoplayback)
we are building out a one -dimensional token sequence.
6:49
S…
Speaker 2 (videoplayback)
When I click new chat here,
6:52
S…
Speaker 2 (videoplayback)
that wipes the token window.
6:54
S…
Speaker 2 (videoplayback)
That resets the tokens to basically zero again and
6:58
S…
Speaker 2 (videoplayback)
restarts the conversation from scratch.
7:00
S…
Speaker 1 (videoplayback)
Now,
7:00
S…
Speaker 2 (videoplayback)
the cartoon diagram that I have in my mind when I'm speaking to a model looks something like this.
7:05
S…
Speaker 2 (videoplayback)
When we click new chat,
7:07
S…
Speaker 2 (videoplayback)
we begin a token sequence.
7:10
S…
Speaker 2 (videoplayback)
So this is a one -dimensional sequence of tokens.
7:12
S…
Speaker 2 (videoplayback)
The user,
7:13
S…
Speaker 2 (videoplayback)
we can write tokens into this stream.
7:16
S…
Speaker 2 (videoplayback)
And then when we hit enter,
7:18
S…
Speaker 2 (videoplayback)
we transfer control over to the language model.
7:21
S…
Speaker 2 (videoplayback)
And the language model responds with its own token streams.
7:24
S…
Speaker 2 (videoplayback)
And then the language model has a special token that basically says something
7:29
S…
Speaker 2 (videoplayback)
along the lines of,
7:29
S…
Speaker 1 (videoplayback)
I'm done.
7:30
S…
Speaker 2 (videoplayback)
So when it emits that token,
7:32
S…
Speaker 2 (videoplayback)
the chat GPT application transfers control back to us,
7:35
S…
Speaker 2 (videoplayback)
and we can take turns.
7:36
S…
Speaker 2 (videoplayback)
Together,
7:37
S…
Speaker 2 (videoplayback)
we are building out the token stream,
7:40
S…
Speaker 2 (videoplayback)
which we also call the context window.
7:42
S…
Speaker 2 (videoplayback)
So the context window is kind of like this working memory of
7:46
S…
Speaker 2 (videoplayback)
tokens, and anything that is inside this context window is kind of like in the working
7:50
S…
Speaker 2 (videoplayback)
memory of this conversation,
7:52
S…
Speaker 2 (videoplayback)
and is very directly accessible by the model.
7:56
S…
Speaker 2 (videoplayback)
Now, what is this entity here that we are talking to and how should we think about it?
8:00
S…
Speaker 1 (videoplayback)
Well,
8:01
S…
Speaker 2 (videoplayback)
this language model here,
8:02
S…
Speaker 2 (videoplayback)
we saw that the way it is trained in the previous video,
8:05
S…
Speaker 2 (videoplayback)
we saw there are two major stages,
8:07
S…
Speaker 2 (videoplayback)
the pre -training stage and the post -training stage.
8:10
S…
Speaker 2 (videoplayback)
The pre -training stage is kind of like taking all of
8:14
S…
Speaker 2 (videoplayback)
internet, chopping it up into tokens,
8:17
S…
Speaker 2 (videoplayback)
and then compressing it into a single kind of like zip file.
8:21
S…
Speaker 2 (videoplayback)
But the zip file is not exact.
8:23
S…
Speaker 2 (videoplayback)
The zip file is lossy and probabilistic zip file because
8:27
S…
Speaker 2 (videoplayback)
we can't possibly represent all of internet in just one sort of like,
8:30
S…
Speaker 2 (videoplayback)
say, terabyte of zip file because
8:35
S…
Speaker 2 (videoplayback)
there's just way too much information.
8:36
S…
Speaker 2 (videoplayback)
So we just kind of get the gestalt or the vibes inside this
8:41
S…
Speaker 2 (videoplayback)
zip file.
8:42
S…
Speaker 1 (videoplayback)
Now,
8:44
S…
Speaker 2 (videoplayback)
what's actually inside the zip file are the parameters of a neural
8:48
S…
Speaker 1 (videoplayback)
network.
8:49
S…
Speaker 2 (videoplayback)
And so,
8:49
S…
Speaker 2 (videoplayback)
for example, a one terabyte zip file would correspond to roughly,
8:53
S…
Speaker 2 (videoplayback)
say, one trillion parameters inside this neural network.
8:56
S…
Speaker 2 (videoplayback)
And what this neural network is trying to do is it's trying to basically
9:01
S…
Speaker 2 (videoplayback)
take tokens and it's trying to predict the next token in a sequence.
9:04
S…
Speaker 2 (videoplayback)
But it's doing that on internet documents.
9:07
S…
Speaker 2 (videoplayback)
So it's kind of like this internet document generator,
9:10
S…
Speaker 1 (videoplayback)
right?
9:11
S…
Speaker 2 (videoplayback)
And in the process of predicting the next token in a sequence on internet,
9:15
S…
Speaker 2 (videoplayback)
the neural network gains a huge amount of knowledge about
9:19
S…
Speaker 1 (videoplayback)
the world.
9:20
S…
Speaker 2 (videoplayback)
And this knowledge is all represented and stuffed and compressed
9:24
S…
Speaker 2 (videoplayback)
inside the 1 trillion parameters,
9:26
S…
Speaker 2 (videoplayback)
roughly, of this language model.
9:27
S…
Speaker 1 (videoplayback)
Now,
9:29
S…
Speaker 2 (videoplayback)
this pre -training stage also we saw is fairly costly.
9:31
S…
Speaker 2 (videoplayback)
So this can be many tens of millions of dollars,
9:34
S…
Speaker 2 (videoplayback)
say like three months of training and so on.
9:36
S…
Speaker 2 (videoplayback)
So this is a costly long phase.
9:39
S…
Speaker 2 (videoplayback)
For that reason,
9:40
S…
Speaker 2 (videoplayback)
this phase is not done that often.
9:43
S…
Speaker 2 (videoplayback)
So for example,
9:44
S…
Speaker 2 (videoplayback)
GPT -40,
9:45
S…
Speaker 2 (videoplayback)
this model was pre -trained probably many
9:49
S…
Speaker 2 (videoplayback)
months ago, maybe like even a year ago by now.
9:51
S…
Speaker 2 (videoplayback)
And so that's why these models are a little bit out of date.
9:54
S…
Speaker 2 (videoplayback)
They have what's called a knowledge cutoff because that knowledge cutoff
9:58
S…
Speaker 2 (videoplayback)
corresponds to when the model was...
10:00
S…
Speaker 2 (videoplayback)
pre -trained and its knowledge only goes up to that point.
10:03
S…
Speaker 1 (videoplayback)
Now,
10:04
S…
Speaker 2 (videoplayback)
some knowledge can come into
10:08
S…
Speaker 2 (videoplayback)
the model through the post -training phase,
10:10
S…
Speaker 2 (videoplayback)
which we'll talk about in a second.
10:12
S…
Speaker 2 (videoplayback)
But roughly speaking,
10:13
S…
Speaker 2 (videoplayback)
you should think of these models as kind of like a little bit out of date because pre
10:17
S…
Speaker 2 (videoplayback)
-training is way too expensive and happens infrequently.
10:21
S…
Speaker 2 (videoplayback)
So any kind of recent information,
10:23
S…
Speaker 2 (videoplayback)
like if you wanted to talk to your model about something that happened last week or so on,
10:26
S…
Speaker 2 (videoplayback)
we're going to need other ways of providing that information to the model because it's not
10:30
S…
Speaker 2 (videoplayback)
stored in the knowledge of the model.
10:32
S…
Speaker 2 (videoplayback)
So we're going to have various tool use to give that information to the
10:36
S…
Speaker 1 (videoplayback)
model. Now,
10:37
S…
Speaker 2 (videoplayback)
after pre -training,
10:38
S…
Speaker 2 (videoplayback)
there's a second stage called post -training.
10:40
S…
Speaker 2 (videoplayback)
And the post -training stage is really attaching a smiley face to this zip file.
10:45
S…
Speaker 2 (videoplayback)
Because we don't want to generate internet documents.
10:47
S…
Speaker 2 (videoplayback)
We want this thing to take on the persona of an assistant that
10:52
S…
Speaker 2 (videoplayback)
responds to user queries.
10:54
S…
Speaker 2 (videoplayback)
And that's done in the process of post -training,
10:56
S…
Speaker 2 (videoplayback)
where we swap out the dataset for a dataset of conversations that are built
11:00
S…
Speaker 2 (videoplayback)
out by humans.
11:02
S…
Speaker 2 (videoplayback)
So this is basically where the model takes on this persona so that
11:06
S…
Speaker 2 (videoplayback)
we can ask questions and it responds with answers.
11:09
S…
Speaker 2 (videoplayback)
So it takes on the style of an assistant,
11:13
S…
Speaker 2 (videoplayback)
that's post -training,
11:14
S…
Speaker 2 (videoplayback)
but it has the knowledge of all of internet,
11:17
S…
Speaker 2 (videoplayback)
and that's by pre -training.
11:19
S…
Speaker 2 (videoplayback)
So these two are combined in this artifact.
11:23
S…
Speaker 2 (videoplayback)
Now the important thing to understand here,
11:26
S…
Speaker 2 (videoplayback)
I think, for this section is that what you are talking to is a fully
11:30
S…
Speaker 2 (videoplayback)
self -contained entity by default.
11:32
S…
Speaker 2 (videoplayback)
This language model,
11:33
S…
Speaker 2 (videoplayback)
think of it as a one terabyte file on a disk.
11:36
S…
Speaker 2 (videoplayback)
Secretly,
11:37
S…
Speaker 2 (videoplayback)
that represents one trillion parameters and their precise settings inside the neural network
11:42
S…
Speaker 2 (videoplayback)
that's trying to give you the next token in the sequence.
11:44
S…
Speaker 2 (videoplayback)
But this is the fully self -contained entity.
11:47
S…
Speaker 2 (videoplayback)
There's no calculator.
11:48
S…
Speaker 2 (videoplayback)
There's no computer and Python interpreter.
11:51
S…
Speaker 2 (videoplayback)
There's no worldwide web browsing.
11:53
S…
Speaker 2 (videoplayback)
There's none of that.
11:54
S…
Speaker 2 (videoplayback)
There's no tool use yet in what we've talked about so far.
11:56
S…
Speaker 2 (videoplayback)
You're talking to a zip file.
11:58
S…
Speaker 2 (videoplayback)
If you stream tokens to it,
12:00
S…
Speaker 2 (videoplayback)
it will respond with tokens back.
12:02
S…
Speaker 2 (videoplayback)
And the zip file has the knowledge from pre -training and it has the
12:06
S…
Speaker 2 (videoplayback)
style and form from post -training.
12:09
S…
Speaker 2 (videoplayback)
And so that's roughly how you can think about this
12:13
S…
Speaker 1 (videoplayback)
entity. Okay,
12:14
S…
Speaker 2 (videoplayback)
so if I had to summarize what we talked about so far,
12:16
S…
Speaker 2 (videoplayback)
I would probably do it in the form of an introduction of ChatGPT in a way that I think
12:20
S…
Speaker 2 (videoplayback)
you should think about it.
12:21
S…
Speaker 2 (videoplayback)
So the introduction would be,
12:23
S…
Speaker 2 (videoplayback)
hi, I'm ChatGPT.
12:24
S…
Speaker 2 (videoplayback)
I'm a one terabyte zip file.
12:26
S…
Speaker 2 (videoplayback)
My knowledge comes from the internet,
12:28
S…
Speaker 2 (videoplayback)
which I read in its entirety.
12:31
S…
Speaker 2 (videoplayback)
about six months ago,
12:32
S…
Speaker 2 (videoplayback)
and I only remember vaguely,
12:34
S…
Speaker 1 (videoplayback)
okay?
12:35
S…
Speaker 2 (videoplayback)
And my winning personality was programmed,
12:37
S…
Speaker 2 (videoplayback)
by example,
12:38
S…
Speaker 2 (videoplayback)
by human labelers at OpenAI.
12:40
S…
Speaker 2 (videoplayback)
So the personality is programmed in post -training,
12:44
S…
Speaker 2 (videoplayback)
and the knowledge comes from compressing the internet during pre
12:48
S…
Speaker 1 (videoplayback)
-training.
12:49
S…
Speaker 2 (videoplayback)
And this knowledge is a little bit out of date and it's a probabilistic and slightly vague.
12:53
S…
Speaker 2 (videoplayback)
Some of the things that probably are mentioned very frequently on the internet,
12:57
S…
Speaker 2 (videoplayback)
I will have a lot better recollection of than some of the things that are discussed very
13:02
S…
Speaker 2 (videoplayback)
rarely, very similar to what you might expect with a human.
13:05
S…
Speaker 2 (videoplayback)
So let's now talk about some of the repercussions of this entity
13:09
S…
Speaker 2 (videoplayback)
and how we can talk to it and what kinds of things we can expect from it.
13:12
S…
Speaker 1 (videoplayback)
Now,
13:12
S…
Speaker 2 (videoplayback)
I'd like to use real examples when we actually go through this.
13:15
S…
Speaker 2 (videoplayback)
So for example,
13:16
S…
Speaker 2 (videoplayback)
this morning, I asked ChatGPT the following.
13:18
S…
Speaker 2 (videoplayback)
How much caffeine is in one shot of Americana?
13:20
S…
Speaker 2 (videoplayback)
And I was curious because I was comparing it to matcha.
13:22
S…
Speaker 1 (videoplayback)
Now,
13:23
S…
Speaker 2 (videoplayback)
Chachi PT will tell me that this is roughly 63 milligrams of caffeine or so.
13:27
S…
Speaker 2 (videoplayback)
Now, the reason I'm asking ChachiPT this question that I think this is okay is,
13:31
S…
Speaker 2 (videoplayback)
number one, I'm not asking about any knowledge that is very recent.
13:35
S…
Speaker 2 (videoplayback)
So I do expect that the model has sort of read about how much caffeine there is
13:39
S…
Speaker 2 (videoplayback)
in one shot.
13:40
S…
Speaker 2 (videoplayback)
I don't think this information has changed too much.
13:43
S…
Speaker 2 (videoplayback)
And number two,
13:43
S…
Speaker 2 (videoplayback)
I think this information is extremely frequent on the internet.
13:46
S…
Speaker 2 (videoplayback)
This kind of a question and this kind of information has occurred all over the place on the internet.
13:50
S…
Speaker 2 (videoplayback)
And because there were so many mentions of it,
13:52
S…
Speaker 2 (videoplayback)
I expect the model to have good memory of it and its knowledge.
13:56
S…
Speaker 2 (videoplayback)
So there's no tool use,
13:58
S…
Speaker 2 (videoplayback)
and the model,
13:58
S…
Speaker 2 (videoplayback)
the zip file,
13:59
S…
Speaker 2 (videoplayback)
responded that there's roughly 63 milligrams.
14:01
S…
Speaker 2 (videoplayback)
Now,
14:02
S…
Speaker 2 (videoplayback)
I'm not guaranteed that this is the correct answer.
14:05
S…
Speaker 2 (videoplayback)
This is just its vague recollection of the internet.
14:09
S…
Speaker 2 (videoplayback)
But I can go to primary sources and maybe I can look up,
14:12
S…
Speaker 2 (videoplayback)
okay,
14:13
S…
Speaker 2 (videoplayback)
caffeine and Americano and I could verify that,
14:16
S…
Speaker 2 (videoplayback)
yeah, it looks to be about 63 is roughly right.
14:18
S…
Speaker 2 (videoplayback)
And you can look at primary sources to decide if this is true or not.
14:21
S…
Speaker 2 (videoplayback)
So I'm not strictly speaking guaranteed that this is true,
14:24
S…
Speaker 2 (videoplayback)
but I think probably this is the kind of thing that ChatGPT would know.
14:27
S…
Speaker 2 (videoplayback)
Here's an example of a conversation I had two days ago,
14:30
S…
Speaker 1 (videoplayback)
actually.
14:31
S…
Speaker 2 (videoplayback)
And there's another example of a knowledge -based conversation and
14:35
S…
Speaker 2 (videoplayback)
things that I'm comfortable asking of ChatGPT with some caveats.
14:38
S…
Speaker 2 (videoplayback)
So I'm a bit sick.
14:39
S…
Speaker 2 (videoplayback)
I have runny nose and I want to get meds that help with that.
14:42
S…
Speaker 2 (videoplayback)
So it told me a bunch of stuff.
14:43
S…
Speaker 2 (videoplayback)
And I want my nose
14:47
S…
Speaker 2 (videoplayback)
to not be runny.
14:48
S…
Speaker 2 (videoplayback)
So I gave it a clarification based on what it said.
14:50
S…
Speaker 2 (videoplayback)
And then it kind of gave me some of the things that might be helpful with that.
14:53
S…
Speaker 2 (videoplayback)
And then I looked at some of the meds that I have at home and I said,
14:57
S…
Speaker 2 (videoplayback)
does day cool or night cool work?
15:00
S…
Speaker 2 (videoplayback)
It went off and it kind of like went over the ingredients of Dayquil and Nikol and
15:04
S…
Speaker 2 (videoplayback)
whether or not they helped mitigate running nose.
15:08
S…
Speaker 2 (videoplayback)
Now, when these ingredients are coming here,
15:10
S…
Speaker 2 (videoplayback)
again, remember, we are talking to a zip file that has a recollection of the internet.
15:13
S…
Speaker 2 (videoplayback)
I'm not guaranteed that these ingredients are correct.
15:16
S…
Speaker 2 (videoplayback)
And in fact,
15:17
S…
Speaker 2 (videoplayback)
I actually took out the box and I looked at the ingredients and I made sure that NyQuil
15:21
S…
Speaker 2 (videoplayback)
ingredients are exactly these ingredients.
15:23
S…
Speaker 2 (videoplayback)
And I'm doing that because I don't always fully trust what's coming out here,
15:28
S…
Speaker 2 (videoplayback)
right? This is just a probabilistic statistical recollection of the internet.
15:31
S…
Speaker 2 (videoplayback)
But that said,
15:33
S…
Speaker 2 (videoplayback)
conversations of NyQuil and NyQuil,
15:35
S…
Speaker 2 (videoplayback)
these are very common meds.
15:37
S…
Speaker 2 (videoplayback)
Probably there's tons of information about a lot of this on the internet.
15:40
S…
Speaker 2 (videoplayback)
And this is the kind of things that the model have pretty good recollection of.
15:44
S…
Speaker 2 (videoplayback)
So actually these were all correct.
15:46
S…
Speaker 2 (videoplayback)
And then I said,
15:47
S…
Speaker 2 (videoplayback)
okay, well, I have Nikol.
15:48
S…
Speaker 2 (videoplayback)
How fast would it act roughly?
15:51
S…
Speaker 2 (videoplayback)
And it kind of tells me.
15:52
S…
Speaker 2 (videoplayback)
And then is acetaminophen basically a Tylenol?
15:56
S…
Speaker 2 (videoplayback)
And it says yes.
15:57
S…
Speaker 2 (videoplayback)
So this is a good example of how ChatGPT was useful to me.
16:00
S…
Speaker 2 (videoplayback)
It is a knowledge -based query.
16:01
S…
Speaker 2 (videoplayback)
This knowledge sort of isn't recent knowledge.
16:04
S…
Speaker 2 (videoplayback)
This is all coming from the knowledge of the model.
16:07
S…
Speaker 2 (videoplayback)
I think this is common information.
16:08
S…
Speaker 2 (videoplayback)
This is not a high -stakes situation.
16:10
S…
Speaker 2 (videoplayback)
I'm checking ChatGPT a little bit,
16:13
S…
Speaker 2 (videoplayback)
but also this is not a high -stakes situation,
16:15
S…
Speaker 2 (videoplayback)
so no big deal.
16:16
S…
Speaker 2 (videoplayback)
So I popped an I call,
16:17
S…
Speaker 2 (videoplayback)
and indeed it helped.
16:18
S…
Speaker 2 (videoplayback)
But that's roughly how I'm thinking about what's coming back here.
16:22
S…
Speaker 1 (videoplayback)
Okay,
16:22
S…
Speaker 2 (videoplayback)
so at this point, I want to make two nodes.
16:25
S…
Speaker 2 (videoplayback)
The first note I want to make is that naturally as you interact with these models,
16:28
S…
Speaker 2 (videoplayback)
you'll see that your conversations are growing longer,
16:31
S…
Speaker 1 (videoplayback)
right?
16:31
S…
Speaker 2 (videoplayback)
Anytime you are switching topic,
16:34
S…
Speaker 2 (videoplayback)
I encourage you to always start a new chat.
16:37
S…
Speaker 2 (videoplayback)
When you start a new chat,
16:39
S…
Speaker 2 (videoplayback)
as we talked about,
16:40
S…
Speaker 2 (videoplayback)
you are wiping the context window of tokens and resetting it back to zero.
16:44
S…
Speaker 2 (videoplayback)
If it is the case that those tokens are not any more useful to your next query,
16:48
S…
Speaker 2 (videoplayback)
I encourage you to do this because these tokens in this window are expensive.
16:52
S…
Speaker 2 (videoplayback)
And they're expensive in kind of like two ways.
16:55
S…
Speaker 2 (videoplayback)
Number one, if you have lots of tokens here,
16:58
S…
Speaker 2 (videoplayback)
then the model can actually find it a little bit distracting.
17:01
S…
Speaker 2 (videoplayback)
So if this was a lot of tokens,
17:04
S…
Speaker 2 (videoplayback)
this is kind of like the working memory of the model.
17:07
S…
Speaker 2 (videoplayback)
The model might be distracted by all the tokens in the past when it is trying
17:11
S…
Speaker 2 (videoplayback)
to sample tokens much later on.
17:13
S…
Speaker 2 (videoplayback)
So it could be distracting and it could actually decrease the accuracy of the
17:17
S…
Speaker 2 (videoplayback)
model and of its performance.
17:18
S…
Speaker 2 (videoplayback)
And number two,
17:19
S…
Speaker 2 (videoplayback)
the more tokens are in the window,
17:22
S…
Speaker 2 (videoplayback)
the more expensive it is by a little bit,
17:24
S…
Speaker 2 (videoplayback)
not by too much,
17:25
S…
Speaker 2 (videoplayback)
but by a little bit to sample the next token in the sequence.
17:28
S…
Speaker 2 (videoplayback)
So your model is actually slightly slowing down.
17:30
S…
Speaker 2 (videoplayback)
It's becoming more expensive to calculate the next token and the more
17:34
S…
Speaker 2 (videoplayback)
tokens there are here.
17:37
S…
Speaker 2 (videoplayback)
And so think of the tokens in the context window as a precious resource.
17:41
S…
Speaker 2 (videoplayback)
Think of that as the working memory of the model and don't
17:45
S…
Speaker 2 (videoplayback)
overload it with irrelevant information and keep it as short as you can.
17:49
S…
Speaker 2 (videoplayback)
And you can expect that to work faster and slightly better.
17:52
S…
Speaker 1 (videoplayback)
Of course,
17:53
S…
Speaker 2 (videoplayback)
if the information actually is related to your task,
17:56
S…
Speaker 2 (videoplayback)
you may want to keep it in there.
17:57
S…
Speaker 2 (videoplayback)
But I encourage you to,
17:58
S…
Speaker 2 (videoplayback)
as often as you can,
17:59
S…
Speaker 2 (videoplayback)
basically start a new chat whenever you are switching topic.
18:03
S…
Speaker 2 (videoplayback)
The second thing is that I always encourage you to keep in mind what model you are actually
18:07
S…
Speaker 2 (videoplayback)
using. So here on the top left,
18:09
S…
Speaker 2 (videoplayback)
we can drop down and we can see that we are currently using GPT -40.
18:12
S…
Speaker 1 (videoplayback)
Now,
18:13
S…
Speaker 2 (videoplayback)
there are many different models of many different flavors and there are
18:17
S…
Speaker 2 (videoplayback)
too many actually,
18:18
S…
Speaker 2 (videoplayback)
but we'll go through some of these over time.
18:19
S…
Speaker 2 (videoplayback)
So we are using GPT -40 right now and in everything that I've shown you,
18:23
S…
Speaker 2 (videoplayback)
this is GPT -40.
18:24
S…
Speaker 1 (videoplayback)
Now,
18:25
S…
Speaker 2 (videoplayback)
when I open a new incognito window,
18:27
S…
Speaker 2 (videoplayback)
so if I go to chatgpt .com and I'm not logged in,
18:32
S…
Speaker 2 (videoplayback)
The model that I'm talking to here,
18:33
S…
Speaker 2 (videoplayback)
so if I just say hello,
18:34
S…
Speaker 2 (videoplayback)
the model that I'm talking to here might not be GPT 4 .0.
18:37
S…
Speaker 2 (videoplayback)
It might be a smaller version.
18:39
S…
Speaker 2 (videoplayback)
Now,
18:40
S…
Speaker 2 (videoplayback)
unfortunately, OpenAI does not tell me when I'm not logged in what model I'm using,
18:44
S…
Speaker 2 (videoplayback)
which is kind of unfortunate.
18:45
S…
Speaker 2 (videoplayback)
But it's possible that you are using a smaller,
18:47
S…
Speaker 2 (videoplayback)
kind of dumber model.
18:49
S…
Speaker 2 (videoplayback)
So if we go to the ChatGPT pricing page here,
18:51
S…
Speaker 2 (videoplayback)
we see that they have three basic tiers for individuals,
18:55
S…
Speaker 2 (videoplayback)
the free,
18:56
S…
Speaker 2 (videoplayback)
plus, and pro.
18:57
S…
Speaker 2 (videoplayback)
And in the free tier,
19:00
S…
Speaker 2 (videoplayback)
you have access to what's called GPT -40 Mini.
19:02
S…
Speaker 2 (videoplayback)
And this is a smaller version of GPT -40.
19:05
S…
Speaker 2 (videoplayback)
It is a smaller model with a smaller number of parameters.
19:08
S…
Speaker 2 (videoplayback)
It's not going to be as creative,
19:10
S…
Speaker 2 (videoplayback)
like its writing might not be as good.
19:12
S…
Speaker 2 (videoplayback)
Its knowledge is not going to be as good.
19:14
S…
Speaker 2 (videoplayback)
It's going to probably hallucinate a bit more,
19:16
S…
Speaker 1 (videoplayback)
etc.
19:16
S…
Speaker 2 (videoplayback)
But it is kind of like the free offering,
19:19
S…
Speaker 2 (videoplayback)
the free tier.
19:19
S…
Speaker 2 (videoplayback)
They do say that you have limited access to 4 .0 and O3 Mini,
19:23
S…
Speaker 2 (videoplayback)
but I'm not actually 100 % sure.
19:24
S…
Speaker 1 (videoplayback)
Like,
19:25
S…
Speaker 2 (videoplayback)
it didn't tell us which model we were using,
19:26
S…
Speaker 2 (videoplayback)
so we just fundamentally don't know.
19:29
S…
Speaker 2 (videoplayback)
Now, when you pay for $20 per month,
19:31
S…
Speaker 2 (videoplayback)
even though it doesn't say this,
19:33
S…
Speaker 2 (videoplayback)
I think basically like they're screwing up on how they're describing this.
19:37
S…
Speaker 2 (videoplayback)
But if you go to fine print,
19:38
S…
Speaker 2 (videoplayback)
limits apply,
19:38
S…
Speaker 2 (videoplayback)
we can see that the plus users get 80
19:43
S…
Speaker 2 (videoplayback)
messages every three hours for GPT -40.
19:45
S…
Speaker 2 (videoplayback)
So that's the flagship biggest model that's currently available as of today.
19:50
S…
Speaker 2 (videoplayback)
That's available and that's what we want to be using.
19:53
S…
Speaker 2 (videoplayback)
So if you pay $20 per month,
19:55
S…
Speaker 2 (videoplayback)
you have that with some limits.
19:57
S…
Speaker 2 (videoplayback)
And then if you pay for $200 per month, you get the price.
20:00
S…
Speaker 2 (videoplayback)
and there's a bunch of additional goodies as well as unlimited GPT -4O.
20:03
S…
Speaker 2 (videoplayback)
And we're going to go into some of this because I do pay for pro subscription.
20:06
S…
Speaker 1 (videoplayback)
Now,
20:08
S…
Speaker 2 (videoplayback)
the whole takeaway I want you to get from this is be mindful of
20:12
S…
Speaker 2 (videoplayback)
the models that you're using.
20:13
S…
Speaker 2 (videoplayback)
Typically with these companies,
20:14
S…
Speaker 2 (videoplayback)
the bigger models are more expensive to...
20:18
S…
Speaker 1 (videoplayback)
calculate.
20:18
S…
Speaker 2 (videoplayback)
And so therefore,
20:19
S…
Speaker 2 (videoplayback)
the companies charge more for the bigger models.
This transcript was generated by AI (automatic speech recognition). May contain errors — verify against the original audio for critical use. AI policy
Tóm tắt
Nhấn tổng hợp để tạo một tóm tắt AI của bản ghi này.
Tóm tắt...
Hỏi AI về bản ghi này
Hỏi bất cứ điều gì về bản ghi chép này - AI sẽ tìm thấy các phần liên quan và trả lời.