videoplayback

0:00

S… Speaker 2 (videoplayback)

Hi, everyone.

0:00

S… Speaker 2 (videoplayback)

So in this video,

0:01

S… Speaker 2 (videoplayback)

I would like to continue our general audience series on large

0:05

S… Speaker 2 (videoplayback)

language models like ChatGPT.

0:07

S… Speaker 1 (videoplayback)

Now,

0:08

S… Speaker 2 (videoplayback)

in a previous video,

0:09

S… Speaker 2 (videoplayback)

deep dive into LLMs that you can find on my YouTube,

0:11

S… Speaker 2 (videoplayback)

we went into a lot of the under -the -hood fundamentals of how these models are trained and

0:15

S… Speaker 2 (videoplayback)

how you should think about their cognition or psychology.

0:18

S… Speaker 2 (videoplayback)

Now, in this video,

0:20

S… Speaker 2 (videoplayback)

I want to go into more practical applications of these tools.

0:23

S… Speaker 2 (videoplayback)

I want to show you lots of examples.

0:25

S… Speaker 2 (videoplayback)

I want to take you through all the different settings that are available.

0:27

S… Speaker 2 (videoplayback)

And I want to show you how I use these tools and how you can also use them in

0:32

S… Speaker 2 (videoplayback)

your own life and work.

0:33

S… Speaker 2 (videoplayback)

So let's dive in.

0:35

S… Speaker 1 (videoplayback)

Okay,

0:35

S… Speaker 2 (videoplayback)

so first of all, the web page that I have pulled up here is chatgpt .com.

0:39

S… Speaker 1 (videoplayback)

Now,

0:40

S… Speaker 2 (videoplayback)

as you might know, ChatGPT was developed by OpenAI and deployed in 2022.

0:45

S… Speaker 2 (videoplayback)

So this was the first time that people could actually just kind of like talk to a large language model

0:49

S… Speaker 2 (videoplayback)

through a text interface.

0:50

S… Speaker 2 (videoplayback)

And this went viral and all over the place on the internet.

0:53

S… Speaker 2 (videoplayback)

And this was huge.

0:55

S… Speaker 1 (videoplayback)

Now,

0:56

S… Speaker 2 (videoplayback)

since then,

0:56

S… Speaker 2 (videoplayback)

though, the ecosystem has grown a lot.

0:58

S… Speaker 2 (videoplayback)

So I'm going to be showing you a lot of examples of ChatGPT specifically.

1:01

S… Speaker 2 (videoplayback)

But now in 2025,

1:04

S… Speaker 2 (videoplayback)

there's many other apps that are kind of like ChatGPT -like.

1:08

S… Speaker 2 (videoplayback)

And this is now a much bigger and richer ecosystem.

1:11

S… Speaker 2 (videoplayback)

So in particular,

1:11

S… Speaker 2 (videoplayback)

I think ChatGPT by OpenAI is this original gangster incumbent.

1:15

S… Speaker 2 (videoplayback)

It's most popular and most feature -rich also because it's been around the

1:19

S… Speaker 1 (videoplayback)

longest.

1:20

S… Speaker 2 (videoplayback)

But there are many other kind of clones available,

1:23

S… Speaker 2 (videoplayback)

I would say.

1:23

S… Speaker 2 (videoplayback)

I don't think it's too unfair to say.

1:25

S… Speaker 2 (videoplayback)

But in some cases,

1:26

S… Speaker 2 (videoplayback)

there are kind of like unique experiences that are not found in ChatGPT.

1:29

S… Speaker 2 (videoplayback)

And we're going to see examples of those.

1:32

S… Speaker 2 (videoplayback)

So for example,

1:33

S… Speaker 2 (videoplayback)

Big Tech has followed with a lot of kind of ChatGPT -like experiences.

1:37

S… Speaker 2 (videoplayback)

So for example,

1:38

S… Speaker 2 (videoplayback)

Gemini, Meta AI,

1:39

S… Speaker 2 (videoplayback)

and Copilot from Google,

1:40

S… Speaker 2 (videoplayback)

Meta, and Microsoft,

1:41

S… Speaker 1 (videoplayback)

respectively.

1:42

S… Speaker 2 (videoplayback)

And there's also a number of startups.

1:43

S… Speaker 2 (videoplayback)

So for example,

1:44

S… Speaker 2 (videoplayback)

Anthropic has Cloud,

1:46

S… Speaker 2 (videoplayback)

which is kind of like a ChatGPT equivalent.

1:48

S… Speaker 2 (videoplayback)

XAI,

1:49

S… Speaker 2 (videoplayback)

which is Elon's company,

1:50

S… Speaker 2 (videoplayback)

has Grok.

1:51

S… Speaker 2 (videoplayback)

And there's many others.

1:53

S… Speaker 2 (videoplayback)

So all of these here are from the United States.

1:57

S… Speaker 2 (videoplayback)

companies basically deep seek is a chinese company and le chat is

2:01

S… Speaker 2 (videoplayback)

a french company mistral now where can you find these and how can

2:05

S… Speaker 2 (videoplayback)

you keep track of them well number one on the internet somewhere but there are some leaderboards and

2:09

S… Speaker 2 (videoplayback)

in the previous video i've shown you chatbot arena is one of them so here you

2:13

S… Speaker 2 (videoplayback)

can come to some ranking of different models and you can see sort of their strength

2:18

S… Speaker 2 (videoplayback)

or elo score

2:19

S… Speaker 2 (videoplayback)

And so this is one place where you can keep track of them.

2:21

S… Speaker 2 (videoplayback)

I would say like another place maybe is this seal leaderboard

2:26

S… Speaker 2 (videoplayback)

from scale.

2:27

S… Speaker 2 (videoplayback)

And so here you can also see different kinds of evals and different kinds

2:31

S… Speaker 2 (videoplayback)

of models and how well they rank.

2:32

S… Speaker 2 (videoplayback)

And you can also come here to see which models are currently performing the best

2:36

S… Speaker 2 (videoplayback)

on a wide variety of tasks.

2:40

S… Speaker 2 (videoplayback)

So understand that the ecosystem is fairly rich,

2:42

S… Speaker 2 (videoplayback)

but for now,

2:43

S… Speaker 2 (videoplayback)

I'm going to start with OpenAI because it is the incumbent and is most feature -rich,

2:47

S… Speaker 2 (videoplayback)

but I'm going to show you others over time as well.

2:50

S… Speaker 2 (videoplayback)

So let's start with ChatGPT.

2:51

S… Speaker 2 (videoplayback)

What is this text box and what do we put in here?

2:54

S… Speaker 2 (videoplayback)

Okay, so the most basic form of interaction with the language model is that we give it

2:58

S… Speaker 2 (videoplayback)

text and then we get some text back in response.

3:00

S… Speaker 2 (videoplayback)

So as an example,

3:02

S… Speaker 2 (videoplayback)

we can ask to get a haiku about what it's like to be a large language model.

3:06

S… Speaker 2 (videoplayback)

So this is a good kind of example task for a language model because these

3:11

S… Speaker 2 (videoplayback)

models are really good at writing.

3:12

S… Speaker 2 (videoplayback)

So writing haikus or poems or cover letters or

3:16

S… Speaker 2 (videoplayback)

resumes or email replies,

3:18

S… Speaker 2 (videoplayback)

they're just good at writing.

3:20

S… Speaker 2 (videoplayback)

So when we ask for something like this,

3:22

S… Speaker 2 (videoplayback)

what happens looks as follows.

3:23

S… Speaker 2 (videoplayback)

The model basically responds,

3:25

S… Speaker 2 (videoplayback)

words flow like a stream,

3:28

S… Speaker 2 (videoplayback)

endless echoes nevermind,

3:29

S… Speaker 2 (videoplayback)

ghost of thought unseen.

3:31

S… Speaker 2 (videoplayback)

Okay,

3:32

S… Speaker 2 (videoplayback)

it's pretty dramatic.

3:34

S… Speaker 2 (videoplayback)

But what we're seeing here in ChatGPT is something that looks a bit like a conversation that you

3:38

S… Speaker 2 (videoplayback)

would have with a friend.

3:38

S… Speaker 2 (videoplayback)

These are kind of like chat bubbles.

3:40

S… Speaker 1 (videoplayback)

Now,

3:41

S… Speaker 2 (videoplayback)

we saw in the previous video is that what's going on under the hood here is

3:45

S… Speaker 2 (videoplayback)

that this is what we call a user query,

3:47

S… Speaker 2 (videoplayback)

this piece of text.

3:49

S… Speaker 2 (videoplayback)

And this piece of text and also the response from the model,

3:52

S… Speaker 2 (videoplayback)

this piece of text is chopped up into little text chunks that

3:56

S… Speaker 2 (videoplayback)

we call tokens.

3:58

S… Speaker 2 (videoplayback)

So this sequence of text is under the hood a token sequence,

4:03

S… Speaker 2 (videoplayback)

one -dimensional token sequence.

4:04

S… Speaker 2 (videoplayback)

Now the way we can see those tokens is we can use an app like,

4:07

S… Speaker 2 (videoplayback)

for example, TickTokenizer.

4:08

S… Speaker 2 (videoplayback)

So making sure that GPT -40 is selected,

4:10

S… Speaker 2 (videoplayback)

I can paste my text here.

4:12

S… Speaker 2 (videoplayback)

And this is actually what the model sees under the hood.

4:14

S… Speaker 2 (videoplayback)

My piece of text to the model looks like a sequence of

4:19

S… Speaker 2 (videoplayback)

exactly 15 tokens.

4:20

S… Speaker 2 (videoplayback)

And these are the little text chunks that the model sees.

4:25

S… Speaker 2 (videoplayback)

Now, there's a vocabulary here of 200 ,000 roughly of possible

4:29

S… Speaker 2 (videoplayback)

tokens.

4:30

S… Speaker 2 (videoplayback)

And then these are the token IDs corresponding to all these little

4:34

S… Speaker 2 (videoplayback)

text chunks that are part of my query.

4:35

S… Speaker 2 (videoplayback)

And you can play with this and update it.

4:37

S… Speaker 2 (videoplayback)

And you can see that,

4:38

S… Speaker 2 (videoplayback)

for example, this is kate -sensitive.

4:39

S… Speaker 2 (videoplayback)

You would get different tokens.

4:40

S… Speaker 2 (videoplayback)

And you can kind of edit it and see live how the token sequence changes.

4:44

S… Speaker 2 (videoplayback)

So our query was 15 tokens.

4:46

S… Speaker 2 (videoplayback)

And then the model response is right here.

4:50

S… Speaker 2 (videoplayback)

And it responded back to us with a sequence of exactly 19

4:54

S… Speaker 1 (videoplayback)

tokens.

4:54

S… Speaker 2 (videoplayback)

So that haiku is this sequence of 19 tokens.

4:58

S… Speaker 1 (videoplayback)

Now...

5:00

S… Speaker 2 (videoplayback)

So we said 15 tokens and it said 19 tokens back.

5:03

S… Speaker 1 (videoplayback)

Now,

5:04

S… Speaker 2 (videoplayback)

because this is a conversation and we want to actually maintain a lot of the metadata that

5:08

S… Speaker 2 (videoplayback)

actually makes up a conversation object,

5:10

S… Speaker 2 (videoplayback)

this is not all that's going on under the hood.

5:13

S… Speaker 2 (videoplayback)

And we saw in the previous video a little bit about the conversation format.

5:17

S… Speaker 2 (videoplayback)

So it gets a little bit more complicated in that we have to take our user

5:21

S… Speaker 1 (videoplayback)

query.

5:22

S… Speaker 2 (videoplayback)

And we have to actually use this chat format.

5:24

S… Speaker 2 (videoplayback)

So let me delete the system message.

5:26

S… Speaker 2 (videoplayback)

I don't think it's very important for the purposes of understanding what's going on.

5:29

S… Speaker 2 (videoplayback)

Let me paste my message as the user.

5:31

S… Speaker 2 (videoplayback)

And then let me paste the model response as an assistant.

5:35

S… Speaker 2 (videoplayback)

And then let me crop it here properly.

5:38

S… Speaker 2 (videoplayback)

The tool doesn't do that properly.

5:40

S… Speaker 2 (videoplayback)

So here we have it as it actually

5:44

S… Speaker 2 (videoplayback)

happens under the hood.

5:46

S… Speaker 2 (videoplayback)

There are all these special tokens that basically begin a message from

5:50

S… Speaker 2 (videoplayback)

the user, and then the user says,

5:52

S… Speaker 2 (videoplayback)

and this is the content of what we said.

5:54

S… Speaker 2 (videoplayback)

And then the user ends,

5:56

S… Speaker 2 (videoplayback)

and then the assistant begins and says this,

5:59

S… Speaker 1 (videoplayback)

etc.

6:00

S… Speaker 1 (videoplayback)

Now,

6:01

S… Speaker 2 (videoplayback)

the precise details of the conversation format are not important.

6:03

S… Speaker 2 (videoplayback)

What I want to get across here is that what looks to you and I as little chat

6:08

S… Speaker 2 (videoplayback)

bubbles going back and forth,

6:09

S… Speaker 2 (videoplayback)

under the hood,

6:10

S… Speaker 2 (videoplayback)

we are collaborating with the model.

6:12

S… Speaker 2 (videoplayback)

And we're both writing into a token stream.

6:16

S… Speaker 2 (videoplayback)

And these two bubbles back and forth were in

6:20

S… Speaker 2 (videoplayback)

a sequence of exactly 42 tokens under the hood.

6:23

S… Speaker 2 (videoplayback)

I contributed some of the first tokens,

6:25

S… Speaker 2 (videoplayback)

and then the model continued the sequence of tokens with its response.

6:29

S… Speaker 2 (videoplayback)

And we could alternate and continue adding tokens here.

6:32

S… Speaker 2 (videoplayback)

And together,

6:33

S… Speaker 2 (videoplayback)

we are building out a token window,

6:35

S… Speaker 2 (videoplayback)

a one -dimensional sequence of tokens.

6:38

S… Speaker 1 (videoplayback)

Okay,

6:39

S… Speaker 2 (videoplayback)

so let's come back to ChachiPT now.

6:42

S… Speaker 2 (videoplayback)

What we are seeing here is kind of like little bubbles going back and forth between us and the

6:46

S… Speaker 2 (videoplayback)

model. Under the hood,

6:47

S… Speaker 2 (videoplayback)

we are building out a one -dimensional token sequence.

6:49

S… Speaker 2 (videoplayback)

When I click new chat here,

6:52

S… Speaker 2 (videoplayback)

that wipes the token window.

6:54

S… Speaker 2 (videoplayback)

That resets the tokens to basically zero again and

6:58

S… Speaker 2 (videoplayback)

restarts the conversation from scratch.

7:00

S… Speaker 1 (videoplayback)

Now,

7:00

S… Speaker 2 (videoplayback)

the cartoon diagram that I have in my mind when I'm speaking to a model looks something like this.

7:05

S… Speaker 2 (videoplayback)

When we click new chat,

7:07

S… Speaker 2 (videoplayback)

we begin a token sequence.

7:10

S… Speaker 2 (videoplayback)

So this is a one -dimensional sequence of tokens.

7:12

S… Speaker 2 (videoplayback)

The user,

7:13

S… Speaker 2 (videoplayback)

we can write tokens into this stream.

7:16

S… Speaker 2 (videoplayback)

And then when we hit enter,

7:18

S… Speaker 2 (videoplayback)

we transfer control over to the language model.

7:21

S… Speaker 2 (videoplayback)

And the language model responds with its own token streams.

7:24

S… Speaker 2 (videoplayback)

And then the language model has a special token that basically says something

7:29

S… Speaker 2 (videoplayback)

along the lines of,

7:29

S… Speaker 1 (videoplayback)

I'm done.

7:30

S… Speaker 2 (videoplayback)

So when it emits that token,

7:32

S… Speaker 2 (videoplayback)

the chat GPT application transfers control back to us,

7:35

S… Speaker 2 (videoplayback)

and we can take turns.

7:36

S… Speaker 2 (videoplayback)

Together,

7:37

S… Speaker 2 (videoplayback)

we are building out the token stream,

7:40

S… Speaker 2 (videoplayback)

which we also call the context window.

7:42

S… Speaker 2 (videoplayback)

So the context window is kind of like this working memory of

7:46

S… Speaker 2 (videoplayback)

tokens, and anything that is inside this context window is kind of like in the working

7:50

S… Speaker 2 (videoplayback)

memory of this conversation,

7:52

S… Speaker 2 (videoplayback)

and is very directly accessible by the model.

7:56

S… Speaker 2 (videoplayback)

Now, what is this entity here that we are talking to and how should we think about it?

8:00

S… Speaker 1 (videoplayback)

Well,

8:01

S… Speaker 2 (videoplayback)

this language model here,

8:02

S… Speaker 2 (videoplayback)

we saw that the way it is trained in the previous video,

8:05

S… Speaker 2 (videoplayback)

we saw there are two major stages,

8:07

S… Speaker 2 (videoplayback)

the pre -training stage and the post -training stage.

8:10

S… Speaker 2 (videoplayback)

The pre -training stage is kind of like taking all of

8:14

S… Speaker 2 (videoplayback)

internet, chopping it up into tokens,

8:17

S… Speaker 2 (videoplayback)

and then compressing it into a single kind of like zip file.

8:21

S… Speaker 2 (videoplayback)

But the zip file is not exact.

8:23

S… Speaker 2 (videoplayback)

The zip file is lossy and probabilistic zip file because

8:27

S… Speaker 2 (videoplayback)

we can't possibly represent all of internet in just one sort of like,

8:30

S… Speaker 2 (videoplayback)

say, terabyte of zip file because

8:35

S… Speaker 2 (videoplayback)

there's just way too much information.

8:36

S… Speaker 2 (videoplayback)

So we just kind of get the gestalt or the vibes inside this

8:41

S… Speaker 2 (videoplayback)

zip file.

8:42

S… Speaker 1 (videoplayback)

Now,

8:44

S… Speaker 2 (videoplayback)

what's actually inside the zip file are the parameters of a neural

8:48

S… Speaker 1 (videoplayback)

network.

8:49

S… Speaker 2 (videoplayback)

And so,

8:49

S… Speaker 2 (videoplayback)

for example, a one terabyte zip file would correspond to roughly,

8:53

S… Speaker 2 (videoplayback)

say, one trillion parameters inside this neural network.

8:56

S… Speaker 2 (videoplayback)

And what this neural network is trying to do is it's trying to basically

9:01

S… Speaker 2 (videoplayback)

take tokens and it's trying to predict the next token in a sequence.

9:04

S… Speaker 2 (videoplayback)

But it's doing that on internet documents.

9:07

S… Speaker 2 (videoplayback)

So it's kind of like this internet document generator,

9:10

S… Speaker 1 (videoplayback)

right?

9:11

S… Speaker 2 (videoplayback)

And in the process of predicting the next token in a sequence on internet,

9:15

S… Speaker 2 (videoplayback)

the neural network gains a huge amount of knowledge about

9:19

S… Speaker 1 (videoplayback)

the world.

9:20

S… Speaker 2 (videoplayback)

And this knowledge is all represented and stuffed and compressed

9:24

S… Speaker 2 (videoplayback)

inside the 1 trillion parameters,

9:26

S… Speaker 2 (videoplayback)

roughly, of this language model.

9:27

S… Speaker 1 (videoplayback)

Now,

9:29

S… Speaker 2 (videoplayback)

this pre -training stage also we saw is fairly costly.

9:31

S… Speaker 2 (videoplayback)

So this can be many tens of millions of dollars,

9:34

S… Speaker 2 (videoplayback)

say like three months of training and so on.

9:36

S… Speaker 2 (videoplayback)

So this is a costly long phase.

9:39

S… Speaker 2 (videoplayback)

For that reason,

9:40

S… Speaker 2 (videoplayback)

this phase is not done that often.

9:43

S… Speaker 2 (videoplayback)

So for example,

9:44

S… Speaker 2 (videoplayback)

GPT -40,

9:45

S… Speaker 2 (videoplayback)

this model was pre -trained probably many

9:49

S… Speaker 2 (videoplayback)

months ago, maybe like even a year ago by now.

9:51

S… Speaker 2 (videoplayback)

And so that's why these models are a little bit out of date.

9:54

S… Speaker 2 (videoplayback)

They have what's called a knowledge cutoff because that knowledge cutoff

9:58

S… Speaker 2 (videoplayback)

corresponds to when the model was...

10:00

S… Speaker 2 (videoplayback)

pre -trained and its knowledge only goes up to that point.

10:03

S… Speaker 1 (videoplayback)

Now,

10:04

S… Speaker 2 (videoplayback)

some knowledge can come into

10:08

S… Speaker 2 (videoplayback)

the model through the post -training phase,

10:10

S… Speaker 2 (videoplayback)

which we'll talk about in a second.

10:12

S… Speaker 2 (videoplayback)

But roughly speaking,

10:13

S… Speaker 2 (videoplayback)

you should think of these models as kind of like a little bit out of date because pre

10:17

S… Speaker 2 (videoplayback)

-training is way too expensive and happens infrequently.

10:21

S… Speaker 2 (videoplayback)

So any kind of recent information,

10:23

S… Speaker 2 (videoplayback)

like if you wanted to talk to your model about something that happened last week or so on,

10:26

S… Speaker 2 (videoplayback)

we're going to need other ways of providing that information to the model because it's not

10:30

S… Speaker 2 (videoplayback)

stored in the knowledge of the model.

10:32

S… Speaker 2 (videoplayback)

So we're going to have various tool use to give that information to the

10:36

S… Speaker 1 (videoplayback)

model. Now,

10:37

S… Speaker 2 (videoplayback)

after pre -training,

10:38

S… Speaker 2 (videoplayback)

there's a second stage called post -training.

10:40

S… Speaker 2 (videoplayback)

And the post -training stage is really attaching a smiley face to this zip file.

10:45

S… Speaker 2 (videoplayback)

Because we don't want to generate internet documents.

10:47

S… Speaker 2 (videoplayback)

We want this thing to take on the persona of an assistant that

10:52

S… Speaker 2 (videoplayback)

responds to user queries.

10:54

S… Speaker 2 (videoplayback)

And that's done in the process of post -training,

10:56

S… Speaker 2 (videoplayback)

where we swap out the dataset for a dataset of conversations that are built

11:00

S… Speaker 2 (videoplayback)

out by humans.

11:02

S… Speaker 2 (videoplayback)

So this is basically where the model takes on this persona so that

11:06

S… Speaker 2 (videoplayback)

we can ask questions and it responds with answers.

11:09

S… Speaker 2 (videoplayback)

So it takes on the style of an assistant,

11:13

S… Speaker 2 (videoplayback)

that's post -training,

11:14

S… Speaker 2 (videoplayback)

but it has the knowledge of all of internet,

11:17

S… Speaker 2 (videoplayback)

and that's by pre -training.

11:19

S… Speaker 2 (videoplayback)

So these two are combined in this artifact.

11:23

S… Speaker 2 (videoplayback)

Now the important thing to understand here,

11:26

S… Speaker 2 (videoplayback)

I think, for this section is that what you are talking to is a fully

11:30

S… Speaker 2 (videoplayback)

self -contained entity by default.

11:32

S… Speaker 2 (videoplayback)

This language model,

11:33

S… Speaker 2 (videoplayback)

think of it as a one terabyte file on a disk.

11:36

S… Speaker 2 (videoplayback)

Secretly,

11:37

S… Speaker 2 (videoplayback)

that represents one trillion parameters and their precise settings inside the neural network

11:42

S… Speaker 2 (videoplayback)

that's trying to give you the next token in the sequence.

11:44

S… Speaker 2 (videoplayback)

But this is the fully self -contained entity.

11:47

S… Speaker 2 (videoplayback)

There's no calculator.

11:48

S… Speaker 2 (videoplayback)

There's no computer and Python interpreter.

11:51

S… Speaker 2 (videoplayback)

There's no worldwide web browsing.

11:53

S… Speaker 2 (videoplayback)

There's none of that.

11:54

S… Speaker 2 (videoplayback)

There's no tool use yet in what we've talked about so far.

11:56

S… Speaker 2 (videoplayback)

You're talking to a zip file.

11:58

S… Speaker 2 (videoplayback)

If you stream tokens to it,

12:00

S… Speaker 2 (videoplayback)

it will respond with tokens back.

12:02

S… Speaker 2 (videoplayback)

And the zip file has the knowledge from pre -training and it has the

12:06

S… Speaker 2 (videoplayback)

style and form from post -training.

12:09

S… Speaker 2 (videoplayback)

And so that's roughly how you can think about this

12:13

S… Speaker 1 (videoplayback)

entity. Okay,

12:14

S… Speaker 2 (videoplayback)

so if I had to summarize what we talked about so far,

12:16

S… Speaker 2 (videoplayback)

I would probably do it in the form of an introduction of ChatGPT in a way that I think

12:20

S… Speaker 2 (videoplayback)

you should think about it.

12:21

S… Speaker 2 (videoplayback)

So the introduction would be,

12:23

S… Speaker 2 (videoplayback)

hi, I'm ChatGPT.

12:24

S… Speaker 2 (videoplayback)

I'm a one terabyte zip file.

12:26

S… Speaker 2 (videoplayback)

My knowledge comes from the internet,

12:28

S… Speaker 2 (videoplayback)

which I read in its entirety.

12:31

S… Speaker 2 (videoplayback)

about six months ago,

12:32

S… Speaker 2 (videoplayback)

and I only remember vaguely,

12:34

S… Speaker 1 (videoplayback)

okay?

12:35

S… Speaker 2 (videoplayback)

And my winning personality was programmed,

12:37

S… Speaker 2 (videoplayback)

by example,

12:38

S… Speaker 2 (videoplayback)

by human labelers at OpenAI.

12:40

S… Speaker 2 (videoplayback)

So the personality is programmed in post -training,

12:44

S… Speaker 2 (videoplayback)

and the knowledge comes from compressing the internet during pre

12:48

S… Speaker 1 (videoplayback)

-training.

12:49

S… Speaker 2 (videoplayback)

And this knowledge is a little bit out of date and it's a probabilistic and slightly vague.

12:53

S… Speaker 2 (videoplayback)

Some of the things that probably are mentioned very frequently on the internet,

12:57

S… Speaker 2 (videoplayback)

I will have a lot better recollection of than some of the things that are discussed very

13:02

S… Speaker 2 (videoplayback)

rarely, very similar to what you might expect with a human.

13:05

S… Speaker 2 (videoplayback)

So let's now talk about some of the repercussions of this entity

13:09

S… Speaker 2 (videoplayback)

and how we can talk to it and what kinds of things we can expect from it.

13:12

S… Speaker 1 (videoplayback)

Now,

13:12

S… Speaker 2 (videoplayback)

I'd like to use real examples when we actually go through this.

13:15

S… Speaker 2 (videoplayback)

So for example,

13:16

S… Speaker 2 (videoplayback)

this morning, I asked ChatGPT the following.

13:18

S… Speaker 2 (videoplayback)

How much caffeine is in one shot of Americana?

13:20

S… Speaker 2 (videoplayback)

And I was curious because I was comparing it to matcha.

13:22

S… Speaker 1 (videoplayback)

Now,

13:23

S… Speaker 2 (videoplayback)

Chachi PT will tell me that this is roughly 63 milligrams of caffeine or so.

13:27

S… Speaker 2 (videoplayback)

Now, the reason I'm asking ChachiPT this question that I think this is okay is,

13:31

S… Speaker 2 (videoplayback)

number one, I'm not asking about any knowledge that is very recent.

13:35

S… Speaker 2 (videoplayback)

So I do expect that the model has sort of read about how much caffeine there is

13:39

S… Speaker 2 (videoplayback)

in one shot.

13:40

S… Speaker 2 (videoplayback)

I don't think this information has changed too much.

13:43

S… Speaker 2 (videoplayback)

And number two,

13:43

S… Speaker 2 (videoplayback)

I think this information is extremely frequent on the internet.

13:46

S… Speaker 2 (videoplayback)

This kind of a question and this kind of information has occurred all over the place on the internet.

13:50

S… Speaker 2 (videoplayback)

And because there were so many mentions of it,

13:52

S… Speaker 2 (videoplayback)

I expect the model to have good memory of it and its knowledge.

13:56

S… Speaker 2 (videoplayback)

So there's no tool use,

13:58

S… Speaker 2 (videoplayback)

and the model,

13:58

S… Speaker 2 (videoplayback)

the zip file,

13:59

S… Speaker 2 (videoplayback)

responded that there's roughly 63 milligrams.

14:01

S… Speaker 2 (videoplayback)

Now,

14:02

S… Speaker 2 (videoplayback)

I'm not guaranteed that this is the correct answer.

14:05

S… Speaker 2 (videoplayback)

This is just its vague recollection of the internet.

14:09

S… Speaker 2 (videoplayback)

But I can go to primary sources and maybe I can look up,

14:12

S… Speaker 2 (videoplayback)

okay,

14:13

S… Speaker 2 (videoplayback)

caffeine and Americano and I could verify that,

14:16

S… Speaker 2 (videoplayback)

yeah, it looks to be about 63 is roughly right.

14:18

S… Speaker 2 (videoplayback)

And you can look at primary sources to decide if this is true or not.

14:21

S… Speaker 2 (videoplayback)

So I'm not strictly speaking guaranteed that this is true,

14:24

S… Speaker 2 (videoplayback)

but I think probably this is the kind of thing that ChatGPT would know.

14:27

S… Speaker 2 (videoplayback)

Here's an example of a conversation I had two days ago,

14:30

S… Speaker 1 (videoplayback)

actually.

14:31

S… Speaker 2 (videoplayback)

And there's another example of a knowledge -based conversation and

14:35

S… Speaker 2 (videoplayback)

things that I'm comfortable asking of ChatGPT with some caveats.

14:38

S… Speaker 2 (videoplayback)

So I'm a bit sick.

14:39

S… Speaker 2 (videoplayback)

I have runny nose and I want to get meds that help with that.

14:42

S… Speaker 2 (videoplayback)

So it told me a bunch of stuff.

14:43

S… Speaker 2 (videoplayback)

And I want my nose

14:47

S… Speaker 2 (videoplayback)

to not be runny.

14:48

S… Speaker 2 (videoplayback)

So I gave it a clarification based on what it said.

14:50

S… Speaker 2 (videoplayback)

And then it kind of gave me some of the things that might be helpful with that.

14:53

S… Speaker 2 (videoplayback)

And then I looked at some of the meds that I have at home and I said,

14:57

S… Speaker 2 (videoplayback)

does day cool or night cool work?

15:00

S… Speaker 2 (videoplayback)

It went off and it kind of like went over the ingredients of Dayquil and Nikol and

15:04

S… Speaker 2 (videoplayback)

whether or not they helped mitigate running nose.

15:08

S… Speaker 2 (videoplayback)

Now, when these ingredients are coming here,

15:10

S… Speaker 2 (videoplayback)

again, remember, we are talking to a zip file that has a recollection of the internet.

15:13

S… Speaker 2 (videoplayback)

I'm not guaranteed that these ingredients are correct.

15:16

S… Speaker 2 (videoplayback)

And in fact,

15:17

S… Speaker 2 (videoplayback)

I actually took out the box and I looked at the ingredients and I made sure that NyQuil

15:21

S… Speaker 2 (videoplayback)

ingredients are exactly these ingredients.

15:23

S… Speaker 2 (videoplayback)

And I'm doing that because I don't always fully trust what's coming out here,

15:28

S… Speaker 2 (videoplayback)

right? This is just a probabilistic statistical recollection of the internet.

15:31

S… Speaker 2 (videoplayback)

But that said,

15:33

S… Speaker 2 (videoplayback)

conversations of NyQuil and NyQuil,

15:35

S… Speaker 2 (videoplayback)

these are very common meds.

15:37

S… Speaker 2 (videoplayback)

Probably there's tons of information about a lot of this on the internet.

15:40

S… Speaker 2 (videoplayback)

And this is the kind of things that the model have pretty good recollection of.

15:44

S… Speaker 2 (videoplayback)

So actually these were all correct.

15:46

S… Speaker 2 (videoplayback)

And then I said,

15:47

S… Speaker 2 (videoplayback)

okay, well, I have Nikol.

15:48

S… Speaker 2 (videoplayback)

How fast would it act roughly?

15:51

S… Speaker 2 (videoplayback)

And it kind of tells me.

15:52

S… Speaker 2 (videoplayback)

And then is acetaminophen basically a Tylenol?

15:56

S… Speaker 2 (videoplayback)

And it says yes.

15:57

S… Speaker 2 (videoplayback)

So this is a good example of how ChatGPT was useful to me.

16:00

S… Speaker 2 (videoplayback)

It is a knowledge -based query.

16:01

S… Speaker 2 (videoplayback)

This knowledge sort of isn't recent knowledge.

16:04

S… Speaker 2 (videoplayback)

This is all coming from the knowledge of the model.

16:07

S… Speaker 2 (videoplayback)

I think this is common information.

16:08

S… Speaker 2 (videoplayback)

This is not a high -stakes situation.

16:10

S… Speaker 2 (videoplayback)

I'm checking ChatGPT a little bit,

16:13

S… Speaker 2 (videoplayback)

but also this is not a high -stakes situation,

16:15

S… Speaker 2 (videoplayback)

so no big deal.

16:16

S… Speaker 2 (videoplayback)

So I popped an I call,

16:17

S… Speaker 2 (videoplayback)

and indeed it helped.

16:18

S… Speaker 2 (videoplayback)

But that's roughly how I'm thinking about what's coming back here.

16:22

S… Speaker 1 (videoplayback)

Okay,

16:22

S… Speaker 2 (videoplayback)

so at this point, I want to make two nodes.

16:25

S… Speaker 2 (videoplayback)

The first note I want to make is that naturally as you interact with these models,

16:28

S… Speaker 2 (videoplayback)

you'll see that your conversations are growing longer,

16:31

S… Speaker 1 (videoplayback)

right?

16:31

S… Speaker 2 (videoplayback)

Anytime you are switching topic,

16:34

S… Speaker 2 (videoplayback)

I encourage you to always start a new chat.

16:37

S… Speaker 2 (videoplayback)

When you start a new chat,

16:39

S… Speaker 2 (videoplayback)

as we talked about,

16:40

S… Speaker 2 (videoplayback)

you are wiping the context window of tokens and resetting it back to zero.

16:44

S… Speaker 2 (videoplayback)

If it is the case that those tokens are not any more useful to your next query,

16:48

S… Speaker 2 (videoplayback)

I encourage you to do this because these tokens in this window are expensive.

16:52

S… Speaker 2 (videoplayback)

And they're expensive in kind of like two ways.

16:55

S… Speaker 2 (videoplayback)

Number one, if you have lots of tokens here,

16:58

S… Speaker 2 (videoplayback)

then the model can actually find it a little bit distracting.

17:01

S… Speaker 2 (videoplayback)

So if this was a lot of tokens,

17:04

S… Speaker 2 (videoplayback)

this is kind of like the working memory of the model.

17:07

S… Speaker 2 (videoplayback)

The model might be distracted by all the tokens in the past when it is trying

17:11

S… Speaker 2 (videoplayback)

to sample tokens much later on.

17:13

S… Speaker 2 (videoplayback)

So it could be distracting and it could actually decrease the accuracy of the

17:17

S… Speaker 2 (videoplayback)

model and of its performance.

17:18

S… Speaker 2 (videoplayback)

And number two,

17:19

S… Speaker 2 (videoplayback)

the more tokens are in the window,

17:22

S… Speaker 2 (videoplayback)

the more expensive it is by a little bit,

17:24

S… Speaker 2 (videoplayback)

not by too much,

17:25

S… Speaker 2 (videoplayback)

but by a little bit to sample the next token in the sequence.

17:28

S… Speaker 2 (videoplayback)

So your model is actually slightly slowing down.

17:30

S… Speaker 2 (videoplayback)

It's becoming more expensive to calculate the next token and the more

17:34

S… Speaker 2 (videoplayback)

tokens there are here.

17:37

S… Speaker 2 (videoplayback)

And so think of the tokens in the context window as a precious resource.

17:41

S… Speaker 2 (videoplayback)

Think of that as the working memory of the model and don't

17:45

S… Speaker 2 (videoplayback)

overload it with irrelevant information and keep it as short as you can.

17:49

S… Speaker 2 (videoplayback)

And you can expect that to work faster and slightly better.

17:52

S… Speaker 1 (videoplayback)

Of course,

17:53

S… Speaker 2 (videoplayback)

if the information actually is related to your task,

17:56

S… Speaker 2 (videoplayback)

you may want to keep it in there.

17:57

S… Speaker 2 (videoplayback)

But I encourage you to,

17:58

S… Speaker 2 (videoplayback)

as often as you can,

17:59

S… Speaker 2 (videoplayback)

basically start a new chat whenever you are switching topic.

18:03

S… Speaker 2 (videoplayback)

The second thing is that I always encourage you to keep in mind what model you are actually

18:07

S… Speaker 2 (videoplayback)

using. So here on the top left,

18:09

S… Speaker 2 (videoplayback)

we can drop down and we can see that we are currently using GPT -40.

18:12

S… Speaker 1 (videoplayback)

Now,

18:13

S… Speaker 2 (videoplayback)

there are many different models of many different flavors and there are

18:17

S… Speaker 2 (videoplayback)

too many actually,

18:18

S… Speaker 2 (videoplayback)

but we'll go through some of these over time.

18:19

S… Speaker 2 (videoplayback)

So we are using GPT -40 right now and in everything that I've shown you,

18:23

S… Speaker 2 (videoplayback)

this is GPT -40.

18:24

S… Speaker 1 (videoplayback)

Now,

18:25

S… Speaker 2 (videoplayback)

when I open a new incognito window,

18:27

S… Speaker 2 (videoplayback)

so if I go to chatgpt .com and I'm not logged in,

18:32

S… Speaker 2 (videoplayback)

The model that I'm talking to here,

18:33

S… Speaker 2 (videoplayback)

so if I just say hello,

18:34

S… Speaker 2 (videoplayback)

the model that I'm talking to here might not be GPT 4 .0.

18:37

S… Speaker 2 (videoplayback)

It might be a smaller version.

18:39

S… Speaker 2 (videoplayback)

Now,

18:40

S… Speaker 2 (videoplayback)

unfortunately, OpenAI does not tell me when I'm not logged in what model I'm using,

18:44

S… Speaker 2 (videoplayback)

which is kind of unfortunate.

18:45

S… Speaker 2 (videoplayback)

But it's possible that you are using a smaller,

18:47

S… Speaker 2 (videoplayback)

kind of dumber model.

18:49

S… Speaker 2 (videoplayback)

So if we go to the ChatGPT pricing page here,

18:51

S… Speaker 2 (videoplayback)

we see that they have three basic tiers for individuals,

18:55

S… Speaker 2 (videoplayback)

the free,

18:56

S… Speaker 2 (videoplayback)

plus, and pro.

18:57

S… Speaker 2 (videoplayback)

And in the free tier,

19:00

S… Speaker 2 (videoplayback)

you have access to what's called GPT -40 Mini.

19:02

S… Speaker 2 (videoplayback)

And this is a smaller version of GPT -40.

19:05

S… Speaker 2 (videoplayback)

It is a smaller model with a smaller number of parameters.

19:08

S… Speaker 2 (videoplayback)

It's not going to be as creative,

19:10

S… Speaker 2 (videoplayback)

like its writing might not be as good.

19:12

S… Speaker 2 (videoplayback)

Its knowledge is not going to be as good.

19:14

S… Speaker 2 (videoplayback)

It's going to probably hallucinate a bit more,

19:16

S… Speaker 1 (videoplayback)

etc.

19:16

S… Speaker 2 (videoplayback)

But it is kind of like the free offering,

19:19

S… Speaker 2 (videoplayback)

the free tier.

19:19

S… Speaker 2 (videoplayback)

They do say that you have limited access to 4 .0 and O3 Mini,

19:23

S… Speaker 2 (videoplayback)

but I'm not actually 100 % sure.

19:24

S… Speaker 1 (videoplayback)

Like,

19:25

S… Speaker 2 (videoplayback)

it didn't tell us which model we were using,

19:26

S… Speaker 2 (videoplayback)

so we just fundamentally don't know.

19:29

S… Speaker 2 (videoplayback)

Now, when you pay for $20 per month,

19:31

S… Speaker 2 (videoplayback)

even though it doesn't say this,

19:33

S… Speaker 2 (videoplayback)

I think basically like they're screwing up on how they're describing this.

19:37

S… Speaker 2 (videoplayback)

But if you go to fine print,

19:38

S… Speaker 2 (videoplayback)

limits apply,

19:38

S… Speaker 2 (videoplayback)

we can see that the plus users get 80

19:43

S… Speaker 2 (videoplayback)

messages every three hours for GPT -40.

19:45

S… Speaker 2 (videoplayback)

So that's the flagship biggest model that's currently available as of today.

19:50

S… Speaker 2 (videoplayback)

That's available and that's what we want to be using.

19:53

S… Speaker 2 (videoplayback)

So if you pay $20 per month,

19:55

S… Speaker 2 (videoplayback)

you have that with some limits.

19:57

S… Speaker 2 (videoplayback)

And then if you pay for $200 per month, you get the price.

20:00

S… Speaker 2 (videoplayback)

and there's a bunch of additional goodies as well as unlimited GPT -4O.

20:03

S… Speaker 2 (videoplayback)

And we're going to go into some of this because I do pay for pro subscription.

20:06

S… Speaker 1 (videoplayback)

Now,

20:08

S… Speaker 2 (videoplayback)

the whole takeaway I want you to get from this is be mindful of

20:12

S… Speaker 2 (videoplayback)

the models that you're using.

20:13

S… Speaker 2 (videoplayback)

Typically with these companies,

20:14

S… Speaker 2 (videoplayback)

the bigger models are more expensive to...

20:18

S… Speaker 1 (videoplayback)

calculate.

20:18

S… Speaker 2 (videoplayback)

And so therefore,

20:19

S… Speaker 2 (videoplayback)

the companies charge more for the bigger models.

Tóm tắt

Hỏi AI về bản ghi này