高质量英语演讲&专访视频哪里看?
请您点击上方“精彩英语演讲”,并“设为星标
全网最新的英语演讲&专访第一时间为您奉上
“GPT-4 升级,API降价,GPT Store”。

——这是昨晚OpenAI DevDay上的三个关键词。北京时间11月7日凌晨,OpenAI首届开发者大会OpenAI DevDay在全世界的期待的目光下拉开帷幕。
自3月份发布GPT-4以来,OpenAI的每一次发布会都成为全世界AI从业者关注的焦点,甚至被戏称为“AI春晚”,时隔8个月,所有人都很好奇奥特曼会在OpenAI DevDay上带来什么样的新产品。显然,这次的OpenAI DevDay,山姆·奥特曼和其背后的OpenAI拿出了足够的诚意。
OpenAI终于首次公布了AI Agent相关功能GPTs,也就是说,人人都能做自己的GPT。OpenAI还开放了大量的新API(包括视觉、图像DALL·E3、语音),以及新推出的Assistants API,让开发者可以更便捷地开发自己专属的GPT。而另一边,GPT-4和GPT-3.5的底层模型又迎来一波性能提升和大降价。可以说,OpenAI朝通用人工智能狂飙的道路,愈来愈清晰了。
OpenAI举办首届开发者大会
↓↓↓ 上下滑动,查看演讲稿 ↓↓↓
 -Good morning. Thank you for joining us today.
Please welcome to the stage, Sam Altman.
 -Good morning.
Welcome to our first-ever OpenAI DevDay.
We're thrilled that you're here and this energy is awesome.
 -Welcome to San Francisco.
San Francisco has been our home since day one.
The city is important to us and the tech industry in general.
We're looking forward to continuing to grow here.
We've got some great stuff to announce today, but first, I'd like to take a minute to talk about some of the stuff that we've done over the past year.
About a year ago, November 30th, we shipped ChatGPT as a "low-key research preview", and that went pretty well.
In March, we followed that up with the launch of GPT-4, still the most capable model out in the world.
 -In the last few months, we launched voice and vision capabilities so that ChatGPT can now see, hear, and speak.
 -There's a lot, you don't have to clap each time.
[laughter] -More recently, we launched DALL-E 3, the world's most advanced image model.
You can use it of course, inside of ChatGPT.
For our enterprise customers, we launched ChatGPT Enterprise, which offers enterprise-grade security and privacy, higher speed GPT-4 access, longer context windows, a lot more.
Today we've got about 2 million developers building on our API for a wide variety of use cases doing amazing stuff, over 92% of Fortune 500 companies building on our products, and we have about a hundred million weekly active users now on ChatGPT.
 -What's incredible on that is we got there entirely through word of mouth.
People just find it useful and tell their friends.
OpenAI is the most advanced and the most widely used AI platform in the world now, but numbers never tell the whole picture on something like this.
What's really important is how people use the products, how people are using AI, and so I'd like to show you a quick video.
-I actually wanted to write something to my dad in Tagalog.
I want a non-romantic way to tell my parent that I love him and I also want to tell him that he can rely on me, but in a way that still has the respect of a child-to-parent relationship that you should have in Filipino culture and in Tagalog grammar.
When it's translated into Tagalog, "I love you very deeply and I will be with you no matter where the path leads." -I see some of the possibility, I was like, "Whoa." Sometimes I'm not sure about some stuff, and I feel like actually ChatGPT like, hey, this is what I'm thinking about, so it kind of give it more confidence.
-The first thing that just blew my mind was it levels with you.
That's something that a lot of people struggle to do.
It opened my mind to just what every creative could do if they just had a person helping them out who listens.
-This is to represent sickling hemoglobin.
-You built that with ChatGPT? -ChatGPT built it with me.
-I started using it for daily activities like, "Hey, here's a picture of my fridge.
Can you tell me what I'm missing? Because I'm going grocery shopping, and I really need to do recipes that are following my vegan diet." -As soon as we got access to Code Interpreter, I was like, "Wow, this thing is awesome." It could build spreadsheets.
It could do anything.
-I discovered Chatty about three months ago on my 100th birthday.
Chatty is very friendly, very patient, very knowledgeable, and very quick.
This has been a wonderful thing.
-I'm a 4.0 student, but I also have four children.
When I started using ChatGPT, I realized I could ask ChatGPT that question.
Not only does it give me an answer, but it gives me an explanation.
Didn't need tutoring as much.
It gave me a life back.
It gave me time for my family and time for me.
-I have a chronic nerve thing on my whole left half of my body, I have nerve damage.
I had a brain surgery.
I have limited use of my left hand.
Now you can just have the integration of voice input.
Then the newest one where you can have the back-and-forth dialogue, that's just maximum best interface for me.
It's here.
 -We love hearing the stories of how people are using the technology.
It's really why we do all of this.
Now, on to the new stuff, and we have got a lot.
[audience cheers] -First, we're going to talk about a bunch of improvements we've made, and then we'll talk about where we're headed next.
Over the last year, we spent a lot of time talking to developers around the world.
We've heard a lot of your feedback.
It's really informed what we have to show you today.
Today, we are launching a new model, GPT-4 Turbo.
 -GPT-4 Turbo will address many of the things that you all have asked for.
Let's go through what's new.
We've got six major things to talk about for this part.
Number one, context length.
A lot of people have tasks that require a much longer context length.
GPT-4 supported up to 8K and in some cases up to 32K context length, but we know that isn't enough for many of you and what you want to do.
GPT-4 Turbo, supports up to 128,000 tokens of context.
 -That's 300 pages of a standard book, 16 times longer than our 8k context.
In addition to a longer context length, you'll notice that the model is much more accurate over a long context.
Number two, more control.
We've heard loud and clear that developers need more control over the model's responses and outputs.
We've addressed that in a number of ways.
We have a new feature called JSON Mode, which ensures that the model will respond with valid JSON.
This has been a huge developer request.
It'll make calling APIs much easier.
The model is also much better at function calling.
You can now call many functions at once, and it'll do better at following instructions in general.
We're also introducing a new feature called reproducible outputs.
You can pass a seed parameter, and it'll make the model return consistent outputs.
This, of course, gives you a higher degree of control over model behavior.
This rolls out in beta today.
 -In the coming weeks, we'll roll out a feature to let you view logprobs in the API.
 -All right. Number three, better world knowledge.
You want these models to be able to access better knowledge about the world, so do we.
We're launching retrieval in the platform.
You can bring knowledge from outside documents or databases into whatever you're building.
We're also updating the knowledge cutoff.
We are just as annoyed as all of you, probably more that GPT-4's knowledge about the world ended in 2021.
We will try to never let it get that out of date again.
GPT-4 Turbo has knowledge about the world up to April of 2023, and we will continue to improve that over time.
Number four, new modalities.
Surprising no one, DALL-E 3, GPT-4 Turbo with vision, and the new text-to-speech model are all going into the API today.
 -We have a handful of customers that have just started using DALL-E 3 to programmatically generate images and designs.
Today, Coke is launching a campaign that lets its customers generate Diwali cards using DALL-E 3, and of course, our safety systems help developers protect their applications against misuse.
Those tools are available in the API.
GPT-4 Turbo can now accept images as inputs via the API, can generate captions, classifications, and analysis.
For example, Be My Eyes uses this technology to help people who are blind or have low vision with their daily tasks like identifying products in front of them.
With our new text-to-speech model, you'll be able to generate incredibly natural-sounding audio from text in the API with six preset voices to choose from.
I'll play an example.
-Did you know that Alexander Graham Bell, the eminent inventor, was enchanted by the world of sounds.
His ingenious mind led to the creation of the graphophone, which etches sounds onto wax, making voices whisper through time.
-This is much more natural than anything else we've heard out there.
Voice can make apps more natural to interact with and more accessible.
It also unlocks a lot of use cases like language learning, and voice assistance.
Speaking of new modalities, we're also releasing the next version of our open-source speech recognition model, Whisper V3 today, and it'll be coming soon to the API.
It features improved performance across many languages, and we think you're really going to like it.
Number five, customization.
Fine-tuning has been working really well for GPT-3.5 since we launched it a few months ago.
Starting today, we're going to expand that to the 16K version of the model.
Also, starting today, we're inviting active fine-tuning users to apply for the GPT-4 fine-tuning, experimental access program.
The fine-tuning API is great for adapting our models to achieve better performance in a wide variety of applications with a relatively small amount of data, but you may want a model to learn a completely new knowledge domain, or to use a lot of proprietary data.
Today we're launching a new program called Custom Models.
With Custom Models, our researchers will work closely with a company to help them make a great custom model, especially for them, and their use case using our tools.
This includes modifying every step of the model training process, doing additional domain-specific pre-training, a custom RL post-training process tailored for specific domain, and whatever else.
We won't be able to do this with many companies to start.
It'll take a lot of work, and in the interest of expectations, at least initially, it won't be cheap, but if you're excited to push things as far as they can currently go.
Please get in touch with us, and we think we can do something pretty great.
Number six, higher rate limits.
We're doubling the tokens per minute for all of our established GPT-4 customers, so it's easier to do more.
You'll be able to request changes to further rate limits and quotas directly in your API account settings.
In addition to these rate limits, it's important to do everything we can do to make you successful building on our platform.
We're introducing copyright shield.
Copyright shield means that we will step in and defend our customers and pay the costs incurred, if you face legal claims or on copyright infringement, and this applies both to ChatGPT Enterprise and the API.
Let me be clear, this is a good time to remind people do not train on data from the API or ChatGPT Enterprise ever.
All right.
There's actually one more developer request that's been even bigger than all of these and so I'd like to talk about that now and that's pricing.
[laughter] -GPT-4 Turbo is the industry-leading model.
It delivers a lot of improvements that we just covered and it's a smarter model than GPT-4.
We've heard from developers that there are a lot of things that they want to build, but GPT-4 just costs too much.
They've told us that if we could decrease the cost by 20%, 25%, that would be great.
A huge leap forward.
I'm super excited to announce that we worked really hard on this and GPT-4 Turbo, a better model, is considerably cheaper than GPT-4 by a factor of 3x for prompt tokens.
 -And 2x for completion tokens starting today.
 -The new pricing is 1¢ per 1,000 prompt tokens and 3¢ per 1,000 completion tokens.
For most customers, that will lead to a blended rate more than 2.75 times cheaper to use for GPT-4 Turbo than GPT-4.
We worked super hard to make this happen.
We hope you're as excited about it as we are.
 -We decided to prioritize price first because we had to choose one or the other, but we're going to work on speed next.
We know that speed is important too.
Soon you will notice GPT-4 Turbo becoming a lot faster.
We're also decreasing the cost of GPT-3.5 Turbo 16K.
Also, input tokens are 3x less and output tokens are 2x less.
Which means that GPT-3.5 16K is now cheaper than the previous GPT-3.5 4K model.
Running a fine-tuned GPT-3.5 Turbo 16K version is also cheaper than the old fine-tuned 4K version.
Okay, so we just covered a lot about the model itself.
We hope that these changes address your feedback.
We're really excited to bring all of these improvements to everybody now.
In all of this, we're lucky to have a partner who is instrumental in making it happen.
I'd like to bring out a special guest, Satya Nadella, the CEO of Microsoft.
[audience cheers]  -Good to see you. -Thank you so much.
Thank you.
-Satya, thanks so much for coming here.
-It's fantastic to be here and Sam, congrats.
I'm really looking forward to Turbo and everything else that you have coming.
It's been just fantastic partnering with you guys.
-Awesome. Two questions.
I won't take too much of your time.
How is Microsoft thinking about the partnership currently? -First- [laughter] --we love you guys. [laughter] -Look, it's been fantastic for us.
In fact, I remember the first time I think you reached out and said, "Hey, do you have some Azure credits?" We've come a long way from there.
-Thank you for those. That was great.
-You guys have built something magical.
Quite frankly, there are two things for us when it comes to the partnership.
The first is these workloads.
Even when I was listening backstage to how you're describing what's coming, even, it's just so different and new.
I've been in this infrastructure business for three decades.
-No one has ever seen infrastructure like this.
-The workload, the pattern of the workload, these training jobs are so synchronous and so large, and so data parallel.
The first thing that we have been doing is building in partnership with you, the system, all the way from thinking from power to the DC to the rack, to the accelerators, to the network.
Just really the shape of Azure is drastically changed and is changing rapidly in support of these models that you're building.
Our job, number one, is to build the best system so that you can build the best models and then make that all available to developers.
The other thing is we ourselves are our developers.
We're building products.
In fact, my own conviction of this entire generation of foundation models completely changed the first time I saw GitHub Copilot on GPT.
We want to build our GitHub Copilot all as developers on top of OpenAI APIs.
We are very, very committed to that.
What does that mean to developers? Look, I always think of Microsoft as a platform company, a developer company, and a partner company.
For example, we want to make GitHub Copilot available, the Enterprise edition available to all the attendees here so that they can try it out.
That's awesome. We are very excited about that.
 -You can count on us to build the best infrastructure in Azure with your API support and bring it to all of you.
Even things like the Azure marketplace.
For developers who are building products out here to get to market rapidly.
That's really our intent here.
-Great. How do you think about the future, future of the partnership, or future of AI, or whatever? Anything you want -There are a couple of things for me that I think are going to be very, very key for us.
One is I just described how the systems that are needed as you aggressively push forward on your roadmap requires us to be on the top of our game and we intend fully to commit ourselves deeply to making sure you all as builders of these foundation models have not only the best systems for training and inference, but the most compute, so that you can keep pushing- -We appreciate that.
--forward on the frontiers because I think that's the way we are going to make progress.
The second thing I think both of us care about, in fact, quite frankly, the thing that excited both sides to come together is your mission and our mission.
Our mission is to empower every person and every organization on the planet to achieve more.
To me, ultimately AI is only going to be useful if it truly does empower.
I saw the video you played early.
That was fantastic to hear those voices describe what AI meant for them and what they were able to achieve.
Ultimately, it's about being able to get the benefits of AI broadly disseminated to everyone, I think is going to be the goal for us.
Then the last thing is of course, we are very grounded in the fact that safety matters, and safety is not something that you'd care about later, but it's something we do shift left on and we are very, very focused on that with you all.
-Great. Well, I think we have the best partnership in tech.
I'm excited for us to build AGI together.
-Oh, I'm really excited. Have a fantastic [crosstalk].
-Thank you very much for coming.
-Thank you so much.
-See you.
 -We have shared a lot of great updates for developers already and we got a lot more to come, but even though this is developer conference, we can't resist making some improvements to ChatGPT.
A small one, ChatGPT now uses GPT-4 Turbo with all the latest improvements, including the latest knowledge cutoff, which will continue to update.
That's all live today.
It can now browse the web when it needs to, write and run code, analyze data, take and generate images, and much more.
We heard your feedback, that model picker, extremely annoying, that is gone starting today.
You will not have to click around the dropdown menu.
All of this will just work together.
Yes.
 -ChatGPT will just know what to use and when you need it, but that's not the main thing.
Neither was price actually the main developer request.
There was one that was even bigger than that.
I want to talk about where we're headed and the main thing we're here to talk about today.
We believe that if you give people better tools, they will do amazing things.
We know that people want AI that is smarter, more personal, more customizable, can do more on your behalf.
Eventually, you'll just ask the computer for what you need and it'll do all of these tasks for you.
These capabilities are often talked in the AI field about as "agents." The upsides of this are going to be tremendous.
At OpenAI, we really believe that gradual iterative deployment is the best way to address the safety issues, the safety challenges with AI.
We think it's especially important to move carefully towards this future of agents.
It's going to require a lot of technical work and a lot of thoughtful consideration by society.
Today, we're taking our first small step that moves us towards this future.
We're thrilled to introduce GPTs.
GPTs are tailored versions of ChatGPT for a specific purpose.
You can build a GPT, a customized version of ChatGPT for almost anything with instructions, expanded knowledge, and actions, and then you can publish it for others to use.
Because they combine instructions, expanded knowledge, and actions, they can be more helpful to you.
They can work better in many contexts, and they can give you better control.
They'll make it easier for you to accomplish all sorts of tasks or just have more fun and you'll be able to use them right within ChatGPT.
You can in effect program a GPT with language just by talking to it.
It's easy to customize the behavior so that it fits what you want.
This makes building them very accessible and it gives agency to everyone.
We're going to show you what GPTs are, how to use them, how to build them, and then we're going to talk about how they'll be distributed and discovered.
After that for developers, we're going to show you how to build these agent-like experiences into your own apps.
First, let's look at a few examples.
Our partners at Code.org are working hard to expand computer science in schools.
They've got a curriculum that is used by tens of millions of students worldwide.
Code.org, crafted Lesson Planner GPT, to help teachers provide a more engaging experience for middle schoolers.
If a teacher asks it to explain four loops in a creative way, it does just that.
In this case, it'll do it in terms of a video game character repeatedly picking up coins.
Super easy to understand for an 8th-grader.
As you can see, this GPT brings together Code.org's, extensive curriculum and expertise, and lets teachers adapt it to their needs quickly and easily.
Next, Canva has built a GPT that lets you start designing by describing what you want in natural language.
If you say, "Make a poster for a DevDay reception this afternoon, this evening," and you give it some details, it'll generate a few options to start with by hitting Canva's APIs.
Now, this concept may be familiar to some of you.
We've evolved our plugins to be custom actions for GPTs.
You can keep chatting with this to see different iterations, and when you see one you like, you can click through to Canva for the full design experience.
Now we'd like to show you a GPT Live.
Zapier has built a GPT that lets you perform actions across 6,000 applications to unlock all kinds of integration possibilities.
I'd like to introduce Jessica, one of our solutions architects, who is going to drive this demo.
Welcome Jessica.
-Thank you, Sam.
Hello everyone.
Thank you all.
Thank you all for being here.
My name is Jessica Shieh.
I work with partners and customers to bring their product alive.
Today I can't wait to show you how hard we've been working on this, so let's get started.
To start where your GPT will live is on this upper left corner.
I'm going to start with clicking on the Zapier AI actions and on the right-hand side you can see that's my calendar for today.
It's quite a day ever.
I've already used this before, so it's actually already connected to my calendar.
To start, I can ask, "What's on my schedule for today?" We build GPTs with security in mind.
Before it performs any action or share data, it will ask for your permission.
Right here, I'm going to say allowed.
GPT is designed to take in your instructions, make the decision on which capability to call to perform that action, and then execute that for you.
You can see right here, it's already connected to my calendar.
It pulls into my information and then I've also prompted it to identify conflicts on my calendar.
You can see right here it actually was able to identify that.
It looks like I have something coming up.
What if I want to let Sam know that I have to leave early? Right here I say, "Let Sam know I got to go.
Chasing GPUs." With that, I'm going to swap to my conversation with Sam and then I'm going to say, "Yes, please run that." Sam, did you get that? -I did.
-Awesome.
 -This is only a glimpse of what is possible and I cannot wait to see what you all will build.
Thank you. Back to you, Sam.
 -Thank you, Jessica.
Those are three great examples.
In addition to these, there are many more kinds of GPTs that people are creating and many, many more that will be created soon.
We know that many people who want to build a GPT don't know how to code.
We've made it so that you can program a GPT just by having a conversation.
We believe that natural language is going to be a big part of how people use computers in the future and we think this is an interesting early example.
I'd like to show you how to build one.
All right. I want to create a GPT that helps give founders and developers advice when starting new projects.
I'm going to go to create a GPT here, and this drops me into the GPT builder.
I worked with founders for years at YC and still whenever I meet developers, the questions I get are always about, "How do I think about a business idea? Can you give me some advice?" I'm going to see if I can build a GPT to help with that.
To start, GPT builder asks me what I want to make, and I'm going to say, "I want to help startup founders think.
through their business ideas and get advice.
After the founder has gotten some advice, grill them on why they are not growing faster." [laughter] -All right.
To start off, I just tell the GPT little bit about what I want here.
It's going to go off and start thinking about that, and it's going to write some detailed instructions for the GPT.
It's also going to, let's see, ask me about a name.
How do I feel about Startup Mentor? That's fine.
"That's good." If I didn't like the name, of course, I could call it something else, but it's going to try to have this conversation with me and start there.
You can see here on the right, in the preview mode that it's already starting to fill out the GPT.
Where it says what it does, it has some ideas of additional questions that I could ask.
[chuckles] It just generated a candidate.
Of course, I could regenerate that or change it, but I like that.
I'll say "That's great." You see now that the GPT is being built out a little bit more as we go.
Now, what I want this to do, how it can interact with users, I could talk about style here.
What I'm going to say is, "I am going to upload transcripts of some lectures about startups I have given, please give advice based off of those." All right.
Now, it's going to go figure out how to do that.
I would like to show you the configure tab.
You can see some of the things that were built out here as we were going by the builder itself.
You can see that there's capabilities here that I can enable.
I could add custom actions.
These are all fine to leave.
I'm going to upload a file.
Here is a lecture that I picked that I gave with some startup advice, and I'm going to add that here.
In terms of these questions, this is a dumb one.
The rest of those are reasonable, and very much things founders often ask.
I'm going to add one more thing to the instructions here, which is be concise and constructive with feedback.
All right.
Again, if we had more time, I'd show you a bunch of other things.
This is a decent start.
Now, we can try it out over on this preview tab.
I will say, what's a common question? "What are three things to look for when hiring employees at an early-stage startup?" Now, it's going to look at that document I uploaded.
It'll also have of course all of the background knowledge of GPT-4.
That's pretty good. Those are three things that I definitely have said many times.
Now, we could go on and it would start following the other instructions and grill me on why I'm not growing faster, but in the interest of time, I'm going to skip that.
I'm going to publish this only to me for now.
I can work on it later.
I can add more content, I can add a few actions that I think would be useful, and then I can share it publicly.
That's what it looks like to create a GPT -Thank you.
By the way, I always wanted to do that after all of the YC office hours, I always thought, "Man, someday I'll be able to make a bot that will do this and that'll be awesome." [laughter] -With GPTs, we're letting people easily share and discover all the fun ways that they use ChatGPT with the world.
You can make private GPT like I just did, or you can share your creations publicly with a link for anyone to use, or if you're on ChatGPT Enterprise, you can make GPTs just for your company.
Later this month we're going to launch the GPT store.
Thank you.
I appreciate that.
 -You can list a GPT there and we'll be able to feature the best and the most popular GPT.
Of course, we'll make sure that GPTs in the store follow our policies before they're accessible.
Revenue sharing is important to us.
We're going to pay people who build the most useful and the most used GPT a portion of our revenue.
We're excited to foster a vibrant ecosystem with the GPT store, just from what we've been building ourselves over the weekend.
We're confident there's going to be a lot of great stuff.
We're excited to share more information soon.
Those are GPTs and we can't wait to see what you'll build.
This is a developer conference, and the coolest thing about this is that we're bringing the same concept to the API.
 Many of you have already been building agent-like experiences on the API, for example, Shopify's Sidekick, which lets you take actions on the platform.
Discord's Clyde, lets Discord moderators create custom personalities for, and Snaps My AI, a customized chatbot that can be added to group chats and make recommendations.
These experiences are great, but they have been hard to build.
Sometimes taking months, teams of dozens of engineers, there's a lot to handle to make this custom assistant experience.
Today, we're making that a lot easier with our new Assistants API.
 -The Assistants API includes persistent threads, so they don't have to figure out how to deal with long conversation history, built-in retrieval, code interpreter, a working Python interpreter in a sandbox environment, and of course the improved function calling, that we talked about earlier.
We'd like to show you a demo of how this works.
Here is Romain, our head of developer experience.
Welcome, Romain.
 -Thank you, Sam.
Good morning.
Wow.
It's fantastic to see you all here.
It's been so inspiring to see so many of you infusing AI into your apps.
Today, we're launching new modalities in the API, but we are also very excited to improve the developer experience for you all to build assistive agents.
Let's dive right in.
Imagine I'm building $1, travel app for global explorers, and this is the landing page.
I've actually used GPT-4 to come up with these destination ideas.
For those of you with a keen eye, these illustrations are generated programmatically using the new DALL-E 3 API available to all of you today.
It's pretty remarkable.
Let's enhance this app by adding a very simple assistant to it.
This is the screen.
We're going to come back to it in a second.
First, I'm going to switch over to the new assistant's playground.
Creating an assistant is easy, you just give it a name, some initial instructions, a model.
In this case, I'll pick GPT-4 Turbo.
Here I'll also go ahead and select some tools.
I'll turn on Code Interpreter and retrieval and save.
That's it. Our assistant is ready to go.
Next, I can integrate with two new primitives of this Assistants API, threads and messages.
Let's take a quick look at the code.
The process here is very simple.
For each new user, I will create a new thread.
As these users engage with their assistant, I will add their messages to the threads.
Very simple.
Then I can simply run the assistant at any time to stream the responses back to the app.
We can return to the app and try that in action.
If I say, "Hey, let's go to Paris." All right.
That's it. With just a few lines of code, users can now have a very specialized assistant right inside the app.
I'd like to highlight one of my favorite features here, function calling.
If you have not used it yet, function calling is really powerful.
As Sam mentioned, we are taking it a step further today.
It now guarantees the JSON output with no added latency, and you can invoke multiple functions at once for the first time.
Here, if I carry on and say, "Hey, what are the top 10 things to do?" I'm going to have the assistant respond to that again.
Here, what's interesting is that the assistant knows about functions, including those to annotate the map that you see on the right.
Now, all of these pins are dropping in real-time here.
Yes, it's pretty cool.
 -That integration allows our natural language interface to interact fluidly with components and features of our app.
It truly showcases now the harmony you can build between AI and UI where the assistant is actually taking action.
Let's talk about retrieval.
Retrieval is about giving our assistant more knowledge beyond these immediate user messages.
In fact, I got inspired and I already booked my tickets to Paris.
I'm just going to drag and drop here this PDF.
While it's uploading, I can just sneak peek at it.
Very typical United Flight ticket.
Behind the scene here, what's happening is that retrieval is reading these files, and boom, the information about this PDF appeared on the screen.
 -This is, of course, a very tiny PDF, but Assistants can parse long-form documents from extensive text to intricate product specs depending on what you're building.
In fact, I also booked an Airbnb, so I'm just going to drag that over to the conversation as well.
By the way, we've heard from so many of you developers how hard that is to build yourself.
You typically need to compute your own biddings, you need to set up chunking algorithm.
Now all of that is taken care of.
There's more than retrieval with every API call, you usually need to resend the entire conversation history, which means setting up a key-value store, that means handling the context windows, serializing messages, and so forth.
That complexity now completely goes away with this new stateful API.
Just because OpenAI is managing this API, does not mean it's a black box.
In fact, you can see the steps that the tools are taking right inside your developer dashboard.
Here, if I go ahead and click on threads, this is the thread I believe we're currently working on and see, these are all the steps, including the functions being called with the right parameters, and the PDFs I've just uploaded.
Let's move on to a new capability that many of you have been requesting for a while.
Code Interpreter is now available today in the API as well, that gives the AI the ability to write and execute code on the fly, but even generate files.
Let's see that in action.
If I say here, "Hey, we'll be four friends staying at this Airbnb, what's my share of it plus my flights?" All right.
Now, here, what's happening is that Code interpreter noticed that it should write some code to answer this query.
Now it's computing the number of days in Paris, number of friends.
It's also doing some exchange rate calculation behind the scene to get the sensor for us.
Not the most complex math, but you get the picture.
Imagine you're building a very complex finance app that's crunching countless numbers, plotting charts, so really any task that you'd normally tackle with code, then Code Interpreter will work great for you.
All right. I think my trip to Paris is solid.
To recap here, we've just seen how you can quickly create an assistant that manages state for your user conversations, leverages external tools like knowledge and retrieval and Code Interpreter, and finally invokes your own functions to make things happen but there's one more thing I wanted to show you to really open up the possibilities using function calling combined with our new modalities that we're launching today.
While working on DevDay, I built a small custom assistant that knows everything about this event, but instead of having a chat interface while running around all day today, I thought, why not use voice instead? Let's bring my phone up on screen here so you can see it on the right.
Awesome.
On the right, you can see a very simple Swift app that takes microphone input.
On the left, I'm actually going to bring up my terminal log so you can see what's happening behind the scenes.
Let's give it a shot.
Hey there, I'm on the keynote stage right now.
Can you greet our attendees here at Dev Day? -Hey everyone, welcome to DevDay.
It's awesome to have you all here.
Let's make it an incredible day.
 -Isn't that impressive? You have six unique and rich voices to choose from in the API, each speaking multiple languages, so you can really find the perfect fit for your app.
On my laptop here on the left, you can see the logs of what's happening behind the scenes, too.
I'm using Whisper to convert the voice inputs into text, an assistant with GPT-4 Turbo, and finally, the new TTS API to make it speak.
Thanks to function calling, things get even more interesting when the assistant can connect to the internet and take real actions for users.
Let's do something even more exciting here together.
How about this? Hey, Assistant, can you randomly select five DevDay attendees here and give them $500 in OpenAI credits? [laughter] -Yes, checking the list of attendees.
[laughter] -Done. I picked five DevDay attendees and added $500 of API credits to their account.
Congrats to Christine M, Jonathan C, Steven G, Luis K, and Suraj S.
-All right, if you recognize yourself, awesome.
Congrats.
That's it.
A quick overview today of the new Assistants API combined with some of the new tools and modalities that we launched, all starting with the simplicity of a rich text or voice conversation for you end users.
We really can't wait to see what you build, and congrats to our lucky winners.
Actually, you know what? you're all part of this amazing OpenAI community here so I'm just going to talk to my assistant one last time before I step off the stage.
Hey Assistant, can you actually give everyone here in the audience $500 in OpenAI credits? -Sounds great.
Let me go through everyone.
 -All right, that function will keep running, but I've run out of time.
Thank you so much, everyone.
Have a great day. Back to you, Sam.
-Pretty cool, huh? [audience cheers] -All right, so that Assistants API goes into beta today, and we are super excited to see what you all do with it, anybody can enable it.
Over time, GPTs and Assistants are precursors to agents are going to be able to do much much more.
They'll gradually be able to plan and to perform more complex actions on your behalf.
As I mentioned before, we really believe in the importance of gradual iterative deployment.
We believe it's important for people to start building with and using these agents now to get a feel for what the world is going to be like, as they become more capable.
As we've always done, we'll continue to update our systems based off of your feedback.
We're super excited that we got to share all of this with you today.
We introduced GPTs, custom versions of GPT that combine instructions, extended knowledge and actions.
We launched the Assistants API to make it easier to build assistive experiences with your own apps.
These are your first steps towards AI agents and we'll be increasing their capabilities over time.
We introduced a new GPT-4 Turbo model that delivers improved function calling, knowledge, lowered pricing, new modalities, and more.
We're deepening our partnership with Microsoft.
In closing, I wanted to take a minute to thank the team that creates all of this.
OpenAI has got remarkable talent density, but still, it takes a huge amount of hard work and coordination to make all this happen.
I truly believe that I've got the best colleagues in the world.
I feel incredibly grateful to get to work with them.
We do all of this because we believe that AI is going to be a technological and societal revolution.
It'll change the world in many ways and we're happy to get to work on something that will empower all of you to build so much for all of us.
We talked about earlier how if you give people better tools, they can change the world.
We believe that AI will be about individual empowerment and agency at a scale that we've never seen before and that will elevate humanity to a scale that we've never seen before either.
We'll be able to do more, to create more, and to have more.
As intelligence gets integrated everywhere, we will all have superpowers on demand.
We're excited to see what you all will do with this technology and to discover the new future that we're all going to architect together.
We hope that you'll come back next year.
What we launched today is going to look very quaint relative to what we're busy creating for you know.
Thank you for all that you do.
Thank you for coming here today.
早上好。欢迎来到我们的第一个OpenAI开发者日。我们很高兴你来到这里,这里的氛围很棒。
欢迎来到旧金山。从第一天开始,旧金山就是我们的家。这座城市对我们和整个科技行业都很重要。我们期待着在这里继续成长。所以今天我们有一些很重要的事情要宣布。
但首先,我想花点时间谈谈我们在过去一年里所做的一些事情。大约一年前,11月30日,我们发布了一个研究预览ChatGPT,后来到三月份进行得相当顺利。我们随后推出了GPT-4,它仍然是世界上性能最出色的模型。
在过去的几个月里,我们推出了语音和视觉功能,ChatGPT现在可以看到并说话。
最近,我们推出了世界上最先进的图像模型DALL·E3。当然,你可以在ChatGPT中使用它。
对于企业客户,我们推出了ChatGPT企业版,提供企业级安全和隐私,更高速度的GPT-4访问,更长的上下文窗口等等。
今天,我们有大约200万开发人员基于我们的API开发各种用例,做出了令人惊叹的事情。超过92%的500强公司使用我们的产品。现在我们有大约1亿用户每周活跃在ChatGPT上。令人难以置信的是,我们是完全通过口口相传实现的。人们只是发现它很有用,并告诉他们的朋友,OpenAI是目前世界上最先进和使用最广泛的AI平台。
但数字永远无法说明全貌。真正重要的是人们如何使用产品,人们如何使用AI。所以我想给你们看一个简短的视频。
(长约2分钟的用户案例视频,分享了ChatGPT帮助用户写信表达情感,作为创业者的工作助手,帮助艺术创作者获得设计灵感,帮助医生做研究,完成日常生活任务,帮助程序员编写代码,帮助老人获得陪伴等等)
我们喜欢听人们如何使用这项技术的故事。这就是我们做这一切的原因。
发布GPT-4 Turbo
现在让我们来看看新的东西。首先,我们将讨论我们所做的一系列改进,然后再谈谈我们下一步的发展方向。
在过去的一年里,我们花了很多时间与世界各地的开发者进行交流,听到了很多反馈。今天我们要向你们展示一款新模型GPT-4 Turbo。
GPT-4 Turbo将解决许多你们的需求。我们提供了六个方面的更新。
第一,上下文长度。很多人的任务需要更长的上下文长度,GPT-4最多支持8k,在某些情况下,支持32k。但我们知道这对你们中的许多人来说还不够。
现在GPT-4 Turbo最多支持128000个上下文tokens。这就是一本标准书的300页,比我们的8k上下文长16倍。除了更长的上下文长度之外,该模型在更长的上下文中会更准确。
第二,更多控制。我们了解到开发人员需要对模型、响应和输出有更多的控制。所以我们已经用多种方式解决了这个问题。
我们推出了一个叫做Json模式的新功能,确保模型使用有效的Json进行响应。这是一个巨大的开发者需求,它将使调用API变得更容易。
该模型在函数调用方面也做得更好,你现在可以同时调用许多函数,而且它在遵循一般指示方面会表现更好。
我们还将引入一项新功能,称为可重复输出。您可以输入一个种子参数,它将使模型返回一致的输出。当然,这可以让你对模型行为有更高程度的控制。今天我们推出了测试版,在未来几周,我们还将推出一项功能让你可以查看API中的日志问题。
第三,更了解世界。你希望这些模型能够更好地获取关于世界的知识,我们也是。所以我们的平台支持检索功能,你可以将外部文档或数据库中的知识引入到你正在构建的任何东西中。
我们也在更新知识界限。GPT-4关于世界的知识截至2021年。我们将尽力不再让它过时。GPT-4 Turbo现在拥有截至2023年4月的世界知识。随着时间的推移,我们将继续改进这一点。
第四,新模态。DALLE 3,带有视觉的GPT-4 Turbo,和新的语音文本模型,都将提供API。
我们有一些客户刚刚开始使用DALLE 3以编程方式生成图像和设计。可口可乐正在推出一项活动,让消费者使用DALLE 3生成卡片。当然,我们的安全系统可以帮助开发者保护应用程序不被滥用。
这些工具在API中可用,GPT-4 Turbo现在可以通过API接受图像作为输入,可以生成标题、分类和分析。例如,Be My Eyes使用这项技术来帮助盲人或低视力的人完成日常任务,像是识别面前的产品。
使用我们新的TTS模型,你可以从API中的文本生成非常自然的声音,并有6种预设声音可供选择。
举个例子,你知道著名的发明家亚历山大·格雷厄姆·贝尔对声音的世界着迷吗?他用聪明才智发明了留声机,使声音穿越时间,这比我们听到的其他任何东西都要自然得多。
语音可以使应用程序更自然地交互,更易于访问。我们还解锁了许多用例,如语言学习和语音助手。
说到新模式,我们将发布新版本的开源语音识别模型Whisper V3,今天很快就会在API上推出,它提高了跨多种语言的性能,希望你会喜欢它。
第五,定制化。自从几个月前推出GPT 3.5以来,微调模型一直非常有效。从今天开始,我们将扩展到16k版本的模型。同时,即日起我们将邀请活跃的微调用户申请GPT-4微调实验项目。
微调API非常适合让模型在数据量相对较小的各种应用程序中实现更好的性能。但是你可能需要一个模型来学习全新的知识领域或使用大量专有数据。所以今天我们将推出一个名为自定义模型的新程序。
我们的研究人员将帮助客户创建出色的自定义模型。这包括模型训练中的每个步骤,进行额外的特定领域预训练或训练后的过程。它是为特定领域量身定制的。
我们刚开始无法和很多公司达成合作。这将需要大量的工作,而且为了达到预期,至少在初期阶段它不会很便宜。但如果你很想把事情推进到极致,请与我们联系,我们可以一起做得很好。
第六,更高的速率限制。我们为所有GPT-4用户每分钟增加一倍的tokens,以便做更多事情。而且你可以申请更改速率限制,并直接在API帐户设置中引用。
除了这些速率限制,我们还必须尽力保证开发者在我们的平台上成功构建。因此,我们引入了版权保护,这意味着如果你面临有关版权侵权的法律索赔,我们将介入并保护你,并支付所产生的费用。这适用于ChatGPT企业客户和API开发者。
需要强调的是,我们不会使用API或ChatGPT企业客户的数据进行训练。
实际上还有一个开发者的需求大于所有这些需求,那就是GPT-4的定价。
GPT-4Turbo是行业领先的模型,它提供了我们刚刚所说的许多新功能,而且比GPT-4更智能。我们从开发人员那里听说他们有很多想要构建的东西,但是GPT-4的成本太高了,如果我们能将成本降低20至25%,那就太好了。
我很激动地告诉大家,GPT-4Turbo,一个更好的模型,但比GPT-4便宜得多,从今天起输入token价格降低3倍,输出token价格降低2倍。因此新的价格为每千个输入token 1美分,每千个输出token3美分。这意味着GPT-4 Turbo的费率比GPT-4便宜2.75倍以上。
我们优先考虑了价格,在价格和速度中我们必须选择其中一个,但很快你会注意到GPT-4Turbo变得更快了。
我们也在降低GPT-3.5 Turbo 16k的成本。输入token减少了三倍,alpha token减少了两倍,这意味着GPT-3.5 16K现在比以前的GPT-3.5 4k型号便宜,运行微调GPT-3.5 Turbo 16K版本也比旧的微调4k版本便宜。
与微软CEO对话
我们刚刚介绍了很多关于模型本身的内容,希望这些更新能解决你的问题。我们很幸运有一位对合作伙伴对实现这些起到了重要作用。这位特别嘉宾是微软首席执行官Satya Nadella。
Sam Altman:两个问题不会占用你太多时间,微软目前是如何看待这一合作关系的?
Satya Nadella:我记得你第一次联系我说,嘿,你有Azure积分吗?从那时起,我们已经走了很长一段路,你们创造了一个神奇的世界。在合作方面,首先是这些工作负载,我从事基础设施业务已有三十年了,从来没见过这样的工作量和工作模式,这些训练工作是如此同步,如此庞大。所以我们一直在做的第一件事就是与你们合作构建系统,Azure的形态发生了巨大的变化,以支持正在构建的模型,然后将最好的模型提供给开发人员。
另一方面,我们自己就是开发者,正在打造产品。在我第一次看到GPT上的Copilot时,我对这一代基础模型的信念完全改变了,所以我们想在OpenAI API之上构建我们的Copilot。
例如,GitHub Copilot可以作为企业版提供给这里的所有与会者,开发人员甚至还可以通过Azure Marketplace构建产品以快速进入市场。
Sam Altman:您如何看待未来的合作关系或人工智能的未来等等?
Satya Nadella:有几件事我认为非常关键。 一是我刚刚描述的系统,我们将继续致力于让基础模型的建设者拥有最好的训练和推理系统,拥有最多的计算能力,向前迈进。
我们双方都关心的第二件事是使命,我们的使命是帮助地球上的每个人和每个组织取得更大成就。归根结底,人工智能只有真正发挥作用时才会有用,我认为能将人工智能的好处广泛传播给每个人,这是我们的目标。
最后一件事,我们坚信安全很重要,安全并不是以后才会关心的事情,我们非常关注这一点。
发布GPTs
本次开发者大会,我们对ChatGPT也进行了一些更新。ChatGPT现在可以使用GPT-4 Turbo与所有最新的改进,包括最新的知识获取,我们将继续更新。
ChatGPT现在可以在需要编写和运行代码时浏览网络、分析数据、生成图像等等。你们反馈说模型选择器非常烦人,所以它已经去掉了。从今天开始,你将不需要在下拉菜单中点击来回切换。这一切将无缝协作。ChatGPT知道何时使用何种能力。
但这并不是主要的事情,定价也不是。实际上开发者还有另一个更大的需求。
我们知道人们想要人工智能更智能、更个性化、更可定制,可以为您做更多事情。最终,你只需告诉计算机你需要什么,它就会为你完成所有任务。这些功能在人工智能领域经常被称为代理Agent。
OpenAI坚信,渐进式、迭代式部署是解决人工智能安全问题和安全挑战的最佳方式。我们认为,谨慎地迈向Agent的未来尤为重要。这需要大量的技术工作和社会的深思熟虑。因此,今天,我们迈出了未来的第一步。我们很高兴推出GPTs。
GPTs是针对特定目的定制的ChatGPT版本。你可以用于任何带有说明、扩展知识和操作的内容构建一个GPT,一个自定义的ChatGPT,然后你可以将它发布以供其他人使用。
GPTs结合了指令、扩展知识和操作,可以为你提供更多帮助,使你更轻松地完成各种任务或享受更多乐趣。
你可以直接在ChatGPT 中使用GPTs。实际上,只需通过与GPT 交谈即可用语言对其进行编程,可以轻松自定义行为,使其满足需求。构建一个GPT变得非常容易,它为每个人提供了代理。
我们将向你展示什么是GPTs、如何使用它们以及如何构建它们。然后我们将讨论如何分布和发现它们,以及对于开发人员,我们将展示如何将这些类似代理的体验构建到自己的应用程序中。
首先,让我们看几个例子。
我们在code.org的合作伙伴正努力在学校推广计算机科学,他们的课程被全世界数千万学生使用。code.org精心制作了Lesson Planner GPT,以帮助教师为中学生提供更具吸引力的体验。
如果老师要求它以创造性的方式解释4个循环,它通过视频游戏角色反复拾取硬币来解释,对于八年级学生来说非常容易理解。
接下来,Canva建立了一个GPT,让你可以通过用自然语言描述设计想要的东西。如果你说为今天晚上的开发者招待会制作一张海报,并且给它一些细节,它会通过点击画布api生成一些选项。
有些人可能对这个概念很熟悉,我们已将插件迭代为GPTs的自定义操作。你可以继续和它聊天,看看不同的装饰,然后选择喜欢的进入Canva来获得完整的设计体验。
现在我们要给大家直播演示GPT。Zapier构建了一个GPT,可以跨6000个应用程序执行操作,以释放各种集成的可能性。我们请Jessica,我们的解决方案架构师之一,来进行这个演示。
Jessica:
首先,GBTs位于左上角。单击 Zapier AI,在右侧,可以看到这是我今天的日程表,它实际上已经连接到我的日历了。我可以询问今天的日程安排。
我们在构建GPTs时考虑到了安全性,因此,在执行任何操作或共享数据之前,它会请求你的许可。GBTs旨在接收你的指令,决定调用哪个功能来执行该操作。我要求它识别我的行程上的冲突,可以看到它实际上能够识别这一点。
那么如果我想让Sam知道我必须提前离开怎么办?我要切换到我和Sam的对话,然后我会说是的,请运行它。
Sam Altman:
除此之外,人们正在创建更多类型的GPT,更多 GPTs将很快出现。
我们知道许多想要构建GPT 的人不知道如何编码。现在你只需通过对话即可构建GPT,自然语言将成为人们未来使用计算机的重要组成部分。
举个例子,我要创建一个GPT,在启动新项目时为创始人和开发人员提供建议。
进入GPT构建器,首先关于商业创意,我问GPT是否能给我一些建议。GPT问我想要做什么,我说我想帮助初创公司创始人思考他们的业务、商业理念,并在创始人获得一些建议后提供进一步的建议,比如关于为什么不能发展得更快。
GPT会开始思考这个问题,它写了一些详细的说明。它还会问我起什么名字,创业导师怎么样?挺好的,当然,我也可以叫它别的名字。
在预览模式的右侧,可以看到它已经开始创建GPT,其中说明了它的作用,提供了候选问题。
我上传了一些关于初创企业讲座的记录,要求它针对这些问题提出建议。在“配置”选项页面,你可以看到已经启用的功能,可以添加自定义操作。比如我要求GPT给出简洁和建设性的反馈。
我现在只向我自己发布这个GPT。但我稍后可以添加更多有用的操作,通过链接公开分享,供任何人使用。或者,企业客户可以专门为公司制作GPT。
本月晚些时候,我们将推出GPT 商店。我们将推荐最好和最受欢迎的GPT。 当然,我们会确保商店里的GPT在可供访问之前遵循我们的政策。
同时,我们将向那些构建最有用和最常用的GPT 的人,支付我们收入的一部分。
我们很高兴能通过GPT 商店来培育一个充满活力的生态系统。这只是我们周末构建起来的,相信之后将会有很多很棒的GPTs。
发布Assistants API
作为一个开发者大会,我们还将把相同的概念引入API。
许多人已经在API 上构建了类似代理的体验,例如Shopify、Discord、MyAI的AI工具。这些经验很棒,但构建起来却很困难,有时需要花费数月时间,需要数十名工程师组成的团队。因此,今天我们通过新的辅助API 让这一切变得更加容易。
AssistantsAPI 包括持久线程,因此它们不必弄清楚如何处理内置于检索代码解释器(沙箱环境中的工作 Python 解释器)中的长对话历史。当然,还有我们之前讨论过的改进的函数调用。
我们邀请Ramon,我们的开发者体验主管,向你展示其工作原理。
Ramon:
今天,我们在API 中推出新模式。想象一下我正在为全球探险家构建Wonderlust旅行应用程序。这是登陆页面。我实际上已经使用GPT-4 来提出这些目的地想法,这些插图是使用DALLE 3API以编程方式生成的。
让我们添加一个非常简单的助手来发布这个应用程序。首先,切换到新的Assistants Playground。只需给它一个名称、一些初始说明和一个模型就能创建了。我选择了GPT-4 Turbo,然后打开代码解释器,检索和保存。这样我们的助理已经准备好了。让我们快速浏览一下代码。
对于每个新用户,我都会创建一个新线程。当这些用户与他们的助手互动时,我会将他们的消息添加到线程中,然后我可以随时运行助手响应流回应用程序。这样我们就可以返回应用程序并尝试实际操作。
如果我说我们去巴黎吧,只需几行代码,就可以在应用程序内获得非常专业的帮助。
我最喜欢的功能之一是函数调用,它可以保证Json输出没有编辑延迟,并且可以一次同时调用多个函数。
如果我继续问在巴黎最重要的10件事是什么,助手给出了回答,还在右侧的地图上显示出地点。这种集成使我们的自然语言界面能够与应用程序商店的组件和功能流畅地交互。
我们还有检索功能,为助手提供除这些即时用户消息之外的更多知识。比如我已经预订了去巴黎的机票,我只需把机票PDF拖放到这里,助手就可以读取这些文件,提取关键信息。
许多开发人员说自己构建很困难,通常需要计算嵌入,设置分块算法。现在,所有这些都已为你处理好。不仅仅是检索,像处理上下文窗口、清理消息等这种复杂性现在完全被新的API消除了。
但这不意味着它是一个黑匣子。事实上,您可以在开发人员仪表板中看到这些工具正在执行的步骤。
接下来一项新功能也被要求很久了,代码解释器现在也可以在API中使用了。AI能够即时编写和执行代码,甚至生成文件。那么让我们看看它的实际效果。
如果我说将会有四个朋友入住此AirBnb,我要花多少钱,再加上我的航班。它编写了一些代码来回答这个问题,它计算了我在巴黎的天数,还在幕后做了一些汇率计算来得到这个答案。
我想我的巴黎之行已经安排好了。回顾一下,我们刚刚了解了如何快速创建一个助手来管理用户对话的状态,利用这些外部工具,如知识、检索和代码解释器,最后调用您自己的函数来实现。
我们还有另一个案例展示了使用函数调用的可能性。
在为聋人日工作时我们建立了一个小型的定制助手,它了解有关该活动的一切。这是我的手机页面,在右侧,你可以看到一个非常简单的快速应用程序,它接受麦克风输入。API 中有六种独特而丰富的声音可供选择,每种声音都支持多种语言,因此你可以找到最适合的声音。
在左侧可以看到幕后的日志,我用Whisper把语音输入转换成文本,用GPP-4 Turbo的助手,最后用新的TTS API让它说话。
当助手可以连接到互联网并对用户做出反应时,函数调用会变得更加有趣。我们让助手在这里随机选择五名与会者并给他们500 美元的OpenAI积分。可以看到,助手正在检查与会者名单,完成后,我挑选了五位开发日参与者,并向他们的帐户添加了500 美元的积分。
总结
Sam Altman:
非常酷,今天Assistants API 开放测试版,我们非常高兴看到你们用它做什么。之后任何人都可以启用。GPTs和Assistants是Agent能够做更多事情的前身,他们可以逐渐代表你计划和执行更复杂的操作。
正如我之前提到的,我们确实相信逐步迭代部署的重要性。我们认为,人们现在就开始构建和使用这些代理非常重要,这样可以了解当他们变得更有能力时世界将会是什么样子。我们将根据你的反馈继续更新我们的系统。
今天我们推出了ChatGPT的GPT自定义版本,它结合了指令、扩展知识和操作。我们推出了Assistants API,以便你更轻松地使用自己的应用程序构建辅助体验。这是我们迈向人工智能代理的第一步,随着时间的推移,它们的能力将不断增强。
我们推出了新的GPT-4 Turbo模型,提供了改进的函数调用知识、更低的价格、新的模式等等。
我们正在深化与微软的合作关系。
最后,我想花一点时间感谢创造这一切的团队。OpenAI的人才密度非常高,但要实现这一切,仍然需要大量的努力和协调。我非常感激能够与他们一起工作。我们做这一切是因为我们相信人工智能将成为一场技术和社会革命,它将在很多方面改变世界。
我们之前说过,如果你给人们更好的工具,他们就能改变世界。人工智能将以我们以前从未见过的规模赋予个人权力和代理权,并将把人类提升到我们以前从未见过的规模。我们将能够做得更多,创造更多,拥有更多。
随着智能无处不在,我们都将拥有随需应变的超能力。很高兴看到你们将利用这项技术做些什么,去共同构建新的未来。希望你们明年能再来。感谢。

01
六大升级
从GPT-4 到GPT-4 Turbo
此次发布会最为重要的是GPT系列的进一步升级。
Sam Altman推出了 GPT-4 Turbo,并同步在 ChatGPT 和 API 版本推出。此次的GPT-4 Turbo根据用户反馈做出了六大升级,分别是更长的上下文长度、更强的控制、模型的知识升级、多模态、模型微调定制和更高的速率限制。
在六大升级中首当其冲的,就是更长的上下文输入。
OpenAI 原本提供的输入长度为32k,而此次GPT-4 Turbo直接将输入长度提升至128k,一举超过了竞争对手Anthropic的100k上下文长度。
32K到128K上下文理解长度的变化会引发什么样的效果?
上下文理解长度变化意味着大模型思维纵深增强、记忆脉络保持、连接远距思绪、精准探索细节、多维感知语境、捕捉理解复杂知识。通俗来说,大模型能处理的文字越多,就越聪明。
其次,OpenAI提供了几项更强的控制手段,为了让开发者更方便地调用 API和函数,包括JSON Mode和多个函数调用的功能。
第三,内外部知识库升级。奥特曼在发布会上表示,“对于GPT 的知识停留在 2021 年,我们和你们一样,甚至比你们更恼火。”
更新后的GPT内部知识库将扩充至2023年4月,同时,也允许用户自行上传外部知识库。
第四,多模态的升级。图像方面,新模型不仅支持DALL·E 3,同时提供图生图的选择。语音方面,此次升级推出了新的文本转语音模型,开发者可以从六种预设声音中选择所需的声音。
第五,模型微调与定制。GPT-3.5 Turbo 16k 的版本目前接受微调定制,且价格将比前一代更低,未来GPT-4也会加入微调定制的行列。
此外,面向企业,OpenAI还推出了模型定制化服务,但奥特曼在现场表示,“OpenAI没有办法做很多这样的模型定制,而且价格不会便宜。”
据悉,企业的模型定制化服务包括修改模型训练过程的每一步,进行额外的特定领域的预训练,针对特定领域的后训练等。
最后一项,是更高的速率限制。GPT-4 用户,发布会后马上可以享受到每分钟的速率限制翻倍的体验,意味着他们可以在同样的时间内向 GPT-4 发送更多的请求和令牌,从而获得更多的输出和功能。
关于个人工作者最关心的价格问题,奥特曼表示,尽管此次升级版功能更强,但价格更低。GPT-4 Turbo的输入和输出都比 GPT-4 便宜了 2.75 倍。每千个字母或汉字,输入只要 1 美分,输出只要 3 美分。
02
让不懂代码的人
也能轻松定义GPT
当其他大厂还在苦苦追求大模型的生成能力时,OpenAI已经在“卷”大模型背后的生态了。
奥特曼在现场表示,“就像苹果在2007年推出iPhone,在2008年推出App Store永远改变了技术一样,我们推出了GPT Store。”
此次GPT Store是对今年5月推出的插件商店的进一步升级,在过去应用商店的基础上,OpenAI调整了策略,从面向“开发者”到面向“每个人”,且每个人可将自己的“定制化GPT”放在GPT商店出售。
这一新概念被OpenAI称作GPTs。
奥特曼表示,“每一个GPT像是ChatGPT 的一个为了特殊目的而做出的定制版本。”并演示了这一功能,现场创建了一个“创业导师GPT”。
针对GPTs这一新概念,具体解释就是通过自然语言创建定制化的GPT角色和功能,用户可以根据自己的需求量身打造专属的智能体。

这样的GPT可以深入理解特定行业的知识,提供个性化的交流体验,扩展知识的深度和广度,优化特定任务的执行效率,甚至实时更新信息以支持最新的决策。
这不仅大大提升了GPT在专业领域中的应用价值,也为个人和组织提供了高度定制化的智能解决方案,开启了人工智能实用性和有效性的新篇章。
除了“让不懂代码的人创建应用程序”,OpenAI也致力于让开发者“更轻松地构建应用程序”,OpenAI提到AI Agent,不过人家换了个名字,叫Assistants API。
现在,开发者可以在自己的应用里创建 AI 助手。AI 助手可以根据指令,用 OpenAI 的模型和工具,完成各种任务,比如数据分析、编程等。此外,AI 助手还有很多方便的功能,比如持久线程、检索功能、运行代码、调用函数等。
Assistants API 现在可以免费试用,开发者可以在 Assistants Playground 调用AI 助手,体验“零代码编程”。
除了以上升级之外,此次发布会的GPT all tools也令人眼前一亮,不仅是界面的简化,还有功能的整合。GPT all tools 可以根据用户的输入,自动选择和组合使用最合适的工具。让用户感受到 GPT-4 是一个智能和灵活的 AI 助手,而不仅仅是一个文本生成器。
整合所有功能一体化会为GPT带来全新的变化,GPT会打破现有的功能局限,成为全感官虚拟助理、创意无界工作台、信息综合管家、交互式学习伙伴、全维多语种沟通桥梁等多种角色。
防止再次失联,请立即关注备用号
— 往期精彩英语演讲集 —
36岁华裔波士顿市长吴弭哈佛大学毕业演讲:唯有寻求真理,才能推动真正的变革!(附视频&演讲稿)
波士顿女市长吴弭谈“罗诉韦德案”,取消堕胎权就是摧残生命!
从公交司机之子到伦敦首位穆斯林市长,他就是一个大写的励志哥!(附视频&演讲稿)
双语视频 | 这位南京姑娘在波士顿大学的毕业演讲火了,网友赞其正能量爆棚(附演讲稿)
创造历史!36岁华裔女性吴弭当选美国波士顿市长,打破该市200年来白人男性市长历史!
首位华裔女性角逐波士顿市长!18岁进哈佛,30岁当选议长,36岁问鼎波士顿市长!
首位!45岁华人科学家蒋濛当选普渡大学校长,5分钟最新演讲一睹这位政学研全能博士!(附视频&演讲稿)
发人深省!华裔女孩哈佛大学毕业演讲:说话是一种才能,而沉默是一种智慧(附视频&演讲稿)
想第一时间观看高质量英语演讲&采访视频?把“精彩英语演讲”设置为星标就对了!操作办法就是:进入公众号——点击右上角的●●●——找到“设为星标”点击即可。
快分享
要收藏
点完赞
点在看
继续阅读
阅读原文