• 首页
  • 报告
  • 资讯
  • 快讯
  • 图表
  • 网址导航
首页外文报告外文报告GPT-4 Technical Report(英)
PDF

GPT-4 Technical Report(英)

小*理 98页 4.85M

下载文档
/ 98
全屏查看
GPT-4 Technical Report(英)
还有 98 页未读 ,您可以 继续阅读 或 下载文档
报告派所有资源均由用户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
1、本文档共计 98 页,下载后文档不带水印,支持完整阅读内容或进行编辑。
2、当您下载文档后,并不意味着拥有了版权,文档仅供网友学习交流,不得用于其他商业用途。
3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!
4、本本所有内容不构成任何投资建议,不具有任何指导和买卖意见。
5、如文档内容存在违规,或者侵犯商业秘密、侵犯著作权等,请点击“违规举报”。
from experience.Care should be taken when using the outputs of GPT-4,particularly in contextswhere reliability is important.GPT-4's capabilities and limitations create significant and novel safety challenges,and we believecareful study of these challenges is an important area of research given the potential societal impact.This report includes an extensive system card (after the Appendix)describing some of the risks weforesee around bias,disinfommation,over-reliance,privacy,cybersecurity,proliferation,and moreIt also describes interventions we made to mitigate potential harms from the deployment of GPT-4,including adversarial testing with domain experts,and a model-assisted safety pipeline.2 Scope and Limitations of this Technical ReportThis report focuses on the capabilities,limitations,and safety properties of GPT-4.GPT-4 is aTransformer-style model [33]pre-trained to predict the next token in a document,using both publiclyavailable data(such as internet data)and data licensed from third-party providers.The model wasthen fine-tuned using Reinforcement Learning from Human Feedback (RLHF)[34].Given boththe competitive landscape and the safety implications of large-scale models like GPT-4,this reportcontains no further details about the architecture (including model size),hardware,training computedataset construction,training method,or similar.We are committed to independent auditing of our technologies,and shared some initial steps andideas in this area in the system card accompanying this release.2 We plan to make further technicaldetails available to additional third parties who can advise us on how to weigh the competitive andsafety considerations above against the scientific value of further transparency.3 Predictable ScalingA large focus of the GPT-4 project was building a deep learning stack that scales predictably.Theprimary reason is that for very large training runs like GPT-4,it is not feasible to do extensivemodel-specific tuning.To address this,we developed infrastructure and optimization methods thathave very predictable behavior across multiple scales.These improvements allowed us to reliablypredict some aspects of the perfommance of GPT-4 from smaller models trained using 1,000x10,000×less compute.3.1 Loss PredictionThe final loss of properly-trained large language models is thought to be well approximated by powerlaws in the amount of compute used to train the model [35.36.2.14,15].To verify the scalability of our optimization infrastructure,we predicted GPT-4's final loss on ourintemal codebase(not part of the training set)by fitting a scaling law with an irreducible loss term(as in Henighan et al.[15D:L(C)=aCb+c,from models trained using the same methodologybut using at most 10,000x less compute than GPT-4.This prediction was made shortly after the runstarted,without use of any partial results.The fitted scaling law predicted GPT-4's final loss withhigh accuracy (Figure 1).3.2 Scaling of Capabilities on HumanEvalHaving a sense of the capabilities of a model before training can improve decisions around alignment.safety,and deployment.In addition to predicting final loss,we developed methodology to predictmore interpretable metrics of capability.One such metric is pass rate on the HumanEval dataset [37],which measures the ability to synthesize Python functions of varying complexity.We successfullypredicted the pass rate on a subset of the HumanEval dataset by extrapolating from models trainedwith at most 1,000 x less compute (Figure 2).For an individual problem in HumanEval,performance may occasionally worsen with scale.Despitethesechallenges,we find an approximate power law relationship-Eplog(pass_rate(C))]=a+C-2In addition to the accompanying system card,OpenAI will soon publish additional thoughts on the socialand economic implications of Al systems,including the need for effective regulation.