OpenAI has officially launched the GPT-4.1 model family, comprising GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano via its API platform. Designed for real-world performance, the models offer significant improvements across coding, instruction following, and long-context understanding, while operating at a fraction of the cost and latency of previous models.

The launch signals a decisive move to phase out GPT-4.5 Preview, with OpenAI setting a July 14, 2025 sunset date. GPT-4.1 now stands as the company’s flagship API-only model, providing both superior performance and pricing advantages.

GPT-4.1

Table of Contents

Coding Capabilities Reimagined

GPT-4.1 outperforms all previous GPT models on SWE-bench Verified, completing 54.6% of tasks, a 21-point leap over GPT-4o. The model excels in real-world software engineering, particularly when editing large code files or generating patches in diff format.

Benchmarks like Aider’s polyglot confirm GPT-4.1’s prowess in multilingual coding tasks, with the model achieving over 52% accuracy in diff generation, doubling GPT-4o’s score and beating GPT-4.5. Developers can also take advantage of expanded output token limits (up to 32,768 tokens) and use optimized prompting for faster iteration cycles.

“GPT-4.1 offers a game-changing experience for engineers, particularly in debugging, refactoring, and real-time code generation,” said a developer at Windsurf, one of OpenAI’s alpha partners.

Sharper Instruction Following and Response Fidelity

With a 38.3% score on Scale’s MultiChallenge benchmark 10.5% higher than GPT-4o, GPT-4.1 stands out for its improved comprehension of nuanced, multi-step instructions. In OpenAI’s own instruction-following tests, it registered a 49.1% accuracy on hard prompts, significantly outpacing competitors.

The model’s ability to parse and follow ordered, conditional, and format-specific instructions makes it highly effective in enterprise applications like tax compliance (as seen with Blue J) and database querying (as evidenced by performance at Hex).

Revolutionary Long Context: Up to 1 Million Tokens

GPT-4.1 introduces a game-changing context window of up to 1 million tokens roughly the size of eight full React codebases. With new evaluations like OpenAI-MRCR and Graphwalks, GPT-4.1 has proven capable of both retrieval and multi-hop reasoning across vast documents, outperforming GPT-4o in all positions and tests.

Partners like Thomson Reuters and Carlyle report major gains in accuracy and document analysis, from complex legal workflows to extracting financial data from multi-format files.

“GPT-4.1 allows us to process legal contracts across dozens of documents with minimal supervision,” noted a representative from Thomson Reuters.

Mini and Nano: Performance at the Edge of Speed and Cost

GPT-4.1 mini matches GPT-4o’s intelligence but cuts latency nearly in half and slashes cost by 83%.
GPT-4.1 nano is the fastest and cheapest model to date, ideal for classification and autocomplete tasks, with impressive scores: 80.1% on MMLU and 50.3% on GPQA.

These models enable high-performance at scale, supporting up to 1 million token contexts with response times as fast as 5 seconds for large prompts.

Multimodal and Visual Understanding Enhanced

Visual benchmarks also show major improvements. On tests like MMMU and MathVista, GPT-4.1 mini and nano beat GPT-4o and even rival GPT-4.5. The models excel at interpreting charts, mathematical diagrams, and scientific papers, critical for education, research, and analytics use cases.

Pricing and Availability

The entire GPT-4.1 series is now live in the OpenAI API, including through the Batch API with an additional 50% discount. Here’s the pricing breakdown per 1 million tokens:

Model	Input	Cached Input	Output	Blended Pricing
GPT-4.1	$2.00	$0.50	$8.00	$1.84
GPT-4.1 mini	$0.40	$0.10	$1.60	$0.42
GPT-4.1 nano	$0.10

Prompt caching discounts have increased to 75%, allowing developers to optimize both performance and cost.

What’s Next?

With GPT-4.1’s improvements in code generation, long-context reasoning, instruction following, and vision, OpenAI paves the way for more robust agentic applications. The company plans to continue incorporating learnings from GPT-4.5 and earlier iterations into its evolving suite of models.

As GPT-4.5 Preview sunsets, developers are urged to migrate their workloads to GPT-4.1 to benefit from its enhanced capabilities and lower operational costs.

“GPT-4.1 is the practical AI leap we’ve been waiting for,” said a product manager at Qodo. “It’s smarter, faster, and more aligned with the challenges developers face today.”

OpenAI Launches GPT-4.1 API with Breakthrough Features