Close Menu
StreamLineCrypto.comStreamLineCrypto.com
  • Home
  • Crypto News
  • Bitcoin
  • Altcoins
  • NFT
  • Defi
  • Blockchain
  • Metaverse
  • Regulations
  • Trading
What's Hot

CoinFund President: RWA Tokens Democratize Investing

July 13, 2025

Pump.fun is the summer’s most hyped—and troubling—ICO

July 13, 2025

Will It Blast Through $125,000 Or Slip Back To $110,000?

July 13, 2025
Facebook X (Twitter) Instagram
Sunday, July 13 2025
  • Contact Us
  • Privacy Policy
  • Cookie Privacy Policy
  • Terms of Use
  • DMCA
Facebook X (Twitter) Instagram
StreamLineCrypto.comStreamLineCrypto.com
  • Home
  • Crypto News
  • Bitcoin
  • Altcoins
  • NFT
  • Defi
  • Blockchain
  • Metaverse
  • Regulations
  • Trading
StreamLineCrypto.comStreamLineCrypto.com

OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ

October 21, 2024Updated:October 21, 2024No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ
Share
Facebook Twitter LinkedIn Pinterest Email
ad


OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQReceive, Manage & Grow Your Crypto Investments With Brighty

SolidityBench by IQ has launched as the primary leaderboard to judge LLMs in Solidity code era. Out there on Hugging Face, it introduces two modern benchmarks, NaïveJudge and HumanEval for Solidity, designed to evaluate and rank the proficiency of AI fashions in producing sensible contract code.

Developed by IQ’s BrainDAO as a part of its forthcoming IQ Code suite, SolidityBench serves to refine their very own EVMind LLMs and examine them in opposition to generalist and community-created fashions. IQ Code goals to supply AI fashions tailor-made for producing and auditing sensible contract code, addressing the rising want for safe and environment friendly blockchain purposes.

As IQ advised CryptoSlate, NaïveJudge affords a novel method by tasking LLMs with implementing sensible contracts primarily based on detailed specs derived from audited OpenZeppelin contracts. These contracts present a gold commonplace for correctness and effectivity. The generated code is evaluated in opposition to a reference implementation utilizing standards similar to practical completeness, adherence to Solidity finest practices and safety requirements, and optimization effectivity.

The analysis course of leverages superior LLMs, together with totally different variations of OpenAI’s GPT-4 and Claude 3.5 Sonnet as neutral code reviewers. They assess the code primarily based on rigorous standards, together with implementing all key functionalities, dealing with edge circumstances, error administration, correct syntax utilization, and total code construction and maintainability.

Optimization issues similar to gasoline effectivity and storage administration are additionally evaluated. Scores vary from 0 to 100, offering a complete evaluation throughout performance, safety, and effectivity, mirroring the complexities {of professional} sensible contract growth.

Which AI fashions are finest for solidity sensible contract growth?

Benchmarking outcomes confirmed that OpenAI’s GPT-4o mannequin achieved the very best total rating of 80.05, with a NaïveJudge rating of 72.18 and HumanEval for Solidity cross charges of 80% at cross@1 and 92% at cross@3.

Curiously, newer reasoning fashions like OpenAI’s o1-preview and o1-mini had been overwhelmed to the highest spot, scoring 77.61 and 75.08, respectively. Fashions from Anthropic and XAI, together with Claude 3.5 Sonnet and grok-2, demonstrated aggressive efficiency with total scores hovering round 74. Nvidia’s Llama-3.1-Nemotron-70B scored lowest within the prime 10 at 52.54.

SolidityBench scores for LLMs (Hugging Face)
SolidityBench scores for LLMs (Hugging Face)

Per IQ, HumanEval for Solidity adapts OpenAI’s authentic HumanEval benchmark from Python to Solidity, encompassing 25 duties of various problem. Every process contains corresponding checks suitable with Hardhat, a preferred Ethereum growth atmosphere, facilitating correct compilation and testing of generated code. The analysis metrics, cross@1 and cross@3, measure the mannequin’s success on preliminary makes an attempt and over a number of tries, providing insights into each precision and problem-solving capabilities.

Objectives of using AI fashions in sensible contract growth

By introducing these benchmarks, SolidityBench seeks to advance AI-assisted sensible contract growth. It encourages the creation of extra subtle and dependable AI fashions whereas offering builders and researchers with priceless insights into AI’s present capabilities and limitations in Solidity growth.

The benchmarking toolkit goals to advance IQ Code’s EVMind LLMs and in addition units new requirements for AI-assisted sensible contract growth throughout the blockchain ecosystem. The initiative hopes to deal with a crucial want within the trade, the place the demand for safe and environment friendly sensible contracts continues to develop.

Builders, researchers, and AI lovers are invited to discover and contribute to SolidityBench, which goals to drive the continual refinement of AI fashions, promote finest practices, and advance decentralized purposes.

Go to the SolidityBench leaderboard on Hugging Face to study extra and start benchmarking Solidity era fashions.

High AI Crypto Belongings

View All

Talked about on this article



Source link

ad
Code contract GPT model OpenAI ranked Smart Solidity writing
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Related Posts

CoinFund President: RWA Tokens Democratize Investing

July 13, 2025

Pump.fun is the summer’s most hyped—and troubling—ICO

July 13, 2025

Will It Blast Through $125,000 Or Slip Back To $110,000?

July 13, 2025

Urgent appeal to help defend Tornado Cash’s Roman Storm and the right to financial privacy

July 13, 2025
Add A Comment
Leave A Reply Cancel Reply

ad
What's New Here!
CoinFund President: RWA Tokens Democratize Investing
July 13, 2025
Pump.fun is the summer’s most hyped—and troubling—ICO
July 13, 2025
Will It Blast Through $125,000 Or Slip Back To $110,000?
July 13, 2025
Urgent appeal to help defend Tornado Cash’s Roman Storm and the right to financial privacy
July 13, 2025
Bitcoin May Land On 36 More Company Balance Sheets This Year, Blockchain Firm Says
July 13, 2025
Facebook X (Twitter) Instagram Pinterest
  • Contact Us
  • Privacy Policy
  • Cookie Privacy Policy
  • Terms of Use
  • DMCA
© 2025 StreamlineCrypto.com - All Rights Reserved!

Type above and press Enter to search. Press Esc to cancel.