AI+Web3 Ecosystem Panorama: New Opportunities from Computing Power Sharing to Privacy Computing

2025-07-09 17:54:47

AI+Web3: Towers and Squares

TL;DR

Web3 projects with AI concepts have become targets for capital attraction in both primary and secondary markets.
The opportunities of Web3 in the AI industry are reflected in: using distributed incentives to coordinate potential supplies in the long tail------across data, storage, and computation; at the same time, establishing an open-source model and a decentralized market for AI agents.
AI's main application in the Web3 industry is on-chain finance (crypto payments, trading, data analysis) and assisting development.
The utility of AI + Web3 is reflected in the complementarity of the two: Web3 is expected to counter the centralization of AI, while AI is expected to help Web3 break out of its circle.

Introduction

In the past two years, the development of AI has been like pressing the acceleration button. This butterfly effect triggered by ChatGPT has not only opened a new world of generative artificial intelligence but has also stirred up a tidal wave in the realm of Web3.

With the support of AI concepts, the funding boost in the relatively sluggish cryptocurrency market is evident. Media statistics show that in the first half of 2024, a total of 64 Web3+AI projects completed funding, with the AI-based operating system Zyber365 achieving a maximum funding amount of 100 million dollars in its Series A round.

The secondary market is even more prosperous. According to data from the crypto aggregation site Coingecko, in just over a year, the total market value of the AI sector has reached 48.5 billion USD, with a 24-hour trading volume approaching 8.6 billion USD. The positive impact of mainstream AI technological advancements is obvious; after the release of a certain company's Sora text-to-video model, the average price of the AI sector surged by 151%. The AI effect has also radiated to one of the cryptocurrency money-making sectors, Meme: the first AI Agent concept MemeCoin------GOAT quickly became popular and achieved a valuation of 1.4 billion USD, successfully igniting the AI Meme craze.

The research and topics surrounding AI+Web3 are equally heated, from AI+Depin to AI Memecoin, and now to AI Agent and AI DAO, the FOMO sentiment can no longer keep up with the speed of the new narrative rotation.

AI+Web3, this term combination filled with hot money, opportunities, and futuristic fantasies, is inevitably seen by some as a marriage arranged by capital. It seems difficult for us to discern whether beneath this magnificent robe lies a stage for speculators or the eve of a dawn explosion?

To answer this question, a key consideration for both parties is whether it will get better with the other involved. Can one benefit from the other's model? In this article, we also attempt to examine this pattern from the shoulders of predecessors: How can Web3 play a role in various aspects of the AI technology stack, and what new vitality can AI bring to Web3?

Part.1 What opportunities does Web3 have under the AI stack?

Before diving into this topic, we need to understand the technical stack of AI large models:

To express the entire process in simpler terms: a "large model" is like the human brain. In the early stages, this brain belongs to a newborn baby that has just arrived in the world, needing to observe and intake a vast amount of external information to understand this world. This is the "collection" phase of data. Since computers do not possess multiple senses like human sight and hearing, large-scale unlabelled information from the outside world needs to be converted into a format that computers can understand and utilize through "preprocessing" before training.

After inputting data, the AI constructs a model with understanding and predictive abilities through "training", which can be seen as the process of an infant gradually understanding and learning about the outside world. The parameters of the model are like the language skills that the infant adjusts continuously during the learning process. When the learning content begins to specialize, or when feedback from communication with others is received and corrections are made, it enters the "fine-tuning" stage of the large model.

As children gradually grow up and learn to speak, they can understand meanings and express their feelings and thoughts in new conversations. This stage is similar to the "reasoning" of large AI models, where the model can predict and analyze new language and text inputs. Infants express feelings, describe objects, and solve various problems through language abilities, which is also akin to how large AI models are applied in the reasoning phase to various specific tasks after training and deployment, such as image classification, speech recognition, and so on.

The AI Agent is moving closer to the next form of large models - capable of independently executing tasks and pursuing complex goals. It not only possesses the ability to think but also has memory, planning skills, and can use tools to interact with the world.

Currently, in response to the pain points of AI across various stacks, Web3 has preliminarily formed a multi-layered, interconnected ecosystem that encompasses all stages of the AI model process.

1. Basic Layer: Computing Power and Data's Airbnb

▎Hash Rate

Currently, one of the highest costs of AI is the computational power and energy required for training models and inference models.

One example is that a company's LLAMA3 requires 16,000 H100 GPUs produced by a certain company (this is a top-tier graphics processing unit designed for artificial intelligence and high-performance computing workloads) and takes 30 days to complete training. The unit price of the latter's 80GB version ranges from $30,000 to $40,000, which necessitates an investment of $400 million to $700 million in computing hardware (GPU + network chips), while the monthly training consumes 1.6 billion kilowatt-hours, with energy expenditures approaching $20 million per month.

The release of AI computing power is also one of the earliest intersections of Web3 and AI------DePin (Decentralized Physical Infrastructure Network). Currently, the DePin Ninja data website has displayed over 1,400 projects, among which representative projects for GPU computing power sharing include io.net, Aethir, Akash, Render Network, and so on.

The main logic is that the platform allows individuals or entities with idle GPU resources to contribute their computing power in a decentralized way without permission, similar to an online marketplace for buyers and sellers like a ride-hailing platform or an accommodation platform, thereby improving the utilization of underutilized GPU resources. End users also obtain more cost-effective and efficient computing resources as a result; at the same time, the staking mechanism ensures that if there is a violation of the quality control mechanism or a network interruption, resource providers will face corresponding penalties.

Its characteristics are:

Gather idle GPU resources: The suppliers mainly consist of surplus computing power resources from third-party independent small and medium-sized data centers, cryptocurrency mining farms, etc., with mining hardware based on the PoS consensus mechanism, such as FileCoin and ETH miners. Currently, there are also projects aimed at launching devices with lower entry thresholds, such as exolab utilizing local devices like MacBook, iPhone, iPad, etc., to establish a computing power network for running large model inference.
Facing the long tail market of AI computing power:

a. "In terms of technology, a decentralized computing power market is more suitable for inference steps. Training relies more on the data processing capabilities brought by ultra-large cluster scale GPUs, while inference requires relatively lower GPU computing performance, such as Aethir focusing on low-latency rendering tasks and AI inference applications."

b. "From the demand side perspective," small and medium computing power demanders will not train their own large models individually, but will only choose to optimize and fine-tune around a few leading large models, and these scenarios are naturally suitable for distributed idle computing power resources.

Decentralized ownership: The technological significance of blockchain lies in the fact that resource owners always retain control over their resources, allowing for flexible adjustments based on demand while also generating profits.

▎Data

Data is the foundation of AI. Without data, computation is as useless as a floating weed, and the relationship between data and models is like the saying "Garbage in, Garbage out"; the quantity of data and the quality of input determine the final output quality of the model. For the training of current AI models, data determines the model's language ability, understanding ability, and even values and human-like performance. Currently, the data demand dilemma of AI mainly focuses on the following four aspects:

Data Hunger: AI model training relies on a large amount of data input. Public information shows that a certain company's training of GPT-4 has reached a trillion-level parameter count.
Data Quality: With the integration of AI and various industries, the timeliness of data, diversity of data, specialization of vertical data, and the incorporation of emerging data sources such as social media sentiment have posed new requirements for its quality.
Privacy and compliance issues: Currently, various countries and enterprises are gradually recognizing the importance of high-quality datasets and are imposing restrictions on dataset scraping.
High data processing costs: large data volume and complex processing. Public information shows that more than 30% of AI companies' R&D costs are used for basic data collection and processing.

Currently, web3 solutions are reflected in the following four aspects:

Data Collection: The availability of freely accessible real-world data for scraping is rapidly diminishing, and AI companies are increasing their spending on data year by year. However, this expenditure has not benefited the true contributors of the data; platforms have entirely enjoyed the value creation brought by the data. For instance, a certain social media platform generated a total income of $203 million through data licensing agreements with AI companies.

The vision of Web3 is to allow users who contribute genuinely to also participate in the value creation brought by data, and to obtain more private and valuable data from users in a low-cost manner through distributed networks and incentive mechanisms.

Grass is a decentralized data layer and network that allows users to contribute idle bandwidth and relay traffic by running Grass nodes to capture real-time data from across the internet and earn token rewards.
Vana has introduced a unique Data Liquidity Pool (DLP) concept, allowing users to upload their private data (such as shopping records, browsing habits, social media activities, etc.) to a specific DLP and flexibly choose whether to authorize the use of this data by specific third parties;
In PublicAI, users can use #AI 或#Web3 as a classification tag on a certain social platform and @PublicAI to achieve data collection.

Data Preprocessing: In the data processing of AI, the collected data is often noisy and contains errors, so it must be cleaned and transformed into a usable format before training the model, involving standardization, filtering, and handling missing values as repetitive tasks. This stage is one of the few manual processes in the AI industry, giving rise to the profession of data annotators. As the model's requirements for data quality increase, the threshold for data annotators has also risen, and this task is naturally suitable for the decentralized incentive mechanisms of Web3.

Currently, Grass and OpenLayer are both considering adding data labeling as a key step.
Synesis has proposed the concept of "Train2earn," emphasizing data quality, where users can earn rewards by providing labeled data, annotations, or other forms of input.
The data labeling project Sapien gamifies the labeling tasks and allows users to stake points to earn more points.

Data Privacy and Security: It is important to clarify that data privacy and security are two different concepts. Data privacy involves the handling of sensitive data, while data security protects data information from unauthorized access, destruction, and theft. Thus, the advantages of Web3 privacy technologies and potential application scenarios are reflected in two aspects: (1) Training of sensitive data; (2) Data collaboration: Multiple data owners can participate in AI training together without sharing their raw data.

Current common privacy technologies in Web3 include:

Trusted Execution Environment ( TEE ), such as Super Protocol;
Fully Homomorphic Encryption (FHE), such as BasedAI, Fhenix.io, or Inco Network;
Zero-knowledge technology (zk), such as Reclaim Protocol which uses zkTLS technology, generates zero-knowledge proofs for HTTPS traffic, allowing users to securely import activity, reputation, and identity data from external websites without exposing sensitive information.

However, the field is still in its early stages, and most projects are still in exploration. A current dilemma is the cost of computation.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

22 Likes