FaultSeeker: LLM-Empowered Blockchain Fault Localization

System Output

Human Cognitive Layer      AI Architecture Mapping    FaultSeeker Implementation
──────────────────────  →  ─────────────────────── →  ──────────────────────────
Perception Layer        →  Input Processing        →  Blockchain Data Collection
Working Memory          →  Context Management      →  RAG + Dynamic Loading
Attention Mechanism     →  Focus Selection         →  Stage 1 Transaction Forensics
Long-term Memory        →  Knowledge Base          →  Expert Agent Pre-trained Knowledge
Reasoning Layer         →  Synthesis Decision      →  Orchestrator Coordination & Synthesis

System Output

┌─────────────────────────────────────────────────────────┐
│                FaultSeeker Architecture                  │
├─────────────────────────────────────────────────────────┤
│                                                          │
│  Stage 1: Transaction-Level Forensics                   │
│  ┌──────────────────────────────────────────────┐      │
│  │ • Global Scan: Analyze blockchain tx graph   │      │
│  │ • Pattern Recognition: Detect anomalous fund │      │
│  │   flows, Gas consumption patterns            │      │
│  │ • Initial Screening: Locate suspicious txs   │      │
│  │   and related addresses                      │      │
│  └──────────────────────────────────────────────┘      │
│           ↓ (Attention Mechanism - Focus on Key Info)   │
│                                                          │
│  Stage 2: Coordinated Specialist Agents                 │
│  ┌──────────────────────────────────────────────┐      │
│  │        Orchestrator                           │      │
│  │  ┌────────────┬────────────┬────────────┐   │      │
│  │  │ Reentrancy │ FlashLoan  │  Price     │   │      │
│  │  │   Expert   │   Expert   │  Oracle    │   │      │
│  │  │            │            │  Expert    │   │      │
│  │  └────────────┴────────────┴────────────┘   │      │
│  │  ┌────────────┬────────────┐                │      │
│  │  │  Access    │  Logic     │ ...           │      │
│  │  │  Control   │   Flaw     │               │      │
│  │  │  Expert    │   Expert   │               │      │
│  │  └────────────┴────────────┘                │      │
│  │        ↓ (Working Memory - Iterative Reasoning) │   │
│  │    Synthesis → Root Cause → Evidence Chain   │      │
│  └──────────────────────────────────────────────┘      │
│                                                          │
└─────────────────────────────────────────────────────────┘

System Output

# Stage 1 Pseudocode (Transaction-Level Forensics)
class TransactionForensics:
    def analyze(self, transaction_hash):
        # 1. Extract transaction metadata
        tx_data = blockchain.get_transaction(transaction_hash)

        # 2. Build interaction graph
        interaction_graph = self.build_interaction_graph(tx_data)
        # Involved addresses: [0xAttacker, 0xVictim, 0xFlashLoanProvider, ...]
        # Call chain: Attacker → FlashLoan → Victim.withdraw() → ...

        # 3. Anomaly pattern detection
        anomalies = []
        if self.detect_unusual_gas_pattern(tx_data):
            anomalies.append("High Gas consumption (flash loan signature)")
        if self.detect_rapid_fund_flow(interaction_graph):
            anomalies.append("Rapid multi-hop fund transfers")
        if self.detect_reentrancy_pattern(tx_data.logs):
            anomalies.append("Suspected reentrancy call sequence")

        # 4. Identify key contracts
        suspicious_contracts = self.identify_key_contracts(
            interaction_graph, anomalies
        )

        return {
            "focus_contracts": suspicious_contracts,
            "attack_indicators": anomalies,
            "evidence_snapshot": interaction_graph
        }

System Output

# Stage 2 Pseudocode (Coordinated Specialist Agents)
class OrchestratorAgent:
    def __init__(self):
        self.experts = {
            "reentrancy": ReentrancyExpert(),
            "flashloan": FlashLoanExpert(),
            "price_oracle": PriceOracleExpert(),
            "access_control": AccessControlExpert(),
            "logic_flaw": LogicFlawExpert()
        }

    def coordinate_analysis(self, forensics_result):
        # 1. Task assignment
        tasks = self.plan_analysis(forensics_result)
        # E.g., High Gas + rapid fund flow → assign to flashloan expert
        #       Repeated call pattern → assign to reentrancy expert

        # 2. Parallel expert analysis
        expert_findings = {}
        for task in tasks:
            expert = self.experts[task.expert_type]
            result = expert.analyze(
                contracts=forensics_result["focus_contracts"],
                context=forensics_result["evidence_snapshot"]
            )
            expert_findings[task.expert_type] = result

        # 3. Synthesis reasoning
        root_cause = self.synthesize(expert_findings)
        # Cross-validation: flashloan expert finds lending + price_oracle expert finds price anomaly
        #                  → Synthesize as "flash loan + price manipulation combo attack"

        # 4. Generate report
        report = self.generate_report(
            root_cause=root_cause,
            evidence_chain=expert_findings,
            recommendations=self.propose_fixes(root_cause)
        )

        return report

class FlashLoanExpert:
    """Flash Loan Attack Specialist Agent Example"""
    def analyze(self, contracts, context):
        # LLM prompt engineering
        prompt = f"""
        You are a blockchain security expert specializing in flash loan attack analysis.

        Analyze the following smart contract code:
        {contracts}

        Transaction context:
        {context}

        Please answer:
        1. Is there flash loan borrowing behavior? (Check Aave/Uniswap flash loan interface calls)
        2. Are borrowed funds used for price manipulation? (Examine DEX trades, oracle calls)
        3. How does the attacker profit from price deviation? (Analyze arbitrage logic)
        4. Provide detailed evidence chain (call sequences, state changes, fund flows)
        """

        analysis = llm.generate(prompt)
        return {
            "confidence": 0.95,  # Score based on evidence strength
            "finding": "Flash loan attack confirmed",
            "evidence": analysis
        }

System Output

Scenario: Average 10 security incidents requiring deep analysis per year

Traditional approach cost:
- Manual analysis: 10 × $1,670 = $16,700
- Time cost: 10 × 16.7h = 167 analyst hours

FaultSeeker approach cost:
- Direct cost: 10 × $4.53 = $45.3
- Time savings: 167h - 1.4h = 165.6h freed for proactive defense

Annual savings: $16,654 + 165.6h analyst time
Additional value: Faster response avoiding loss escalation (unquantifiable but potentially millions)

System Output

Detected anomalies:
✓ High Gas consumption (4.8M gas, normal tx <300K)
✓ Single flash loan borrowing 2,804 ETH (from Aave)
✓ Rapid DEX transaction sequence (6 Uniswap V2 txs in same block)
✓ Abnormal price fluctuation (crETH price plunged 40% briefly)

Key contracts identified:
- 0x2db0E83599a91b508Ac268a6197b8B14F5e72840 (Attacker contract)
- 0x8C3B7a4320ba70f8239F83770c4015B5bc4e6F91 (Cream crETH)
- 0x7d2768dE32b0b80b7a3454c06BdAc94A69DDc7A9 (Aave flash loan pool)

System Output

[FlashLoan Expert] Analysis result:
- Confirmed Aave flash loan borrowing 2,804 ETH
- Repayment verification: Successfully repaid 2,804.84 ETH (0.3% fee)
- Attacker net profit: ~$18.5M (via Cream protocol arbitrage)
- Confidence: 0.98

[PriceOracle Expert] Analysis result:
- Cream uses Uniswap V2 TWAP price oracle
- Attacker manipulated Uniswap pool in same block (add massive liquidity → borrow → remove liquidity)
- Caused crETH price undervaluation by 40%
- Confidence: 0.96

[AccessControl Expert] Analysis result:
- Cream contract lacks limits on single large borrows
- No delayed oracle update mechanism (vulnerable to same-block manipulation)
- Confidence: 0.92

[Orchestrator] Synthesis:
Root cause: Flash loan + Uniswap price manipulation + Cream oracle design flaw
Attack path:
  1. Flash loan borrowed 2,804 ETH
  2. Injected massive ETH into Uniswap, manipulating crETH/ETH price
  3. Borrowed massive assets from Cream at undervalued price
  4. Repaid flash loan, retained arbitrage profit

Remediation recommendations:
  - Use Chainlink or other decentralized oracles
  - Implement TWAP delayed updates (at least 3 blocks)
  - Add single-borrow caps (liquidity % check)

System Output

[AccessControl Expert] Key finding:
- EthCrossChainManager contract has permission verification flaw
- putCurEpochConPubKeyBytes() function lacks caller permission check
- Attacker constructed malicious cross-chain messages, replaced verification public key
- Gained arbitrary cross-chain asset transfer permission

Root cause:
function putCurEpochConPubKeyBytes(bytes memory curEpochPkBytes) public {
    // ❌ Missing onlyOwner or onlyKeeper modifier
    ConKeepersPkBytes[..] = curEpochPkBytes;
}

Remediation recommendation:
+ function putCurEpochConPubKeyBytes(...) public onlyOwner {
    ConKeepersPkBytes[..] = curEpochPkBytes;
  }

System Output

Security incident occurs
    ↓
1. Real-time monitoring system (Forta/OpenZeppelin Defender) detects anomalous tx
    ↓
2. Automatically triggers FaultSeeker analysis
    ↓ (4-8 minutes)
3. Generates preliminary fault report
    ├─ Root cause localization
    ├─ Attack path reconstruction
    ├─ Affected scope assessment
    └─ Remediation recommendations
    ↓
4. Security team validates report
    ↓
5. Execute emergency response
    ├─ Pause affected contracts
    ├─ Deploy remediation patches
    └─ User notification

System Output

Claim submitted
    ↓
FaultSeeker rapid analysis (5-8 minutes)
    ├─ Verify attack authenticity
    ├─ Confirm loss amount
    ├─ Determine if within policy coverage
    └─ Identify potential fraud (e.g., self-directed attacks)
    ↓
Claims decision recommendation → Final human approval

System Output

# Multi-layer verification mechanism
class VerificationPipeline:
    def validate_finding(self, llm_result):
        # 1. Static analysis tool cross-validation
        slither_check = run_slither(llm_result.contract)
        if not slither_check.confirms(llm_result.vulnerability):
            llm_result.confidence *= 0.7  # Reduce confidence

        # 2. Symbolic execution verification
        if llm_result.type == "reentrancy":
            mythril_result = run_mythril(llm_result.contract)
            if mythril_result.confirms_reentrancy():
                llm_result.confidence *= 1.2  # Increase confidence

        # 3. Human review threshold
        if llm_result.confidence < 0.8:
            return "NEEDS_HUMAN_REVIEW"
        elif llm_result.confidence > 0.95:
            return "HIGH_CONFIDENCE_AUTO_APPROVE"
        else:
            return "MEDIUM_CONFIDENCE"

System Output

┌─────────────────────────────────────┐
│  FaultSeeker 2.0 Architecture Vision │
├─────────────────────────────────────┤
│                                      │
│  1. Development Phase                │
│     ├─ IDE plugins (Remix/VSCode)   │
│     ├─ Real-time security tips      │
│     └─ Historical case similarity   │
│                                      │
│  2. Pre-Deployment Review            │
│     ├─ CI/CD integration            │
│     ├─ Automated security scoring   │
│     └─ Auto-reject high-risk patterns│
│                                      │
│  3. Runtime Monitoring               │
│     ├─ Mempool tx pre-analysis      │
│     ├─ Real-time suspicious tx      │
│     │   interception (<1 sec)       │
│     └─ Firewall contract linkage    │
│                                      │
│  4. Post-Incident Forensics          │
│     └─ 4-8 minute deep analysis     │
│                                      │
└─────────────────────────────────────┘

System Output

┌─────────────────────────────────────────┐
│    DeFi Attack Knowledge Graph (DAKG)   │
├─────────────────────────────────────────┤
│                                          │
│  Entity Types:                           │
│  • Attacker addresses                    │
│  • Victim protocols                      │
│  • Vulnerability types                   │
│  • Attack techniques                     │
│  • Remediation patterns                  │
│                                          │
│  Relationship Types:                     │
│  • Attacker -[executes]-> Attack event  │
│  • Attack event -[exploits]-> Vuln type │
│  • Vuln type -[maps to]-> Fix pattern   │
│  • Attack event -[evolved from]-> Historical│
│                                          │
│  Applications:                           │
│  • Attacker profiling (identify repeat offenders)│
│  • Attack pattern evolution analysis     │
│  • Auto-generate defense strategies      │
│  • Predict future attack trends          │
│                                          │
└─────────────────────────────────────────┘

System Output

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  Project A    │  │  Project B    │  │  Project C    │
│  (On-premise) │  │  (On-premise) │  │  (On-premise) │
│  ┌────────┐  │  │  ┌────────┐  │  │  ┌────────┐  │
│  │Local   │  │  │  │Local   │  │  │  │Local   │  │
│  │Model   │  │  │  │Model   │  │  │  │Model   │  │
│  └────────┘  │  │  └────────┘  │  │  └────────┘  │
│      ↓        │  │      ↓        │  │      ↓        │
│  ┌────────┐  │  │  ┌────────┐  │  │  ┌────────┐  │
│  │Gradient│  │  │  │Gradient│  │  │  │Gradient│  │
│  │Encrypt │  │  │  │Encrypt │  │  │  │Encrypt │  │
│  └────────┘  │  │  └────────┘  │  │  └────────┘  │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                  │                  │
       └──────────────────┼──────────────────┘
                          ↓
                 ┌─────────────────┐
                 │  Central         │
                 │  Aggregation     │
                 │  Server          │
                 │  (Gradients only)│
                 │  ┌───────────┐  │
                 │  │Global Model│  │
                 │  └───────────┘  │
                 └─────────────────┘

System Output

┌────────────────────────────────────────────────────────┐
│    FaultSeeker + Certora + K Framework Integration     │
├────────────────────────────────────────────────────────┤
│                                                         │
│  Layer 1: FaultSeeker Heuristic Analysis               │
│  ┌──────────────────────────────────────────┐         │
│  │ • Rapid scan (4-8 min)                    │         │
│  │ • Generate vulnerability hypotheses       │         │
│  │ • Confidence scoring (0-1)                │         │
│  └──────────────────────────────────────────┘         │
│                    ↓                                    │
│  Layer 2: K Framework Static Verification (Mid-tier)   │
│  ┌──────────────────────────────────────────┐         │
│  │ • KEVM semantic analysis (5-10 min)       │         │
│  │ • Symbolic execution verification         │         │
│  │ • State space exploration                 │         │
│  └──────────────────────────────────────────┘         │
│                    ↓                                    │
│  Layer 3: Certora Formal Proof (Final tier)            │
│  ┌──────────────────────────────────────────┐         │
│  │ • SMT solver verification (30-60 min)     │         │
│  │ • Mathematical proof generation           │         │
│  │ • Counterexample construction             │         │
│  └──────────────────────────────────────────┘         │
│                                                         │
└────────────────────────────────────────────────────────┘

System Output

1. FaultSeeker rapid scan (4-8 minutes)
    ↓
2. Suspected vulnerabilities found (confidence > 0.7) → trigger K Framework
    ↓
3. K Framework KEVM analysis (5-10 minutes)
    ├─ Confirms suspicion → proceed to Certora deep verification
    ├─ Rules out risk → mark FaultSeeker false positive
    └─ Uncertain → manual review
    ↓
4. Certora formal proof (30-60 minutes)
    ├─ Proves vulnerability exists → high-priority remediation + generate PoC
    ├─ Proves vulnerability doesn't exist → update FaultSeeker model
    └─ Cannot prove → mark as complex edge case

System Output

Human Expert Analysis Process    FaultSeeker Architecture Mapping
────────────────────────────  →  ──────────────────────────
1. Attention: Global scan, focal points → Stage 1: Tx-level forensics
2. Working Memory: Load relevant info → RAG + context management
3. Long-term Memory: Invoke expertise → Expert agents (pre-trained knowledge)
4. Reasoning: Synthesize info, conclude → Orchestrator coordination & synthesis

System Output

Conclusion: Flash loan attack (Confidence: 0.92)

Evidence chain:
✅ [0.98] Aave flash loan interface call confirmed (FlashLoanExpert)
✅ [0.96] Uniswap 40% abnormal price fluctuation (PriceOracleExpert)
⚠️ [0.65] No obvious reentrancy pattern found (ReentrancyExpert)
✅ [0.92] Single-block borrow-manipulate-repay completion (LogicFlawExpert)

Synthesis logic:
1. Flash loan confirmed + price manipulation → combo attack hypothesis
2. No reentrancy pattern → rule out reentrancy attack
3. Single-block execution → matches flash loan typical signature
→ Conclusion: Flash loan + price manipulation combo attack

Refutation test:
? Could this be normal arbitrage rather than attack?
  → Check profit source: Victim protocol lost $18.5M, inconsistent with normal arbitrage
  → Determination: Malicious attack

Executive Summary

Key Findings

Chapter 1: Blockchain Security Challenges and Technical Opportunities

1.1 DeFi Security Threat Landscape

Technical-Layer Threats

Analysis Bottlenecks

1.2 LLM Breakthroughs in Code Analysis

Chapter 2: FaultSeeker Technical Architecture Deep Dive

2.1 Cognitive Science-Inspired Two-Stage Design

Theoretical Foundation: COLMA Cognitive Layered Memory Architecture

Architecture Design

2.2 Technical Comparison with Existing Solutions

Technical Dimension Comparison

Detailed Performance Data Comparison (115 Real Cases)

2.3 Key Implementation Challenges

Challenge 1: Context Window Management

Challenge 2: LLM Hallucination Control

Challenge 3: Cost-Performance Balance

Chapter 3: Empirical Evaluation and Performance Analysis

3.1 Evaluation Dataset

3.2 Performance Metrics Comparison

Time Efficiency

Cost Efficiency

Accuracy Comparison

3.3 Detailed Case Analysis

Case 1: Cream Finance Flash Loan Attack (August 2021)

Case 2: Poly Network Cross-Chain Bridge Attack (August 2021)

3.4 Boundary Case Analysis

Scenarios Where FaultSeeker Excels

Scenarios Where FaultSeeker Faces Limitations

Chapter 4: Practical Application Scenarios and Deployment Strategies

4.1 DeFi Protocol Incident Response

4.2 Security Audit Firm Efficiency Enhancement

4.3 Investment Due Diligence

4.4 Insurance Industry Applications

4.5 Regulatory Compliance Support

Chapter 5: Technical Challenges and Future Evolution

5.1 Current Technical Limitations

Limitation 1: LLM Hallucination Risk

Limitation 2: Novel Attack Generalization

Limitation 3: Cross-Chain Analysis Complexity

5.2 Evolution Directions

Direction 1: Proactive Defense Capability

Direction 2: Knowledge Graph Enhancement

Direction 3: Federated Learning & Privacy Protection

Direction 4: Integration with Formal Verification

5.3 Ethical and Social Impact

Double-Edged Sword Issue

Employment Impact

Fairness Issues

Chapter 6: Industry Ecosystem and Competitive Landscape

6.1 Target Market Segmentation

6.2 Competitive Analysis

Direct Competitors

Potential Competitors

6.3 Business Models

Model 1: SaaS Subscription (Mainstream)

Model 2: Pay-per-Use

Model 3: Private Deployment

Model 4: Data Services

6.4 Go-to-Market Strategy

Stage 1: Early Adopters (0-6 months)

Stage 2: Scaling (6-18 months)

Stage 3: Ecosystem Dominance (18-36 months)

Chapter 7: Academic Contributions and Future Research Directions

7.1 Theoretical Contributions

Contribution 1: Cognitive Architecture Application in Software Security

Contribution 2: Multi-Agent Collaboration Paradigm Validation

Contribution 3: Domain-Specialized LLM Methodology

7.2 Future Research Directions

Direction 1: Self-Supervised Learning and Model Distillation

Direction 2: Explainable AI and Trust Mechanisms

Direction 3: Adversarial Testing and Robustness

Direction 4: Cross-Chain Security Analysis Theory

7.3 ASE 2025 Conference Significance

Conclusion

Feng Ning (风宁)

Related Chronicles

Yellow Teaming Framework

HexStrike AI Tool: Deep Technical Analysis and Defense