• About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post
No Result
View All Result
Digital Phablet
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
  • Home
  • NewsLatest
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones
  • AI
  • Reviews
  • Interesting
  • How To
No Result
View All Result
Digital Phablet
No Result
View All Result

Home » Tencent Creates “Minecraft” Hack: 400 Screenshots Help AI Mine and Beat Levels

Tencent Creates “Minecraft” Hack: 400 Screenshots Help AI Mine and Beat Levels

Seok Chen by Seok Chen
September 4, 2025
in AI
Reading Time: 2 mins read
A A
8A12CD7D25979CB7A89E6DE1DFB83253A10F512F size46 w600 h176.png
ADVERTISEMENT

Select Language:

Most people see Minecraft simply as a highly自由 sandbox game, but at the Hong Kong University of Science and Technology (Guangzhou) and Tencent’s joint research team, it’s viewed as a digital training ground for practicing general artificial intelligence (AI).

ADVERTISEMENT

The team has introduced the VistaWise framework, a system that combines “cross-modal knowledge graphs” with lightweight visual fine-tuning, marking the first time such an approach has been applied to open-world intelligent agents using minimal data. This innovative system leverages only 471 images from the game, allowing visual model fine-tuning on a standard 24GB GPU—a task that typically demands high-end hardware and extensive datasets.

Experimental results are promising: VistaWise achieved an 8 percentage point improvement over previous state-of-the-art methods in completing the “obtaining diamond” task chain, successfully meeting success thresholds of over 73% across all nine sub-tasks. This approach led to a new record, with a success rate of 33% for non-API methods in this chain, a significant leap forward.

Recently, the joint work was officially accepted at EMNLP 2025, one of the top conferences in natural language processing, signifying the importance of this development.

ADVERTISEMENT

The innovative framework, dubbed “graph-retrieval-control,” is built around the core concepts of “one graph, two enhancements, three collaborations.” It primarily uses a lightweight “knowledge graph” that fuses open-world game text strategies with real-time visual perception, updated within 20 milliseconds with only a single 1080p image. By integrating this with a lightweight object detector (YOLOv10-L) fine-tuned on just 471 images, the system can precisely locate game entities, using pixel-based depth estimation to save computational resources.

The system employs a dual-stage “Path-Searching + Entity-Matching” approach for task-specific information retrieval, enabling it to focus on relevant visual and textual cues swiftly. Control functions are simplified too, built on PyAutoGUI to support keyboard and mouse actions, allowing the AI to perform complex gameplay maneuvers like clicking, dragging, and crafting without reliance on external APIs such as MineFlayer.

Decisions unfold through four main stages: perception, retrieval, reasoning, and execution. The AI detects the environment and objects, updates the knowledge graph in real-time, generates next-step actions with GPT-4o, and acts by translating these instructions into commands driving the Minecraft client. Remarkably, all this runs locally on a laptop with only 8GB of GPU memory, with training completed on a single 24GB GPU, reducing hardware costs drastically—by more than 87%.

Compared to multimodal large language models, VistaWise uses less data and fewer tokens for inference while maintaining or exceeding performance levels. For example, the entire process of obtaining diamonds—often demanding vast datasets and high-end hardware—costs only about 5% of what previous methods required, roughly $1.28 instead of $25.

This leap suggests that small data sets can power highly efficient and cost-effective AI models capable of complex reasoning and decision-making in open-world environments.

The team’s core researcher, Wang Hao, an assistant professor and doctoral supervisor at HKUST (Guangzhou), previously earned his Ph.D. from Nanyang Technological University in Singapore. His work spans generative AI agents and 3D reconstruction, with over 50 publications in top conferences such as TPAMI, IJCV, CVPR, and NeurIPS. He has also received funding from the National Natural Science Foundation of China, the Ministry of Science and Technology, and various industry grants.

ADVERTISEMENT

For more details, consult the full paper available at https://arxiv.org/abs/2508.18722.

ChatGPT Add us on ChatGPT Perplexity AI Add us on Perplexity
ADVERTISEMENT
Seok Chen

Seok Chen

Seok Chen is a mass communication graduate from the City University of Hong Kong.

Related Posts

Captain Hartlin Boss Guide - Path of Exile 2: Completing & Solving
Gaming

Captain Hartlin Boss Guide – Path of Exile 2: Completing & Solving

September 4, 2025
Getting Started with Indiana Jones: Order of Giants DLC Guide
Gaming

Getting Started with Indiana Jones: Order of Giants DLC Guide

September 4, 2025
Top 10 Countries by Natural Resource Wealth
Infotainment

Top 10 Countries with the Most Natural Resource Wealth

September 4, 2025
Acer’s New AI-Focused Laptops Tablets and Monitors
News

Mammotion Unveils Smart Navigation for Robot Lawn Mowers at IFA

September 4, 2025
Next Post
Afghanistan Earthquake Kills 2,200+; Survivors Struggle for Aid

Afghanistan Earthquake Kills 2,200+; Survivors Struggle for Aid

  • About Us
  • Contact Us
  • Advertise
  • Privacy Policy
  • Guest Post

© 2025 Digital Phablet

No Result
View All Result
  • Home
  • News
  • Technology
    • Education Tech
    • Home Tech
    • Office Tech
    • Fintech
    • Digital Marketing
  • Social Media
  • Gaming
  • Smartphones

© 2025 Digital Phablet