The device resembles a small "power bank" and can run deep learning models (LLMs) containing 120 billion operators locally, without needing to connect to the cloud or relying on traditional graphics processing units. (The operators are the "brain" of the model, and the more there are, the smarter the model is and the more capable it is of solving complex problems).
Through this innovation, Tiiny AI aims to enable individuals to use advanced artificial intelligence without relying on cloud data centers, while providing processing power comparable to that of supercomputers, and reducing energy consumption and privacy risks.
The device weighs less than a pound and brings server-scale AI to the palm of your hand. https://t.co/DWlTNCoPFC pic.twitter.com/5VbFW0xnin
Launched on December 10, the AI Pocket Lab is a practical solution to address the sustainability challenges and high energy costs associated with traditional AI infrastructure.
In a press release, Summer Pogue, Marketing and Sales Director at Tiiny AI, said: "Cloud AI has made great strides, but it has created challenges in reliability, security, and sustainability. With Tiiny AI Pocket Lab, we believe that AI should be accessible to individuals, making advanced AI private, personal, and available to each individual device."
The device is designed to meet all personal AI usage needs, and is beneficial to creators, developers, researchers, and students alike.
The device enables multi-step reasoning, deep context understanding, agent workflow management, content creation, and secure handling of sensitive data, all without requiring an internet connection. It also stores user data and documents locally with bank-grade encryption, providing privacy and long-term storage that surpasses cloud systems.
The device supports running models ranging from 10 billion to 100 billion parameters, and can process models up to 120 billion parameters, providing GPT-4-like performance for multi-step inference and analysis while maintaining data security.
The device is powered by a 65W 12-core ARMv9.2 processor, delivering high power performance with a low carbon footprint compared to traditional systems.
The device relies on two basic technologies:
TurboSparse: Enhances efficiency by activating only the required neurons without affecting model performance.
Powerinfer: An open-source engine that distributes AI loads between the CPU and NPU to reduce power consumption while maintaining performance.
The device provides an open-source system, supporting the installation of popular templates such as Llama, DeepSeek, and GPT-OSS.
Users will also receive regular updates and online hardware upgrades, with these features launching at CES in January 2026.
