Earlier this year, Mainframe released the fullmoon app, a free and open-source tool that allows users to run large language models, such as Llama and DeepSeek, entirely on their device. It launched on January 13, 2025, aptly timed with a full moon, a playful nod to the name fullmoon (fuLLMoon).
Rather than relying on internet connections, fullmoon executes AI models locally on Apple silicon, which improves speed and keeps conversation data private. The app works fully offline across iPhone, iPad, Mac, and Vision Pro, and it requires iOS 17.6, iPadOS 17.6, macOS 14.0, or visionOS 2.0 or later.
While many AI chat tools rely on constant internet connectivity, fullmoon takes a different approach. By running models locally on Apple silicon, it delivers faster response times and ensures that user data stays on the device. For privacy-conscious users, this means conversations remain entirely personal, with no third-party servers involved. The offline capability also makes it useful in low-connectivity environments or for those who prefer to keep their workflows self-contained.
The app supports several models including Llama-3.2-1B-Instruct-4bit, Llama-3.2-3B-Instruct-4bit, and DeepSeek-R1-Distill-Qwen-1.5B in both 4-bit and 8-bit versions. These models are optimized for Apple’s M-series chips, balancing performance with efficiency. Users can further tailor their experience through customizable themes, fonts, and system prompts, and Apple Shortcuts integration allows them to embed AI functionality into automated tasks.
Fullmoon began as a TestFlight beta, where over 3,500 participants helped shape the final release. Community feedback has been integral to its development, and the open-source nature of the project—hosted on GitHub—ensures that anyone can inspect, modify, or contribute to the code. This transparency has helped it gain traction among both developers and everyday users looking for trustworthy AI tools.
Since its launch, fullmoon has earned positive reviews for its speed, reliability, and user-friendly interface. The ability to use advanced models without internet access has been particularly appreciated by travelers, field professionals, and those seeking private AI experimentation. Even with more capable models, performance remains smooth, highlighting the efficiency of running LLMs directly on Apple hardware.
Rather than being just another chatbot app, fullmoon represents a shift toward personal AI: powerful, private, and locally controlled. It demonstrates that running advanced language models on consumer devices is not only possible but practical, paving the way for more accessible and privacy-focused AI experiences.