Logo
FrontierNews.ai

LM Studio's New iPhone App Lets You Control a Powerful Home AI Rig From Your Phone

LM Studio released a new iPhone app called Locally paired with LM Link, a private encrypted connection that lets you run large AI models on your home computer and access them from your phone over cellular or Wi-Fi, with all inference and chat history staying on your own hardware. The update, which arrived in early June 2026, represents a shift in how people can work with local AI.

What Problem Does This Solve for Home AI Users?

If you own a powerful home computer with a good graphics card or Apple Silicon chip, you've probably faced a frustration: your phone can't run large language models (LLMs), which are AI systems trained on vast amounts of text to understand and generate human language. A 70-billion-parameter model, which represents the scale and complexity of the model, requires far more memory than any smartphone has. But your home rig has plenty of power sitting idle when you're away from your desk.

LM Link solves this by making your phone a thin client, meaning it acts as a remote control rather than doing the heavy computational work itself. The model weights, which are the learned parameters that make the AI work, stay loaded in your home computer's memory. Your phone sends prompts and receives responses, but all the actual thinking happens on your home machine.

How Does the Connection Stay Private and Secure?

The architecture relies on Tailscale, a mesh VPN technology based on WireGuard encryption, but you don't need to install or configure it yourself. LM Studio embeds tsnet, a userspace library version of Tailscale that runs entirely inside the app. This matters because it means no kernel changes, no administrator rights required, and no rerouting of your other internet traffic.

The connection punches through network obstacles like CGNAT (carrier-grade network address translation), corporate firewalls, and double-NAT home routers without requiring you to open ports on your router or expose anything to the public internet. Two devices find each other through Tailscale's coordination servers and then connect directly.

End-to-end encryption via WireGuard means prompts, responses, model listings, and hardware information travel only between your devices. According to the developers, neither Tailscale nor LM Studio's backend can read the contents. The only thing that touches LM Studio's servers is your device discovery list, so the two devices can find each other.

What Are the Speed and Performance Trade-offs?

Performance over LM Link has two components: network latency and token generation speed. Network latency is typically tens of milliseconds on LTE and lower on 5G, which is minimal. The real factor that determines whether the experience feels responsive is how fast your home computer generates tokens, which are the individual words or word fragments that the AI produces.

Comfortable reading speed is roughly 7 to 10 tokens per second. On a phone, where you read in shorter bursts, anything above 15 tokens per second feels essentially instant. Here's how different model sizes perform on a Mac Studio M4 Max, which has 546 gigabytes per second of memory bandwidth:

  • Qwen2.5 7B model: Generates approximately 87 tokens per second, feeling instant as text appears faster than you can read.
  • 14-billion-parameter model: Generates 40 to 50 tokens per second, feeling instant for reading and faster than your natural reading pace.
  • Llama 3.3 70B model: Generates 20 to 28 tokens per second, comfortable and faster than reading speed with only a slight wait on the first token.
  • 70B model at long context: Drops toward 18 tokens per second, still readable but with a noticeable pause before the first word appears.

The takeaway is that even a 70-billion-parameter model at 20 to 28 tokens per second comfortably outpaces phone reading speed, so the remote experience feels good. The real bottleneck you'll hit is time to first token on large models with long prompts, where the host computer must process your context before the first word appears.

How to Set Up LM Studio with iPhone Remote Access

  • Update your host machine: Install LM Studio 0.4.16 or later on the computer with the GPU or Apple Silicon, then sign in to your LM Studio account in the top-right corner.
  • Download and load a model: Download at least one model you want to reach remotely; a 7-billion to 14-billion-parameter model is the sweet spot for phone use, and open LM Studio settings to toggle LM Link on.
  • Verify the local server: Run a quick test to confirm the OpenAI-compatible endpoint is working by checking that your model appears in the local server's model list.
  • Install Locally on your phone: Download the Locally app from the App Store on your iPhone or iPad, sign in with the same LM Studio account, and enable LM Link in Locally's settings.
  • Test the remote connection: Put your phone on cellular only by turning off Wi-Fi, then send a prompt to verify that you're running a model on your home rig from the cellular network with no ports open.

What Are the Current Limitations?

LM Link is currently account-gated and LM Studio-only on both ends, meaning it's a convenience layer for existing LM Studio users rather than a general remote-inference server that works with other tools. Launch is iPhone and iPad only; Android has not been announced.

The feature rolled out with a waitlist initially, but as of June 8, 2026, the request-gated waitlist was removed so LM Link became open to everyone. The default context length, which is the amount of conversation history the model can consider at once, was bumped to 8,000 tokens at that time.

For those who already own a capable home rig and a Mac or PC running LM Studio, LM Link is the cleanest remote-access setup available, with no reverse proxy, no exposed ports, and no SSH tunnel required. The trade-off is that it's limited to LM Studio users on both ends, so it's not a general solution for all remote AI inference scenarios.