Figure AI's Production Surge Reveals the Hidden Labor Behind Humanoid Robots

Figure AI is ramping up humanoid robot production at an exponential pace, but behind every robot lies a growing network of low-wage workers whose hand movements and household tasks are being converted into training data. CEO Brett Adcock recently shared production data showing Figure manufactured approximately 150 units in April 2026 alone, a dramatic acceleration from single-digit monthly outputs just months earlier. Yet this manufacturing milestone masks a larger, more troubling reality: the data infrastructure required to train these robots depends on a global workforce of thousands earning poverty-adjacent wages while their movements are extracted, labeled, and resold without full transparency.

How Is Figure Training Its Humanoid Robots at Scale?

Figure and competitors like Tesla are facing a critical bottleneck: they need billions of hours of real-world human motion data to train their robots, but collecting this data at scale requires an entirely new labor model. Companies like Micro1, a Palo Alto-based data collection firm, have recruited approximately 4,000 workers across 71 countries to solve this problem. Workers wear head-mounted cameras with LiDAR sensors (light detection and ranging technology that creates 3D maps) to record themselves performing household tasks like folding clothes, loading dishwashers, and opening refrigerators. Each worker submits at least 10 hours of video per week, generating over 160,000 hours of footage monthly.

The process is more invasive than it sounds. Candidates first interview with an AI agent named Zara, who assesses their suitability and requests a trial video. Upon approval, workers receive a headband mount, recording guide, and task checklist. The instructions require hands to remain visible at all times and movements to occur at a "natural pace," yet workers report that natural speed appears too fast on camera, forcing them to deliberately slow down until their movements resemble "sleepwalking". After submission, videos undergo dual review by AI and human reviewers, with only about half of submissions ultimately approved. Rejection reasons include insufficient lighting, hands moving out of frame, movements that are too fast, or unauthorized objects in the background.

Why Are Workers Paid So Differently Across Countries?

The compensation structure reveals a stark geographic hierarchy. Workers in the United States can earn three times more than workers in India or Vietnam for identical tasks, such as folding clothes. This disparity exists because robot companies assume U.S. consumers will be the first to purchase humanoid robots, making operational environment data from American homes more valuable than data from Chennai or Manila. A tutor in New Delhi named Arjun reported that it typically takes him an hour to brainstorm enough household tasks to fill a 15-minute recording, yet he earns only $15 per hour, a wage that is competitive in Nairobi or Manila but pales in comparison to the billions of dollars invested in robotics companies.

The information asymmetry compounds the problem. Micro1 does not disclose its client list to workers, citing confidentiality, and workers remain unclear about how their data will be stored or whether it will be resold to third parties. Workers sign agreements and receive payment, but they occupy the bottom of an information chain with little knowledge of the full scope of their participation.

Steps to Understanding the Data Collection Pipeline

  • Video Capture: Workers wear iPhone-mounted cameras with LiDAR sensors to record household tasks, with each worker submitting at least 10 hours of footage weekly across 71 countries.
  • Quality Review: Submitted videos undergo dual review by AI and human reviewers, with approximately 50% rejection rate based on lighting, hand visibility, movement speed, and background objects.
  • Annotation and Labeling: Approved videos enter a second labor phase where human annotators label action categories, object names, and motion trajectories frame by frame, creating the structured training data robots need.
  • Data Aggregation: Monthly footage from thousands of workers converges at companies in Palo Alto and San Francisco, where it is transformed into training datasets for humanoid robot AI systems.

Arian Sadeghi, Vice President of Micro1, acknowledged the scale of the challenge: "1.6 million hours of monthly footage is far from sufficient. We likely need billions of hours. We haven't even begun collecting data on human-to-human interactions; we're still at the most basic level of household tasks". At the current collection rate, gathering billions of hours would require approximately ten thousand years of continuous operation.

Arian Sadeghi, Vice President of Micro1

What Is "Data Colonialism" and Why Does It Matter?

Anthropologist Mary Gray and computer scientist Siddharth Suri published a 2019 book called "Ghost Work," which describes the human labor that makes AI systems appear intelligent yet never appears in product descriptions. Gray discovered that when she asked engineers who was performing this work, responses included "I'm not sure" and "I don't dare to check". The labor has evolved. In the past, ghost work primarily occurred in front of screens, involving clicking, labeling, and reviewing. Now, the human body itself, including gestures such as folding clothes, the rhythm of cooking, and the motion of opening a refrigerator, has become raw material that can be collected, priced, and resold.

Scholars Nick Couldry and Ulises Mejias proposed a framework called "data colonialism" to describe this dynamic, arguing that tech companies' appropriation of data structurally continues the historical logic of colonialism's extraction of land and resources, transforming human daily life itself into raw material available for capital extraction. In the case of Micro1, workers earn $15 per hour, a competitive wage in some regions, but this pales in comparison to the billions of dollars invested in robotics companies and the long-term value of the data being extracted.

Despite the isolation inherent in this work, Gray discovered something that impressed her: workers often spontaneously find one another and form informal mutual support networks, because the work itself provides almost no support. Isolation is the default state of this kind of labor, making peer connection a form of resistance and survival.

How Does Figure's Production Ramp-Up Connect to This Labor Model?

Figure's exponential production growth directly depends on the data collection infrastructure that Micro1 and similar companies are building. The company's BotQ production facility can produce up to 12,000 units per year, and if Figure continues its monthly doubling trajectory, it will need exponentially more training data. The Figure 03 platform, which features upgraded cameras, hand-mounted cameras, and fingertip tactile sensors capable of detecting forces as light as 3 grams, requires massive amounts of real-world motion data to train its AI systems.

This hardware maturation is a prerequisite for Figure's broader "Software 2.0" ambitions. To train the Helix 02 architecture, which computes torque directly from pixels, the company requires a massive fleet of identical machines to gather real-world data. A sudden influx of 150 or more robots per month would provide the necessary "compute-in-the-loop" required to solve room-scale autonomy. Yet each robot in the field generates more demand for training data, creating a feedback loop that accelerates the need for workers in developing countries to record their household tasks.

The global humanoid robot market is projected to reach $4.23 billion in 2026, with mass production plans by companies such as Tesla driving global cumulative installations past 100,000 units by 2027. These robots are likely to enter factories and homes to take over physical labor, and the data used to train them comes from people who currently rely on physical labor to make a living. This creates a paradox: the workers whose movements train robots to replace human labor are themselves economically vulnerable and dependent on the wages this work provides.

Philosopher Michael Polanyi wrote in 1958 about "tacit knowledge," the vast amounts of knowledge humans possess that cannot be written down but are instead embodied in actions, perceptions, and intuition. Cycling is a common example; you know how to maintain balance, but you cannot write down a set of rules to teach it to someone else. What companies like Figure are attempting to do is extract this tacit knowledge from the human body and convert it into data that machines can process. The camera on a worker's forehead captures not just the motion of folding clothes, but also how the fingers sense the weight of fabric, how the wrist flips at just the right moment, and how the gaze tracks the edge of fabric throughout the folding process.

As Figure scales production and other robotics companies follow suit, the demand for this invisible labor will only intensify. The workers recording these videos remain largely unaware of the full commercial value of their contributions or the long-term implications of their participation in training systems designed to automate their own forms of work.

" }