Blog

Building a Vision Transformer Model From Scratch

April 4, 2024

Building a Vision Transformer Model From Scratch by Matt Nguyen The self-attention-based transformer model was first introduced by Vaswani et al. in their paper Attention Is All You Need in 2017 and has been widely used in natural language processing. A transformer model is what is used by OpenAI to...

A humanoid performing assembly. Image by the author via miramuseai.net.

The Future of Robotic Assembly

March 28, 2024

Since the introduction of mass production in 1913 assembly lines are still mostly human — humanoids might change this Henry Ford is known as the father of mass production, streamlining the production of his “Model T” enabling cars to be widespread affordable. One of the key innovations at the time...

Grasping With Common Sense using VLMs and LLMs

March 10, 2024

How to leverage large language models for robotic grasping and code generation Grasping and manipulation remain a hard, unsolved problem in robotics. Grasping is not just about identifying points where to put your fingers on an object to create sufficient constraints. Grasping is also about applying just enough force to...

A humanoid cleaning up (its own?) mess while preparing a meal. The humanoid form factor holds tremendous promise for seamless integration into existing value creation processes. Image: author via miramuseai.net

Are the Humanoids Here to Stay?

March 1, 2024

Humanoids might finally solve the “brownfield” problem that plagues robotic adaptation, and recent breakthroughs in multi-modal transformers and diffusion models might actually make it happen. Not a week goes by without a flurry of humanoid companies releasing a new update. Optimus can walk? Digit has just moved an empty tote?...

Left: Performance of the “CLIP” model on accurately providing labels for images, dramatically outperforming previous work. Image from https://arxiv.org/pdf/2103.00020.pdf. Right: Summarizing a model’s performance by a single number is only one piece of information. Once this information is actually used to make a decision, we will also need to understand the different ways the model can fail. Image: own work.

Reasoning About Uncertainty using Markov Chains

Feb. 26, 2024

Formal methods to tackle “Trial-and-Error” problems The ability to deal with unseen objects in a zero-shot manner makes machine learning models very attractive for applications in robotics, allowing robots to enter previously unseen environments and manipulating unknown objects therein . While their accuracy in doing so is incredible compared with...

Principal Component Analysis of a random 2D point cloud using PyTorch’s built-in function. Image by the author.

Understanding Principal Component Analysis in PyTorch

Feb. 18, 2024

Built-in function vs. numerical methods PCA is an important tool for dimensionality reduction in data science and to compute grasp poses for robotic manipulation from point cloud data. PCA can also directly used within a larger machine learning framework as it is differentiable. Using the two principal components of a...

Search