While few people noticed at the time, by controlling the whole stack, Amazon Web Services (AWS) has outpaced competitors like AMD and Nvidia in delivering innovative solutions.
Amazon designed CPUs, AI accelerators, servers, and data centres as a vertically integrated operation.
Meanwhile, AMD and Nvidia did their best–trying to control the full stack by acquiring servers, software, and interconnect companies, but Amazon Web Services (AWS) was ahead of them.
According to IEEE Spectrum's interview with Ali Saidi, technical lead for the AWS Graviton series of CPUs, and Rami Sinno, director of engineering at Annapurna Labs, Amazon scored its lead by focusing on vertically integrated design — and Amazon-scale.
Ali Sinno said he was working at Arm, and looking for his next adventure, looking at where the industry is heading and what he wanted his legacy to be.
“ I looked at two things: One is vertically integrated companies because this is where most of the innovation is — the interesting stuff is happening when you control the full hardware and software stack and deliver directly to customers."
"And the second thing was machine learning, AI in general, would be big. I didn't know exactly which direction it would take, but I knew that something would be generational, and I wanted to be part of that. I already had that experience when I was part of the group building the chips that go into the Blackberries; that was a fundamental shift in the industry. That feeling was incredible, to be part of something so big, so fundamental. And I thought, 'Okay, I have another chance to be part of something fundamental.'"
Sinno emphasises the importance of delivering complete servers in the data centre directly to customers: "If you think from that perspective, you'll be able to optimise and innovate across the full stack. It might not be at the transistor level or at the substrate level or at the board level. It could be something completely different. It could be purely software. And having that knowledge, having that visibility, will allow the engineers to be significantly more productive and delivery to the customer significantly faster. We're not going to bang our head against the wall to optimise the transistor where three lines of code downstream will solve these problems, right...?"
Saidi said that in chip design, there are many different competing optimisation points. You have all of these conflicting requirements, cost, scheduling, power consumption, size, what DRAM technologies are available, and when you're going to intersect them.
“It ends up being this fun, multifaceted optimisation problem to figure out the best thing you can build in a timeframe. And you need to get it right."