Large language models (LLMs) are ushering in a revolutionary era with their remarkable capabilities. From enhancing everyday applications to transforming complex systems, generative AI is becoming an integral part of our lives.
However, the surge in demand for AI-powered solutions exposes a critical challenge: the scarcity of computational resources required to meet the growing appetite for logic and voice-based interfaces. This scarcity leads to a pressing need for cost-efficient platforms that can support the development and deployment of LLMs.
Industrializing AI software development will require transforming the processes for developing, deploying and maintaining AI systems from a research or ad-hoc approach into a structured, systematic and scalable industrial process. By focusing on cloud cost optimization and platform engineering, businesses can foster growth, profitability, and innovation in the field of AI.
Now, in a provocative new story, the Wall Street Journal fleshes out where the cracks are starting to form. Basically, monetizing AI is hard, user interest is leveling off or declining, and running the hardware behind these products is often very expensive — meaning that while the tech does sometimes offer a substantial “wow” factor, its path to a stable business model is looking rockier than ever.
“Personalize or Perish.” One of the leading newspapers aptly summarizes the critical nature of personalization 2.0, or hyper-personalization for businesses.
We live in an era where customers expect businesses to understand their wants and needs. Today, companies must meet customers’ needs and anticipate and exceed them. And for this, they must pivot to a digital-first mindset to create stronger, more authentic customer interactions.
How do they do this? Through a hyper-personalized, AI-powered business strategy where products, ads, and interactions are tailor-made for each customer or a group of customers.
The statement that may not be obvious is that the sleeping giant, Google has woken up, and they are iterating on a pace that will smash GPT-4 total pre-training FLOPS by 5x before the end of the year. The path is clear to 100x by the end of next year given their current infrastructure buildout. Whether Google has the stomach to put these models out publicly without neutering their creativity or their existing business model is a different discussion.
Today we want to discuss Google’s training systems for Gemini, the iteration velocity for Gemini models, Google’s Viperfish (TPUv5) ramp, Google’s competitiveness going forward versus the other frontier labs, and a crowd we are dubbing the GPU-Poor.
Access to compute is a bimodal distribution. There are a handful of firms with 20k+ A/H100 GPUs, and individual researchers can access 100s or 1,000s of GPUs for pet projects. The chief among these are researchers at OpenAI, Google, Anthropic, Inflection, X, and Meta, who will have the highest ratios of compute resources to researchers. A few of the firms above as well as multiple Chinese firms will 100k+ by the end of next year, although we are unsure of the ratio of researchers in China, only the GPU volumes.
Aude Oliva is a prominent Cognitive and Computer Scientist directing the MIT Computational Perception and Cognition group at CSAIL while also leading the MIT-IBM Watson AI Lab and co-leading the MIT AI Hardware Program. With research spanning computational neuroscience, cognition, and computer vision, she pioneers the integration of human perception and machine recognition. Her contributions extend across academia, industry, and research, making her a distinguished figure at MIT.
Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:
Construction is set to break ground by the end of this year, and the company expects to move into the new space by the end of 2024. The production facility for semiconductor quartz will include a clean room, high-purity cleaning system and allow them to expand an automation component of their business that they’ve been capitalizing on for years.
“We knew that our customers all over the world were expanding at a rate we couldn’t keep up with,” said Scott Lingren, SXT’s managing director and U.S. chairman. “As you see all these expansions from Samsung in Taylor to Texas Instruments Inc. in the Dallas area to all over the world … we just have to keep up.”
SXT – which is headquartered in the Netherlands and owned by the privately-held Schunk Group in Germany – supplies semiconductor manufacturers around the world, like Samsung, which has had a presence in Central Texas for decades and is potentially adding to its existing Austin campus and its new site in Taylor. Other major players in the industry include Taiwan Semiconductor Manufacturing Co., which is expanding in Arizona, and Intel Corp., which is expanding to Ohio.
What happens in femtoseconds in nature can now be observed in milliseconds in the lab.
Scientists at the university of sydney.
The University of Sydney is a public research university located in Sydney, New South Wales, Australia. Founded in 1,850, it is the oldest university in Australia and is consistently ranked among the top universities in the world. The University of Sydney has a strong focus on research and offers a wide range of undergraduate and postgraduate programs across a variety of disciplines, including arts, business, engineering, law, medicine, and science.
How will AI affect businesses and employees? It’s the million-dollar question, and according to Harvard Business School’s Raffaella Sadun, the answer will depend on how well an organization connects the new technologies to both a broad corporate vision and individual employee growth.
One without the other is a recipe for job elimination and fewer new opportunities for all. Luckily, she points out, we are early in our AI journey, and nothing is predetermined. Smart leaders don’t need to understand every technicality of AI. But they do need to identify the best use cases for their specific business and communicate a clear strategy for reskilling their teams.
For this episode of our video series “The New World of Work”, HBR editor in chief Adi Ignatius sat down with Sadun, who wrote the HBR article, “Reskilling in the Age of AI” (https://hbr.org/2023/09/reskilling-in-the-age-of-ai), to discuss:
• How leaders should use GenAI to augment their own decision making, without entrusting it to make the actual decisions. • Even in the age of AI, the top management skills will be a mixture of technical (“hard”) and social (“soft”) skills. Those who excel will comprehend their organization’s complexity while communicating a clear vision to all employees. • Handling change management when everyone is uncertain about the future and regular employees are especially fearful.
This interview part of a series called “The New World of Work,” which explores how top-tier executives see the future and how their companies are trying to set themselves up for success. Each week, Adi will interview a leader on LinkedIn Live — and then share an inside look at those conversations and solicit questions for future discussions in a newsletter just for HBR subscribers. If you’re a subscriber, you can sign up for the newsletter here: https://hbr.org/my-library/preferences?movetile=newworldofwork.