🤖 LLMs won't stop hallucinating

And scaling at Slack

Hey everybody,

In today's newsletter, we’re diving into how Slack enhances safety and scalability within their Chef infrastructure, alongside Péter Szász’s insights on leading engineering teams during tough times.

Also, there’s a really interesting study on why Large Language Models will always hallucinate. Plus, I’ll be sharing OpenAI’s new o1 models and what sets them apart.

Quick Links

🧵 Enhancing Safety and Scalability at Slack
In a post on the Slack Engineering blog, Archie Gunasekara dives into how Slack enhances safety and scalability within its Chef infrastructure. With tens of thousands of EC2 instances hosting services like Vitess databases and Kubernetes workers, Slack relies on Chef to efficiently manage and deploy changes across these instances. The post gives a really good breakdown of the evolution of this setup and the challenges they faced along the way.

🔥 How to Lead Your Team when the House Is on Fire
Péter Szász's latest blog post outlines the three key areas for Engineering Managers: aligning delivery with company goals, building high-performing teams, and fostering individual growth. He also shares strategies for navigating these priorities during turbulent times in the tech industry.

🌳 Learn Git Branching
Looking to get started with Git? "Learn Git Branching" is the most interactive and visual tool out there. It breaks down Git concepts with fun challenges, demos, and step-by-step tutorials. Perfect for beginners who want to grasp powerful Git features while enjoying the process.

🫂 Streamline Your Entire Business with a Free CRM *
HubSpot provides a comprehensive customer relationship management platform to help you grow. With powerful features to manage leads and improve customer relationships, HubSpot’s CRM is completely free, with no restrictions on users or data, making it ideal for businesses at any stage.

🤖 LLMs will always hallucinate (research) 
A recent DataLabs study argues that hallucinations in Large Language Models are not just occasional errors but an unavoidable feature. The research suggests these hallucinations arise from the fundamental structure of LLMs and cannot be completely eliminated through improvements in architecture, datasets, or fact-checking. The study essentially states that every step of the LLM process has a non-zero probability of error, challenging the belief that hallucinations can ever be fully mitigated by AI companies.

Sponsored Content *

OpenAI o1

OpenAI unveiled its new o1 models last week, offering ChatGPT users a fresh experience with AI that can “pause to think” before responding. Internally codenamed “Strawberry,” the o1 models have been highly anticipated as OpenAI’s next step toward developing more human-like intelligence.

These models excel at writing code and solving complex, multistep problems—an area where previous iterations often struggled. However, the trade-off comes in performance, as o1 models are slower and more expensive than the widely-used GPT-4o.

According to Jerry Tworek, OpenAI’s research lead, the training behind o1 is fundamentally different from its predecessors. He mentions that the company employed a new optimization algorithm and a tailored dataset specifically designed for this model. However, OpenAI has remained vague about the precise details of these changes.

Unlike previous models, which were trained to replicate patterns from vast amounts of data, o1 relies on reinforcement learning. This technique teaches the AI to solve problems on its own by rewarding correct answers and penalizing mistakes. The system uses a “chain of thought” process, allowing it to break down complex queries step-by-step, similar to how humans reason through difficult tasks.

Despite its advancements, the o1 model has limitations. It is roughly four times more expensive to use than GPT-4o and lacks many of the features that made GPT-4o a standout, such as multimodal capabilities and quicker response times.

OpenAI is keeping the model’s thought process under wraps, aggressively warning users not to probe into how they work. Since launching o1-preview and o1-mini last week, the company has issued threats of bans to anyone attempting to uncover the inner workings of the model.

Programming Humor

Until next week,

Travis.

Reply

or to participate.