IBM throws its Red Hat into open source AI ring with RHEL AI (original) (raw)
IBM and Red Hat open source their first LLMs, but IT experts say RHEL AI is more likely to stand out in the ways it links AI to hybrid cloud infrastructure.
DENVER -- Red Hat executives proclaimed that open source AI is too difficult for most companies to contribute to and incorporate into their specific applications. RHEL AI, a new open source AI platform, is a bid to change that.
Red Hat Enterprise Linux AI includes Red Hat's RHEL operating system packaged as a bootable container image using the bootc Linux utility, which makes it portable across infrastructures. RHEL AI also folds in a newly open sourced series of IBM Granite large language models, a subset of the LLMs that underpin IBM's Watsonx and Red Hat's Ansible Lightspeed coding and chat assistants, along with InstructLab AI alignment tools. InstructLab, also released to open source under an Apache license by IBM Research and Red Hat this week, lets users fine-tune pre-trained Granite models using a knowledge and skills taxonomy that generates a synthetic data set.
"You can now teach a foundation model a new skill … with five examples that before might have taken 5,000," said Red Hat President and CEO Matt Hicks, during a keynote presentation to kick off Red Hat Summit 2024 this week. "With the ability to teach smaller models the skills relevant to your use case, everything gets better -- training costs are lower, inference costs are lower, deployment options expand."
These updates represent a shift in stance from a year ago, when Hicks said at Red Hat Summit 2023 that Red Hat did not plan to get into AI models -- a shift acknowledged at a press session here this week.
"The context was quite different," Red Hat CTO Chris Wright said of last year's remarks. "The industry was really rallied around proprietary models, and we don't deliver proprietary solutions like that. The transition over the last year has been more openness in that model space."
Red Hat still isn't developing its own open source AI models, but it is supporting IBM Research projects along with other open source AI development and deployment tools, Wright said.
Red Hat President and CEO Matt Hicks presents RHEL AI at Tuesday's Red Hat Summit keynote.
RHEL AI takes on open source AI issues
RHEL AI aims to address some of the common problems with open source AI as it has emerged alongside proprietary LLMs such as OpenAI's GPT. While opportunities for collaboration are richer with such models, the data sets used to train them are often not available along with the model's source code. With this week's update, IBM Research is also releasing Granite Code Instruct models and disclosing the data sets used to train them, which include metadata from IBM's CodeNet, according to a company blog post.
Training custom open source AI on internal infrastructure also typically requires massive resources that most mainstream companies can't afford or manage. Cloud providers have filled in this gap so far with hosted LLM services, although early adopters have had to exercise caution to avoid cost overruns. RHEL AI, by contrast, targets large organizations that have AI workloads at the edge, on premises and in multiple clouds with portable RHEL container images and OpenShift hybrid cloud automation tools, while InstructLab open source is meant to make fine-tuning AI models more accessible to the masses by requiring fewer data inputs than other hosted LLMs.
InstructLab can connect the dots between data science tooling and data. Organizations want that easy button.
Rob Strechay Lead analyst, TheCube
"InstructLab being open source is unique," said Rob Strechay, lead analyst at enterprise tech media company TheCube, in an interview with TechTarget Editorial this week. "It's hard to get simulated data to train on -- InstructLab can connect the dots between data science tooling and data. Organizations want that easy button."
Red Hat officials said they believe that open source AI can also overcome some common problems with LLMs in general, such as meeting standards of objectivity and appropriateness of results through community collaboration. As with proprietary LLMs and associated services from Microsoft and GitHub, Red Hat will keep users' data sets private and indemnify users of open source Granite LLMs.
RHEL AI is in developer preview and it remains unclear at this early stage how far IBM and Red Hat plan to take indemnification. Asked whether it will indemnify users of Granite open source AI tools against prompt injection attacks, Red Hat officials said they'd follow IBM's Watsonx indemnification policies; these are described as protections against copyright and IP infringement in a Sept. 2023 IBM press release.
This could be problematic with AI training tools out in the open, said one industry analyst during this week's press session.
"From the stage today, there was this talk about aggregating a whole bunch of information from patients, from customers, in order to train models," said Bret Ellis, an analyst at Forrester Research, during the session. "So what happens when you have this aggregation point as a target for a cyber gang, and they can find ways to push the model to disclose information -- how are you putting guardrails around that, specifically?"
The industry as a whole, whether proprietary or open source, is still figuring out the answers to that question, Wright said.
"The way we've done work in Linux with a defense-in-depth model, providing mandatory access controls all the way down the operating system with SELinux or ACS, integrating StackRox directly into Kubernetes … will play out here," he said. "It's just that the tools are less well understood at this point."
Dr. Rudolph Pienaar (left) and Dr. Ellen Grant of Boston Children's Hospital discuss the role of open source AI in their work during the Red Hat Summit keynote.
RHEL AI images link AI to IT infrastructure
While Red Hat laid out a vision for a future of collaborative open source AI development, IT analysts said RHEL AI is most likely to stand out because of its image mode deployment mechanism, which will link AI runtimes to its popular operating system and make them more easily deployable on hybrid cloud infrastructure.
RHEL AI could offer a more familiar alternative to still-developing server-side WebAssembly tools to add hybrid cloud portability to AI apps, said Torsten Volk, an analyst at Enterprise Management Associates, in an interview.
"The models don't really differentiate [vendors] all that much," Volk said. "What differentiates them is everything around the model that gets the data scientists, developers, platform and data engineers to collaboratively create, deploy, manage, monitor, refine and share between organizations, because that's what's not happening right now."
Red Hat also demonstrated an extensive set of integrated generative AI application development and infrastructure management tools that went well beyond RHEL this week, which also included updates to OpenShift AI, Podman, VS Code IDE integrations, and expanded partnerships with chip vendors Intel, AMD and Nvidia.
Tuesday's keynote presentation also showcased customers of those products, Adobe and Boston Children's Hospital, who talked about the importance of multi-cloud and hybrid cloud management using OpenShift as the basis for distributed LLM projects.
One analyst in the keynote audience said OpenShift will play a key role in the IT market as enterprise generative AI adoption grows.
"I see a significant percentage of AI workloads being deployed on premises when they reach production for data privacy and sovereignty reasons," said Steven Dickens, an analyst at the Futurum Group, in an interview.
RHEL AI and other open source AI tools will be equally important for collaboration and data sharing among medical institutions with varying levels of financial resources, said Dr. Rudolph Pienaar, staff scientist at Boston Children's Hospital, during his keynote presentation.
"Open source means a community can analyze and vet and test and build trust," Pienaar said, citing the work Boston Children's is doing with the Massachusetts Open Cloud OpenShift Service and ChRIS, an open source project for analyzing radiology images to foster such collaboration. "Open source levels the playing field -- because the tools we use at Boston Children's run on OpenShift and because ChRIS is open source, a hospital anywhere can … use the same computing algorithms we use to help kids wherever they are."
Beth Pariseau, senior news writer for TechTarget Editorial, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out @PariseauTT.