Radar Trends to Watch: May 2023 (original) (raw)

Developments in Programming, Security, Web, and More

May 2, 2023

Large language models continue to colonize the technology landscape. They’ve broken out of the AI category, and now are showing up in security, programming, and even the web. That’s a natural progression, and not something we should be afraid of: they’re not coming for our jobs. But they are remaking the technology industry.

One part of this remaking is the proliferation of “small” large language models. We’ve noted the appearance of llama.cpp, Alpaca, Vicuna, Dolly 2.0, Koala, and a few others. But that’s just the tip of the iceberg. Small LLMs are appearing every day, and some will even run in a web browser. This trend promises to be even more important than the rise of the “large” LLMs, like GPT-4. Only a few organizations can build, train, and run the large LLMs. But almost anyone can train a small LLM that will run on a well-equipped laptop or desktop.

Learn faster. Dig deeper. See farther.

AI

NVidia has announced Nemo Guardrails, a product whose purpose is to keep Large Language Models operating safely. It prevents LLMs from straying off-topic and answering questions that it is not allowed to answer, checks facts (using other LLMs), and only allows it to access third-party applications known to be safe.
QuiLLMan is an open source voice chat. It uses the Vicuna-13B model, with OpenAI Whisper to transcribe the user’s audio, and Metavoice Tortoise to convert the response back to spoken audio.
The RedPajama project intends to create a fully open source large language model. The first step in this process is the release of a 1.2 trillion token dataset for training.
AI does fashion: Researchers (in Italy, where else?) have developed a Multimodal Garment Designer that uses diffusion models to create realistic images of humans wearing clothes described in prompts.
We talk casually about prompt engineering; Mitchell Hashimoto (founder of Hashicorp) discusses what it means for prompt engineering to be a real engineering discipline.
WasmGPT provides yet another way to run a ChatGPT-like AI chatbot in the browser, this time with WebAssembly. It uses a version of the Cerebras-GPT-1.3B model. Although it is very prone to hallucination, it demonstrates what can be done with WASM and without exotic hardware.
Stability.ai, the creator of Stable Diffusion, has just announced a new large language model, StableLM. The model is open source, and can be used in commercial applications. It was trained with a new dataset, based on The Pile but much larger.
LLaVA (Large Language and Vision Assistant) is a new multimodal language model that allows you to upload images and ask questions about them.
Just as there are techniques for training specialized LLMs, it’s possible to train specialized diffusion models for image generation. Dreambooth is one practical technique for personalizing diffusion models.
GPT-4’s image capabilities are still disabled. A research group has created MiniGPT-4, which allows users to upload and chat about images. It is based on Vicuna, so it can (probably) run on a well-equipped laptop or desktop.
Web LLM is a project that runs the Vicuna 7B large language model entirely in the Chrome browser, using the WebGPU (in the current Chrome beta). Its performance is surprisingly good.
AWS has released its own large language model called Titan, plus a new service for training and deploying LLMs called Bedrock. Their goal is to help users develop their own chatbots, which will presumably run on AWS.
What’s beyond ChatGPT? AutoGPT means the creation of ChatGPT agents that execute tasks for the user without intervention. These tasks typically include additional ChatGPT requests, with automatically generated prompts.
Databricks has released Dolly 2.0, a 12B parameter model that is entirely open source and has been trained with data that is independent of the GPT models (unlike Alpaca and other small LLMs). The model and its training data are available on GitHub and HuggingFace.
One of GPT-4’s plugins is a sandbox that allows it to run Python programs. GPT-3.5 and 4 frequently wrote programs, but could only “guess” about their output. This could be a big step forward in GPT-4’s accuracy, at least for programming tasks.
Alibaba has announced that it will roll out a ChatGPT-like bot, named Tongyi Qianwen. It plans to integrate the bot into all of its products, starting with Alibaba’s workplace messaging app.
Facebook has developed SAM, a universal segmentation model that can detect and mark all of the individual objects in an image. Natural language prompts specify which objects in an image you want to isolate.
Generative agents use large language models and other generative AI tools to simulate human behavior. In a simulation which was prompted only by a suggestion that the agents throw a party, they planned, sent invitations, made acquaintances, and executed many other human behaviors.
We are experiencing a proliferation of small large language models: based on Meta’s LLaMA with 6B to 13B parameters and capable of running on a well-equipped laptop or desktop with GPU, with additional training based on prompt/response pairs from ChatGPT. The latest are Vicuna and Koala; there will no doubt be others.
The use of ChatGPT has been banned in Italy because of privacy issues. (The ban was lifted at the end of April after OpenAI addressed issues raised by the regulators). It’s likely that Germany will follow, and possibly other European nations.
On at least three occasions, Samsung employees have inadvertently disclosed technology secrets by using ChatGPT. Their prompts and ChatGPT’s responses were incorporated into ChatGPT’s language model, from which they leaked to the outside world.
Google has enabled Bard’s code generation capabilities. It has also added with additional arithmetic and logic capabilities, making it less likely to make mistakes in simple arithmetic and logic.
Researchers have created a new AI architecture that combines neural networks with symbolic models in a way that overcomes the limitations of both.
The generative art application Midjourney appears to have temporarily suspended its free trial accounts program in response to deep fakes that have been generated on the platform. Free trials have been suspended until the next “improvement to the system” has been deployed.

Programming

Pushup is a new web framework for Go. It is an “opinionated” template-based framework in the style of Ruby on Rails or Django. Ignore the ill-informed Java bashing; the framework looks like it’s worth investigating.
Docs-as-Code: Etsy has built tools to make the development of documentation as rigorous and maintainable as the development of code, integrating documentation into their development and deployment pipelines.
AWS has opened up CodeWhisperer, a competitor to GitHub Copilot, for use. It is free for personal use.
According to a survey, Kubernetes deployments are trending towards “Managed Kubernetes,” in which responsibility for running Kubernetes is delegated to another company, typically a cloud vendor.
FerretDB is a new open source database that’s an alternative to MongoDB. Because it uses the Server Side Public License (SSPL), MongoDB can no longer be considered open source.
A new database, NAM-DB, demonstrates that distributed transactions can scale.
Flyte is an open source container orchestration platform that has been designed specifically for data science workloads. It is based on Kubernetes.

Security

An important report highlights the security risks of AI systems. AI has all the vulnerabilities of traditional software, in addition to its own; and while it isn’t yet an attack vector of choice, attacks have been seen in the wild, and will no doubt proliferate as AI is deployed more widely.
There are many ways to get cryptography wrong—and the problems are a lot more subtle than “don’t implement cryptographic algorithms yourself.” Here’s a post on Cyptographic Best Practices that shows how to get it right.
eBPF (enhanced Berkeley Packet Filter) is a powerful tool for detecting attacks and threats against containers; it is usable in situations where traditional security monitoring doesn’t work.
A new prompt injection attack allows an attacker to steal chat data by tricking the user into copying and pasting a prompt into ChatGPT.
SAP has created a Risk Explorer that can help users evaluate the risks in their software supply chains. It’s a hierarchy of known attacks, with explanations, that can be explored through a graphical interface.
PassGAN is an AI-based password cracking tool. Despite fear-mongering hype, it is not better than brute force methods. More important, its developers are recommending that users change their passwords every 3 to 6 months, a change that makes sites more vulnerable, and that goes against recommendations from NIST, the FTC, Microsoft, and others.
An attack against most modern cars requires hijacking the CAN bus (Controller Area Network), which connects all of a car’s systems. It requires some vandalism; on a locked car, the easiest way to access the CAN bus is through the headlights. The attack has been seen in the wild.
Workload Security Rings are a new approach to isolating workloads based on their security requirements while minimizing compromises to efficiency. Workloads fall into one of three classes: sensitive, hardened, and trusted.
The FBI has shut down Genesis Market, an online store for stolen data and malware.
The creators of large language models are not keeping up with the attacks against them. Security is, as they say, a “hard problem”; but with the models already in widespread use, LLM-based fraud won’t be far behind.
A research project at CMU installed hundreds of networked sensors, including microphones, throughout a new CS department building. This installation has created a significant controversy about the meaning and future of privacy.
Fake Ransomware sounds like an April Fool’s joke, but it’s real. Some threat actors threaten to sell or reveal stolen data, without having actually obtained the data. It’s a weird kind of phishing, and surprisingly effective.
A large set of leaked documents describes Russia’s far-reaching cyberwarfare efforts.
Security Copilot is a chat assistant to help IT staff with incident response. It is based on GPT-4, with an additional model integrating data from Microsoft’s knowledge of security incidents.

Web

Consent-O-Matic is a browser plugin that automatically fills in annoying cookie popups in a way that maximizes privacy. It is available from browsers’ web stores; source code is in GitHub.
Google’s Environmental Insights Explorer provides access to data about the environment and sustainability for over 40,000 cities worldwide.
Perseus is a new high performance Web framework for Rust. It runs on WebAssembly.
CGI makes a comeback! Of course, it’s never really gone away. But WCGI, using WebAssembly to run CGI applications, is safer and faster.
WebGPU is shipping in Chrome 113 (currently in Beta), and development is in progress for Firefox and Safari. WebGPU is a JavaScript standard for interacting with GPUs and other advanced graphics hardware from the browser.
Salesforce has created a platform that allows companies to create NFT-based customer loyalty programs. These programs give companies direct access to customer data, eliminating the need to work within restrictions on the use of cookies. Are crypto wallets the new cookies?

Augmented and Virtual Reality

Facebook/Meta is using undercover content moderators to police Horizon Worlds.
Is privacy possible in virtual reality? Probably not. So much relies on motion, and motion is identifiable. Headsets leave a trail of data that will be very hard to anonymize.
Augmented reality isn’t dead. Snap is launching AR “mirrors” for stores that show customers what they will look like wearing clothes without trying them on.