The New Stack | DevOps, Open Source, and Cloud Native News

The New 2GB Raspberry Pi 5: Another Option for Linux Sysadmins

Damon M. Garn — Mon, 02 Sep 2024 16:00:41 +0000

The Raspberry Pi single computing board has long been the darling of hobbyists, even though plenty of sysadmins, scientists, data crunchers, and other users have made it their own. Unfortunately, Raspberry Pi news has sometimes been overshadowed by supply issues for some time, but those kinks appear to be getting worked out, and the popular platform continues to soldier on.

With that in mind, this article introduces the Raspberry Pi 5 with 2GB of RAM, released on Aug. 19, 2024. At first glance, it might seem a bit underpowered compared to other Raspberry Pi 5 options, but it fits nicely into a specific niche that just might be attractive to IT administrators and other power users.

How the 2GB Pi Compares

How does the new Pi compare to other models? The primary difference is the quantity of RAM, so this platform is best for your low-memory projects. Maybe you’re setting up some IoT sensors or planning to rotate static informational or advertising images in a lobby or other public space.

If you want to use a Raspberry Pi as a daily driver desktop system or multimedia streaming box, you should probably spend a few more bucks on the 4GB or 8GB Raspberry Pi versions to support these more RAM-intensive scenarios.

There’s more to a computer (even a single-board system!) than just memory. The basic Raspberry Pi 5 features other essential specs, including:

4 GHz quad-core 64-bit ARM BMC2712C1 processor
VideoCore VII GPU
Dual 4Kp60 HMDI display outputs
11ac wireless support
Bluetooth support
USB 2.0 and 3.0 ports
Dual camera receivers

If you add the optional HATs, you can also get the following:

Onboard Gigabit Ethernet
PCIe 2.0 interface

In his release announcement for the 2GB Raspberry Pi 5, Eben Upton, the CEO of Raspberry Pi Trading, goes into detail about changes to the BMC2712C1 processor that help keep the price down for this model.

The short version of the explanation is that the modified processor lacks capabilities BMC normally includes for other platforms that the Raspberry Pi does not utilize. That’s an important consideration, and it’s nice to see that the folks at Raspberry Pi continue to focus on the intended use and audience.

The Raspberry Pi OS is optimized for this hardware, adding performance, power management, and features specific to it and eliminating extra capabilities that are unneeded. If that distribution isn’t to your liking, there are a variety of alternate Pi-optimized OS choices, including Ubuntu, AlmaLinux, Pop!_OS, and even Kali Linux.

2GB Raspberry Pi Use Cases

The Raspberry Pi is often seen in the context of hobbyist or enthusiast users, but it has a place in the business world. Consider some of the following use cases for your new 2GB Raspberry Pi 5:

Digital sign and media displays
X (Twitter) bot for marketing
Thin-client systems for end users or kiosks
Server room environment sensors
Simple and lightweight Python programming platform
Embedded systems

Pay attention to the RAM requirements for your projects — remember that this newest Raspberry Pi 5 offering offers less memory space. It’s not going to be the tool for every job. You may find yourself reaching for the 4GB or 8GB Raspberry Pi solutions instead.

Other Options and Use Cases

The current Raspberry Pi lineup is pretty diverse, including small Pi Pico options and much more powerful Raspberry Pi 5 configurations.

The 2GB Raspberry Pi 5 meets the goal of accessibility and flexibility, coming in at a mere US$50, making it the least expensive of the Raspberry Pi 5 choices. The 4GB Raspberry Pi 5 will set you back $60, and another $20 will get you the 8GB device.

These configurations are more suited to web servers, name resolution services, NTP time devices, or other similar uses. You may also prefer the additional memory for more extensive automation, AI, security, or prototyping projects. As you phase out CentOS, remember there’s an AlmaLinux image specifically for the Raspberry Pi.

Other Raspberry Pi

Raspberry Pi 4, 3, and Zero hardware is still available if your project requires that specific hardware, with costs running from $5 to up to around $75.

However, if you’re just starting fresh with the Raspberry Pi platform and deploying it in an enterprise environment to support essential sensor or IoT functions, I suggest you start with version 5 devices.

Get Your Own 2GB Raspberry Pi

Various resellers worldwide offer the Raspberry Pi, and the 2GB version is already available. The Raspberry Pi 5 product information page has links to order devices and accessories.

Various add-on devices extend the Pi’s capabilities. Many of these are project-specific. Modifications include:

HATs (Hardware Attached on Top): Add functionality, including additional processing or expansion slots.
Cameras: Add visual elements to your Pi projects.
Project kits: Some preconfigured kits are available that include all hardware needed for specific projects.

Don’t forget to pick up a case to protect your Raspberry Pi. The Pi 5 also benefits from active cooling, so be sure to keep that in mind with your case and other storage options.

Wrap Up

The 2GB Raspberry Pi platform is a great choice for many business cases and home projects. Its expandability and processing power are the same as those of the 4GB and 8GB versions, even though the processor is stripped down compared to other Pi. Watch for projects and applications with minimal RAM requirements, including embedded systems, display-only situations, and basic server or network functionality.

The 2GB Pi is available now for $50, maintaining the hardware’s accessible price point. Remember to add cooling and a case, depending on your project.

The post The New 2GB Raspberry Pi 5: Another Option for Linux Sysadmins appeared first on The New Stack.

One Company Rethinks Diff to Cut Code Review Times

David Cassel — Mon, 02 Sep 2024 15:00:15 +0000

A Stack Overflow blog post calls it “the oldest tool still widely used by contemporary developers.” The file-comparing program diff has now been around for literally half a century.

And to this day, its underlying “Myers diff algorithm” still finds its way into our workflows — including the way we see changes on GitHub (with its red highlighting for changed code, and green highlighting for new code).

Is it time for a fresh look? Even diff’s official info file notes that the GNU project “has identified some improvements as potential programming projects for volunteers.”

But the Stack Overflow blog post offers the fascinating case study of one developer tooling company that decided to try building a better diff…

All About Alloy

Alloy.dev’s website says it’s in the business of “fine software products,” with a subhead promising their tools are “dogfooded daily by people who love building software.” And it specifically touts two products that “help ambitious creators squeeze the most value from every working hour” — one of which is Amplenote, a note-taking/to-do list app.

And then there’s “GitClear” — which the company released after three years of iteration in 2018. “For GitClear, we’re eager to make pull request review take more like 1-5% of the average dev team’s week, instead of 20%,” says Alloy’s webpage.

Harding and his team added new ways to show when code had simply been moved, received minor updates, or experienced name changes from a find-and-replace command.

Doing some research, the company had found that only 5% of code changes are truly “substantive” changes, Alloy founder/CEO Bill Harding said in his 2022 presentation, with the rest being what they consider “change noise”.

And about 30% of all changed lines in a pull request are just chunks of code that were only moved to a new location. “Why are developers still reading pull requests where this 30% of unchanged code is emphasized equally alongside the substantive changes that deserve attention?” Harding asked.

As Harding said in a June demonstration, “We want to help developers review as little code as possible.”

Building a Better Diff?

In a guest post on Stack Overflow, Harding describes how Alloy first ran experiments with a new set of diff operators. His goal was to see whether “a deeper lexicon” could condense the way commits are represented. “Can change be shown more concisely than what was possible nearly 40 years ago?” So besides additions and deletions, Harding and his team added new ways to show when code had simply been moved, received minor updates, or experienced name changes from a find-and-replace command.

Reducing lines makes things better, Harding says in another video. “This can end up being hours more time that you have available to write code instead of reviewing it.”

Alloy presents several examples and videos to substantiate their claim that their tool results in 30% less code to review in pull requests.

How It Works

Obviously, its usefulness depends on what it’s trying to summarize. But when a chunk of code is moved into a separate function, GitClear doesn’t highlight all that moved-but-still-the-same code — only the newly-added method definition.

And Harding’s blog post also highlights a case where they’d made a minor change to a constant value — adding a 0 in front of it. Rather than displaying this as a line deleted and then another different line added, the tool simply displays the changed line with its changed character highlighted (and shown inline).

The end result? Roughly 28% fewer “changed lines” to review — which Harding sees as a clear win. “This implies that updating git diff processing tools could reduce the volume of lines requiring review by almost a third.”

Harding even says they recruited 48 test subjects from CodeMentor to review pull requests — half of which came from GitClear. The results found “equivalent comprehension” of code. But with fewer lines to review, less time was spent reviewing — between 23% less time (on average) and 36% less time (the median).

Other Features

When I submitted a URL to the company’s “alternative pull request review tool,” GitClear sent me an email highlighting just how many lines of code I wouldn’t have to review using the tool…

But their tool includes other features.

Visiting the pull request pulls up an overview page offering what Harding calls “high-level details of where the pull request is at… and how it compares with previous pull requests that have been submitted.” One graph shows how many days the pull request has been open — and even lets you compare that to other files in your repository — or to pull requests for all your repositories, or even “to other companies within your industry.” (Another graph performs the same comparisons for the amount of test coverage in a pull request.)

And then there’s a graph of the “diff delta,” which the company’s site touts as GitClear’s proprietary but “empirically-validated appraisal of how much durable change occurred per commit,” weaving together a commit’s entire history to track “the long-term fate of each line of code authored — through moves, renames, and other updates.”

They’ve added other ways to try to improve the code-reviewing experience. In a video demo, Harding notes their tool also offers a view of just “unreviewed commits since last review”

“For the way that our team works, this might be the single feature that saves the most time… Because if your team goes through multiple rounds of review on a pull request, you’ve certainly felt the pain of trying to look at a file you’ve already seen before and pick out, ‘Which of the changes were in response to my feedback? And which of these changes were already here, and I’ve already looked at before?'”

And it’s also possible to look back in time. Hovering over a line displays its whole history of commit messages, “which often elucidates why a particular line evolved into its final form,” Harding explained in his blog post. And this means that even when a line is moved — and may be changed by a find-and-replace command — that line’s original position is still available. [Their essay argues it’s useful seeing that the code appeared in past revisions of an application.]

Less Time on Code Reviews

Harding’s essay closes by making the case for their tool. How many hours are spent reviewing those additional 28% lines of code? Harding estimates it costs a 10-person team thousands of dollars a week. And Harding also suggests developers might appreciate spending less time on code reviews.

“Considering that code review is often one of the most unpleasant, high-willpower chores included in a developer’s responsibilities, the morale improvement gained by reducing code review time may rival the gains in ‘time saved.'”

It’s not the only alternative. In the comments on Harding’s story was a fan of another alternative named Difftastic.

And in a 2022 presentation at GitKon, Harding acknowledges there’s at least 10 companies now offering Git analysis/developer analytics tools (often combined with issue tracking).

But Harding’s post sparked some lively discussions around the web about the current state of code-reviewing tools. Various long-time Stack Overflow readers left their comments on his blog post:

“It would be great to recognise [sic] a complete function with the comment block and usually a trailing empty line as one unit and therefore one change”
“It would be great if a diff recognised [sic] a renamed file…”
“A lot of improvements in the review process can come from intentionally separating the commits in a way that facilitates review.

Whatever the future holds, there’s clearly an appetite for the best possible code-reviewing tools. And maybe it’s not hard to understand why.

In another 2022 presentation, Harding said with a smile that “developers naturally have a lot more enthusiasm for writing code than we do for reviewing it.”

The post One Company Rethinks Diff to Cut Code Review Times appeared first on The New Stack.

Did Broadcom’s VMware Hit Nutanix Where It Hurts?

B. Cameron Gain — Mon, 02 Sep 2024 14:00:37 +0000

The much-anticipated product releases during Broadcom VMware’s annual Explore user’s conference offered not so many pleasant surprises, but definitely promising additions to VMware’s product portfolio. VMware has made considerable investments in streamlining DevOps in pursuit of that one single platform as a service — the “Holy Grail” if successful — for developers and platform engineers with its VMware Tanzu Platform 10 release. It introduced the VMware Cloud Foundation (VCF) platform for private clouds, which Broadcom CEO Hock Tan said would play a critical role in DevOps in the future. Other releases during Explore worth a look cover edge deployments and AI (but of course).

However, these releases come in the wake of the controversy following Broadcom’s acquisition of VMware, where a number of enterprises and organizations saw their pricing and fees skyrocket. However, this is not necessarily commensurate with the big picture, as certain customers have seen their prices decrease. (More on that below.)

@Broadcom‘s @VMware Cloud Foundation Division @VMware‘s @cswolf keynote session at #VMwareExplore: “We can take complex AI applications, dozens of microservices and be able to stand them up in minutes,” which can take weeks to months done manually. pic.twitter.com/h2gym9yhmj

— BC Gain (@bcamerongain) August 27, 2024

Organizations are also not going to limit what they adopt to hurt feelings about licensing changes, and again, and may (or may not) opt for some of the rather ambitious offerings VMware announced in its attempt to streamline CI/CD with a major Tanzu revamp, AI integration that DevOps can rely on and use or private cloud advantages.

Meanwhile, just a few days before the conference, Nutanix CEO Rajiv Ramaswami — who previously held executive roles at VMware — had this to say in a statement about the perceived negative effects the Broadcom and VMware merger had on customers:

“What we’re hearing from customers over and over is that they’re concerned about Broadcom’s impact on VMware, from pricing increases, to support changes, and lack of innovation. Customers want to know how their cloud infrastructure platform will support their business critical applications now but also in the future, from cloud native applications to AI and beyond.

“Customers are looking at Nutanix and seeing a platform that can help them achieve this, all supported by our 90+ NPS score. We have benefitted from customers leaving VMware, like Treasure Island Hotel & Casino and Computershare, and we see this as a significant multi-year tailwind.”

During a conference call with analysts this week following the issuance of its first-quarter earnings, Ramaswami continued to speak about closing some “additional deals that I would consider to be influenced by the Broadcom VMware transaction.”

“In fact, again, going back to that bank win that we talked about, the G2K bank, and that is certainly influenced by this transaction,” Ramaswami said. “As we mentioned, it was a dual vendor strategy, but going forward, it’s going to be a single vendor strategy with us.”

Having noted how many customers have signed multi-year Enterprise License Agreements (ELAs) with VMware prior to the closing of Broadcom’s purchase of VMware in order to lock in lower pricing in some instances, customers mulling whether to make the switch to Nutanix have some time to decide, Ramaswami said.

But in the immediate, “there continues to be certainly a lot of concerns around all the stuff we’ve talked about in the past; pricing, increased pricing potentially, dropping support levels, etc.,” Ramaswami said. “So, we have a significant pipeline of opportunities and it’s growing, and a good degree of engagement with prospects driven by these concerns,” Ramaswami said. “It’s just difficult to predict timing and magnitude of events, and we continue to expect some benefit from these, influenced by this transaction, and we certainly factored that in, into our guidance for this fiscal year.”

A Tale of Two Suppliers

Nutanix and VMware compete directly in the cloud infrastructure platform market but their approaches differ. On offer by both are compute networking, storage management automation and respective hypervisors for building cloud platforms. vSphere factors in hugely as well for both Nutanix and VMware customers. Gartner’s report, “Hype Cycle for Infrastructure Platforms” from 2023, characterized the aspects of what Nutanix and VMware, as well as Microsoft, are competing in the intelligent platform market.

“Market momentum around HCI software in the cloud now creates a market for multiple hardware vendors to build software management and integration services,” Gartner analyst Dennis Smith wrote in the report.

But again, the approaches that these cloud native leaders are taking differ. The move to a more simplified offering for VMware consisting primarily of the VMware Cloud Foundation and the VMware vSphere Foundation has certainly made up for its more complicated previous pricing scheme while large enterprises certainly appreciate what VMware says are more services and features for cloud native environments than what they are paying for.

The Coliseum

As mentioned above, following Broadcom’s acquisition, VMware products or licenses changed. Instead of offering perpetual license-driven subscriptions to companies, Broadcom now offers them as subscriptions or termed licenses, a change that has been in effect since December.

The VMware Cloud Foundation has since featured either an enterprise hybrid cloud solution or the VMware PSPIR Foundation, which the company says is a simpler enterprise workload platform for mid-sized to smaller customers. However, this also means that many smaller customers have seen a price hike.

The licensing scheme change is at the heart of the controversy. “Since the acquisition by Broadcom VMware has lost a lot of trust through their rather unfortunate communication of license model changes, product discontinuations, changes in their partner program and key staff leaving the company,” Torsten Volk, an analyst at TechTarget’s Enterprise Strategy Group, said. “This definitely has resulted in customers looking for alternatives, especially for cloud native applications.”

However, customers and partners that are deeply invested in VMware infrastructure will have a lot of “detangling” and upskilling to do to move to different platforms, Volk said. “VMware has a little more time to convince these stakeholders to stay on,” Volk said. “However, we do see organizations seeking out alternatives for their cloud native projects and for the modernization of legacy apps.”

In this battle, nothing is guaranteed, of course, as ideally VMware and Nutanix will battle it out by using their resources to deliver what DevOps teams can really use and want, with added motivation to deliver to the developer and platform engineering teams. Nutanix CEO Ramaswami seemed to agree when speaking during the analyst call:

“Again, I think we are still at a point where we have a significant and growing pipeline of opportunities that we are engaged in with prospects. But it’s hard to predict what portion of those we’ll win, how much will they, for example, bring us in as a second vendor or the sole vendor or just use us as a negotiating lever to get a better deal from VMware.

“So, there’s a lot of uncertainty and lack of predictability there for the long term. So, we’ve focused on the timing, for example, for when this might happen. So, that’s why we have — we’ve modeled some level of share gains here into our forecast for the year and it’s incorporated into the guidance. But it’s going to be — I would again emphasize that it’s going to be a multiyear thing for us here, and it’s going to be a bit of timing and the exact share gains are going to be a little unpredictable.”

In order to fend off competitors such as Nutanix as well as Red Hat, VMware must show that it has the vision and ability to deliver a tightly integrated portfolio of cloud native solutions, Volk said. “Tanzu AI clearly shows that VMware is thinking about how to best capture emerging workloads, such as LLMs and the overall Tanzu 10 release has shown that the company wants to execute on its vision of delivering a simple developer platform for the masses,” Volk said. “However, to convince organizations to move their cloud native workloads to Tanzu will require a lot more than was included in the Tanzu 10 release.”

Meanwhile, vendors such as Nutanix and Red Hat will work on making workload migration from VMware “as easy and attractive as possible,” Volk said. “Clearly, the current situation is an opportunity for these vendors to eat some of VMware’s pie, but at the same time it is up to Broadcom to double down on investing in its cloud native portfolio and other platforms.”

@Broadcom CEO Hock Tan’s keynote session at #VMwareExplore on the public cloud hangover: “The future is private..private cloud, private AI fueled by your own private data. It’s about staying on prem and in control.” @VMware #HockTan pic.twitter.com/QoegkrXn2m

— BC Gain (@bcamerongain) August 27, 2024

Again, Broadcom is at least posturing to meet the challenge with a clear vision of what it plans to deliver — and how needs are evolving and changing. As Broadcom’s Tan emphasized during his keynote, the future lies in truly achieving the goal of breaking down those silos, which is what DevOps was supposed to do years ago. “You want to deploy a new application, you need to write a ticket. You might need to write a ticket to your IT department, and you might just get that virtual machine two months later… At VMware, we believe we have the solution,” Tan said.

VCF will play a large part in what will be on offer, to help “create a single platform” that can not only run on private clouds but can extend to public clouds as needed, Tan said. “It’s resilient, it’s secure, and it costs much less than public cloud,” Tan said.

The post Did Broadcom’s VMware Hit Nutanix Where It Hurts? appeared first on The New Stack.

Build Platform Engineering as a Product for Dev Adoption

Todd R. Weiss — Mon, 02 Sep 2024 13:00:22 +0000

So, your company is creating a platform engineering platform and is onboard to use it to create, test and deploy all the software applications that are built inside your operations. That is great, but now you need to take another important step — approach it as a product throughout its development and inception so that your company keeps it fresh, popular, and responsive for your developers who will be tasked with using it.

“If you actually want to build something that is going to solve a problem and be good over the long term, you really must understand what the problem is,” Daniel Bryant, a platform engineer, developer relations specialist, go-to-market professional, software developer, and the head of marketing for platform engineering vendor, Syntasso told The New Stack. “You chat with your customers — your developers in this case — and ask, ‘Hey, what is your biggest pain point? Where do you have friction in [your work] day?’”

Using this approach, developer pain points can be addressed, resolved, and minimized, which will encourage their adoption of a platform-engineering platform that is seen as a critical tool for the companies that invest in planning and building these frameworks, said Bryant. For companies, these are the same approaches that must be taken when marketing and selling a product, he added.

“You have to do marketing within the company and you market it as a product” to encourage internal developers to “come and use my platform,” said Bryant. The move to platform engineering brings in a standardized set of tools for all developers to use, which is maintained and updated by system administrators who choose the tools, package them together, and present them as a platform for the developer teams.

The idea is to provide curated, self-service sets of development tools that encourage developers to be able to tackle their jobs without having to maintain, collate, and update their own toolsets. In the big picture, the idea of platform engineering is to ensure that developers can spend their valuable time generating great, clean, and innovative code for their companies, rather than looking for their own tools and wasting time.

But for that to work, for all these efforts to be a success, companies that are using platform engineering must be sure that they get critical buy-in from their developers so that the platforms are adopted and utilized to do their code building, said Bryant.

“Because if they are doing their own thing [assembling and using] their own tools, that is not solving this problem,” he said.

But by approaching these platforms as products that are aimed directly at their developer users, Bryant said that he believes it can inspire better buy-in from users. “Anecdotally, at Syntasso we see that. And there are industry reports that lean in that direction.”

For this to happen, the planning for successful platform as product approaches must start at the nascent stages of a platform engineering strategy as it is being envisioned and implemented, he said.

That means getting ideas from developers at every step through the software development processes, including coding, shipping, production, and more that will help to provide value and insights that allow the platform as a product concept to succeed, said Bryant.

“It is basically applying product thinking,” he said. “It is all the thinking we do when we design our iPhones or apps or whatever, just applying that methodology, that thinking, to the internal developer platform (IDP) [that is being built]. And it is important to think long term.”

Top Tips for Platform as a Product Success

So, how can a successful platform engineering implementation for an IDP be built from the ground up to succeed as a product that drives innovation and success for developers and companies?

Bryant has several tips for systems administrators and IT managers who are tasked with making platform engineering work within their companies.

“Ensure that someone within the team has a product owner mindset,” he said. “The infrastructure folks, they have not been building products. They have been racking and stacking. And it is not in a disrespectful way that I say this, but … we need to train the folks that are doing it to have a product mindset. That is key.”
Also critical is that the platform as a product team and organizers must talk to developers about what is being planned, implemented and finalized, said Bryant. “Too many times in my 20-year IT career I saw us building stuff without talking to our customers [developers]. The same goes for the platform – who you are building it for, why [it is being built] and what their problems are. You will never build the right thing [without addressing these issues.]
“And the third thing I would say, is to measure the results from day one” to get a good feedback loop, said Bryant. “Get that baseline, because some folks do not know what impact they have had because they have not measured from day one. They will think they have made it faster, [but will not know unless they] measure from day one.”
Always make improvements and tweaks to keep the platform product vital, efficient, and up to date for its users because the process is a voyage, not a destination, said Bryant. “Focus early on delivering value as quickly as possible,” he said. “We talk a lot about [creating] the thinnest, minimum viable platform (MVP)” on which to base a company’s IDP to keep it simple and create a solid product for developers who once they see it will realize that they need to have it.
Do not worry about establishing quick wins, but instead be sure to deliver real value that shows a path for further progress in the future, said Bryant. “That is a hard balance. I have seen people try to get quick wins and then the product only works for a month or so. It is not very good. I have seen some folks obsess about long-term value, and then they do not show any value upfront. So, the platform itself gets canceled from a budgeting point of view. So, you need to get this balance of showing value early, fixing real problems, ensuring that developers have a minimum viable product in this form, but with an eye to … sustainably evolve the platform to meet more and more use cases, to provide more and more value.”

Ultimately, an IDP that will be successfully created inside a company is one that is continuously maintained, improved, changed, and that incorporates feedback and operational insights, said Bryant.

“We talk about platform decay quite a bit at Syntasso,” he said. “You know, just entropy in the world in general, like new versions of things come out. And then people do not maintain things. Then the platform decays. You need to be on that kind of thing, too.”

The post Build Platform Engineering as a Product for Dev Adoption appeared first on The New Stack.

Manage Multiple Jupyter Instances in the Same Cluster Safely

Manas Chowdhury — Mon, 02 Sep 2024 12:00:35 +0000

Jupyter notebooks are interactive, efficient tools that allow data scientists to explore datasets and add models productively. Many organizations — 42% in a recent JetBrains study — leverage Jupyter notebooks to provide users with programmatic access to sensitive data assets.

Jupyter notebooks have become a staple in data science and research for reasons including:

Interactivity
Flexibility
Integration
Collaboration
Ease of use

However, have you ever wondered about the threats this model poses to data security? Attackers or unethical users can exploit it to gain unauthorized access to sensitive information.

In this article, I will walk you through common Jupyter notebook threats and explain how to use zero trust security to protect them.

Since Jupyter notebooks are widely used and popular, preventing security threats is not just beneficial but necessary.

Common Jupyter Notebook Threats and Exploits

Attackers can use Python to modify the operating system, which allows Jupyter notebooks to change system settings and files. This can pose major security risks and potentially mess with local assets.

Here are some of the most common security threats that Jupyter notebooks can face due to their design.

Remote Command Injection

Remote command injection happens when an attacker exploits vulnerabilities in a Jupyter notebook environment to run arbitrary commands on the host server.

This can occur through poorly sanitized inputs or malicious notebooks. Once the attacker gains command execution, they can control the server, access sensitive data and even move to other systems within the network. This can cause extensive damage and data breaches.

Unauthorized Access to Remote Trusted Entities

By gaining unauthorized access to external services or systems trusted by the Jupyter notebook, attackers can exploit vulnerabilities or misconfigurations. They can use this access to impersonate legitimate users or services to access sensitive data or perform other unauthorized actions. This not only jeopardizes the security of the trusted entities but also undermines the integrity of the data and operations within the Jupyter environment.

Unauthorized Access to Another Jupyter User Pod in the Same Namespace

In environments where multiple users share a Jupyter deployment, such as Kubernetes namespaces, attackers exploit vulnerabilities to gain unauthorized access to another user’s pod. This allows them to execute arbitrary code, steal data or disrupt the operations of other users. Such breaches can lead to significant security incidents, especially in multitenant environments where data isolation is crucial.

Control and Beacon via a Remote C&C Server

Attackers can establish a command-and-control (C&C) server to remotely control compromised Jupyter notebook instances. By doing so, they can issue commands, exfiltrate data and perform other malicious activities without direct access to the environment. This type of attack can be particularly stealthy and persistent, as the C&C server can continuously direct the compromised notebook to perform harmful actions.

Unauthorized Access to Another Customer Pod via Namespace Escape

Namespace escape attacks occur when an attacker exploits vulnerabilities to break out of their isolated environment (namespace) and access other customers’ pods. This is particularly concerning in cloud environments where multiple customers share the same underlying infrastructure. Such an attack can lead to unauthorized data access and system manipulation, and potentially compromise the entire infrastructure’s security.

Data Exfiltration Through Malicious Resources

Data exfiltration involves the unauthorized transfer of data from the Jupyter notebook environment to an external location. Attackers use malicious notebooks or scripts to read sensitive data and send it out to a controlled server. This type of attack can result in significant data breaches, exposing confidential information and causing financial and reputational damage to the affected organization.

Supply Chain Attack

Attackers can compromise the software supply chain by injecting malicious code into trusted software components or libraries used within the Jupyter notebook environment. When these components are integrated, the malicious code executes, allowing the attacker to compromise the system. This type of attack can be particularly insidious, as it exploits the inherent trust in widely used software components.

MITM Attack for Connection to Remote External Trusted Entities

A man-in-the-middle (MITM) attack occurs when an attacker intercepts and potentially alters communications between the Jupyter notebook and remote trusted entities. This enables the attacker to eavesdrop on sensitive information, inject malicious data and disrupt the integrity of the communication. This type of attack can compromise the confidentiality and reliability of data exchanges, leading to significant security risks.

Safely Managing Multiple Jupyter Instances in the Same K8s Cluster

To demonstrate how these threats can affect data science environments, I will use a sample deployment scenario and share some best practices.

First, set up your Jupyter notebook instances in a Kubernetes (K8s) cluster for data science workloads.

K8s setup: The deployment uses a Kubernetes cluster with three nodes on Google Kubernetes Engine (GKE). The cluster is set up with the default configurations provided by the regular channel, and it uses a container-optimized operating system image for efficiency and performance.

Jupyter notebook setup: Two namespaces are created within the Kubernetes cluster, each hosting its own Jupyter notebook instance. When a user logs in, the system dynamically spins up a user-specific pod named Jupiter-. This ensures that each user has their own isolated environment in which to run their Jupyter notebooks, enhancing security and resource allocation.

Follow these best practices for managing multiple Jupyter instances within the same cluster:

Running multiple instances: To run multiple Jupyter notebook instances in the same Kubernetes cluster, create separate Docker images for each instance. Then set up Kubernetes deployments and services for these instances.
Namespace isolation: Namespace isolation is used to ensure that each Jupyter notebook instance operates in its own isolated environment. This helps prevent potential security issues and resource conflicts between different users or projects.
Security measures: Implementing security measures involves configuring the Kubernetes cluster and Jupyter notebooks to minimize vulnerabilities. This might include setting up network policies, role-based access controls and monitoring for potential threats.
Resource allocation: Proper resource allocation ensures that each Jupyter notebook instance receives its necessary CPU, memory and storage resources without impacting others. This is critical for maintaining performance and reliability in a multiuser environment.
Threat mitigation strategies: Specific strategies are put in place to mitigate threats such as unauthorized access, data exfiltration and command injection. This might involve using secure configurations, regularly updating software and monitoring for suspicious activities.

Using Zero Trust Security for Jupyter Notebooks

The way we use digital technologies is changing fast, making old perimeter-based cybersecurity defenses ineffective. Perimeters can’t keep up with the dynamic nature of digital changes. Only zero trust security can handle these challenges by carefully checking and approving access requests at every part of a network.

The principle of least privilege ensures that no user or system has full access to the entire network. Each access request is constantly checked based on factors like who the user is, the health of their device and where they’re trying to access from. This reduces the chance of unauthorized access and protects important data and systems.

If there’s a security breach, microsegmentation is crucial. It divides the network into smaller parts, stopping any sideways movement within the network. This containment limits the damage a hacker can do.

Using zero trust not only boosts security but also meets today’s business needs for flexibility, growth and strong data protection. By using these strategies, companies can protect their digital assets from new threats while keeping their operations safe and efficient.

How Zero Trust Helps To Secure a Jupyter Notebook

A comprehensive zero trust cloud native application protection platform (CNAPP) solution delivers superior protection and control with features like:

Granular control: Achieve precise management of user actions to mitigate the risk of security incidents effectively.
Real-time protection: Monitor system activities continuously, and promptly address any unauthorized actions with proactive security measures.
User-friendly configuration: Security policies are straightforward to set up and adjust, ensuring accessibility for users with varying levels of expertise.

What To Look for in a Zero Trust Solution

If you have decided to secure your organization’s data with a zero trust security strategy, choosing the right service provider is crucial.

The most important factor to look for is whether the service provider offers inline security or post-attack mitigation.

While post-attack mitigation might seem to offer similar levels of security and might be affordable, it can cost your enterprise more in the long run.

Post-attack mitigation reacts post-exploitation; once a security mishap has occurred, it identifies and stops it. On the other hand, inline security or runtime security responds to potential attacks before they happen. It offers a more proactive and real-time threat mitigation approach than post-attack mitigation.

Here are a few other features to seek out:

User execution control: The ability to limit user execution of binaries to specific, predefined paths is crucial. This feature helps prevent unauthorized access to critical system binaries and enhances overall system security by controlling user actions.
Path restriction: Defining explicit paths, such as /usr/local/bin and /bin/, for execution is vital. Controlling the scope of binary execution minimizes the risk of potential vulnerabilities and restricts users to trusted paths, reducing the likelihood of malicious activities.
Prohibition of new binaries: Implementing rules to disallow the creation of new binaries within specified paths is an essential security measure. This mitigates the risk of introducing unknown executables and safeguards the system against potential threats by controlling binary additions.
Enforcement of execution rules: A robust security provider should identify and enforce rules for executing binaries from essential paths. Defining strict measures for binary execution from paths like /usr/local/bin and /bin/ significantly enhances the security of the system.
Prevention of write operations: Applying strict measures to prevent any write operations within critical paths ensures system integrity. This approach aligns with the principles of zero trust, which is fundamental for maintaining a secure environment.

Takeaways

Jupyter Notebook is the most used data science integrated development environment (IDE), but it carries significant security risks. Organizations must understand these vulnerabilities to mitigate threats effectively.
Zero trust security is one of the best ways to safeguard Jupyter notebook environments. By verifying every access request and assuming zero trust, organizations can prevent unauthorized access and data breaches.
If you decide to work with a zero trust service provider, go for solutions offering real-time threat mitigation and inline security measures.

The post Manage Multiple Jupyter Instances in the Same Cluster Safely appeared first on The New Stack.

Linux: SSH and Key-Based Authentication

Damon M. Garn — Sun, 01 Sep 2024 19:00:51 +0000

The Secure Shell (SSH) is a critical remote administration tool for Linux systems and network devices. It’s also essential for macOS access and is often added to Windows computers (or used in conjunction with PowerShell). I’ll demonstrate concepts and configurations using OpenSSH.

SSH’s primary benefits include the following:

Remote access to a wide variety of platforms.
Remote command execution.
Default installation on most Linux distributions.
Strong authentication mechanisms.
Support for secure file transfers, such as SCP and SFTP.
Provides tunneling for other non-secure applications.
Enhances automation and scripting.

Learning to leverage SSH is an essential Linux sysadmin skill. This article covers basic SSH configurations, password-based authentication and general security settings. It also shows how to improve SSH functionality with key-based authentication for better remote administration and integration with automation tools.

SSH helps mitigate eavesdropping attacks by encrypting authentication and network traffic. It’s a critical means of protecting administrative connections to servers, routers, switches, IoT devices and even cloud connections.

This article provides commands for managing remote Linux systems. I suggest using a Linux lab environment when completing these exercises. Review Understand the Linux Command Line to work with these commands better.

Establish an SSH Connection Using a Password

You might already know the basic SSH syntax for connecting to remote machines. Use the

ssh

command and target a particular hostname or IP address:

$ ssh server07

Enhance the command by including the username for the remote user account you want to authenticate. For example, to connect using remote user admin03, type:

$ ssh admin03@server07

SSH prompts you to enter the password of the user account hosted on the remote system. On most systems, the command prompt will change to show the hostname of the remote computer.

At this point, you can begin executing Linux commands and run any programs installed on the remote device, such as Vim, Apache, or MariaDB. Remember that you may need to use

sudo

to elevate your privileges on the remote system.

Once you complete your remote administration tasks, type

exit

logout

to disconnect the SSH session.

Common SSH Use Cases

There are plenty of examples for using remote SSH connections, including:

Run remote backups with Duplicity, Kopia, tar or other tools.
Compile or install applications using compilers or package managers.
Modify system and application configuration files for web and database services.
Restart services. (Remember, you’ll be disconnected if you restart network or firewall services.)

However, the above use cases only allow for manual remote administration, where the administrator connects to one system at a time and runs commands (or scripts). It also means passwords must be tracked and maintained, which could be challenging when dealing with multiple remote devices.

Modern SSH implementations offer a far more robust way of proving your identity called key-based authentication. Implementing key-based authentication initially simplifies authentication for remote administration, but it is especially critical for automation functions.

Key-based authentication allows automation tools to authenticate to remote systems without requiring an administrator to type a password (or store a password in a configuration file). I examine this idea in more detail below.

What Is Key-Based Authentication?

Key-based authentication is a major improvement in SSH authentication, and it replaces password authentication. It relies on asymmetric key cryptography. This method relies on two mathematically related keys. Each key plays a specific role. The keys are:

Public key: This key can be transferred to remote systems across the network. Any data encrypted with the public key can only be decrypted with the related private key.
Private key: This key remains securely stored on the local device and never traverses the network. Any data encrypted with the private key can only be decrypted with the public key.

You’ll generate a public-private key pair on the administration workstation (the administrator’s local computer), then copy the public key to one or more remote servers.

During the connection attempt, the remote server encrypts a message challenge using the admin workstation’s public key. This message can only be decrypted with the admin workstation’s private key. If the workstation decrypts the challenge and replies with the correct information, the remote server knows its identity is confirmed.

The asymmetric keys are much harder to guess or brute-force than standard passwords, making this approach far more secure and reliable than passwords that may be based on predictable words or phrases.

Configure Key-Based Authentication for SSH

Implementing key-based SSH authentication is straightforward. The general steps are generating a key pair, copying the public key to the remote device and testing the connection.

Here is an explanation of the steps:

Generate the key pair using the
```
ssh-keygen
```
command. It creates two hidden files in the current user’s home directory. The files are
```
~/.ssh/id_rsa
```
(private key) and
```
~/.ssh/id_rsa.pub
```
(public key). You’ll typically press Enter through the interactive prompts.
Copy the public key to the remote SSH device using the
```
ssh-copy-id
```
command with the specified user. You must enter your password during this step, but this is the last time you’ll do so. The utility also prompts you for a yes or no confirmation. The public key file will reside in the
```
~/.ssh/authorized_users
```
file on the remote host.
Test the connection by typing
```
ssh admin03@server07
```
(substitute your own credentials and hostname). The remote system should not challenge you for a password. The authentication is silent.

You’ll establish authenticated remote connections using the key pair from this point forward.

The following list summarizes the commands:

$ sudo ssh-keygen
$ sudo ssh-copy-id admin03@server07
$ sudo ssh admin03@server07

Figure 1: Use the ssh-keygen command to generate a public/private key pair.

When you generate the key pair, you’ll be offered the chance to add a passphrase. You can also specify encryption algorithms and key sizes at this time. Most administrators will press Enter through these prompts, bypassing additional passphrase access.

After you copy the client’s public key to the remote server, you’ll no longer be challenged for a password during the connection attempt. Type the regular SSH connection command and the authentication process silently succeeds.

Use Key-Based Authentication for General Administration

The initial benefit of key-based authentication is simplicity. You’ll no longer be challenged for difficult-to-remember passwords. Authentication happens silently. The process is quicker, and you can begin your admin tasks immediately.

This is especially handy when you use SSH to quickly run a single remote command without manual intervention, such as:

# ssh admin03@server07 'run-backup.sh'

There’s no question this quicker authentication is helpful, but the real benefit of key-based SSH authentication occurs when automation gets involved.

Use Key-Based Authentication with Automation

SSH connectivity continues to be relevant in modern DevOps and Infrastructure as Code (IaC) environments. Many configuration management utilities must connect to remote systems to inventory software, manage settings, or initiate software testing. These tools must still authenticate to the remote systems if they use SSH.

Early approaches paused the configuration management tasks until administrators manually entered passwords. Clearly, that method does not enhance automation. Other designs embedded passwords or other authentication information directly in management files, risking accidental exposure to anyone who could access the files (or instances of the files, such as those found in backups).

Modern configuration management tools that use SSH can take advantage of key-based authentication to establish remote connections for a completely zero-touch solution.

Here are just a few automated configuration management tools that can use SSH connectivity:

Ansible
Chef
Puppet

Implementing key-based authentication means remote connections can be defined within these configuration management tools, and they’ll run without pausing for a manually entered password. There is no need for user intervention, which is essential when configuration management tasks run in the middle of the night or during scale-up incidents.

Another benefit of using keys for authentication is avoiding the need to embed passwords in deployment and configuration files. This risky practice can easily expose passwords for admin accounts.

Use Key-Based Authentication with Multiple Remote Servers

What if the admin workstation actually needs to connect to multiple remote SSH servers. You could maintain separate key pairs for each, but this would be very tedious. With a few quick configuration file edits, you can use the same key pair to authenticate to multiple remote devices. This approach even supports different connection options for each target system.

The steps to configure the local system for key-based authentication to multiple target servers begin in the same way as above. However, do not generate new key pairs for each connection. Each time you run the

ssh-keygen

command it overwrites the existing key pair. You’ll use the same public and private keys for all connections.

The first two steps in the process are:

Generate a key pair on the local system using the
```
ssh-keygen
```
command.
Copy the new public key to each remote server using the
```
ssh-copy-id
```
command.

The most significant configuration change when dealing with multi-server connections is editing the client’s user-specific local SSH configuration file. Create (or edit) the

~/.ssh/config

file. You have several choices, including:

Hostnames.
Client identity files for various private keys.
Alternate port numbers.

For example, you might set up the following configuration to connect to various remote systems using a single private key named

id_rsa

Host server07
   Hostname server07
   User admin03
   IdentityFile ~/.ssh/id_rsa

Host server09
   Hostname server09
   User admin03
   IdentityFile ~/.ssh/id_rsa

Finally, test the key-based authentication connection to ensure it can reach the remote device and to ensure the correct settings.

Configure Additional SSH Security Settings

SSH includes various other options to enhance security and customize its use in your environment. The primary SSH server configuration file is usually stored at

/etc/ssh/sshd_config

. It contains many entries. Review the comments and best practices carefully.

Here are some standard security configurations you may consider implementing.

Set SSH to refuse password-based authentication: PasswordAuthentication no.
Set SSH to refuse direct root logins across the network: PermitRootLogin no.
Change the default SSH port from 22 to a non-standard port to control connectivity.
Set a banner warning message.
Configure idle times to reduce hung connections. (Be careful of this setting with configuration managers, as it may be difficult to anticipate how long they will need to be connected.)

It’s poor security practice to log on to a local or remote Linux system as the root (administrator) user. Most systems force you to log on as a regular user and then use the sudo (super user do) command to elevate your privileges. You might be prompted for your password when using

sudo

The PermitRootLogin no configuration mentioned above is a great way of enforcing this. You’ll establish the connection by authenticating with a non-privileged user on the remote SSH target, then elevate your privileges using

sudo

on that box. Combine this method with key-based authentication to manage SSH security better.

Figure 2: Customize the SSH server configuration file to match your company’s security policy.

Configure the Firewall for SSH

Remember to update your firewall settings. If SSH was preinstalled and running with your Linux distribution, the firewall is probably already open for port 22. If you have to add it, don’t forget to update the firewall rules to permit remote connections.

If you will only manage the server from a single admin workstation or jump box, restrict inbound SSH connections to that device’s identity only. That prevents SSH connections from any other network node.

Figure 3: Most distributions default to port 22 open.

Audit Log Files for SSH Connections

Audit log files regularly for remote SSH connections to identify any unauthorized connections or repeated failed connection attempts. These may indicate users or malicious actors attempting to access the remote server.

If you manage several remote Linux servers using SSH, consider centralizing your log files using rsyslog. Doing so makes reviewing SSH connections easier, helping you verify that only authorized connections occur.

Wrap Up

Linux and network administrators rely on SSH for secure, convenient access to remote systems. It’s an essential part of their toolboxes. Password-based authentication to a few remote devices is viable, but it’s not convenient when implementing automation with lots of target servers.

Integrating SSH into your larger CI/CD and orchestration pipelines provides a simple, secure solution for remote connectivity. SSH functions with Linux, macOS, Windows and many network devices (routers, switches, etc.), making it a standard administration utility.

Start today by auditing your current SSH communications, then implement key-based authentication and automate as many configurations as possible. You might just find your environment is more secure and easier to work with.

The post Linux: SSH and Key-Based Authentication appeared first on The New Stack.

AI Demands More Than Just Technical Skills From Developers

Rob Whiteley — Sun, 01 Sep 2024 17:00:50 +0000

What skills do aspiring developers need to acquire? This question has been debated for decades, back to the ’80s and ’90s, when schools focused their curricula on “hard skills” like advanced knowledge of intricate programming languages. As development environments grew more collaborative, employers emphasized hiring versatile developers with soft skills like teamwork and communication and traditional technical chops.

Today, the integration of AI into development environments is reigniting the skills debate yet again. By giving AI a more significant role in the coding process, organizations are placing greater value on hiring “well-rounded” developers who can think, adapt, solve problems, and coax the best solutions from their AI assistants.

Marco Argenti, CIO for Goldman Sachs, recently wrote about the phenomenon in the Harvard Business Review. He argues that engineers should study philosophy to code successfully in the age of AI. Studying philosophy, he argues, helps aspiring developers think clearly and logically about why they’re doing what they’re doing.

Whether developers need to take philosophy classes or not, the reasoning is sound. Generative AI transformed the way we think and work. Unlike in the past, when developers took instructions from a team lead and executed tasks as individual contributors, now they’re outsourcing problem-solving and code generation to AI tools and models. By partnering with GenAI to solve complex problems, developers who were once individual contributors are now becoming team leads in their own right. This new workflow requires developers to elevate their critical-thinking skills and empathy for end-users. No longer can they afford to operate with a superficial understanding of the task at hand. Now, it’s paramount that developers understand the why that is driving their initiative so that they can lead their AI counterparts to the most desirable outcomes.

Understanding the Problem First

In the new world of GenAI, well-rounded developers must fully understand the problem and required outcome before GenAI-assisted problem-solving begins. Their understanding of the problem space must match that of a product manager or end-user. After all, the wrong prompt could result in a response that perpetuates the problem at hand. Give an image generation tool like Dall-E a basic prompt (show me a developer in an office), and follow up with a detailed prompt (show me a developer in an office coding on a laptop in an urban environment with young co-workers). You’ll end up with two completely different pictures.

Key Soft Skills for Developers

What soft skills matter most in the age of AI? Four that stand out are reasoning, curiosity, creativity, and accountability.

Reasoning and Context Matter

One of the most important lessons I learned from a previous boss is that context matters. Suppose you’re trying to convince someone to do something; explaining “the why” is the most important part. It’s what creates linkage and trust. GenAI doesn’t do that on its own. We’re at a point now where GenAI produces a good but not great output. A human touch is still needed to inject that last 20% of work to push the chatbot and iterate.

You have to treat your GenAI like an intern — someone who needs coaching and context so that they can ultimately help you get what you need and learn more about the process along the way. That means your job is to provide the reason and context to convince the AI/intern to do things correctly.

Embrace Curiosity and Exploration

When they use GenAI, developers have to probe for more information continually. They should think of themselves as reporters uncovering facts. Is there anything else I missed? After AI creates a first take, probe further in a second version, making the questions more action-oriented. Think of it as having a conversation with the GPT. If you’re creating content, tell the GenAI to pretend it’s an employee, share three questions an employee would have, and then answer them. Then, have the GPT rework the draft with the answers again. Using this approach while embracing the diversity of thought with your unique skill set and problem-solving abilities will be essential to effectively serve a diverse set of customers.

Creativity in Developer Prompts

GenAI does what it’s told. It culls information from available sources and applies it systematically based on the prompts that it is given. The creativity a developer exercises in delivering those prompts can encourage an AI tool to present coding options that the organization may not have anticipated. Like writers who keep their works fresh by varying their syntax, pacing, and tone, developers can issue directives in different ways to elicit “out-of-the-box” responses.

Accountability in the Age of AI

We’re on the border of an ethical conundrum, and well-rounded developers will be needed to get us through. Just because developers can get GenAI to do something doesn’t mean they should. Developers are now co-creating IP. Who owns the IP? Does the prompt engineer? Does the GenAI tool? If developers write code with a certain tool, do they own that code? In an industry where tool sets are moving so quickly, it varies based on what tool you’re using, what version of the tool, and what different tools within certain vendors even have different rules. Intellectual property rights are evolving. It’s like the wild, wild west. Reasoning through that and understanding the context of what developers should get their tools to do is an important skill.

Conclusion

For top-performing developers, the increasing integration of GenAI into development workflows does not diminish the importance of hard skills. However, for developers who seek to advance their careers and contributions, up-leveling their soft skills like customer empathy and critical thinking will go a long way in making them well-rounded developers in a post-GenAI landscape.

The advancement of developers’ soft skills will not only make them more effective collaborators in the workplace. Still, it will also reinforce their value to organizations exploring leveraging GenAI to achieve new levels of productivity and success.

The post AI Demands More Than Just Technical Skills From Developers appeared first on The New Stack.

Microsoft Builds AutoGen Studio for AI Agent Prototyping

Darryl K. Taft — Sun, 01 Sep 2024 13:00:10 +0000

Microsoft Research has introduced AutoGen Studio, a new low-code interface designed to revolutionize the way developers prototype AI agents.

Built on top of the open source AutoGen framework, this breakthrough tool aims to simplify the complex process of creating and managing multi-agent workflows, the company said.

Elvis Saravia, a researcher in machine learning and natural language processing at the Distributed AI Research Institute (DAIR.AI) posted about the technology on X (formerly known as Twitter).

“The term ‘agent’ refers to an autonomous piece of software that achieves specific business goals independent of other software in its environment,” said Jason Bloomberg, an analyst at Intellyx. “Just how autonomous they are and what they actually do, however, depends upon whom you ask.”

Empowering Developers with Low-Code Solutions

AutoGen Studio offers a user-friendly approach to AI agent development, allowing developers to rapidly prototype AI agents, enhance agents with specialized skills, compose agents into complex workflows and interact with agents to accomplish various tasks.

“This is a very cool project from Microsoft, which has actually been brewing for a few months now,” said Brad Shimmin, an analyst with Omdia. “Basically, it runs on top of Microsoft’s LLM orchestration framework, AutoGen, and really does speed up the prototyping process for enterprise practitioners looking to build GenAI outcomes — not just agentic, but really any outcome where they may want to exert some control over how an LLM runs.”

The tool provides both a web interface and a Python API, enabling developers to represent LLM-enabled agents using JSON-based specifications. This flexibility caters to a wide range of development preferences and skill levels.

“It’s really a nice, graphical counterpoint to the capabilities you get from other agentic frameworks like LangGraph and CrewAI,” Shimmin said. “For developers building on top of Azure AI in particular, this tool plus framework can help them move from PoC to production without a lot of headaches and with some additional perks like plugging into tools like Microsoft [Azure] Purview to better secure AI data.”

Key Features for Streamlined Development

AutoGen Studio includes several features designed to streamline the development process, such as an intuitive drag-and-drop UI for specifying agent workflows; interactive evaluation and debugging capabilities; and a gallery of reusable agent components

These features are built upon four core design principles for no-code multi-agent developer tools, though Microsoft has not yet disclosed these principles in detail.

Work In Progress

While AutoGen Studio represents a significant step forward in AI agent development, Microsoft notes that it is a research project that is still under development and may never become a product in its own right. The company includes the following warning: “AutoGen Studio is currently under active development and we are iterating quickly. Kindly consider that we may introduce breaking changes in the releases during the upcoming weeks…”

However, the underlying AutoGen framework has already found applications across various industries, including:

Advertising
Customer support
Cybersecurity
Data analytics
Education
Finance
Software engineering

Big Potential

This wide-ranging applicability highlights the potential impact of AutoGen Studio in diverse sectors.

“AI agents can play an important role in organizations’ cloud native strategies, as each agent can run statelessly in containers,” Bloomberg said. “As a result, each agent platform has the ability to scale agents automatically, deploying as many identical agents as necessary to address any situation.”

Moreover, GenAI-based agents are rapidly displacing Robotic Process Automation (RPA) bots — but there’s more to the story, Bloomberg told The New Stack. “Such technologies are gradually supplanting not only RPA but also business process automation, low-code/no-code platforms, rules engines, data integration technologies and more.”

Microsoft encourages developers to use AutoGen Studio for prototyping and demonstration purposes, rather than as a production-ready application. For deployed applications requiring features like authentication and advanced security, developers are advised to build directly on the AutoGen framework.

As AI continues to evolve and reshape various industries, tools like AutoGen Studio are poised to play a crucial role in democratizing AI development and fostering innovation in multi-agent systems.

The post Microsoft Builds AutoGen Studio for AI Agent Prototyping appeared first on The New Stack.

The Future of LLMs Is in Your Pocket

Javier Redondo — Sat, 31 Aug 2024 17:00:08 +0000

Imagine a world where one device acts as the user interface and is connected remotely to a second device that performs the actual computations. This was common in the 1960s. Teletypes used for inputting commands and outputting results could be found in places like office settings and high school libraries. Code execution, however, was too resource-intensive to be put in every room. Instead, each teletype would connect remotely to a large, time-shared computer among many clients.

The current generative AI architecture is in the teletype era: an app runs on the phone, but it depends on a model that can only be hosted in the cloud. This is a relic of the past. Over the decades, teletypes and mainframes gave way to PCs. Similarly, generative AI will eventually run on consumer-grade hardware — but this transition will happen much more quickly.

And this shift has significant implications for application developers.

How We Got Here

You are probably aware that generative AI models are defined by computationally intensive steps that transform an input (e.g., a prompt) into an output (e.g., an answer). Such models are specified by billions of parameters (aka weights), meaning that producing an output also requires billions of operations that can be parallelized across as many cores as the hardware offers. GPUs have thousands of cores, which are an excellent fit for running generative AI models. Unfortunately, because consumer-grade GPUs have limited memory, they can’t hold models that are 10s of GB in size. As a result, generative AI workloads have shifted to data centers featuring a (costly) network of industrial GPUs where resources are pooled.

We keep hearing that the models will “keep getting better” and that, consequently, they will continue to get bigger. We have a contrarian take: While any step function change in model performance can be revolutionary when it arrives, enabling the current generation of models to run on user devices has equally profound implications — and it is already possible today.

Why This Matters

One question to answer before discussing the feasibility of local models is: Why bother? In short, local models change everything for generative AI developers, and applications that rely on cloud models risk becoming obsolete.

The first reason is that, due to the cost of GPUs, generative AI has broken the near-zero marginal cost model that SaaS has enjoyed. Today, anything bundling generative AI commands a high seat price simply to make the product economically viable. This detachment from underlying value is consequential for many products that can’t price optimally to maximize revenue. In practice, some products are constrained by a pricing floor (e.g., it is impossible to discount 50% to 10x the volume), and some features can’t be launched because the upsell doesn’t pay for the inference cost (e.g., AI characters in video games). With local models, the price is gone as a concern: they are entirely free.

The second reason is that the user experience with remote models could be better: generative AI enables useful new features, but they often come at the expense of a worse experience. Applications that didn’t depend on an internet connection (e.g., photo editors) now require it. Remote inference introduces additional friction, such as latency. Local models remove the dependency on an internet connection.

The third reason has to do with how models handle user data. This plays out in two dimensions. First, serious concerns have been about sharing growing amounts of private information with AI systems. Second, most generative AI adopters have been forced to use generic (aka foundation) models because scaling the distribution of personalized models was too challenging. Local models guarantee data privacy and open up the door to as many model variants as there are devices.

Do We Need 1T Parameters?

The idea that generative AI models will run locally might sound surprising. Having grown in size over the years, some models, like SOTA (state-of-the-art) LLMs, have reached 1T+ parameters. These (and possibly larger models under development) will not run on smartphones soon.

However, most generative applications only require models that can already run on consumer hardware. This is the case wherever the bleeding-edge models are already small enough to fit in the device’s memory, as for non-LLM applications such as transcription (e.g., Whisper at ~1.5B) and image generation (e.g., Flux at ~12B). It is less evident for LLMs, as some can run on an iPhone (e.g., Llama-3.1-8B), but their performance is significantly worse than the SOTA.

That’s not the end of the story. While small LLMs know less about the world (i.e., they hallucinate more) and are less reliable at following prompt instructions, they can pass the Turing test (i.e., speak without hiccups). This has been a recent development — in fact, in our opinion, it’s the main advancement seen in the last year, in stark contrast with the lackluster progress in SOTA LLMs. It results from leveraging larger datasets of better quality in training and applying techniques such as quantization, pruning, and knowledge distillation to reduce model size further.

The knowledge and skills gap can now be bridged by fine-tuning — teaching the model how to handle a specific task, which is more challenging than prompting a SOTA LLM. A known method is to use a large LLM as the coach. In a nutshell, if a SOTA LLM is competent at the task, it can be used to produce many successful completion examples, and the small LLM can learn from those. Until recently, this method was not usable in practice because the terms of use for proprietary SOTA models like OpenAI’s GPT-4 explicitly forbade it. The introduction of open-source SOTA models like Llama-3.1-405B without such restrictions solves this.

Finally, a potential concern would be that the replacement for the one-stop-shop 1T-parameter model would be a hundred task-specific 10B-parameter models. The reality is that the task-specific models are all essentially identical. Hence, a method called LoRA enables “adapters,” which can be less than 1% the size of the foundation model they modify. It’s a win in many dimensions. Among other things, it simplifies fine-tuning (lighter hardware requirements), model distribution to end-users (small size of adapters), and context switching between applications (fast swapping due to size).

The Catalysts Are Here

Small models that can deliver best-in-class capabilities in all contexts (audio, image, and language) arrive at the same time as the necessary ecosystem to run them.

On the hardware side, Apple led the way with its ARM processors. The architecture was prescient, making macOS and iOS devices capable of running generative AI models before they became fashionable. They bundle a GPU capable of computing and pack high-bandwidth memory, which is often the limiting factor in inference speed.

Apple is not alone, and the shift is coming to the entire hardware lineup. Not to be left behind, laptops with Microsoft’s Copilot+ seal of approval can also run generative models. These machines rely on new chips like Qualcomm’s Snapdragon X Elite, showing that hardware is now being designed to be capable of local inference.

On the software side, while PyTorch has remained king in the cloud, a new series of libraries are well-positioned to leverage consumer-grade hardware better. These include Apple’s MLX and GGML. Native applications, like on-device ChatGPT alternatives, are already using these tools as a backend, and the release of WASM bindings enables any website loaded from a browser to do the same.

There are some wrinkles left to iron out, particularly concerning what developers can expect to find on a given device. Small foundation models are still a few GB large, so they’re not a practical standalone dependency for almost any application, web or native. With the release of Apple Intelligence, however, we expect macOS and iOS to bundle and expose an LLM within the operating system. This will enable developers to ship LoRA adapters in the 10s of MBs, and other operating systems will follow.

While a potential problem for developers could be the inconsistency between the models bundled by each device, convergence is likely to occur. We can’t say for sure how that will happen, but Apple’s decision with DCLM was to open-source both the model weights and the training dataset, which encourages and enables others to train models that behave similarly.

Implications for App Developers

The shift to on-device processing has significant implications for application developers.

First, start with the assumption that LLM inference is free, which removes the cost floor on any generative AI functionality. What new things can you build, and how does this affect your existing products? We predict three scenarios:

Where generative AI is a feature of a much larger product, it will blend in more seamlessly with existing SaaS tiers, putting it at the same level as other premium features driving upgrades.
Where generative AI is the core value proposition, and the product is priced at “cost plus” (i.e., the cost determines the price point), the products will get cheaper, but this will be more than offset by much larger volumes.
Where generative AI is the core value proposition, and the product was priced on value to the user (i.e., well above cost), the impact will be limited to margin improvements.

Second, realize that there is a shift in how applications are developed, especially those that depend on LLMs: “prompt engineering” and “few-shot training” are out, fine-tuning is now front and center. This means that organizations building generative AI applications will require different capabilities. A benefit of SOTA LLMs was that software engineers were detached from the model, which was seen as an API that worked like any other microservice. This eliminated the dependency on internal teams of ML engineers and data scientists, which were resources many organizations didn’t have or certainly didn’t have at the scale needed to introduce generative AI across the board. On the other hand, those profiles are required for many of the workflows that local models demand. While software engineers without ML backgrounds have indeed leveled up their ML skills with the increased focus on AI, this is a higher step up to take. In the short term, the products are more challenging to build because they require differentiated models instead of relying on SOTA foundation LLMs. In the long run, however, the differentiated small models make the resulting product more valuable.

These are positive evolutions, but only those who pay the most attention to the disruptive dynamic will be ahead and reap the benefits.

Opportunities for Infrastructure Innovation

Finally, a shift to local models requires a revised tech stack. Some categories that already existed in the context of cloud-hosted models become even more necessary and might need to expand their offering:

Foundation models: Foundation model companies started with a single goal: creating the best SOTA models. Although many have shifted partially or entirely to building models with the best cost-to-performance ratio, targeting consumer-grade hardware has not reached my mind. As local models become the primary means of consumption, priorities will shift, but now there is a lot of whitespace to cover.
Observability and guardrails: As developers have shipped AI applications into production, the media has spotlighted their erratic behavior (e.g., hallucinations, toxicity). This has led to a need for tools that provide observability and, in some cases, hard constraints around model behavior. With a proliferation of distributed instances of the models, these challenges are aggravated, and the importance of such tools grows.
Synthetic data and fine-tuning: While fine-tuning has been an afterthought for many application developers in the era of SOTA models, it will be front and center when dealing with fewer parameters. We argued that open-source SOTA models make it possible to synthesize fine-tuning datasets, and anyone can set up their own fine-tuning pipelines. Nevertheless, we know the people required to do these things are scarce, so we believe synthetic data and on-demand fine-tuning are areas where demand will grow significantly.

At the same time, the requirements of local models lead us to believe that several new categories will arise:

Model CI/CD: One thing we don’t have a good sense for yet is how developers will deliver models (or model adapters) to applications. For example, will models be shipped with native application binaries, or will they be downloaded from some repository when the application loads? This brings up other questions, like how frequently models will be updated and how model versions handled. We believe that solutions will emerge that solve these problems.
Adapter marketplaces: While a single SOTA LLM can serve all applications, we established that making small models work across tasks requires different adapters. Many applications will undoubtedly rely on independently developed adapters, but certain adapters can also have a purpose in many applications, e.g., summarization and rephrasing. Only some developers will want to manage such standard adapters’ development and delivery lifecycle independently.
Federated execution: While not an entirely new category, running models on consumer hardware is a new paradigm for those thinking about federated ML, that is, distributed training and inference. The focus here is less on massive fleets of devices connected over the internet and more on small clusters of devices in a local network, e.g., in the same office or home. We’re already seeing innovation here that enables more compute-intensive workloads like training or inference on medium-sized models by distributing the job across two or three devices.

Looking Forward

There’s a future where AI leaves the cloud and lands on user devices. Understanding that the ingredients to make this a possibility are already in place, it will lead to better products at a l0wer cost. In this new paradigm, organizations will need to update go-to-market strategies, organizational skills, and developer toolkits. While that evolution will have meaningful consequences, we don’t believe it to be the end of the story.

Today, AI remains highly centralized up and down the supply chain. SOTA GPUs are designed by only one company, which depends on a single foundry for manufacturing. Hyperscalers hosting that hardware can be counted on one hand, as can the LLM providers that developers settle for when looking for SOTA models. The innovation potential is much greater in a world where models run on commodity consumer hardware. That is something to be excited about.

The post The Future of LLMs Is in Your Pocket appeared first on The New Stack.

Introduction To Plotly Dash, the Most Popular AI Data Tool

David Eastman — Sat, 31 Aug 2024 13:00:51 +0000

The go-to language for data analysis, and to some extent AI development, is Python. Plotly Dash is a presentation graphing tool for supporting data apps. Or in their words, “Dash is the original low-code framework for rapidly building data apps in Python.” But as usual, low code still requires a reasonable grasp of programming.

Earlier this month, Plotly Dash was named the number one most popular tool in Databricks’ State of Data + AI report — even above Langchain! So it’s clearly a trendy tool in the AI engineering ecosystem. “For more than 2 years, Dash has held its position as No. 1, which speaks to the growing pressure on data scientists to develop production-grade data and AI applications,” wrote Databricks.

In this post, I’ll install and play around with Dash, and maybe in a future post, we can build something with it. I’ve used Jupyter notebooks before, but here we’ll just use a classic web server to host the outcome.

So in my trusty Warp shell, we’ll install the two requirements. As I’m not a regular Python guy, I didn’t have the recommended Python version in my .zshrc shell configuration file, so I added that:

#python export 
PATH="$HOME/Library/Python/3.9/bin:$PATH"

Then I used pip to install the dependent modules:

pip install dash 
pip install panadas

Dash will effectively match HTML references into its own component base, and has some specially written interactive graphs and tables too.

To test that things are working, we’ll just try the “minimal” app.py, and run it.

from dash import Dash, html, dcc, callback, Output, Input 
import plotly.express as px 
import pandas as pd 

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/gapminder_unfiltered.csv') 

app = Dash() 

app.layout = [ 
    html.H1(children='Title of Dash App', style={'textAlign':'center'}), 
    dcc.Dropdown(df.country.unique(), 'Canada', id='dropdown-selection'), 
    dcc.Graph(id='graph-content') 
] 

@callback( 
    Output(component_id='graph-content', component_property='figure'), 
    Input(component_id='dropdown-selection', component_property='value') 
) 

def update_graph(value): 
    dff = df[df.country==value] 
    return px.line(dff, x='year', y='pop') 

if __name__ == '__main__': 
    app.run(debug=True)

We can see that a layout is established, and a couple of callbacks. So we’ll have to figure out what they are doing later. I’m guessing from the update_graph method that this is a population growth graph, even though the CSV link name gives us no clue.

After making the app.py file and running it, eventually, I get a response:

So looking at the local site at the local address stated, I have:

Note that “Canada” is the set choice in the dropdown, and that the graph changes immediately if I select another nation. So this gives us a bit of a clue as to what the callbacks are doing.

As expected, if I look at the CSV file contents, it has a bunch of data points:

country,continent,year,lifeExp,pop,gdpPercap 
Afghanistan,Asia,1952,28.801,8425333,779.4453145 
Afghanistan,Asia,1957,30.332,9240934,820.8530296 
... 
Canada,Americas,1950,68.28,14011422,10581.26552 
Canada,Americas,1951,68.55,14330675,10932.46678 
Canada,Americas,1952,68.75,14785584,11367.16112 
...

This means we can see what the x and y-axis labels refer to. We can also see the other data we could choose to graph.

Let’s analyze the code until we have figured the rest out. The pandas module read_csv results into a dataframe (hence “df”). This is just the structure for later work. You can read from an Excel data sheet directly too.

The dcc module (Dash Core Components) gives us both the dropdown and the graph. Altogether, the layout is just a list of components: in our case a title, a dropdown and the graph.

At this point, it is interesting to note that neither the Graph nor the Dropdown components are ever referred to directly again. Indeed, the Graph does not even take in the DataFrame. Clearly, there is some studied decoupling going on.

Now, we use the IDs “dropdown-selection” and “graph-content”.

... 
@callback( 
    Output(component_id='graph-content', component_property='figure'), 
    Input(component_id='dropdown-selection', component_property='value') 
) 
...

We have an Output callback that refers to the “graph-content” ID first defined for the Graph component and uses the “figure” property of the component. Here, I think “figure” just means the diagram to display. The Input refers to the Dropdown component via the “dropdown-selection” ID, and reads the “value” property.

... 
def update_graph(value): 
    dff = df[df.country==value] return px.line(dff, x='year', y='pop') 
...

As there is only one method mentioned, update_graph, and we don’t use that in the code, it is clearly used by the graph component to, er, update the graph. This just takes the country value from the dropdown. In other words, I could replace dff = df[df.country==value] with dff = df[df.country==’Canada’] to see Canada’s stats from the DataFrame. You can go ahead and change the code with the live page — it hot reloads.

So when we change country, the graph is rebuilt, with each line of the csv feeding into the update_graph method; and in this case, making a line from point to point.

Let’s experiment. If we have understood this correctly, we should be able to add, say, a table using the same data. Now, assuming we get hold of the table constructor, what would we need?

We will need the import line.
Add it as a line to the layout.

We won’t need anything else if the table doesn’t interact — the data table is already a fully interactive component.

Next, I’ll add the table import to the end of existing imports:

from dash import Dash, html, dcc, callback, Output, Input, dash_table

I’ll also add the table constructor to the existing layout. We know it is a big table, so I’ll use a page size:

app.layout = [ 
    html.H1(children='Title of Dash App', style={'textAlign':'center'}), 
    dcc.Dropdown(df.country.unique(), 'Canada', id='dropdown-selection'), 
    dcc.Graph(id='graph-content'), 
    dash_table.DataTable(data=df.to_dict('records'), page_size=10) 
]

That already works, but we need to limit the columns to Country, Population and Year:

app.layout = [
    html.H1(children='Population by year', style={'textAlign':'center'}),
    dcc.Dropdown(df.country.unique(), 'Canada', id='dropdown-selection'),
    dcc.Graph(id='graph-content'),
    dash_table.DataTable(data=df.to_dict('records'),
       columns=[
          {'name': 'Country', 'id': 'country', 'type': 'text'},
          {'name': 'Population', 'id': 'pop', 'type': 'numeric'},
          {'name': 'Year', 'id': 'year', 'type': 'numeric'}
       ],
       page_size=5,
       style_cell={'textAlign': 'left'}
    )
]

Notice that I added left alignment, a smaller page size and a nicer title. This gives us:

The Verdict

Dash was pretty straightforward to work with, even though my Python is very much at a basic level. I was looking at controlling the data into the data_table, and that was a bit trickier.

It doesn’t feel entirely standardized, however, so you will need to read the notes for every component you want to try out. But I recommend you try it out the next time you want to show off some data.

The post Introduction To Plotly Dash, the Most Popular AI Data Tool appeared first on The New Stack.

Is React Now a Full Stack Framework? And Other Dev News

Loraine Lawson — Sat, 31 Aug 2024 12:00:52 +0000

Maybe we’re not so much living in a post-React world as we are living with a new React paradigm: React is becoming a full stack framework with the addition of React Server Components and Server Actions, software engineer and freelance developer Robin Wieruch argued recently.

“This marks just the beginning of full-stack development with React,” Wieruch wrote. “As developers start to access databases directly through Server Components and Server Actions, there will be a learning curve ahead to tame the complexities beyond simple CRUD applications.”

This will allow frontend developers to quickly master implementing backend architectures with layers, design patterns and best practices, Wieruch added.

Claude Can Now Generate Artifact

Artifacts give Claude AI users a dedicated window to see, iterate and build on any work created in Claude. For developers, this provides a separate window to see code or to make architecture diagrams from codebases, according to the Claude team.

“Artifacts turn conversations with Claude into a more creative and collaborative experience,” the Claude blog stated. “With Artifacts, you have a dedicated window to instantly see, iterate, and build on the work you create with Claude.”

Screenshot via Claude.ai

Artifacts are now available for all Claude.ai users across the platform’s Free, Pro, and Team plans. Artifacts can also be created and viewed on Claude’s iOS and Android models. It can also create:

Code snippets
Flowcharts
SVG graphics
Websites in single page React or HTML
Interactive dashboards
Insert image

The Anthropic post includes a video that describes how this feature was created and explores other use cases outside development, but for a deeper read on how it can be used to build web applications, check out this post by Pragmatic Engineer, which explores in-depth the capabilities and creation of Artifacts.

“While the feature is small, it feels like it could a leap in using LLMs for collaborative work — as every Artifact can be shared, used by others, and remixed,” explained Gergely Orosz, who writes Pragmatic Engineer.

Catching More Bugs With TypeScript

The release candidate for TypeScript 5.6 is out, and Microsoft TypeScript Product Manager Daniel Rosenwasser offered a roundup of what’s new, including disallowed nullish and truthy checks for catching more bugs. Rosenwasser lists several examples of code that do not do what the author intended, but are still valid JavaScript code. Previously, TypeScript would just accept these examples, he wrote. No more.

“But with a little bit of experimentation, we found that many many bugs could be caught from flagging down suspicious examples like above,” he wrote. “In TypeScript 5.6, the compiler now errors when it can syntactically determine a truthy or nullish check will always evaluate in a specific way.”

“But with a little bit of experimentation, we found that many many bugs could be caught from flagging down suspicious examples like above.”
— Daniel Rosenwasser, Microsoft TypeScript Product Manager

TypeScript 5.6 also introduces a new type called IteratorObject and the post provides code examples of how it’s defined.

Rosenwasser writes that there is an AsyncIteratorObject type for parity.

“AsyncIterator does not yet exist as a runtime value in JavaScript that brings the same methods for AsyncIterables, but it is an active proposal and this new type prepares for it,” he explained.

Project IDX Combines Code Editor With Languages and Tools

Project IDX is a browser-based development experience built on Google Cloud Workstations and powered by Codey, a foundational AI model trained on code and built on PaLM 2. Its goal is to make it easier to build, manage and deploy full-stack web and multiplatform applications, with popular frameworks and languages.

Project IDX seeks to unify the two main parts of a development environment: The code editor and the languages and tools required to build and run the code, the team noted in a recent reflection on developing Project IDX over past year.

“At the heart of Project IDX is our conviction that you should be able to develop from anywhere, on any device, with the full fidelity of local development,” The Project IDX team wrote in introducing it last year. “Every Project IDX workspace has the full capabilities of a Linux-based VM, paired with the universal access that comes with being hosted in the cloud, in a data center near you.”

This yearly update noted the Project IDX team had three areas of focus:

Improve developer productivity with Generative AI tooling backed by Gemini.
Redefine what it means to “get started fast” with project templates and integration.
Bring native mobile app development to the browser with Flutter, React Native, and soon, Android Studio.

The team has integrated generative AI features, provided by Gemini, into the code.

“These provide context-aware code suggestions, unit test generation, comment writing, programming language conversions, and technical question answering — all without ever leaving your workflow,” the post noted. “On the development environment side, we’ve built a robust system based on Nix, allowing for effortless environment configurations. With minimal setup, you can customize your Project IDX workspace with the precise languages, tools, and extensions you need to hit the ground running.”

Nix is a functional package manager that assigns unique identifiers to each dependency, which ultimately means an environment can contain multiple versions of the same dependency, seamlessly, the post added.

The post Is React Now a Full Stack Framework? And Other Dev News appeared first on The New Stack.

CTO to CTPO: Navigating the Dual Role in Tech Leadership

Vicky Wills — Fri, 30 Aug 2024 17:00:12 +0000

Since joining as CTO of Exclaimer a little over a year ago, my role has evolved into that of a CTPO. While this isn’t unusual, it’s a topic that’s sparked much debate in the tech world about whether it’s better to merge product and technology leadership or keep them separate. They’re both full-time jobs in their own right, after all. Having made the transition firsthand, I can passionately argue for and against combining these roles.

As a starting point, I want to stress that technology and product are both challenging areas for any business. I’d argue they’re the two least understood departments within any organization. They’re also massive investments in a company that is so highly interdependent, so when there’s misalignment or friction between them, they both become ineffectual. That’s why the prospect of merging the Chief Technology Officer (CTO) and Chief Product Officer (CPO) roles into a single Chief Product and Technology Officer (CPTO) is such an attractive proposition for some organizations.

Based on my own experience, here’s my take on the overlapping responsibilities of these individual roles and the benefits and challenges of combining or separating them.

The CTO: A Business Leader With a Technical Edge

Wearing my CTO hat, my primary focus is on supporting the organization with scalable, reliable, and efficient technology. This involves more than just coding and systems but making strategic decisions that align with our business goals.

At Exclaimer, handling up to 24 billion transactions annually on our core platform brings real technical scalability and security challenges. As CTO, it’s my responsibility to ensure that the technology we use supports the business’s strategic goals, balancing innovation with operational efficiency.

The CPO: Bridging Product Vision and Customer Value

As CPO, especially in a product-led organization like Exclaimer, my role involves constantly asking: “What new value can we deliver to our customers?”, “Why would our customers choose our product over a competitor’s?” and “How can we innovate to stay ahead?”

This requires balancing immediate product needs with long-term visionary planning, looking at what we’re building today and what we want to achieve tomorrow.

Building value for the customer also means understanding the market. This is a moving target, so work always continues. The tech world moves quickly, and we must understand how to not just adapt to change but drive it to deliver value to our customers.

But none of this matters if it doesn’t drive business growth. So, it’s vital to link product development back to the business’s commercial performance. We need to understand how the work we do translates into ROI and that we’re building products suitable for the organization’s operational capabilities and needs.

The Case for Distinct Roles

The need for distinct CPO and CTO roles depends greatly on organizational maturity. If there’s a lot of work to do for technical stability, developer productivity, or data capabilities, you’ll likely need a dedicated CTO. Similarly, if the path for the product is unknown and PMF is elusive, a full-time CPO role is required. If there’s maturity in the organization for technology and product, it’s possible to combine the roles.

Exclaimer is a mature company with a strong PMF, which is why we’ve combined the roles. But there are days when it’s more obvious that I’m wearing two hats. I could be discussing the next five years of Exclaimer’s product roadmap in one meeting

and then pivoting to optimize hosting costs in the next. This duality can be mentally demanding but also incredibly rewarding when approached correctly.

The Benefit of a Single, Unified Leader

Not all organizations need two separate leaders. For smaller companies or those in the early stages of growth, having a single point of contact simplifies decision-making and aligning priorities. A competent CPTO can streamline processes, reduce the risk of misalignment, and offer a clear vision for both product and technology initiatives. This approach can also be cost-effective, as executive roles come with high salaries and significant demands.

Combining these roles simplifies the organizational structure, providing a single point of contact for research and development. This works well in environments where product and technology are closely integrated and mature in the product and technology systems.

In my role, most of my day-to-day activities are focused on the product. I’m very conscious that I don’t have a counterpart to challenge my thinking, so I spend a lot of time with senior business stakeholders to ensure the debates and discussions occur. I also encourage this in my leadership team to ensure that technology and product leaders are rigorous in their thinking and decision-making.

Striking the Right Balance

Ultimately, deciding to have one or two roles for product and technology depends on a company’s specific needs, maturity, and strategic priorities. For some, clarity and focus come from having both a CPO and a CTO. For others, the simplicity and unified vision that comes from a single leader makes more sense.

In my role, I’ve seen firsthand how combining product and technology leadership can drive innovation and efficiency. However, I would caution that it’s important to remain vigilant about the potential downsides, primarily the risk of overloading one person with too many responsibilities. As a business grows and evolves, it’s important to constantly monitor and assess the leadership structure to ensure those in charge are equipped to meet their strategic objectives. Keeping a watchful eye on this becomes critical for business when you combine roles.

Ultimately, whether the organization opts for one role or two, the key takeaway is that product and technology efforts must be aligned with the broader business strategy. Only then can they effectively drive growth and deliver exceptional value to customers.

The post CTO to CTPO: Navigating the Dual Role in Tech Leadership appeared first on The New Stack.

Implementing IAM in NestJS: The Essential Guide

Chesvic Hillary — Fri, 30 Aug 2024 13:26:42 +0000

Identity and access management (IAM) is an essential component of application security. It helps ensure that the right individuals can access the right technology resources, like emails, databases, data and applications, while keeping unauthorized users out.

NestJS is a popular Node.js framework for building scalable and efficient server-side applications. Implementing IAM in NestJS can greatly improve security while enhancing your user experience. In this guide, I will explore how to implement IAM in a NestJS application from start to finish.

What Is IAM?

IAM is a framework of technologies and policies that helps manage user identities and control access to user resources. It includes authentication, authorization, user provisioning, role-based access control (RBAC) and audit logging. With IAM, you can:

Ensure secure authentication mechanisms.
Implement appropriate authorization rules.
Maintain user roles and permissions.
Monitor and audit access to resources.

OK, I Get IAM … but What Is NestJS?

NestJS is an extensive Node.js framework that helps you build server-side applications. NestJS leverages TypeScript and uses a modular architecture inspired by Angular, making it a strong choice for scalable applications and providing a solid foundation for implementing IAM.

Implement JWT Authentication in NestJS

Authentication is the process of verifying a user’s identity using authentication strategies including JSON Web Tokens (JWT) and OAuth2. Follow these steps to set up JWT authentication in a NestJS application.

First, install the necessary dependencies:

npm install @nestjs/jwt @nestjs/passport passport-jwt

Next, create a module for authentication. This module will handle user login, token generation and token validation.

Create the AuthService to handle authentication logic:

Next, define the JwtStrategy to handle token validation:

Finally, create the AuthController for user login:

The LoginDto defines the expected request body for the login endpoint:

Now you have a basic JWT authentication system in place. Users can log in and receive a JWT token, which they can use to access protected routes.

Implement RBAC Authorization in NestJS

Authorization is the process of determining whether a user has permission to access certain resources. RBAC is a common approach to authorization in NestJS.

To implement RBAC, first, create a RolesGuard that checks if a user has the appropriate role to access a resource:

Define a custom decorator to specify required roles:

With these components, you can create a protected route that requires specific roles:

Enable User Provisioning and Audit Logging

Beyond authentication and authorization, user provisioning and audit logging are crucial components of IAM.

Set Up User Provisioning

User provisioning involves creating, updating and deleting user accounts. You can implement a user service to manage these operations:

Implement Audit Logging

Audit logging helps track user activities, providing insights into who accessed what and when.

Middleware in NestJS provides a centralized way to apply logic to incoming requests before they reach controllers, making it ideal for logging, authentication checks, rate limiting, etc. By placing audit logging in a middleware, you can capture and record relevant information consistently for all or specific endpoints without duplicating logic across controllers.

Here’s an example of how you might implement audit logging as middleware in a NestJS application:

Create Middleware for Audit Logging

Define a middleware that logs relevant information for each request, such as the HTTP method, URL, user identity (if authenticated) and timestamp.

Apply Middleware to the Module

To ensure that the middleware runs for specific routes or globally, register it in the corresponding module(s).

Apply Middleware Globally

To apply the middleware globally, add it to the root module’s configure method:

Apply Middleware to Specific Routes

If you want to apply the middleware only to specific routes, you can specify the routes to which it should apply:

Conclusion

Implementing IAM in a NestJS application involves several key components, including authentication, authorization, user provisioning and audit logging.

This article provided a comprehensive guide with practical examples to help you implement IAM in NestJS. With these components in place, your application will be more secure and better equipped to manage user identities and access to resources.

Are you looking to scale your team with skilled NodeJS specialists like Chesvic? Our guide How to Hire a NodeJS Developer: Finding the Perfect Fit can help you source the right skills for your organization.

The post Implementing IAM in NestJS: The Essential Guide appeared first on The New Stack.

3 Lessons in Accessible Development From an Expert Tester

Loraine Lawson — Thu, 29 Aug 2024 19:00:12 +0000

Computer engineer Suleyman Gokyigit has a requirement for his development team that’s a bit unusual: Programmers must talk with people with accessibility issues.

For Gokyigit, accessibility isn’t just a professional issue. It is a personal one as well — he’s been blind since he was two years old.

In addition to being the chief information officer for FIRE, a First Amendment advocacy group, he works as a testing expert with the accessibility testing firm Applause. Gokyigit recently spoke with The New Stack to share what he’s learned about developing for accessibility as a technologist and as a user.

“As somebody with blindness, I rely so much on technology,” he said. “Everything that I do is on the computer and using websites.”

Gokyigit’s Personal Experience

That wasn’t always possible without a special assistive device or additional software. Gokyigit began learning about computers as a child in the late 1980s, but computers were still relatively primitive and only some software could speak.

In the ’90s, things started to change, just as Gokyigit was pursuing his passion for computer science which led to a graduate degree from Cal Tech in Pasadena. At first, even operating systems were not accessible until years after their first release, he said.

“That gap became less and less over time, and now we have a pretty reasonable expectation that most things are accessible upon release,” he said. “There’s obviously certain exceptions, like things that are very graphical in nature, but your productivity applications [and] most of the day-to-day stuff is usable right of the box.”

He credits Apple and its release of the iPhone 3G back in 2008 with being the first company to design accessibility into an off-the-shelf product, making it the first mainstream device that didn’t require an extra piece of hardware or software such as the Job Access with Speech (JAWS) screen reader to work, he said.

“That was a huge thing, because prior to that point, even when things were accessible and you could make them accessible quickly, it still had a significant cost associated to it, even JAWS,” he said. “All those things cost money, and you either had to figure out a way to afford it yourself, or you had to go through agencies.”

Even with the iPhone’s accessibility features, the majority of applications were inaccessible, he added. Apple put in all the necessary design elements and a framework to design accessible apps, but developers often didn’t use them due to a lack of awareness.

1. Accessibility-First Leads to Better User Experience

Apple’s approach to building accessibility into the product from inception offers a key lesson to technologists. Waiting till after the first release makes it harder to address and it tends to get deprioritized by bugs and feature demands, Gokyigit said. It also can lead to rework and higher costs.

“You’d be surprised how many websites don’t have, for example, labeled images, alt text on images.”
— Suleyman Gokyigit, accessibility expert

“This really needs to come down from management and from the top, and it needs to be part of the whole design process,” he said.

Designing and building for accessibility also makes for a better experience for all users, said Gokyigit. He pointed to Flash, which was not accessible and created an unpleasant user experience.

“Especially with UI design, … it might look attractive when you first take a look at it, but the actual experience of using it ends up becoming more and more complicated,” said Gokyigit, who now programs in Python, Rust and Lua. “It’s the same thing with just software design in general.”

When developers are designing software, they have to think about the user experience. Lose touch with the user experience, and the product will not be received as well, he said.

“Accessibility just puts a stronger framework around what is considered to be good design. And I think people like that. It makes things less bloated, usually more efficient, faster,” he said.

2. Incomplete Accessibility Is Useless

Accessibility isn’t really something developers can roll out in stages, either, he cautioned. He’s seen that in the gaming industry, where features are supposed to be available yet often just don’t work.

“As a blind person, I’m not going to play it if it’s only 50% accessible.”
— Gokyigit

“When you’re talking about accessibility, if you go only part way, and you make some things work, but not other things, it pretty much makes the whole effort wasted,” he said. “As a blind person, I’m not going to play it if it’s only 50% accessible. That just doesn’t make any sense. So go all the way and do it correctly.”

It’s also important for frontend developers to follow web standards, such as the Web Content Accessibility Guidelines (WCAG), he added.

“You’d be surprised how many websites don’t have, for example, labeled images, alt text on images,” Gokyigit said. “It just says graphic or unlabeled button. I mean, it takes two seconds to write an alt description or alt text for a button, and they don’t.”

3. The One Thing Development Teams Need to Add

Gokyigit offered his final lesson: Let developers learn about accessibility from the people who most need it.

Talking with someone who has a disability is something any company can arrange, even if no one on staff has accessibility challenges, he added.

Software development teams can start by asking around, he said, since most people know someone with a disability, he suggested. If that’s not an option, there are also companies like Applause that offer professional testing by people with disabilities, he said. Another option might be to contact organizations that support people with specific disabilities.

He also suggested reaching out to other development teams who have made technology accessibility a priority.

It’s important for programmers to meet with someone navigating a disability because it helps developers understand their challenges and it offers an unexpected benefit to developers, he said.

“When they see the impact they’re having, it’s really a motivating thing. It’s a feel-good thing, and they should feel good.”

The post 3 Lessons in Accessible Development From an Expert Tester appeared first on The New Stack.

Deploy on Friday? Moratorium Doesn’t Achieve Admirable Goal

Steve Fenton — Thu, 29 Aug 2024 18:00:38 +0000

Avoiding Friday deployments is driven by a goal that, while admirable, is better achieved in other ways. Yet, organizations are more likely to deploy on Friday than on a Monday, which shows the industry doesn’t buy the myth that you shouldn’t deploy on Friday.

Few Practices Are Truly Best Practices

While you used to hear about best practices all the time, you’ll have noticed that industry experts have become increasingly cautious about the term. Best practices do exist, such as using version control instead of a network share to store your source code. Many other practices turn out to be simply good, rather than best.

The idea of best practices is that they are table stakes for software delivery, and working without them can be harmful. Good practices, on the other hand, are options. You don’t have to adopt every good practice, as you can cover risk areas with a carefully selected option from the menu of all tried-and-tested ways to solve a problem.

If your end users are reporting escaped bugs, you can apply one or more testing and monitoring good practices to catch similar issues before users experience them. You look at different testing and monitoring approaches and try one to see its effectiveness. The problem could be solved after you adopt one practice, or it might take several complementary approaches to improve the situation.

Crucially, you should stop adding practices once you’ve solved the problem. Adding further practices that aren’t required results in unnecessary complexity, which can be harmful to software delivery.

Finally, there are practices that are actively harmful when pursued. These are often well-intentioned and almost always presented as best practices, but empirical evidence and rigorous research have proven them to be false. Examples of these harmful practices include Gitflow, working in large batches, and heavyweight change-approval processes.

A good heuristic for spotting an anti-pattern is that you’ve being told something is a best practice, but it doesn’t sound like table-stakes basic software delivery. If there’s any doubt, look for trusted advice from someone with plenty of experience and cross-reference dependable research, such as the Accelerate “State of DevOps Report.” (The 2024 report is due out in October.)

Never Deploy on a Friday

With the practices primer firmly in mind, let’s consider whether we should deploy on Friday. After the recent CrowdStrike outage, many people suggested things could have been improved if they hadn’t deployed on Friday.

It’s hard to understand how deploying on Thursday would have improved the CrowdStrike issue. Do we not fly planes or use computers on Thursday? We can explore this confusing claim by looking at the perceived benefits of banning Friday deployments.

Why You Should Avoid Friday Deployments

I asked folks why we should avoid Friday deployments to see if there was some merit to the claims. As someone whose mind was changed on tabs vs. spaces by solid arguments, I believed I could be convinced given a compelling argument.

The most common argument for a Friday moratorium was that it would allow developers to spend the weekend with their families instead of working to restore services.

This is only true if your time to recover from a failed deployment is almost exactly a day. If it takes more than a day, you’ll be working on Saturday to recover from your Thursday deployment, unless you also stop deployments on Thursday. If you can recover in less than a day, you could deploy on Friday and still get to the beach on Saturday.

The longer your recovery times, the further in the past you need to deploy to avoid working on the weekend. For example, if it takes you a week to recover from a failed deployment, you need to deploy last Friday to avoid working this weekend, though this does mean you have to work last weekend.

The reality is that recovery times aren’t constant. Some problems are easier to recover from than others. Having a distribution of recovery times means you are taking a probabilistic approach to protecting the weekend. If you deploy every day, you’ll discover that Friday deployments have a higher probability of disrupting the weekend than Monday deployments.

When you take this approach, you’ll find there are plenty of improvements you can make to your deployment pipeline that reduce the probability of disruption. Banning deployments doesn’t reduce your overall change failure rate, it just means your Monday deployments will fail more often. Improving your deployment pipeline reduces the risk of failed deployments overall, which is far more positive than simply choosing which day will be a bad day.

Work-life balance applies to Monday night just as much as it applies to the weekend. People have personal commitments every day of the week, from school runs to birthdays and anniversaries to arrangements made to meet an old friend. Each of these is important, so making improvements that reduce failed deployments protects work-life balance more than shifting them to a different workday.

Real Reasons To Plan Deployments

There are reasons to avoid deploying at specific times, but they are related to your industry. For example, in the retail industry, you avoid updating the software on tills when the store is open. You don’t need a new version of software while you’re checking out a customer and have a queue of shoppers waiting.

Your release plan would set out the times to deploy and the times to avoid. It would also capture the process for training retail store staff so they know what’s changed before they raise the shutters.

By thinking of the people affected by our software, we consider many more people than if we just think of ourselves. If we think of all the people depending on our software, we can consider the work-life balance of a thousand times more people. Ideally, we’d never have a failed deployment, and our rollout would have no downtime. In reality, things do sometimes go wrong, and having a strong recovery time can minimize the impact.

The research suggests there are capabilities that let you deploy more often while also reducing the failure rate. The top performers are deploying many times a day, including Fridays.

Deploying on Friday Is Commonplace

To understand whether the Friday deployment ban was affecting real organizations, I analyzed over 32 million deployments to see when they were taking place. The data cuts across all organization sizes and many industries. It turns out that deploying on Friday is more common than deploying on Monday.

This is great news, as it demonstrates that the myth of avoiding Fridays has little traction in practice.

Decide for Yourself but Don’t Propagate the Myth

It’s perfectly acceptable for your organization to decide they don’t want to deploy on Fridays, though some curiosity about why will either turn up interesting domain knowledge or highlight some things you can improve in your deployment pipeline. What’s less appealing is broadcasting “don’t deploy on Fridays” as a best practice.

Working in small batches in known to improve software delivery performance. If you are following continuous delivery and DevOps, you might be deploying five times a day. If you rule out Fridays, you build up a Monday deployment that’s five times bigger than your normal batch size. You are opting to downgrade your performance from “on demand” to “daily.”

The more continuously you deploy, the worse the idea of stopping for a day becomes. That means that the members the “don’t deploy on Friday” movement, in the absence of specific organizational context, are opting for mediocrity.

Don’t just start deploying on Fridays. Take this as an opportunity to assess your situation, understand users and customers, and start improving your software delivery capabilities.

The post Deploy on Friday? Moratorium Doesn’t Achieve Admirable Goal appeared first on The New Stack.

Reimagining Observability: The Case for a Disaggregated Stack

Neha Pawar — Thu, 29 Aug 2024 17:00:26 +0000

Observability, often abbreviated as o11y, is crucial for understanding the state and behavior of systems through the collection, processing, and analysis of telemetry data. Yet in 2024, I’ve observed the traditional o11y stack is losing traction, and it’s time for a disaggregated o11y stack.

There are key benefits of disaggregation, such as flexibility, data autonomy, extensibility, and cost-effectiveness. Additionally, there are actionable recommendations and blueprints for data teams on structuring their data architectures to embrace this disaggregated model.

The Observability Stack

Here’s what a typical o11y stack looks like:

Agents: These processes run on your infrastructure alongside your microservices and applications, collecting o11y data and shipping it to a central location for further analysis.
Collection: This layer collects the incoming data from all the various agents and facilitates its transfer to the subsequent layers.
Storage and Query: This layer stores the data from the collection step and makes it available for querying.
Visualization: This includes applications for querying this data, such as tools for metrics visualization, monitoring, alerting, log exploration, and analytics.

Prominent solutions in the o11y space today include technologies like Datadog and Splunk.

Rise of the Disaggregated Stack

The Problem and the Opportunity

The two main problems with the o11y solutions available today are more flexibility and high cost. These issues arise because providers typically offer an all-or-nothing solution. For instance, although Datadog provides comprehensive monitoring capability, we cannot use that platform to generate real-time insights with the same data. Similarly, some of these solutions focus on the end-to-end solution and may need to be optimized for efficient storage and query computation, thus trading off simplicity with cost.

Many companies today are leaning towards a disaggregated stack, which provides the following key benefits:

Flexibility: Companies often have a highly opinionated data stack and can choose specific technologies in each layer that suit their needs.
Reusability: Data is the most essential commodity for any company. With a disaggregated stack, they can build one platform to leverage their datasets for various use cases (including observability)
Cost Efficiency: A disaggregated stack allows a choice of storage-optimized systems that lower the overall service cost.

Review each layer and understand how a disaggregated stack can help overcome corresponding issues.

Agents

Vendors have invested significantly in their agents, tailored to their stacks with specific formats. This specificity increases the overall cost of the solution. In addition, different companies have varied data governance requirements, which may require additional work to accommodate proprietary agents.

At the same time, observability agents have become commoditized, with standards like OpenTelemetry emerging. These standards make it easy to ship data to various backends, removing the constraint of specific formats tied to specific backends and opening up a world of possibilities for the rest of the stack.

Collection

Vendor-specific collection systems need to be able to handle the following challenges

Volume: Companies of all sizes generate a very high data volume for logs and metrics. Tens or hundreds of terabytes of data are expected to be generated daily.
Variety: Metrics, logs, and traces come in various formats and may need special handling.
Network Cost: These proprietary collection systems typically reside in a different cloud VPC, thus driving up egress costs.

In a disaggregated stack, streaming systems such as Kafka and RedPanda are popular choices for the collection layer and are often already deployed as a part of the data ecosystem. Several prominent organizations have tested these systems at scale for high-throughput, real-time ingestion use cases. For instance, Kafka has been known to reach a scale of 1 million+ events per second at organizations such as LinkedIn and Stripe. These systems are agnostic to agent formats and can easily interface with OTEL or other formats. They also have good connector ecosystems and native integrations with storage systems.

Storage and Query

The storage and query layer is the most challenging part, significantly impacting the system’s cost, flexibility, and performance. One of the significant problems with the storage and query layer of present-day solutions is the lack of flexibility to use one’s data for other purposes. In an all-or-nothing solution, once one’s data is in the vendor’s stack, it’s essentially locked in. You can’t use the data stores to build additional applications on top of it.

Another aspect is the cost and performance at the o11y scale. The storage & query system must handle the extremely high volume of data at a very high velocity. The variety of data means you’ll see many more input formats, data types, and unstructured payloads with high cardinality dimensions. This variety makes ingestion into these systems complex, and the need for optimal storage formats, encoding, and indexing becomes high.

In case of a disaggregated stack, choosing the right storage system is extremely important. Here are some of the things to consider when making this choice.

Integration With Real-Time Sources

The system must integrate seamlessly with real-time streaming sources such as Kafka, RedPanda, and Kinesis. A pluggable architecture is crucial, allowing for the easy addition of custom features like decoders for specialized formats such as Prometheus or OTEL with minimal effort. This flexibility is particularly important for o11y data, as it requires support for a wide variety of data formats from different agents.

Ability To Store Metrics Data Efficiently

Here’s an example of a typical metrics event that contains a timestamp column representing the timestamp of the event in milliseconds granularity, a metric name and value column representing the metrics emitted by your system, and a labels column.

There are several challenges associated with such datasets:

High cardinality columns: They need special handling like Gorilla encoding for efficient compression.
Various Indexing techniques: Range index, Inverted or sorted index for efficient lookup and filtering of timestamps, high variability metric values and metric names.
Advanced data layout: The ability to partition on frequently accessed columns to minimize work done during query processing (only process certain partitions).
JSON column support: The “labels” column is typically represented as a JSON map containing a variety of dimension name-value pairs (eg: values for server IP, Kubernetes version, container ID and so on). Ingesting data as is will put the onus on query processing which then needs to do runtime JSON extraction. On the other hand — materializing all such keys at ingestion time is also challenging since the keys are dynamic and keep changing.

Existing technologies have some workaround to overcome these challenges. For instance, Prometheus treats each key-value pair as a unique time series which simplifies JSON handling but runs into scalability issues. In some systems, like DataDog, costs increase as more top-level dimensions are added from these labels. If you go with a key-value store, you’ll again face the perils of high combinatorial explosion and loss of freshness when keeping real-time data in sync. Therefore, it is crucial that the storage system you choose can handle such high cardinality, and also complex data types and payloads.

Ability To Store Logs Data Efficiently

A typical log event includes a timestamp and several top-level attributes such as thread name, log level, and class name, followed by a large unstructured text payload, which is the log line. For the timestamp and attributes, you need similar encoding and indexing features to those required for metrics data. The log message itself is completely unstructured text. Querying this unstructured text involves free-form text search queries, as well as filtering by other attributes and performing aggregations. Therefore, robust text indexing capabilities are essential to efficiently handle regex matching and free-form text search.

Storing entire log messages results in extremely high volumes of log data. Logs often require long-term retention due to compliance requirements or offline analysis and retrospection, leading to substantial storage demands (tens of terabytes per day) and significant costs. Practical compression algorithms are crucial. For instance, the Compressed Log Processor (CLP) developed by engineers at Uber is designed to encode unstructured log messages in a highly compressible format while retaining searchability. Widely adopted within Uber’s Pinot installations, CLP achieved a dramatic 169x compression factor on Spark logs compared to raw logs.

Another important feature for managing the high costs associated with large volumes of data is the ability to use multiple storage tiers, such as SSDs, HDDs, and cloud object stores. This tiering should not come at the expense of flexibility or increase operational burden. Ideally, the system should natively handle the complexity of tiering and data movement policies. This is particularly important for o11y data, which often needs to be retained for long periods even though queries beyond the most recent few days or weeks are infrequent. Utilizing more cost-effective storage for older data is crucial in managing these costs effectively.

Some systems, such as Loki, offer 100% persistence to object stores. Others, like Clickhouse and Elastic, provide multiple tiers but rely on techniques like lazy loading, which can incur significant performance penalties. Systems like Apache Pinot offer tiering and apply advanced techniques, including the ability to pin parts of the data (such as indexes) locally and employ block-level fetching with pipelining of fetch and execution, significantly enhancing performance.

Considerations of Trace Data

Now let’s talk about trace events. These events contain a call graph of spans and associated attributes for each span. Due to the semi-structured, nested nature of the payload, challenges similar to metrics data arise in storing these cost-effectively and querying them efficiently. Native support for ingesting and indexing these payloads efficiently is crucial.

To summarize the challenges, we need a system that can handle petabytes of storage cost-effectively while managing long-term retention. It must ingest various formats at high velocity and serve the data with high freshness and low latency. The system should efficiently encode and store complex semi-structured data. Robust indexing is crucial, as a system that optimizes performance and minimizes workload will scale much more effectively.

Compared to all-in-one solutions, systems purpose-built for low-latency, high-throughput real-time analytics — such as Apache Pinot, Clickhouse, StarRocks, and Apache Druid — are better suited for storage and querying of o11y data (read more about the popular systems in this area here). These systems come with rich ingestion integrations from many real-time data sources, and the recipes have been proven to scale for use cases in different domains. Their columnar storage makes them more efficient in handling storage optimally, offering a variety of encoding and indexing techniques. Many provide good text search capabilities (e.g., Elastic Search is well known for free text search query capabilities). Apache Pinot and Clickhouse also offer native storage tiering capability.

Apache Pinot offers a blueprint for tackling every nuance of o11y, as depicted in the figure below. You can also view this tech talk on the o11y strategy in Apache Pinot.

Fig: Observability capabilities in Apache Pinot.

It also features a plugged architecture, enabling easy support for new formats, specialized data types, and advanced compression techniques. It has a solid indexing story and advanced ones like JSON indexing, enhancing its ability to handle JSON payloads efficiently. Real-world examples of successful implementations include Uber and Cisco, which have leveraged these niche systems to enhance their o11y solutions, demonstrating their effectiveness in managing high volumes of data with high performance and cost-efficiency.

Integration With Visualization Tools

Tools like Grafana are becoming increasingly popular due to their ease of use and rich customization options, allowing users to build comprehensive dashboards. The available widgets range from time series, heatmaps, bar charts, gauges, etc. Additionally, Grafana supports the creation of full-blown applications with extensive custom integrations. It offers flexible, pluggable connectors for different backends, avoiding vendor lock-in. Building a plugin, whether as a whole connector or a panel, is straightforward. Popular storage systems like Clickhouse, Elastic, and Pinot have Grafana plugins. Grafana also supports query protocols like LogQL and PromQL, which are gaining popularity.

Another popular tool is Superset, known for its rich UI widgets and ease of use and customization. Superset integrates seamlessly with many popular databases, allows users to create and share dashboards quickly, and, similar to Grafana, offers extensive charting capabilities.

BYOC

Earlier, we saw that one reason vendor solutions are costly is the high data egress costs when transferring data from agents in your account to the rest of the stack in the vendor’s account. With vendors who support BYOC (Bring Your Own Cloud), this issue is eliminated. The agents and the rest of the stack remain within your account, ensuring that your data doesn’t leave your premises, thereby avoiding additional costs associated with data transfer.

Conclusion

Adopting a disaggregated stack for o11y in modern distributed architectures offers significant benefits in cost-effectiveness and reusability. By decoupling the various components of the o11y stack — such as agents, collection, storage, and visualization — enterprises can choose best-of-breed solutions tailored to specific needs. This approach enhances flexibility, allowing organizations to integrate specialized systems like Kafka, RedPanda, Clickhouse, Pinot, Grafana, and Superset, which have proven capabilities at scale.

The disaggregated model addresses the high costs associated with traditional all-or-nothing vendor solutions by eliminating data egress fees and leveraging more efficient storage solutions. Moreover, the flexibility to use different layers independently promotes reusability and data autonomy, preventing data lock-in and enabling better adaptability to changing requirements.

By embracing a disaggregated o11y stack, organizations can achieve greater agility, optimize performance, and significantly reduce costs while maintaining the ability to scale and adapt their o11y solutions to meet evolving business needs.

The post Reimagining Observability: The Case for a Disaggregated Stack appeared first on The New Stack.

Orca Security Launches First K8s Testing/Staging Environment

Chris J. Preimesberger — Thu, 29 Aug 2024 16:00:18 +0000

By their very nature, highly complex and constantly evolving containerization environments present significant security challenges. These can include misconfigurations, over-privileged roles and vulnerabilities in components. This cannot happen in production environments, or else the business could soon be toast.

To control and reduce this risk, testing cluster configurations and settings in a trusted staging environment before deployment has become mandatory. The problem is, this often requires tedious time/cost overhead for setup and research that nobody likes to undertake because it’s not about innovation or moving the business forward.

In the decade-long existence of containers as key components in current IT systems, this dull but important task has cried out for an efficient, automated tool. That’s what Orca Security set out to build, and as of today, it’s available.

The Orca Research Pod has created KTE, an open source Kubernetes Testing Environment for AWS (EKS), Microsoft Azure (AKS), and Google Cloud (GKE), to help organizations improve their Kubernetes security by providing a safe and controlled space to identify and address potential vulnerabilities before they affect production systems.

Orca’s security research team discovers and analyzes cloud risks and vulnerabilities to strengthen the Orca platform and promote cloud-security best practices. The Orca Research Pod, composed of about a dozen cybersecurity experts, claims to have discovered more than 20 major vulnerabilities on public cloud platforms that were eventually fixed. The team continually analyzes the security of public cloud assets being scanned by the Orca platform, while observing attacker tactics and techniques in the wild.

KTE Available Under Apache 2.0 License

Starting on August 27, Orca will provide KTE to the open source community under the Apache 2.0 License on Orca’s main GitHub repository. KTE creates a free and agnostic KSPM (Kubernetes Security Posture Management) experience by using a listing of open source offerings.

Orca says it will continue to maintain KTE to support the K8s security community and assist their staging endeavors so they can pinpoint and triage security threats. Users are encouraged to replace the default helm chart with one of their own so they can use this project to test an operational staging environment.

This is the first K8s testing/staging project of its kind, Roi Nisimi, a security researcher at Orca, told The New Stack.

“There are currently no similar projects with the same long-term goal in mind: becoming a one-stop shop for all your Kubernetes security concerns, with a single click, regardless of your cloud provider choice and open to the community. This is the power of KTE,” Nisimi said.

What Comprises KTE?

Using the GitHub repository, developers are able to test several security products on their K8s environment. Whether managing systems on AWS, GCP or Azure, participants can use KTE to deploy and scan their clusters. Most importantly, they are provided with clean visibility into their scan results via web-based dashboards, Nisimi said.

Are there any notable differences in versioning between GCP, Azure and AWS?

“The main goal was to provide an infrastructure for any type of Kubernetes user — whether they use GCP, Azure or AWS,” Nisimi said. “In terms of security findings, the currently supported open source tools provide native Kubernetes insights with a few differences, but we expect to see the deployment of many proprietary vendor-specific tools in the future.”

KTE has the potential to become a standard DevSecOps tool. What might it replace?

“The project gives developers the opportunity to test their Kubernetes resources against a vast array of security offerings, and hence achieve a fortified security posture. It will most likely not replace home-built solutions but actually help invent them, allowing to easily test and consolidate quality and varied security data,” Nisimi said

Orca has created and will further maintain KTE to support the K8s security community and assist their staging endeavors so they can pinpoint and triage security threats — not through a single tool, but many open source offerings, with the aim to include all, Nisimi said. This will guarantee a robust and powerful approach to identify K8s misconfigurations and security weaknesses, Nisimi said.

The post Orca Security Launches First K8s Testing/Staging Environment appeared first on The New Stack.

Balancing AI Innovation and Tech Debt in the Cloud

Ido Neeman — Thu, 29 Aug 2024 15:00:49 +0000

In recent years, but specifically since November 2022 when ChatGPT launched, AI has been driving innovation at an unprecedented rate, transforming various industries and the way businesses operate. Every single company today, and their executive leadership, understands that AI needs to be a part of their future strategy, or they will be left behind. That is why we are witnessing a race to deliver the greatest possible innovations on top of AI-powered everything. This is largely a byproduct of AI’s democratization that has made it consumable for the masses — from users to innovators.

Tech executives today are telling their engineering teams, “We need to have an AI story NOW,” with little regard for how this is ultimately implemented in their systems. The race is real, and it has its own set of unique implications, especially for those managing cloud infrastructure. This AI rush is creating AI tech debt at an unprecedented scale, and understanding these implications is crucial for ensuring that our cloud environments remain efficient, secure and cost-effective.

The Dual Impact of AI

As a cloud asset management company, we are witnessing an AI disruption through the consumption of AI-driven cloud services and assets in growing numbers through our telemetry data. From the GPUs to the managed retrieval-augmented generation (RAG) databases, large language models (LLMs) and everything else, all of this AI innovation is built upon some of the most costly cloud resources today. We urge you to check the costs of managed graph databases.

AI’s influence is twofold, affecting both consumers and the infrastructure that supports it. For consumers, there’s a growing need to ensure that the code generated by AI is aware of and compatible with their environments. This includes making sure that AI-driven applications adhere to existing policies, security protocols and compliance requirements.

On the infrastructure side, AI demands significant resources and scalability. The recent Datadog “State of Cloud Costs” 2024 report highlights a 40% increase in spending on GPU instances as organizations experiment with AI, where spending on GPU instances alone now makes up 14% of compute costs. Arm spend has doubled in the past year, which is the new backbone of AI-driven development, the chosen architecture for processors like AWS’s Graviton that are powering this AI revolution.

This surge in resource requirements and cloud spend can lead to AI tech debt that many CTOs are starting to lament. We are at a point where the velocity of AI development often outpaces the organization’s ability to manage and optimize it effectively. This can be seen in the spinning up of costly machines without proper tear down or cleanup — with cloud costs spiraling out of control. This, alongside data not being properly managed and fed into models and machines that later improperly expose this in unexpected ways, are just some examples.

Balancing Innovation With Governance

While AI presents incredible opportunities for innovation, it also sheds light on the need to reevaluate existing governance awareness and frameworks to include AI-driven development. Historically DORA metrics were introduced to quantify elite engineering organizations based on two critical categories of speed and safety. Speed alone does not indicate elite engineering if the safety aspects are disregarded altogether. AI development cannot be left behind when considering the safety of AI-driven applications.

Running AI applications according to data privacy, governance, FinOps and policy standards is critical now more than ever, before this tech debt spirals out of control and data privacy is infringed upon by machines that are no longer in human control. Data is not the only thing at stake, of course. Costs and breakage should also be a consideration.

If the CrowdStrike outage from last month has taught us anything it’s that even seemingly simple code changes can bring down entire mission-critical systems at a global scale when not properly released and governed. This involves enforcing rigorous data policies, cost-conscious policies, compliance checks and comprehensive tagging of AI-related resources.

The recent acquisition of Qwak.ai by JFrog is another indication that companies that have deep-enough pockets will be snatching up the emerging AI players for quicker time to market for competing AI solutions. With more than 50% of programmers today leveraging AI on a regular basis to write or augment code, any tools and platforms promising greater agility in this domain are gaining closer scrutiny and interest for potential acquisitions. Stay tuned for more happening on this front.

One interesting data point emerging from recent AI-driven research and development is code quality (apropos DORA metrics). This recent report by GitClear suggests that code quality is being adversely affected by AI. It states that there’s a significant uptick in code churn and a serious decline in code reuse. An interesting post on the findings can be read here.

Some of the recent critiques of AI indicate that text-based AI assistants are great, as there are massive amounts of text-based data to analyze. That is why AI assistants are able to augment typical text and deliver above-average results when it comes to generating creative or functional texts. However, the same does not hold true for code.

The large majority of available code to examine for AI modeling is actually below average. These include early projects by aspiring engineers and students, and open code that is not in commercial use. It requires many years of domain expertise to produce performant, cost-effective and quality code. Yet these types of repositories are often parsed and collected for code-based AI large language models (LLMs), making AI-assisted code quality, at this point below average to senior engineers’ code quality. High-quality code repositories are often closed source and belong to commercial applications not available to LLMs for data modeling.

This underscores the importance of integrating AI-driven innovations with robust governance structures. Cloud asset managers must be equipped with the tools and knowledge to monitor and manage AI workloads effectively, within their context, understanding the nuances of the complex systems they are managing. This includes ensuring visibility into AI operations and maintaining stringent compliance with governance policies.

Preparing for the Future of AI

As we look to the future, it’s essential to ask: What does this mean for the day after tomorrow when it comes to running AI? For organizations not developing their own LLMs or models, the focus shifts to managing expensive cloud infrastructure. This needs to be done with the same governance and cost-efficiency in mind as any other cloud operation.

Organizations must develop strategies to balance the innovation AI brings with the need for greater, and even more meticulous, governance. This involves leveraging AI-aware tools and platforms that provide visibility and control over AI resources. By doing so, companies can on the one hand, channel the power of AI toward higher-order goals, while maintaining a secure, compliant and cost-effective cloud environment.

As AI continues to drive innovation, its implications on cloud infrastructure and governance cannot be overlooked. Balancing the benefits of AI with effective management and governance practices is key to ensuring sustainable AI innovation powered by emerging cloud technologies.

The post Balancing AI Innovation and Tech Debt in the Cloud appeared first on The New Stack.

How Supabase Is Building Its Platform Engineering Strategy

Todd R. Weiss — Thu, 29 Aug 2024 13:42:19 +0000

Platform engineering is not a destination, but is an evolving and constant process of improvement, innovation and experimentation to provide consistent, tested and productive application development tools for developer teams. That is the plan as most companies begin their platform engineering strategies, and that is how it continues to work for open source PostgreSQL database infrastructure application vendor, Supabase.

Supabase, which describes itself as an open source alternative to Google’s mobile and web app development platform, Firebase, began using platform engineering several years ago. The project began as the company realized that building its own internal development platform (IDP) for its approximately 50 developers would allow the company to consolidate, standardize and automate its development applications to drive increased productivity, code quality and other benefits for its teams. Supabase has been in business since 2020.

“It is growing over time,” Samuel Rose, a platform engineer at Supabase who came to work for the company in February 2024, told The New Stack. “They were already doing this kind of thing and they [began to] formalize things that everybody was doing into a role that that we can all be responsible for throughout the company.”

“Supabase has always had an evolving platform engineering approach, but I was brought on board to formalize and expand it more across the organization,” said Rose. “We will continue to grow this on a weekly basis and have already made huge progress” in the company’s evolving platform engineering strategy.

The company’s platform engineering project came about as IT administrators and developers across many teams contributed to an effort to create a platform engineering approach for their work, said Rose. “The needs grew so great that Supabase needed to take on at least one person full-time to drive it forward,” which is how he joined the company.

Leading to these decisions was Supabase’s continuously growing customer base and growing technical complexity in managing build, test, and release processes, Rose explained. Also important was the company’s “desire to use and leverage our product where it makes sense on our internal platform,” he added.

“I have been in this industry for more than 20 years,” said Rose. “Supabase started four years ago, and it is quite natural that this company would grow into these needs now, after four years of work and expansion. At Supabase we eat our own dog food and use some of our own components as tools in our internal platform.”

Doing Platform Engineering the Supabase Way

Supabase did not start from scratch to build and create its platform engineering strategy. Instead, it began with pre-built recommendations for platform engineering tools that were brought together in sample cloud native landscape outlines from organizations such as the Cloud Native Computing Foundation (CNCF), added Rose.

“Our platform pretty well maps to those platform engineering approaches,” he said. “But we use some of our own products in our platform, including our own API and Postgres-centered development,” instead of using some off the shelf or Software as a Service components offered in the pre-built recommendation grids.

This custom platform engineering approach works well for Supabase, allowing the company to bring together tools from inside and mix them with other industry-standard tools for application building.

“The main goal is to take the existing building blocks that we are using to do platform engineering and consolidate them, automate and give everybody a more solid foundation to stand on,” said Rose. “It is not a super-conventional platform engineering approach, but it is one that is a good fit for the company. It is growing over time.”

To accomplish this, Supabase is consolidating around certain standards and certain tools in key places for now, according to Rose.

“Sometimes when people talk about platform engineering, they mean adopting this whole platform that is literally like some other software that somebody wrote and putting everything in there, like diving all the way headfirst to the bottom of the pool,” said Rose. “We are not really doing that. We are [also using] our own tools.”

These platform engineering efforts began through a natural outgrowth of Supabase’s development efforts and processes, said Rose. “So, it was easy for them to see this need — they started working on this before I ever got involved. I have been working with them, and we … brought it into reality, in terms of doing the things you described as platform engineering, to the point that it is going to into production. It is ongoing.”

What’s Included in Supabase’s Internal Developer Platform

To give its developers the tools they need to build their applications for Supabase, the company’s IT administrators have built their platform engineering platform around a spare number of development applications.

“We try to be economical about it so that we do not create [problems], because if you keep throwing tools at the basket, you are going to have to manage it all,” said Rose. “So, we are judicious about this. We have about five to seven major kinds of like components that we use, and we try to consolidate and leverage as much as we can of the existing systems.”

The tools included in Supabase’s IDP feature:

Developer control plane: Supabase utilizes internal wikis, GitHub, Terraform and Pulumi. The company also uses custom tooling based on Docker, other related tools to run its platform locally, and its SaaS backend as a service product that can also be run locally.
Integration plane: Supabase uses GitHub Actions, Nix package manager/Debian packages, Docker, Amazon S3 and a self-hosted Nix binary cache. In addition, it uses Humanitec’s Platform Orchestrator with in-house custom applications.
Security plane: Web application firewall (WAF), AWS GuardDuty, Google Cloud Platform Intrusion Detection System (IDS), AWS Secrets Manager, AWS EC2 Instance Connect and its own tools in some cases.
Monitoring and logging plane: Vector, Sentry, BigQuery, VictoriaMetrics and its own Logflare tool.
Resource plane: Supabase mostly uses the tools built into AWS and GCP platform, along with strategic use of its own product to manage metadata, clustering and more.

So far, the results of its platform engineering efforts are promising for Supabase.

“One thing is that it is giving us a pathway to manage doing new versions of Postgres for our customers that are better,” said Rose. “Our people in our teams internally can [create] what they call a deterministic build that they can build once and do not have to build again. That is part of the new platform that we are creating for platform engineering. [You] can just use that again and again unless you change something, so it can cut down on the build times and it can help assure that things are repeatable across systems. That was harder to do in the past.”

And by using the IDP, developers can just focus on their code, do their work and move on to the next projects without having to spend their valuable time configuring, collecting and maintaining their development tools, explained Rose. “We are already at the point where many developers can self-service the majority of their needs, while having secure access to monitoring and testing, and supporting production deployments.”

The post How Supabase Is Building Its Platform Engineering Strategy appeared first on The New Stack.

OpenJS Foundation’s Leader Details the Threats to Open Source

Heather Joslyn — Thu, 29 Aug 2024 13:00:36 +0000

Before and after the XZ Utils backdoor vulnerability was discovered in late March, the OpenJS Foundation got inquiries from would-be contributors to open source JavaScript.

Many of those inquiries raised no alarm bells. “JavaScript communities are very much volunteer-led, as opposed to some corporate-led open source projects,” said Robin Ginn, executive director of the OpenJS Foundation, in this episode of The New Stack Makers.

“And of course, they’re overwhelmed, and we’re always trying to recruit new contributors, and so you get emails all the time, and you have contributions all the time, and those are very welcome.”

But after the news broke of how a single contributor, “Jia Tan,” planted a backdoor in XZ Utils, Ginn said, some emails “triggered that Spidey sense that maybe something was a little off. And I think it was. It was them asking for admin privileges to take over a project, and that is something that usually takes some time to earn.”

In this episode of Makers, Ginn spoke to Alex Williams, founder and publisher of TNS, about the impact of episodes like XZ on open source communities and the organizations that use open source code, how security differs from trust in working with open source software and the struggle to secure resources for project maintainers.

The XZ Utils example, Ginn said, clarified the difference between trust and security.

“Security has always been critical for open for any kind of developer, any sort of engineer,” she said. “But when you hand over the keys to your kingdom, your GitHub repository, you need to trust the people who are accepting changes to your codebase. So I think we found trust is not security, which I think we already knew, but it really hit home.”

Too Many Single-Maintainer Projects

The XZ vulnerability, Ginn said, is “likely not an isolated incident.”

In the days after the news about XZ broke, her foundation and the Open Source Security Foundation (OpenSSF) released a joint statement saying they had foiled a hacker’s attempt to gain access to the OpenJS software library last November.

“The XC Utils had the one person identified. In our case, we saw multiple GitHub IDs, overlapping emails, avatars and things like that,” Ginn told Williams. “But they are real people, probably some bad actor somewhere who is not only getting close to understanding the code, but they’re also understanding how our open source communities work.”

The New Stack has previously written about the crisis in recruiting and compensating open source project maintainers. With nearly all websites using JavaScript, it’s especially alarming, Ginn said, that its maintainers remain so overmatched.

“We have Red Hat, who has a couple of people who work part of their day job is to support the Node.js project, and that’s fantastic,” she said. “Microsoft and Slack have employees contributing to Electron. But I would say probably 90% of our contributors are volunteers.”

Those nights-and-weekend maintainers have a lot to do, she added, noting that Node.js, jQuery, Webpack and other JavaScript projects have been around for many years. “So either you have a small group of maintainers, or sometimes even one maintainer, which is pretty common for JavaScript. I think if you look at some other open source projects, they require three maintainers and double checks. JavaScript as a whole has a lot of single maintainers.”

In 2023, the OpenJS Foundation received a €800,000 grant (roughly $893,000) from Germany’s Sovereign Tech Fund. The grant “almost doubled our budget,” Ginn said, but the foundation is still thinly resourced. “We have 35 open source projects and only two full-time staffers working to support those projects and those volunteers.”

A better long-term solution, she said, is for more of the companies that rely on open source software to pay their employees to take more responsibility for maintaining it.

“The best way to pay an open source maintainer is definitely to hire them, give them a full engineering role, or documentation or marketing. There’s lots of ways to contribute.”

Check out the full episode for more from Ginn, including how you can find out if your organization’s website is using outdated open source software (most sites are) and what’s new with jQuery.

Clarification: A previous version of this article stated that the OpenJS Foundation received an increased number of inquiries from aspiring project contributors after the XZ Utils vulnerability was discovered. The foundation has received a continuous stream of inquiries, with no spike after the XZ incident.

The post OpenJS Foundation’s Leader Details the Threats to Open Source appeared first on The New Stack.

Broadcom’s VMware Tanzu Platform 10 Becomes a PaaS

B. Cameron Gain — Wed, 28 Aug 2024 19:34:30 +0000

Broadcom’s VMware Tanzu Platform 10 release this week at the VMware Explore user conference in Las Vegas represents an ambitious effort to solve a straightforward problem, one that involves a lot of difficult-to-solve engineering under the hood. The goal is to improve the developer experience while offering platform engineers more streamlined access to manage governance, compliance and security. In other words, VMware Tanzu Platform 10 is becoming a Platform as a Service (PaaS) for developers.

As the main release of Tanzu Platform since Broadcom’s acquisition, Broadcom has emphasized the importance of allowing developers to use the tools they want and how a Platform as a Service has emerged as a necessary component to that end — a key aspect of Tanzu as an application delivery platform, particularly for cloud native applications. Tanzu is now more explicitly designed in pursuit of the long sought-after “holy grail”: breaking down silos between developers, operations teams, platform engineers and other stakeholders, leading to faster release cycles, release cadences and an easier life in general for anyone involved in DevOps and CI/CD.

If Broadcom can deliver on Tanzu’s promise in November when Broadcom’s VMware Tanzu Platform 10 is expected to become available, the release can be a potential boon to developers especially, but it will still be late in the coming, Torsten Volk, an analyst at TechTarget’s Enterprise Strategy Group, said. “VMware has consolidated its vast and confusing Tanzu portfolio into one consistent platform. This is exactly what Broadcom CEO Hock Tan promised when the acquisition closed, and it is what his company delivered with Tanzu 10. In a time of heated discussions about VMware’s new pricing model and the reset of its partner ecosystem, I think it is important to acknowledge that delivering the overall Tanzu portfolio as a unified persona-driven platform is a major achievement,” Volk said.

“The actual question now is if all this came too late, as adopting any developer platform requires a lot of trust in the platform vendor when Broadcom has squandered away a massive amount of said trust and also lost a lot of its cloud native manpower,” he said.

In reality, this PaaS and platform engineering approach translates into concrete pluses for the developers and platform engineers — at least in theory. As Purnima Padmanabhan, Broadcom vice president and general manager of Tanzu Division, said during a pre-conference analyst and press briefing, application teams are often worried about their desire to commit code and want to reduce the time it takes “from my laptop to app in production.”

“As a developer, how do I make sure that as I push out these changes, my app stays resilient today? For the platform engineers and infrastructure teams, how can I make sure that I accelerate the changes — and not just for one app?”, Padmanabhan said. ”How do I make sure that I keep my infrastructure on the platform continuously fixed, keeping it available while these app changes are happening?”

Tanzu 10 Platform was designed to support multiple runtimes, including Cloud Foundry and Kubernetes, offers a developer framework with polyglot support, includes a hardened set of AI services, and provides integration with over 200 open source packages, Padmanabhan said. The platform also supports application integrations, such as API gateways, ETL, data transformation systems, and databases, “ensuring that you have all the necessary components to start building and deploying your applications effectively,” Padmanabhan said.

AI Must Be Better

AI plays a significant role in this Tanzu release, thanks to the integration with Tanzu AI Solutions. While I have not yet tested it, if the claims hold true, it represents a very ambitious approach to integrating AI into the development cycle in a number of ways. Since AI is largely done in Python, Tanzu is now introducing an API to enable polyglot programming, allowing Java developers to integrate Gen AI into existing apps with VMware Spring AI. While much of GenAI development happens in Python, many enterprise applications are written in Java. Tanzu AI Solutions bridges this gap by bringing the power of GenAI to Java developers through an integrated Spring AI experience. This allows for faster GenAI development, secure model access, controlled usage, and improved accuracy and performance, Padmanabhan said.

@Broadcom‘s @VMware Cloud Foundation Division’s@cswolf keynote session at #VMwareExplore: @VMwareTanzu Platform 10 makes it ‘easy’ “to start to build and integrate AI into your applications.” pic.twitter.com/McK1oGjqIm

— BC Gain (@bcamerongain) August 27, 2024

Tanzu AI Solutions was also provided to deliver “GenAI observability” and monitoring to address accuracy and performance with analysis for apps and large language models (LLMs), Broadcom says.

As Padmanabhan explained, developers need “a simple, safe, and easy-to-use environment that allows them to move code to production efficiently. Second, the infrastructure must be dynamically provisioned and adjusted to meet the needs of the application. If the infrastructure is rigid and templated, flexibility, scale, and velocity are compromised,” Padmanabhan said.

“Finally, the application and infrastructure must be continuously updated to prevent outstanding risks,” she said. “By following these principles, we help accelerate the development, operation, and optimization of applications across both private and public clouds.

The Tanzu portfolio is straightforward, offering a core platform that supports application development, deployment, and security at scale across various environments, including VCF and public clouds, Padmanabhan said. Tanzu Data Solutions ensures that applications are connected to the necessary data services, and Tanzu Cloud Health manages costs across the environment. Additionally, Tanzu Labs provides guidance to help organizations move from code to production faster,” she said.

“With the introduction of the Tanzu Platform 10, we are offering a unified app development platform that can be deployed seamlessly across private cloud infrastructure and public cloud environments, including Kubernetes and VM runtimes,” Padmanabhan said. “The platform provides a consistent developer experience, whether working with Cloud Foundry or Kubernetes and empowers developers to build faster using tools like Spring, ensuring higher performance, governance and security.”

Development teams are “struggling tremendously to deliver AI-driven capabilities for their applications” and VMware simplifying AI development through its Tanzu platform “makes a lot of sense,” Volk said. But more work must be done. Enabling Java developers to leverage AI models and vector databases via a Python API only covers a small part of this challenge, Volk said.

“Data Scientists and the, still growing, majority of application developers ‘speak Python’ and would need Tanzu to directly support Python to a similar degree it supports Java to become excited about this platform,” Volk said. VMware needs to focus on supporting Python frameworks like Flask and Django, AI libraries like PyTorch, TensorFlow, Pandas, and NumPy, Jupyter-based notebooks, observability and monitoring for Python, service brokers, IDE plugins for Python, etc., Volk said.

A Simple PaaS

The VMware Tanzu Platform 10 released heavily involves VMware’s Cloud Foundry, which was originally created as a PaaS dating back to 2011 before VMware spun it off with EMC into Pivotal Software and then acquired Pivotal in 2019. For today’s release, a Cloud Foundry-like developer experience for Kubernetes with application spaces that introduces an application-centric layer of abstraction to allow applications to run with consistent operational governance and compliance is on offer.

The abstraction layer is designed so all stakeholders can work at once, while a so-called separation of concerns, as Broadcom calls it, remains between stakeholders. In this way, developers can focus on their applications without worrying about infrastructure details, while platform and operations teams can focus on managing infrastructure at scale, defining configurations to meet organizational governance and compliance requirements, Broadcom says.

Tanzu Platform 10 integrates with existing VMware Cloud Foundation configurations, including VMware Cloud Foundation 9 released today. “VCF 9 is a transformational platform. We reorganize ourselves internally to give you a single, unified product, one product that’s going to allow you to run a public cloud experience anywhere you do business,” Chris Wolf, global head of AI and Advanced Services, VMware Cloud Foundation Division, Broadcom, said during the keynote.

The post Broadcom’s VMware Tanzu Platform 10 Becomes a PaaS appeared first on The New Stack.

Developers Rail Against JavaScript ‘Merchants of Complexity’

Richard MacManus — Wed, 28 Aug 2024 19:00:46 +0000

The rebelling against JavaScript frameworks continues. In the latest Lex Fridman interview, AI app developer Pieter Levels explained that he builds all his apps with vanilla HTML, PHP, a bit of JavaScript via jQuery, and SQLite. No fancy JavaScript frameworks, no modern programming languages, no Wasm.

“I’m seeing a revival now,” said Levels, regarding PHP. “People are getting sick of frameworks. All the JavaScript frameworks are so… what do you call it, like [un]wieldy. It takes so much work to just maintain this code, and then it updates to a new version, you need to change everything. PHP just stays the same and works.”

Levels lists seven different startups on his X profile, and on one of his websites he offers the advice, “Launch early and multiple times.” He’s a go-getter, in other words, and prefers to build quickly and fast — which means eschewing complex web frameworks.

Other prominent developers chimed in on social media to echo Levels’ take.

“The merchants of complexity will try to convince you that you can’t do anything yourself these days,” wrote David Heinemeier Hansson (DHH), the creator of Ruby on Rails. “You can’t do auth, you can’t do scale, you can’t run a database, you can’t connect a computer to the internet. You’re a helpless peon who should just buy their wares. No. Reject.”

Some even voiced regrets over their JavaScript migrations of old.

“Migrating my main site off PHP in 2010 was one of my worst career mistakes,” wrote Marc Grabanski, founder and CEO of the web development training company Frontend Masters. “Back then I had a vanilla PHP website with over one million uniques per month, and migrating to ever newer languages and frameworks killed all momentum and eventually killed the site.” He added as clarification that his point wasn’t about PHP. “It’s that if you have a project that works with straightforward code, don’t over engineer it by chasing what’s hot. Keep it simple and protect your momentum on the project at all costs.”

The ‘keep it simple’ mantra is, of course, far from new in the computing sphere. Steve Jobs once said the following in a 1998 interview:

“Simple can be harder than complex: you have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.”

The Other End of the Sophistication Scale

What’s really interesting here is that the simplicity philosophy is making a comeback not only in the “hustle culture” startup scene that Pieter Levels personifies, but also in the professional software engineering circles of web development.

I can’t think of two more different developers than Pieter Levels, PHP advocate and startup-er-upper, and Alex Russell, a Microsoft browser engineer and one of the most respected voices in web development. But despite their differences in outlook, they are both currently railing against complex web frameworks.

In a recent series of blog posts, Russell launched a one-man “investigation into JavaScript-first frontend culture and how it broke US public services.” The cons of using a lot of JavaScript in a public service website was starkly illustrated by Russell in an examination of BenefitsCal, which he describes as “the state of California’s recently developed portal for families that need to access SNAP benefits (née “food stamps”).”

Using web performance tools like WebPageTest.org and Google’s Core Web Vitals, Russell showed that BenefitsCal is bloated with JS and loads extremely slowly:

“The first problem is that this site relies on 25 megabytes of JavaScript (uncompressed, 17.4 MB on over the wire) and loads all of it before presenting any content to users. This would be unusably slow for many, even if served well. Users on connections worse than the P75 baseline emulated here experience excruciating wait times. This much script also increases the likelihood of tab crashes on low-end devices.”

In part 4 of the series, Russell offers some solutions. Among other things, he recommends reading the UK’s progressive enhancement standard. It’s a part of the “service manual” of the UK government’s official website, gov.uk. The page Russell links to begins:

“Progressive enhancement is a way of building websites and applications. It’s based on the idea that you should start by making your page work with just HTML, before adding anything else like Cascading Style Sheets (CSS) and JavaScript.”

Just Say No to JS Frameworks

Admittedly, there’s a big difference between dashing off a quick web app using PHP and jQuery, and implementing a web standards-compliant web application using the progressive enhancement philosophy. The former is mainly concerned with enabling fast development and a “minimum viable product” — most likely for users with expensive devices like iPhones — that can be launched as soon as possible. The latter is mainly concerned with setting a solid foundation for future development, and for as wide a user base as possible (which in practice means even those without iPhones).

But either way, JavaScript frameworks are anathema to what they’re trying to achieve. Pieter Levels simply avoids them while creating his many apps, while Alex Russell is fighting the good fight and trying to convince public service websites to adopt better practices.

Perhaps the tide is finally turning against complex web frameworks.

The post Developers Rail Against JavaScript ‘Merchants of Complexity’ appeared first on The New Stack.

How To Implement InnerSource With an Internal Developer Portal

Aidan O'Connor — Wed, 28 Aug 2024 18:00:11 +0000

Strengthening collaboration and breaking down silos is what InnerSource is all about. The methodology encourages an open source way of thinking toward software development. It’s not a new practice; in fact, the term was first coined back in December 2000 by Tim O’Reilly, founder of O’Reilly Media.

Despite it being a somewhat older term in an industry that loves to move on to the latest buzzwords and trends, it is still very much an approach that many engineering teams want to incorporate into their organizations. Gartner expects 40% of software engineering organizations to have InnerSource programs by 2026. This is because they believe that the approach will improve code reusability, increase standardization and inspire a culture of autonomy and ownership among developers.

Ultimately, the goal of InnerSource is to reduce duplication in development, lack of reuse and the resulting increased costs. However, enterprises tend to struggle with the handoff between overarching strategy and tactical implementation.

While no single tool can ensure that developers will adopt InnerSource, there are approaches that can help to implement InnerSource, including the use of an internal developer portal.

Here are five key ways you can use an internal developer portal to help implement and encourage InnerSource within an organization:

The Importance of a ‘Trusted Committer’

In her book “Understanding the InnerSource Checklist,” Silona Bonewald describes the role of a “trusted committer” as crucial to implementing InnerSource best practices. The trusted committer is a developer — often on a two-week rotation — who mentors other developers and ensures standards are met when people create new pull requests (PRs). Trusted committers lead the effort to reduce silos for their service by:

Maintaining contribution guidelines
Reviewing incoming pull requests to ensure they’re in accordance with these guidelines
Mentoring developers who fall outside of the contribution guidelines
Requesting help from those who commit code to their service.

Portals create a place that makes the work of trusted committers easier, seen, acknowledged and easy to follow.

In the most basic sense, an internal developer portal makes the presence of trusted committers known, just like software ownership can be driven through a portal. Having a portal can ensure “trusted committers” for each service are known and rewarded by:

Including an automatically updated “trusted committer” schedule.
Assigning a “trusted committer” tag or property to the developer who is currently serving in this role.
Gamifying the contribution of each trusted committer by maintaining a dashboard (depicting, for example, the number of PRs merged under their watch or the speed with which they respond to each PR).

Finally, Bonewald notes that serving as a trusted committer takes developers away from writing code, so passively recording their contributions using a portal is an excellent way to provide objective performance metrics in year-end performance conversations.

Bonewald suggests a promotion path to “fellow” for developers who excel as trusted committers, which could be a tag or property depicted proudly on their user profile in a portal.

Developers may find it helpful to view the current trusted committer for a service they’ve discovered. Trusted committers will also find it helpful to be identified using an automatically updated schedule.

Boosting Discoverability

This method and the next are particularly important for organizations that have grown inorganically through acquisitions. Whether acquired companies have become a part of a single legal entity or become subsidiaries, the administrative burden of consolidating into a single source code management tool or adding all developers to all existing source code management tools is an insurmountable task and, without doing this, InnerSource efforts tend to languish in slide decks instead of thriving in the daily work of developers.

An alternative to consolidating tools or organizations is to integrate all existing repositories into a single catalog that acts as a foundation for a portal, where developers can discover metadata about all available services without exposing the source code by default. In doing so, developers can understand what a service does, how to contribute to it and who the trusted committer is without ever seeing the source code. This immediately reduces duplication of both services and APIs.

Being Able To Send Access Requests to the Right Person

Once developers are prepared to contribute to or use the service they’ve discovered using a portal, they can use a self-service action to request access to only the repository in question. By implementing dynamic approvals, this request can be sent to the right person, whether that is the trusted committer, product manager or technical lead.

Access to a repository can be accomplished with a dropdown and a brief message, then can be routed to the trusted committer (or whoever is best to field these requests).

Creating New Services That Are InnerSource-Ready

Engineering organizations that do not use a portal already struggle with streamlining new service creation: Developers must submit individual, co-dependent tickets for a new repository, new pipelines, new project management tools, and others. Adding InnerSource requirements to scaffolding a new service is yet another trigger for developers to switch contexts when they should be — and want to be — writing code.

A welcome alternative to a ticket-driven process is a self-service action that allows developers to easily satisfy these requirements from the beginning. Instead of directing them to find, modify, and add the InnerSource documentation requirements (README.md, CONTRIBUTING.md, GETTINGSTARTED.md and HELPWANTED.md), simply ask them to fill out the minimum requirements for these from the beginning in a self-service form. The automation creates the new repository, pipeline and project-management tool, and others can write these files to the new repository, allowing developers to shift their focus to writing the code for the new service nearly immediately.

Auto-populate the templates in this self-service action to ensure developers provide the right information from the beginning.

Scorecards for Services

The approach above will satisfy InnerSource requirements for new services, but organizations tend to have a vast number of existing services that must be evaluated for compliance with InnerSource standards. Before instructing an InnerSource or DevOps team to create a repository scanner that evaluates all repositories, consider using a custom scorecard in a portal. A scorecard can be used to define, measure and track metrics related to each service or entity in an internal developer portal. In this case, a scorecard can help establish metrics to grade compliance with InnerSource standards, which will help managers or team leads understand the gaps in existing services, then drive time-bound initiatives to fill those gaps.

Before building a repository scanner to check for InnerSource standards, consider using a scorecard instead.

Conclusion

By implementing a portal and deliberately configuring it to serve InnerSource purposes, engineering leaders can enjoy the benefits of InnerSource in their organizations. Developers will similarly enjoy the benefits of enhanced discoverability and the ability to easily scaffold a new InnerSource-ready service and quickly find the right person to support their contributions.

The post How To Implement InnerSource With an Internal Developer Portal appeared first on The New Stack.

Rockset Users Stranded by OpenAI Acquisition: Now What?

Christina Cewe — Wed, 28 Aug 2024 17:00:55 +0000

In June 2024, Rockset users got some stressful news: OpenAI had purchased the company, and current users of its platform had 90 days to find another solution. Evaluating new solutions, integrating them, and managing data migrations when the entire process is planned is already challenging enough.

But when it’s unplanned like this? When a solution like Rockset is tightly woven into mission-critical applications? It’s chaos and intensely stressful for the teams that must scramble to adapt.

Some will say it’s the cost of doing business. Others will argue that it’s better to use open source solutions, though these can become closed-source, lose their maintainers, or be infeasible for other reasons. Some even argue for the most untenable approach: to go alone, reinvent the wheel, and build it all yourself. But that’s nearly impossible when there’s only 90 days.

Engineers and technical teams must be resilient, agile, and adaptable, like the solutions they build. Still, leaders and managers must also consider the human side of being forced to make a significant, unexpected pivot to another solution.

It’s summer (at least in the northern hemisphere) — the season of family vacations, weddings, get-togethers, barbeques. But now, at least for current Rockset users, there’s a sudden, unexpected change in a different kind of season. It’s not just the temperature going up, but also the pressure.

So, how should a migration like this be handled? Implementing policies, testing, validation, and ensuring the process is secure are all important. But let’s zoom out and look at the bigger picture and consider how to handle migration in a way that supports both your business and your talent.

Be Agile, Not Rushed

Ninety days isn’t much time — but it is enough time to evaluate several solutions and undergo a trial process. Business and engineering leaders need to set up preliminary calls with several solutions and see which ones are worth trying in more depth. In this case, being agile means being flexible and adapting based on the solutions available to your organization. It means giving yourself time to understand the transition process and create a strategy. This ties into the next point — knowing when to hold and when to fold.

Know When To Hold and When To Fold

How well was your prior integration working for you? Was there room for improvement — perhaps new opportunities with another solution? Do you need to keep and migrate all the data into a new solution? In this case, “holding” means trying to keep everything as close as possible to how it was — but that may not be possible. “Folding” doesn’t mean giving up — seeing if the next hand you draw is a better fit.

From a data perspective, if you already have a short retention window or are not regularly querying all of your data, you may only need to migrate some of it. Or you can move most of it to a storage bucket and worry about it later. If so, it’s okay to fold. Meanwhile, if you have mission-critical data, that’s a hold — and something to determine in the trial/POC process.

From a features perspective, Rockset has a unique range of features (and some weaknesses, too). Which of the features are absolute must-haves for your integration? Which are nice-to-haves that you can live without in the short term? And which weren’t essential? For example, Rockset distinguished itself as a rare solution combining OLAP functionality with mutable data. However, many use cases involving real-time analytics and log data don’t require mutability. Immutability is typically preferable for log data.

Which features and data must you hold, and which can you live without?

It’s All About the People

This is where people are important — not just your people but those working for the vendors with solutions you’re considering. You need to understand the feasibility of a solution, get it up and running quickly, and then integrate it into your business. It’s okay to need help. You’ll probably need all the help you can get. The trial/proof of concept process is an opportunity to quickly get up and running, typically at no cost, with support from another organization that profoundly understands how their product works.

If you need to migrate your data, the success team of your next solution should be able to support that. The same applies to setting up different data sources, ensuring that ingested data has the shape you expect, optimizing queries for your use case, and more.

Great people and support are absolute musts for this kind of rapid transition. Of course, work with the people at your current solution as much as possible. For example, Rockset has pledged to support its current customers with the transition.

Don’t Go It Alone

Do you have a team of superstar engineers that can set up an MVP of a real-time analytics database in a few months? Even if you do, resist the temptation to DIY. There’s too much risk involved, at least in the short term. It’s not just about building an MVP — it’s about the optimizations and consistency you need to develop into a viable solution — a polished solution that works as expected. Trying to build a DIY solution in such a compressed time frame is an example of being rushed, not agile.

Most importantly, trying to DIY will likely divert resources from the products you need to build. A DIY solution can be part of a long-term plan but not a surprise 90-day plan.

Open Source Isn’t the Only Answer

After an acquisition like this, one typical response is that open source is the answer. The reasoning is that open source will always remain available, and you won’t have to deal with surprise acquisitions. Open source is often a great solution, and organizations like the Apache Software Foundation are building powerful tools like Flink, Pinot, and Druid that can support real-time analytics use cases.

However, in the short term, putting together an open source solution that fits your use case has some of the same challenges as DIY. Even a robust open source community that is happy to answer your questions differs from a customer success team that’s deeply motivated to integrate the solution for you and win your business. If it were easy to replicate what Rockset does, it wouldn’t have existed in the first place. Customers would use RocksDB instead or some other open source solution. There is a cost with closed source, but the value is added. Specifically, the complex problems around scalability, reliability, and efficiency are already solved for you.

Ninety days is a little time, so beware of rushing into building something. Yes, open source is often a good answer, but be realistic about how much time you’ll need to set up a solution, how many resources it will take, and how much of a diversion it would be from your core business goals to go that route.

One Size Doesn’t Necessarily Fit All

Rockset built a unique product. There’s a reason that OpenAI spent nine figures to acquire it — and why they don’t want their competition to use it. It combines OLAP with other features like a converged index and mutable data.

Finding another solution that’s a very close replacement may be tempting, but this goes back to knowing when to hold and when to fold. You are going to lose some of the hands. Businesses that must leave Rockset (or facing the same issue with another solution) are already facing a loss. They must now unexpectedly spend time, energy, and resources finding new options.

You shouldn’t just evaluate one solution; you may even find that you want to use more than one solution. For example, in the short term, that could mean using a real-time analytics platform that provides powerful real-time analytics functionality and then sending some of the data from that platform to associated tables in another solution for machine learning applications. It could mean combining several closed source solutions, open source, and even some DIY in the long term.

Protect Your Business — and Your Talent

Surprise migrations don’t just threaten your business — they put tremendous pressure on your teams. To navigate this terrain, you need all the talent you can get — from your teams, leaders, and other businesses that have built solutions designed to fit your use case. For Rockset users, it’s a significant loss that they will no longer have access to the talented people and unique solutions that Rockset built. But there are plenty of other solutions out there, and there’s a good chance that there’s one (or even multiple solutions) that can unlock new business use cases for you.

The post Rockset Users Stranded by OpenAI Acquisition: Now What? appeared first on The New Stack.

How To Nail Your GenAI Product Data Strategy

Prasanna Krishnan — Wed, 28 Aug 2024 15:52:02 +0000

Enterprises are investing heavily in new applications that leverage generative AI to boost productivity and improve decision making, and many startups hope to capitalize on the opportunity with innovative new products. But how do you build and deploy a successful AI-powered product?

First, don’t build a technology in search of a solution; build a technology that addresses a real-world business need. That means looking at what businesses are really trying to achieve, whether that’s improving customer support, growing sales, increasing supply chain efficiency or something else entirely.

Given that data is central to generative AI, you also need to have the right data strategy for your AI-powered product. This is important for you as an application builder, because it determines your costs and the speed at which you can innovate, and for your enterprise customers, for whom data governance and security are paramount. Nailing this data strategy will be the difference between success and failure.

Use Data To Drive Insight and Actions

In virtually any line of business (LOB), the overall goal is to start with data, derive an insight from that data and then take an action based on that insight. The power of AI is that it can greatly simplify and shorten that process from data to action, in part through the power of large language models (LLMs).

This approach of going from data to insight to action is a good rubric to keep in mind when conceiving your application or tool. It can apply to products built for engineers, such as tools that help developers write code or manage infrastructure.

It also applies to the business users in marketing, product management, finance or other LOBs who will use your product. Most businesses have a ton of information about their customer interactions and spending habits, and they want to know who to target with which specific offers to increase sales. Instead of waiting for an engineer to build a SQL query, a marketer can interact with an AI-powered application in natural language to gain insights and recommended actions. Those actions create more data, which helps the system learn and improve, creating a positive loop.

Solving a real business need is one piece of the puzzle, but an application still needs to be differentiated to succeed, especially when your competitors building AI applications all have access to the same models and tools. This is where a strong data management strategy becomes critical, because enterprises care deeply about the security of their data.

Make Governance and Security Your Foundation

Concerns about governance and security are top of mind for enterprises, and they want to protect their sensitive data from inappropriate access. This is especially relevant when running AI applications against enterprise data.

One way to achieve this is by not requiring customers to move their data at all. Most Software as a Service (SaaS) apps require companies to upload their data to a third party, which means the SaaS provider is responsible not just for managing the application code, but also for properly securing the customer data the apps use. They ingest the customers’ data within their own app platform.

You can give potential customers far more confidence by bringing your application to their data where it already lives, instead of needing them to upload it to your platform. Achieve this by deploying your application code within the perimeter of a customer’s own cloud platform, or by connecting your application directly to the customer’s data platform. In each case, the customers’ data stays where it is, allowing them to decide what permissions they want to grant and to monitor how their data is used.

Enterprises also need to know that their data won’t be used to train models in a way that benefits competitors. Be very explicit about your data policies, including guarantees about how a customer’s data will and will not be used.

Provide Flexible, Usage-Based Pricing

Most enterprises are experimenting quickly with a variety of AI tools and don’t want to be tied to an annual contract or even a monthly SaaS license. For many AI-powered applications, usage-based pricing is highly attractive because it allows customers to pay only for the value they receive.

To determine the right pricing model, think about what your application provides — what’s the prime unit of value? If it’s a product for data transformation, you might charge based on the number of records transformed. Eventually, we may see enterprise AI products that charge based on the number of questions end users ask.

To further encourage adoption, provide a way for the customer to try the product at no charge before committing to purchase. Ultimately, you want to make the experience as friction-free and risk-free as possible.

Build Trust Through Transparency

End users need to trust your product to use it, so avoid the classic “black box” problem of AI by being transparent about how it arrives at answers and recommendations. For example, if you offer a system that helps sales reps decide which specific offers to make to a customer, to build trust, show the sources of data used alongside the recommendations.

You might even display the level of confidence in a recommendation, so that end users have as much information as possible to make their decision and aren’t being asked to blindly trust an algorithm that they don’t understand.

Optimize Your Development and Operating Costs

Most startups have limited funding, and there are ways to manage the costs of generative AI while still building a powerful product. For example, there are many different LLMs to choose from, and your selection will depend on what capabilities you need in terms of model size, model type and performance. You want the models that provide the functionality you need at the least cost.

Fine-tuning a model is technically challenging and requires specialized expertise, which means hiring expensive engineering talent in a competitive hiring market. Retrieval augmented generation (RAG) can be a more cost-effective approach for model training and provides sufficiently accurate results for many applications.

It takes a lot of experimentation to arrive at these decisions, so engineers need a development environment that allows them to test different models and training patterns in a way that’s cost-efficient and doesn’t lock them into an approach that turns out to be incorrect. Generative AI projects are susceptible to technical debt, but these debts can be managed if engineers are thoughtful and deliberate about the development process.

Conclusion

As enterprises and startups ride the wave of generative AI, they must anchor their strategies in solving genuine business needs. The differentiator isn’t solely the technology but also the data strategy that underpins your AI application. Bringing your AI application to the data — emphasizing security, governance and operational transparency — is not just good practice; it’s a competitive advantage.

The post How To Nail Your GenAI Product Data Strategy appeared first on The New Stack.

eBPF Security Power and Shortfalls

B. Cameron Gain — Wed, 28 Aug 2024 15:12:41 +0000

Every security tool and platform that purports to offer comprehensive security should at least include some aspects of eBPF. However, for an organization’s security posture, eBPF should not, and will not, cover the entire spectrum of security use cases that are demanded and increasingly urgent in today’s DevOps and platform engineering practices.

But let’s just say eBPF is a necessary element for security. And it does have its limitations, which must either be expanded upon or accounted for in security offerings — or, in some cases, addressed separately.

A proper security provider that makes heavy use of eBPF will also build on eBPF’s functionalities to offer a comprehensive platform. Even still, it will be difficult to find a security provider platform that does not require the addition of complementary tools in addition to one that makes ample use of eBPF.

Ben Hirschberg, CTO at security provider ARMO and creator of the CNCF open source project Kubescape both agreed and disagreed. “I don’t think many people assume that eBPF is meant to solve everything. It is like saying that someone assumes that a hammer is a tool for everything,” Hirschberg said. “In theory, one could prove that a hammer cannot bring down a tree, but an ax will always be better for that. No one meant that eBPF is a single tool/technology to solve all security needs.”

That said, a provider is critical if an organization wants to ensure they are using eBPF. Most enterprises lack the expertise and skills necessary to build and integrate eBPF-based functions, according to Gartner. However, Gartner also recommends that organizations “seek eBPF-based Kubernetes CNI [container network interface] solutions when scale, performance, visibility and security are top of mind.”

Most security platforms use eBPF to monitor and analyze network traffic, system calls and other kernel-level activities, enabling real-time detection of suspicious behaviors or anomalies, said Utpal Bhatt, CMO at Tigera. “However, a holistic security approach should integrate prevention and risk mitigation strategies,” he said.

A significant percentage of attacks are from known malicious actors, for example, and a simple IDS/IPS system with threat feed integration can prevent these attacks “from happening in the first place,” Bhatt said.

Sources like MISP (Malware Information Sharing Platform) store, collect and share threat intelligence and indicators of compromise (IOCs) such as file hashes, Bhatt said. “An effective security strategy will be to prevent such malicious software from execution. In addition, most systems should assume that they will get attacked and should have access to quick risk mitigation tools,” he said.

For example, in the case of Kubernetes, deploying mitigating network controls can reduce the blast radius of an attack, Bhatt said. “eBPF is one of the most significant innovations when it comes to threat detection,” he said. “However, it may not be able to detect everything and in the cat-and-mouse play between attackers and defenders, prevention and risk mitigation is equally important.”

The Workings

As Liz Rice defines in “Learning eBPF,” system calls, or syscalls, are the interface between user-space applications and the kernel: “If you restrict the set of syscalls an app can make, that will limit what the app is able to do. If you have been using Docker or Kubernetes, there is a very good chance you have already used a tool that uses BPF to limit syscalls: seccomp,” as Rice writes. Seccomp is widely used in the container world (in the form of the default Docker seccomp profile), but this was just the first step toward today’s eBPF-based security tools, Rice said.

With seccomp BPF, a set of BPF instructions is loaded that acts as a filter, Rice writes. Each time a syscall is called, the filter is triggered. The filter code has access to arguments that are passed to the syscall, allowing it to make decisions based on both the syscall itself and the arguments that have been passed to it. The outcome can include actions such as allowing syscalls to proceed, returning an error code to the user application space, killing a thread or notifying a user-space application, Rice said.

With eBPF, many more powerful options and capabilities are possible. Where seccomp is limited to the syscall interface, eBPF programs can now be attached to virtually any part of the operating system, Rice said. “Instead of a predefined filter program, eBPF programmers have the flexibility to write custom and bespoke code,” Rice said.

For example, an eBPF program might be attached to a kernel event triggered whenever a file is opened, or when a network packet is received, Rice said. The program can report information about that event to user space for observability purposes, or it can potentially even modify the way that the kernel responds to that event. In network security, eBPF programs can be used to drop packets to comply with firewall rules or to prevent a DDoS attack, Rice said.

eBPF is being used in a wide range of applications and has become the foundation for many successful commercial projects. It has been particularly influential in driving advancements in security. The challenges that once limited the adoption of eBPF have now been overcome; for example, early versions of eBPF had a limit of 4096 instructions, but that limit was lifted to a million instructions in kernel 5.2, and even more complexity is now possible by doing things like chaining eBPF programs together with eBPF Tail Calls, leading to the development of various tools and platforms that not only address these obstacles but also improve efficiency in areas like security, observability, networking and beyond.

Open Source Build Ons

Many companies or organizations aiming to leverage eBPF may lack the resources or the expertise to fully capitalize on eBPF’s benefits directly. Using these features effectively requires deep Linux knowledge, which is why most enterprises don’t write eBPF-based tooling themselves. This is where eBPF tool and platform providers play a crucial role. Their deep understanding of Linux (and Unix) enables them to create advanced capabilities in eBPF code.

“It is true that eBPF requires deep understanding and skills that most organizations lack. There is a very limited number of engineers who can develop and maintain eBPF code,” Hirschberg said. “Therefore it is important to have good vendors (open source or commercial) that enable the power of eBPF for the common user.”

Falco is an excellent case in point, illustrating both the power and limitations of eBPF, although its limitations are minimal. When used alone, it is employed for a number of tasks, especially when configured for distributed Kubernetes and cloud native infrastructures.

Falco covers a wide range of workloads and infrastructure, extending back, of course, to the kernel, as eBPF does so well. It is one of the open source projects that receives significant attention.

However, it does not cover every security aspect on the checklist, which creates opportunities for other security providers to build security solutions with Falco or provide separate functionalities that aren’t necessarily tied to eBPF.

Interestingly, Falco was initially created using kernel modules when it was first released at the end of 2016, without involvement or integration with eBPF. It was accepted as a cloud native project in 2018 when Sysdig contributed Falco to the CNCF. However, it wasn’t until 2021 that Falco’s creators began making the eBPF probe and libraries as a sub-project of the Falco organization, as Loris Degioanni and Leonardo Grasso outlined in their book, “Practical Cloud Native Security With Falco.”

Many enterprises are reluctant to use kernel modules because a bug in the kernel can bring down the whole machine, and kernel modules will not have had the same level of testing and field-hardening as the kernel itself, so the chance of running into such a bug may be an unacceptable risk, Rice said. eBPF’s advantage over kernel modules is the eBPF Verifier, which analyzes programs as they are loaded to ensure that they can’t crash the kernel, eliminating this concern, Rice said.

“Moving away from kernel modules was a major milestone. Everything we implement today in eBPF could have been implemented in kernel modules in the past,” Hirschberg said. “However, kernel modules had a tendency to break the kernel and their perception of being unstable made the adoption of tooling based on them very limited.”

The post eBPF Security Power and Shortfalls appeared first on The New Stack.