Wednesday, July 24, 2024

Hedgehog is the AI network solution builder - plus more

If you are actively looking to build out AI network infrastructure and want to utilize white box, cost effective switching, one of the challenges you have is what software you will use to design, deploy, operate and manage those network switches, because doing that by hand via a CLI will not be fun. Hedgehog wants to be the software you use to solve this problem, along with a few more use cases.

They kicked off their Network Field Day 35 (#NFD35) presentation with the pitch of saying Hedgehog is the AI network solution builder. They are the software to make this happen, but you can definitely use them for more than that, so you have to read between the lines to get the bigger picture on where they fit in the market.

I don't ding them for riding the AI wave, that is what is happening at the moment. And they aren't claiming they are "using AI" to build or deploy the fabric. They are saying you can use their solution to build out network infrastructure easily that makes running AI workloads faster, easier, with better performance, and not necessarily need a network engineering team to do it, which is pretty attractive for companies that are mainly investing the money on teams that can build out interesting AI models and data sets.



The pitch is that folks who are building and running AI workloads likely grew up and are used to cloud first tooling and services. They want an API, and for networking, a VPC like concept to build out what they need to run their AI workloads. Because Hedgehog is providing that user experience, but for whitebox, on-premises environments, the argument is that your existing cloud AI teams can easily get things up and running and have a look, feel, and workflow that matches what they do in a public cloud.

I must admit, I have no idea how many companies are running AI workloads and want to avoid recurring cloud costs and have decided to deploy colocated or dedicated infrastructure on their own to help drive down costs. I think that is the intersection of two customer markets in a small Venn diagram overlap, but clearly Hedgehog can provide a solutions for those folks.

I think the compelling story is the fact that you can build an Ethernet fabric easily with the same workflow and concepts as what you are used to with a public cloud service like AWS or Azure for networking. This is important for organizations that are trying to streamline their AI workload processes and want to leverage the cost saving of running AI workloads on fixes capital investments versus recurring costs in public cloud. It still leverages VXLAN and EVPN, but it is abstracted away from the operator. The reality is, this is where network automation is going. A controller that builds out Ethernet fabric solutions in a standard way where you don't need to touch or maintain the underlay and an API for those that want to consume the fabric can use to set it up they way they need it with a simple abstraction.

Let me repeat that - this is where network automation is going. I'm not saying Hedgehog is the only way to do this, but they demonstrate where the industry is going to go and how the network industry is going to change.

Hedgehog VPC - multi-tenancy the cloud way


It is also worth noting that the solution is built on Kubernetes but you don't need to know anything about that, it is just under the hood. The other part that is interesting is that the software is all open source. Their goal is to provide a fully automated network operations solution at a reasonable cost.

So who is the target?

I think, given what Hedgehog can do, and what capabilities it is providing, the target customer is really organizations that want to run networks that can scale and change without having to spend a crazy amount for network architects and engineers to design and deploy. In addition, the solution is cost conscious, and those that are willing to purchase white box networking hardware to help drive down their capital costs compared to buying a Cisco, Juniper, Arista, or other mainstream network switch solution will be an idea customer. By having open source software, low commodity hardware, and the focus of the cost in annual software maintenance, it will be very appealing to shops that run lean, with low overhead and staff.

So where is Hedgehog going?

Hedgehog - What is next?

It is clear that Hedgehog can only really innovate at the pace that network white box platform have the right supported capabilities. That being said, having a service gateway to make their cloud like experience consistent with what the public clouds provide will be critical. The DPU integration will give them scale and security that will be beyond what most commercial network solutions are providing today. And finally, having flow scheduling ability from a DPU gives them even more capabilities to optimize AI workloads.

Is there room in the market for Hedgehog? Yes. Will they likely get bought up to help a larger network vendor who wants to own that market? Yep, that will likely happen at some point. For now, if you are looking for where the network automation landscape is going, you might want to check out what Hedgehog is providing the industry with open source, industry standard network automation through API, and the key value provided via annual software maintenance. I'll be keeping an eye on what they do.

- Ed


In a spirit of fairness (and also because it is legally required by the FTC), I am posting this Disclosure Statement. It is intended to alert readers to funding or gifts that might influence my writing. My participation in Network Field Day, a Tech Field Day event, was voluntary and I was invited to participate. Tech Field Day events are hosted by Gestalt IT (part of The Futurum Group) and my hotel, transportation, food and beverage was/is paid for by Gestalt IT for the duration of the event. In addition, small swag gifts or donations were/are provided by some of the sponsors of the event to delegates (I don't accept gifts but I do ask the sponsors to donate to causes that support Mental Health). It should be noted that there was/is no requirement to produce content about the sponsors and any content produced does not require review or editing by Gestalt IT or the sponsors of the event. So all the spelling mistakes, technical missteps, incorrect opinions, and grammar errors are my own.

Tuesday, July 16, 2024

Intel - Is it an IPU or a DPU or what?

Intel has developed and sold classic Ethernet Network Interface Cards (NICs) for a long time, but many might not be as familiar with their product offerings in the SmartNIC and more advanced NIC categories. Intel breaks down their product offering as follows:

Intel Ethernet Connectivity Solutions

Intel presented on the work they are doing around their Infrastructure Process Unit (IPU), which they refer to as an "Improved DPU" or Data Processing Unit, which fits in the general bucket of "SmartNIC" and is developed by the Intel NEX Cloud Connectivity Group. This post focuses on that, since that is what they presented at #NFD35. I must admit, I am interested in hearing more about their AI Optimized solutions as I am sure they are being leveraged by some very large organizations for interesting workloads, perhaps a future NFD?

There were two features in the IPU that are something infrastructure engineers should know about. Specifically, the capability to build out reliable transport between two hosts who both have an Intel IPUs. The Ethernet fabric no longer needs to run special queuing and management to deal with congestion and microburst issues but instead, the IPU running Falcon, and leveraging programable congestion control, deals with it. Effectively, Falcon is the method for reliable transport over an existing lossy fabric which brings a lot of options to companies who may not want to build out a dedicated fabric for running storage or AI workloads.

In a shared fabric environment, it can be difficult to structure and provision all the access ports with the right queuing and policies. Given that difficultly, it might make sense, for smaller networks and diverse compute environments, to simple purchase the more advanced IPU's for the servers that require them and have the IPU deal with the lossy fabric issues.

The other feature was demonstrating the use of the IPU for general compute capabilities and also AI inference, and the markets that could potentially use the solution. They are definitely targeting a wide audience of infrastructure engineers who might need to run services and workloads but might not have the capacity, budget, or fabric design to support what they are trying to do. Intel sees the following areas as potential good use cases for their IPU.

IPUs In & Beyond the Data Center

 

You can watch the overview presentation from Thomas Scheibe w/ Intel at:



If you want more information about their Reliable Transport over Lossy Fabrics, which is called Falcon, then check out:



Intel also provided some actual demos and you can watch those at:



What is always a little interesting about Intel and their solutions, is that typically, you and I aren't buying directly from Intel. You are normally purchasing their products through a distributor or hardware supplier like HPE, or Dell, or SuperMicro. But Intel still wants infrastructure engineers to know what their products are capable of, so when you are building out the next server, you are picking the right SmartNIC for your specific needs. So it makes sense they are out providing this information directly to the public, or NFD events in this case, so you can pick and choose the right solution for your Data Center and Enterprise server networking needs.

- Ed


In a spirit of fairness (and also because it is legally required by the FTC), I am posting this Disclosure Statement. It is intended to alert readers to funding or gifts that might influence my writing. My participation in Network Field Day, a Tech Field Day event, was voluntary and I was invited to participate. Tech Field Day events are hosted by Gestalt IT (part of The Futurum Group) and my hotel, transportation, food and beverage was/is paid for by Gestalt IT for the duration of the event. In addition, small swag gifts or donations were/are provided by some of the sponsors of the event to delegates (I don't accept gifts but I do ask the sponsors to donate to causes that support Mental Health). It should be noted that there was/is no requirement to produce content about the sponsors and any content produced does not require review or editing by Gestalt IT or the sponsors of the event. So all the spelling mistakes, technical missteps, incorrect opinions, and grammar errors are my own.


Tuesday, July 02, 2024

Network Field Day 35

Network Field Day 35 (#NFD35) is happening July 10-11, 2024 and I am fortunate enough to be a delegate for the event. You can check out the full event schedule at the NFD35 website. The sponsors list has been growing so checking the site is the best until the event starts. I recommend watching live if you can and I believe LinkedIn is likely the best place to catch stuff.

So far the sponsor line up is:

I will be attending in person, and I will be doing my best to take notes and ask interesting questions. Obviously, there is no way we can cover all the questions that those who are watching remote might have, but hit us up on X/Twitter using #NFD35 or via the Tech Field Day slack channel or even via LinkedIn and we will all do our best to try and bring up the point.

So, there you go, let's get ready to have some serious fun with NFD35 as the delegate line up is pretty impressive! If you are at all into networking then I encourage you to follow along live for the events on the Tech Field Day website or via LinkedIn. If you are interested in being a delegate, you can check out the website, they have all the details up there.

- Ed


In a spirit of fairness (and also because it is legally required by the FTC), I am posting this Disclosure Statement. It is intended to alert readers to funding or gifts that might influence my writing. My participation in Tech Field Day events was voluntary and I was invited to participate. Tech Field Day is hosted by Gestalt IT (part of The Futurum Group) and my hotel, transportation, food and beverage was/is paid for by Gestalt IT for the duration of the event. In addition, small swag gifts or donations were/are provided by some of the sponsors of the event to delegates (I don't accept gifts but I do ask the sponsors to donate to causes that support Mental Health). It should be noted that there was/is no requirement to produce content about the sponsors and any content produced does not require review or editing by Gestalt IT or the sponsors of the event. So all the spelling mistakes, technical missteps, incorrect opinions, and grammar errors are my own.