Posted on Leave a comment

Arm Server Update, Spring/Summer 2021

As usual, we are overdue for an update on all things Arm Servers! Today’s announcement of the Arm v9 specification is a great time to review the state of Arm Servers, and what has changed since our last update.

First, let’s review our last update. Marvell canceled the ThunderX3 product, Ampere had announced the Altra but it wasn’t shipping, AWS Graviton was available, and Nuvia was designing a processor.

Fast forward to today, and the Ampere Altra’s are now becoming available, with limited stock via the Works on Arm program at Equinix Metal, and some designs shown off by Avantek, a channel supplier. Mt. Snow and Mt. Jade, as they are known, are also formally designated as “ServerReady” parts, passing standards compliance tests.

Nuvia, the startup that was designing a new Arm Server SoC from the ground up, was purchased by Qualcomm, in an apparent re-entry into the Arm Server market (or for use in Windows on Arm laptops?). Don’t forget, they previously had an Arm Server part, the Centriq, though they scrapped it a few years ago. So, it now remains to be seen if Nuvia will launch a server-grade SoC, or pivot to a smaller target-device.

The other emerging trend to cover is the role of Arm in the Edge Server ecosystem, where the trend of pushing small servers out of the datacenter and closer to customers and users is rapidly gaining momentum. In this scenario, non-traditional, smaller devices take on the role of a server, and the energy efficiency, small form-factor, and varied capabilities of Arm-powered single board computers are taking on workloads previously handled by typical 1U and 2U rackmount servers in a datacenter. But, small devices like the Nvidia Jetson AGX, RaspberryPi Compute Module 4, and NXP Freeway boxes are able to perform Edge AI, data caching, or local workloads, and only send what is necessary up to the cloud. This trend has been accelerating over the past 12 – 18 months, so, we may see some more niche devices or SoC’s start to fill this market.

Posted on Leave a comment

Arm Server Update, Fall 2020

The announcement yesterday of the cancelation of Marvel’s ThunderX3 Arm Server processor was a reminder that we were overdue for an Arm Server update!  So, continuing on in our regular series, here is the latest news in the Arm Server ecosystem.

As mentioned, unfortunately it appears at this time that Marvell has canceled the ThunderX3 Arm Server processor that was shown earlier this year, and would have been the successor to the ThunderX and ThunderX2 parts released previously.  The current rumors indicate that perhaps some specialized version of the SoC may survive and be used for an exclusive contract with a hyperscaler, but that means “regular” customers will not be able to acquire the part.  And with no general purpose, general availability part, the ThunderX3 will effectively be unavailable. 

That leaves AWS providing the Graviton processor in the EC2 cloud server option, or Ampere with their current generation eMag Arm Server, and forthcoming Ampere Altra SoC as the only server-class Arm processors left (for now).  The Ampere Altra is brand new, and available from our friends at Packet in an Early Access Program, but no specific General Availability date has been mentioned quite yet.  This processor offers 80-cores or 128-cores, and is based on Arm Neoverse N1 cores. 

There is another processor on the horizon though from Nuvia, a startup formed late last year who is designing an Arm-based server class SoC.  Nuvia has said it will take several years to bring their processor to market, which is a typical timeframe for an all-new custom processor design.  So in the meantime, only Amazon and Ampere are left in the market.

The NXP desktop-class LS2160 as found in the SolidRun Honeycomb could also be considered for some workloads, but it is a 16-core part based on A72 cores.

There is one other Arm Server that exists, but unfortunately it’s not able to be acquired outside of China:  the Huawei TaiShan 2280 based on the HiSilicon Kunpeng 920.  This is a datacenter part that is likely used by the large cloud providers in China, but seems difficult (or impossible) to obtain otherwise.  It is a dual processor server, with 64-cores in each processor, thus totaling 128 cores per server.

As usual, the Arm Server ecosystem moves quickly, and we look forward to seeing what’s new and exciting in our next update!

 

Posted on 16 Comments

How to Run Rosetta@Home on Arm-Powered Devices

How to Run Rosetta@Home on Arm-Powered Devices

This week, after an amazing Arm community effort, the Rosetta@Home project released support for sending work units to 64-bit Arm devices, such as the Raspberry Pi 4, Nvidia Jetson Nano, Rockchip RK3399-based single board computers, and other SBC’s that have 2gb of memory or more.

Sahaj Sarup from Linaro, the Neocortix team, Arm, and the Baker Lab at the University of Washington all played in role helping us port the Rosetta software to aarch64, get it tested in their Ralph (Rosetta ALPHa) staging environment, validate the scientific results, and eventually push it to Rosetta@Home.

Now, anyone with spare compute capacity on their Arm-powered SBC’s running a 64-bit OS can help contribute to the project by running BOINC, and crunch data and perform protein folding calculations that help doctors target the COVID-19 spike proteins (among other medicine and scientific workloads).

Here is a quick tutorial on how to get started, using a native operating system for your devices.  This methodology is not the only way to run Rosetta@Home, but, is intended for the technical users who want to run their own OS and manage the system themselves.

Raspberry Pi 4

To fight Covid-19 using a Raspberry Pi 4, you need a Raspberry Pi 4 with 2gb or 4gb of RAM.  The Rosetta work units are large scientific calculations, and they require 1.9gb of memory to run.  You will need to use a 64-bit OS for this, so Raspbian will not work, as it is a 32-bit OS.  Instead, you will need to download and flash Ubuntu Server from their official sources, located here:  https://ubuntu.com/download/raspberry-pi.  Once the SD Card is written, and your Pi 4 has booted up, connect an ethernet cable, and be sure to run ‘sudo apt-get update && sudo apt-get upgrade’ to make sure the system is up to date.  At this point a reboot may be necessary, and once the system comes back up, we can start to install BOINC and Rosetta.  Run ‘sudo apt-get install boinc-client boinctui’ to bring in the BOINC packages.  If you are using a 2gb RAM version of the Pi 4, we need to override one setting to cross that 1.9gb threshold mentioned earlier.  If you have a 4gb RAM version of the Pi 4, you can skip this next item.  But, 2gb users, you will need to type ‘sudo nano /var/lib/boinc-client/global_prefs_override.xml’ and enter the following to increase the default memory available to Rosetta to the maximum amount of memory on the board:

<global_preferences>
   <ram_max_used_busy_pct>100.000000</ram_max_used_busy_pct>
   <ram_max_used_idle_pct>100.000000</ram_max_used_idle_pct>
   <cpu_usage_limit>100.000000</cpu_usage_limit>
</global_preferences>

 Press “Control-o” on the keyboard to save the file, and then press Enter to keep the file name the same.  Next, press “Control-x” to quit nano.

Next, using your desktop or laptop PC, head to http://boinc.bakerlab.org and create an account, and while there, be sure to join the “crunch-on-arm” team!  

Back on the Raspberry Pi, we can now run ‘boinctui’ from the command prompt, and a terminal GUI will load.  Press F9 on the keyboard, to bring down the menu choices.  Navigate to the right, to Projects.  Make sure Add Project is highlighted, and press Enter.  You will see the list of available projects to choose from, choose Rosetta, select “Existing User” and enter the credentials you created on the website a moment ago.  

It will take a moment, but, Rosetta will begin downloading the necessary files and then download some work units, and begin crunching data on your Raspberry Pi 4!

You can press ‘Q’ to quit boinctui and it will continue crunching in the background.

 

Nvidia Jetson

If you have an Nvidia Jetson Nano, you can actually follow the same directions outlined above directly on the Nvidia-provided version of Ubuntu.  To recap, these are the steps:

  • Open a Terminal, and run ‘sudo apt-get update && sudo apt-get upgrade’.  After that is complete, reboot.
  • Using your desktop or laptop PC, head to http://boinc.bakerlab.org and create an account, and join the “crunch-on-arm” team
  • Back on the Jetson Nano, run ‘sudo apt-get install boinc-client boinctui’
  • Run ‘boinctui’, press F9, navigate to Projects, Add Project, and choose Rosetta@Home.  Choose an Existing Account, enter your credentials, and wait for some work units to arrive!

 

Other Boards

If you have other single board computers that are 64-bit, and have 2gb of RAM, that run Armbian, the process is the same for those devices as well!  Examples of boards that could work include Rockchip RK3399 boards like the NanoPi M4 or T4, OrangePi 4, or RockPro64, Allwinner H5 boards like the Libre Computer Tritium H5 or NanoPi K1 Plus, or AmLogic boards like the Odroid C2, Odroid N2, or Libre Computer Le Potato.  Additionally, 96Boards offers high performance boards such as the HiKey960 and HiKey970, Qualcomm RB3, or Rock960 that all have excellent 64-bit Debian-based operating systems available.

For any of those, simply install the ‘boinc-client’ and ‘boinctui’ packages, and add the Rosetta project!

Of course, if you just so happen to have a spare Ampere eMAG, Marvell ThunderX or ThunderX2 laying around, those would work quite nicely as well.

Posted on Leave a comment

The Future of AI Servers

The Future of AI Servers

Following up on the recent announcement of our new Raspberry Pi 4 AI Servers, it seems that AI servers running on Arm processors are gaining more and more traction in the market due to their natural fit at the IoT and Edge layers of infrastructure.  Let’s take a quick look at some of the unique properties that make AI Servers running on Arm a great strategy for AI/ML and AIoT deployments, to help understand why this is so important for the future.

Power – Many IoT deployments do not have luxuries that “regular” servers enjoy such as reliable power and connectivity, or even ample power for that matter.  While Intel has spent decades making excellent, though power hungry processors, Arm has focused on efficiency and battery life, helping to explain they they dominate the market in tablets and smartphones.  This same efficiency is then leveraged by IoT devices running AI workloads, so Edge devices responsible for computer vision, image classification, object detection, deep learning, or other workloads can operate with a much lower thermal footprint than a comparable x86 device.

Size – Similar to the underlying reasons behind power efficiency, the physical size and dimensions of Arm AI Servers can be made smaller than the majority of x86 designs.  Attaching AI Accelerators such as the Gyrfalcon 2801 or 2803 via USB to boards as small as 2 inches square (such as the NanoPi Neo2) is possible, or the addition of a Google Coral TPU via the mini-PCIe slot on a NanoPi T4 bring an enormous amount of inferencing to AI Servers in tiny form factors. 

Cost – Here again, Arm SoC’s and Single Board Computers typically have a rather large cost advantage versus x86 embedded designs.  

Scalability – This is a critical factor in why Arm will play a massive role in the future of AI Servers, and why miniNodes has begun to offer our Raspberry Pi 4 AI Server.  As mentioned, low power, cheap devices make great endpoints, but, there is also a role for “medium” sized AI servers handling larger workloads, and Arm partners are just now starting to bring these products to market.  An example is the SolidRun Janux AI Server, which also makes use of the same Gyrfalcon AI Accelerators used by our nodes.  So, you can get started training your models, testing out your deployment pipeline, understanding the various AI frameworks and algorithms, and getting comfortable with the tools, and very easily scale up as your needs expand.  Of course, once you reach massive amounts of deep learning and AI/ML processing, enterprise Arm server options exist for that as well.

Flexibility – Taking Scalability one step further, the Arm AI servers also allow for a great amount of flexibility in the specific accelerator hardware (Gyrfalcon, Google Coral, Intel Movidius), the frameworks used (Caffe, PyTorch, TensorFlow, TinyML), and the models (ResNet, MobileNet, ShuffleNet) employed.  

Ubiquity – A final piece of the overall AI Server ecosystem is the ease of access to this type of hardware, and low barriers to entry.  The Raspberry Pi and similar types of boards are distributed globally, and readily available in nearly all markets.

As you can see, our view is that the future of AI servers is based on Arm SoC’s, and now is the time to start exploring what that means for your workload.

Posted on 3 Comments

Where to Buy an Arm Server

Being Arm enthusiasts and deeply embedded in the Arm Server ecosystem, one of the questions we get asked often is,

“Where can I buy an Arm Server?”

In the past, it was difficult to actually find Arm Server hardware available to individual end-users. Not long ago, the only way to gain access to Arm Servers was to have NDA’s with major OEM’s or having the right connections to get engineering-sample hardware. However, over the course of the past 2 to 3 years, more providers have entered the market and hardware is now readily available to end users and customers. Here are some of the easiest ways to buy an Arm Server, although this list is not exhaustive. These servers all have great performance, relatively low costs, and are well supported thanks to standards compliance and UEFI.

First and foremost, the AMD Opteron A1100 may not be a commercial success, but it is a fantastic Arm Server platform that is supported upstream and runs perfect out-of-the-box. The SoftIron OverDrive 1000 comes in a small desktop style case, but the OverDrive 3000 series comes in a 1U chassis ready for rackmount installation. It has a BMC, 10gb ethernet, 14 SATA ports (!), and 2 PCIe slots. A standard UEFI boot process allows for easy installation of CentOS, RedHat, Debian, Ubuntu, SUSE, and any other Linux flavor that has an ARM64 build.  Though their Cortex A57 cores are getting a bit older now, they still make great build machines, especially when paired with fast SSD’s.

Next up is the Cavium ThunderX, and the newer ThunderX2. These chips are sold in servers from several vendors, which come in various shapes and sizes. Some of the examples we’ve found include the System76 Starling, the Avantek R-series in both 1U and 2U sizes, and the Gigabyte Arm offering that closely match Avantek’s specs. There are High Density designs, single processor and dual processor options, and 10 GBE as well as SFP options available.  ThunderX2’s have been more popular in HPC environments, but even a first-generation ThunderX is a great choice, and still a very powerful machine.  They can be purchased with up to 48-cores, or  in dual-processor configurations then containing up to 96 cores.

Another option is the Ampere eMag Arm Server from a company that formed a few  years ago, Ampere Computing.  They ship a turnkey Arm Server that is sold by Lenovo, the HR330A or the HR350A.  Their current-generation platform has 32 Arm cores running at 3.0ghz, 42 lanes of PCIe bandwidth, and 1 TB of memory capacity, and their next-generation product is said to have up to 80 Arm Neoverse N1 cores.  Current models are available for purchase from their website, or through Lenovo.

And of course, if buying physical servers and hosting them yourself, or placing them in a datacenter, is not feasible or cost effective in your situation, then our hosted Arm servers are a great option as well!  Our miniNodes Arm are certainly more modest in comparison to those mentioned above, but, they are a great way to get started with Arm development, testing existing code for compatibility, or lighter workloads that don’t require quite so much compute capability.

Be sure to check back often for all things Arm Server related!

Posted on 4 Comments

How-To: Install Minecraft Server on the Raspberry Pi Server or Ubuntu 18.04 Arm Server (2020 Edition)

Install Minecraft Server on the Raspberry Pi (2020 Edition)

Minecraft is one of the most popular games played online, and installing your own Minecraft Server on a Raspberry Pi or other Arm powered device is easy! These instructions will allow you to install Minecraft Server on our Raspberry Pi, Raspberry Pi 3, or on our Ubuntu 18.04 Arm Server.  They should also work locally on your own Raspberry Pi or other Arm powered single board computer!

To install Minecraft Server on your Raspberry Pi, just follow this quick tutorial to get you up and running!

Installing Java

Due to changes in the Oracle licensing, it is no longer possible to download JDK directly from their site without accepting a license agreement, as was possible in the past.  Thus, it is no longer possible to just use ‘wget’ from a terminal to download JDK.  Instead, you will have to use a web browser, navigate to https://www.oracle.com/java/technologies/javase-jdk8-downloads.html, and select the “jdk-8u241-linux-arm32-vfp-hflt.tar.gz” file.  This will need to be accomplished in one of two ways, depending on whether you are using SSH to connect to your server, or, if you are using a local Raspberry Pi with a desktop.  First, if you are using a local Raspberry Pi with a keyboard, monitor, mouse, and desktop installed, you can simply open up a web browser and visit https://www.oracle.com/java/technologies/javase-jdk8-downloads.html, and select the “jdk-8u241-linux-arm32-vfp-hflt.tar.gz” file.  Take note of where it downloads, we will need that in a moment.

If you are connected via SSH, you will need to use a terminal (text only) web browser such as Lynx.  This won’t be pretty, but, it should be enough to prompt for the download of the JDK file.  First connect to your node via SSH using the IP address, username, and password.  Then, install lynx and navigate to the Oracle website in text-only mode:

sudo apt-get install -y lynx && lynx https://www.oracle.com/java/technologies/javase-jdk8-downloads.html

Look for the text on the page where the name of the file is listed, jdk-8u241-linux-arm32-vfp-hflt.tar.gz, and press Enter to start the download.  If you are on a desktop version of the Raspberry Pi, now is the time to switch to the Terminal application, and change to the directory where your file got downloaded to (most likely Downloads … cd Downloads).  If you connected via SSH, then you are already in a terminal, and can proceed.

We need to extract Java, using this command:

sudo tar zxvf jdk-8u241-linux-arm32-vfp-hflt.tar.gz -C /opt/

If the download and extract were successful, we will test to make sure Java is working by:

sudo /opt/jdk1.8.0/bin/java -version

We should see this, confirming Java is now ready (your version may vary a bit):

java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b111)
Java HotSpot(TM) Client VM (build 25.0-b53, mixed mode)

Finally, let’s remove the downloaded gzip to save a bit of disk space:

sudo rm jdk-8u241-linux-arm32-vfp-hflt.tar.gz

Installing Minecraft Server

Now, it is time to download Minecraft Server!

Still in the terminal, get Minecraft from this URL:

wget https://launcher.mojang.com/v1/objects/bb2b6b1aefcd70dfd1892149ac3a215f6c636b07/server.jar

Once it has finished downloading, we can launch it by running:

sudo /opt/jdk1.8.0/bin/java -Xmx1024M -Xms1024M -jar minecraft_server.1.15.2.jar

The original Raspberry Pi Model B only has 512mb of RAM, so it will not actually allocate 1024…but it will take approximately 400mb or so that is available to it.  The Raspberry Pi 3 and our Ubuntu 18.04 LTS Arm Server both have 1gb of RAM, which definitely helps increase performance of the Minecraft Server.  Of course, the Operating System does take up some of the available memory, but Minecraft Server will probably reserve about 750mb to 800mb of memory to run, which will be plenty.  On a Raspberry Pi 4 you can purchase up to 4gb RAM models, so if you have one of those, feel free to experiment with increasing the value of the memory, (1024) in the above command line (perhaps 2048)

At this point, Minecraft Server will go through it’s startup routine, and you will be able to join the newly created world by pointing your game to the IP Address of your node (you can also modify game variables by editing the server.properties file, located in your ~home directory.)

Have fun!

Posted on Leave a comment

Running AI Workloads on Arm Servers

Arm’s Role in Processing AI Workloads

The past several years have seen enormous gains in Artificial Intelligence, Machine Learning, Deep Learning, Autonomous Decision Making, and more.  The availability of powerful GPUs and FPGAs both on-premise and in the cloud for several years now have certainly helped, but more and more of this AI processing is actually being done at the Edge, in small devices.  The popularity of Amazon Alexa, Google Home, and AI-enabled features in smartphones such as Apple’s Siri has skyrocketed over the past few years.  The various frameworks and models such as Tensorflow, PyTorch, Caffe, and others have matured, and newer, lightweight versions have come along such as TinyML, TensorflowLite, and other libraries designed to allow machine learning in the smallest devices possible.  Local processing of audio and detecting specific sounds via wavelength pattern matching, object recognition in a camera’s frame, motions and gestures being monitored and observed, and vehicle safety systems that detect and respond immediately to changing conditions with no human intervention are some of the most common applications.

The work that it takes to develop these AI models is very specialized, but ultimately algorithms are created, a large sample of training data is fed in to the system, and a model is developed that has a confidence factor and accuracy value.  Once the model is deployed, real-time stream processing occurs, and actions can be taken based upon the results of data flowing through the application.  In the case of a computer vision application for example, identifying certain objects  can result in alerts (hospital staff notified), corrective actions (apply the brakes immediately), or data stored for later use.
As mentioned, more and more AI/ML is actually being processed at the Edge, on small form factor devices.  And, small form factor devices tend to be powered by Arm SoCs, as opposed to the more power hungry x86 designs commonplace in laptops and desktops.  Home devices like Alexa, Google Home, and nearly all smartphones are based on Arm SoCs.  Thus, AI models need to be created, tested, and compatibility verified for Arm powered devices.  Even if an algorithm is developed and trained on a big GPU or FPGA, the resulting model should still be tested on Arm SoC’s to ensure proper functionality.  In order to help speed the testing process, miniNodes now offers hosted Arm microservers with dedicated AI accelerators, that can assist with offloading AI tasks from the CPU and offer excellent machine learning performance.  Testing of self-driving vehicle object detection, navigation and guidance, and actions / behavior models, image classification and object recognition from cameras and video streams, convolutional neural networks, and matrix multiplication workloads, robotics, weather simulation, and many types of deep learning activities can be quickly and easily processed.

Arm Lowers the Cost of AI Processing

AI training and inference in the cloud running on Arm microservers at miniNodes also offers a distinct cost advantage over Amazon AWS, Microsoft Azure, or Google GCE.   Those services can very quickly cost thousands of tens of thousands of dollars per month, but many AI workloads can get by just fine with more modest hardware when paired with a dedicated AI accelerator like a Google Coral TPU, Intel Movidius NPU, or Gyrfalcon Matrix Processing Engine.  AWS, Azure, and GCE provide great AI performance, sure, but you also pay heavily for the processor, memory, storage, and other components of the overall system.  If you are ready to make use of those immense resources, wonderful.  But if you are just starting out, are just learning AI/ML, are only beginning to test your AI modeling on Arm, or just have a lightweight use case, then going with a smaller underlying platform while retaining the dedicated AI processing capability can make more sense.

miniNodes is still in the process of building out the full product lineup, but in the meantime Gyrfalcon 2801 and 2803 nodes are online and ready, with up to 16.8 TOPs of processing for ResNet, MobileNet, or VGG models.  They are an easy, cost effective way to get started with AI processing on Arm!

Check them out here:  https://www.mininodes.com/product/raspberrypi-4-ai-server/

Posted on Leave a comment

Recap: Building an Arm-Powered IoT, Edge, and Cloud Infrastructure

Intro

At Arm’s annual TechCon event in San Jose, Arm CEO Simon Segars presented a vision of the future where a trillion connected devices interact seamlessly with each other and pass data between the Cloud, the Edge, and the Internet of Things, at a scale unimaginable even just a few years ago. Self driving cars will generate massive amounts of sensor information and data, 5G wireless will enable increased connection speeds and reduced latency, and artificial intelligence will provide scientific breakthroughs in materials, technologies, medicines, and energy. This vision of the future state of the connected world is something we have heard about for several years now, with countless written articles, interviews, social media posts, conference talks, and various other forms of media addressing the topic.

However, when seeking out real-world examples of this architecture in practice to help learn and understand how the bits and pieces work together, we came up empty. There were no purpose-built sample projects, pre-written code examples, or other working prototypes of these principles available. Surely there are some internal, private teams building out this type of infrastructure for specific use-cases and organizational needs, but there were no public / open projects to learn from.

Thus, it was time to take action, and build a prototype infrastructure ourselves! With the help of the Arm Innovator Program, we set out on a journey to develop a proof-of-concept that encapsulates as many of these concepts as possible, leveraging currently available technologies and showcasing Arm’s diverse portfolio of products and ecosystems. With help from the Works on Arm program via Packet.com, we began brainstorming.  Our goal was to deploy IoT endpoints to a handful of locations around the world, and capture environmental data via sensors on those devices. From there, we wanted to feed that data to a local Edge Server, which would be responsible for translating the data to a usable format and sending it further upstream, to a Cloud Server functioning as a data warehouse and visualization platform.

In this article we’ll take an in-depth look at the project, and detail the key technologies to give a better idea of what this kind of system entails. I’ll also provide a summary of our lessons learned, which hopefully help you to build and iterate faster, and avoid some potential pitfalls along the way.

Design

When thinking about the design of this project, we wanted to keep things simple, as the purpose of this exercise it to demonstrate capability and build a proof-of-concept, but not an actual product shipped to real, paying customers.  Thus, we made hardware and software selections based on cost and availability, as opposed to “most appropriate” for the intended use. We also knew we would have relatively small data-sets, and reliable power and internet connectivity for all of our devices.  Your real-world IoT deployments may not have these luxuries, so, your hardware and software selections may not be as straightforward as ours were.  Many IoT projects have to be tolerant of lost network connectivity, unreliable power delivery, or harsh environmental conditions.  But we were fortunate to have consistent power and internet.  Let’s go through our inventory of Arm-powered hardware and software, keeping in mind the rather ideal conditions we’ve got:

1. IoT Endpoints

Hardware

  • Raspberry Pi 3B+
  • Sparkfun Qwiic HAT
  • Sparkfun Lightning Detector
  • Sparkfun Environmental Combo Sensor
  • Sparkfun GPS Sensor

Software

  • Arm Mbed Linux OS
  • Arm Pelion Device Management

 

2. Edge Nodes

Hardware

  • Linaro / 96Boards HiKey, and HiKey 960

Software

  • Linaro Debian Linux

 

3. Cloud Server

Hardware

  • Ampere eMAG, hosted by Packet.com

Software

  • Debian Linux
  • InfluxDB
  • Grafana

 

As you can see, we have made some selections that fit our small project well, but as mentioned may not be suitable for all IoT use cases depending on your project’s environmental conditions.  However, let’s start detailing the items, beginning with the IoT Endpoint.  We’re using a Raspberry Pi 3B, a Sparkfun Qwiic HAT, and Sparkfun sensors to capture Temperature, Humidity, Barometric Pressure, CO2, and tVOC (volatile organic compounds).  We have lightning detection capability (currently not being used, but, available) as well, and GPS so that we can determine precisely where the Endpoint is located.  As for software, because these devices are out in the wild, scattered literally across the globe, we needed a framework to allow remote monitoring, updating, and application deployment.  Arm Mbed Linux OS is a lightweight, secure, container-based operating system that meets these requirements.  It is currently still in Technical Preview, but is far enough along in development that it meets our project needs and is working great.  A total of 10 Raspberry Pi Endpoints were built and sent around the globe, with several across the United States, as well as Cambridge, Budapest, Delhi, Southern India, and one spare unit left over for local testing.

Turning to our Edge Nodes, these are the simplest component in our project’s infrastructure. These are 96Boards devices, chosen for their support and ease-of-use.  Linaro and the 96Boards team do an excellent job of building ready-made Debian images with updated kernels, applications, and drivers for their hardware, making for a great out-of-the-box experience. Two of these devices are currently provisioned, one in India and one in the United States, each serving their geographic region. The devices aggregate the IoT Endpoint data stream, convert it to the format needed by the Cloud Server, and publish the data to the Cloud.

Finally, the Arm-powered Cloud Server is an Ampere eMAG server, hosted by Packet.com. It is an enterprise-grade machine, and functions as the data warehouse for all of the IoT data, as well as a visual platform for charting and viewing the data in a time-series fashion thanks to InfluxDB and Grafana. Packet.com has datacenters around the world, and their native tooling and user interface make deploying Arm Servers quick and easy.

Now that the system architecture has been described, let’s take a look at the application architecture, and start to dissect how data flows from the IoT Endpoints, to the Edge, to the Cloud. As mentioned, Mbed Linux OS is a container-based OS, which is to say that it is a minimal underlying operating system based on Yocto, providing a small, lightweight, secure foundation to which the Open Container Initiative (OCI) “RunC” container engine is added.  RunC can launch OCI compliant containers built locally on your laptop, then pushed to the Endpoint via the Mbed Linux tooling, no matter where the device is located.  In our particular case, we chose a small Alpine Linux container, added Python, added the Sparkfun libraries for the sensors, and created a small startup script to begin reading data from the sensors when the container starts.  The container also has an MQTT broker in it, which is responsible for taking that sensor data, turning it into a small JSON snippet, and publishing it to a specific known location (the Edge Server).

The Edge Servers are a more traditional Debian operating system, with Python installed as well.  There is a Python script running as a daemon that captures and parses the incoming MQTT from IoT Endpoints, converts it to an InfluxDB formatted query, and publishes it to the specified Influx database, which is running on the Ampere eMAG Cloud Server.

Finally, the Cloud Server is an enterprise-grade Ampere eMAG Arm Server.  It is graciously hosted by the Works on Arm project at Packet.com, in their New Jersey datacenter. This server is also running Debian, and has InfluxDB and Grafana installed for storage and visualization of the data being sent to it from the Edge Nodes.  Thus, our IoT, Edge, and Server are all Arm-powered!

Construction Challenges

Building a container to hold our application did prove more challenging then anticipated, as a result of some needed functionality not provided by the ready-made Mbed Linux downloads. Normally, this could be easily solved by adding the desired packages to the Yocto build scripts and rebuilding from source…however, there is one additional and very unique quirk to this project: We have decided to exclusively use Arm-powered Windows on Snapdragon laptops to build the project!  These laptops are highly efficient, with all-day battery life and far better performance than previous generations offered.  One limitation however, is they are currently unable to run Docker, which we would need to re-build Mbed Linux from source.  Thus, instead of adding the necessary packages to Yocto and recompiling, we instead had to manually port Device Tree functionality, gain access to the GPIO pins on the Pi, enable the I2C bus and tooling, and finally expose that functionality from the host OS to the container, all by way of manually lifting from Raspbian.  Obviously, we placed this limitation upon ourselves, but it does demonstrate that there are still a few shortcomings in the developer experience on Arm.

A second valuable lesson learned is with the native Mbed tooling for initially deploying devices.  Provisioning and updating devices with Pelion Device Management is a straightforward process, except for one small but critical hiccup we experienced.  It is worth noting here again that Mbed Linux OS is in a Technical Preview status, and the feedback we were able to give to the Mbed team as a result of this process has been incorporated and will make the final product even better!  However, when following the documentation to provision devices for the first time, a Developer Certificate is issued. That certificate is only valid for 90 days, and after that time you can no longer push containers to a device in the field. That Certificate can certainly be updated via the re-provisioning process, but, you must be on the same network as the device in order to perform that action. Our devices are already out in the field, so that is not possible at this point.  Thus, we have a fleet of devices that cannot receive their intended application.  On the plus side, this exercise proved it’s worth by highlighting this point of failure, and resulted in the valuable documentation update so that your project can be a success!

Conclusion

In the end, we were able to successfully provision just a few devices that we still had local access to, and prove that the theory was sound and demonstrate a functional prototype at Arm TechCon!

Using a pair of freshly provisioned Raspberry Pi’s, the containerized application was pushed Over The Air to them, via the Mbed CLI.  Pelion showed the devices as Online, and the device and application logs in the Dashboard reported the container was started successfully.  Sure enough, on the Edge Node, data began streaming in, and the MQTT Broker began taking those transmissions, translating them to Influx, and sending them upstream to the Cloud Server.  Logged into Grafana running on the Cloud Server, that data could then be inspected and visualized.

Thus, while it wasn’t quite as geographically diverse as hoped, we did actually accomplish what we set out to do, which was build an end-to-end IoT, Edge, and Cloud infrastructure running entirely on Arm!  The data that is flowing is certainly just a minimal example, but as a proof-of-concept we can truthfully say that the design is valid and the system works!  Now, we’re excited to see what you can build to bring Simon Segar’s vision of the connected future to life!

Posted on 1 Comment

Raspberry Pi 4 AI Server Now Available

As pioneers in the Arm micro server ecosystem, miniNodes has been an innovator and leading expert in the use of small devices to fulfill compute capacity at the Edge, watched as IoT has matured and impacted all industries, and is now witnessing AI and Machine Learning depart the Cloud and instead be performed on-device or at the Edge of the network. More and more phones, home assistant devices (such as Echo), and even laptops are including custom AI hardware accelerator chips designed to handle voice recognition, gesture and motion control, object detection, analyze video and camera feeds, and perform many other deep learning tasks.

The AI models and algorithms that make this happen have to be trained and tested on specialized hardware accelerators as well, and historically that has been very expensive to perform in the cloud. miniNodes is taking a different approach however, and pairing custom hardware AI Accelerators with cost effective Raspberry Pi 4 servers, to lower the cost of testing and training these models while still maintaining high levels of performance. Deep learning, neural network, and matrix multiplication activities can be offloaded to the AI hardware, rapidly accelerating the model training.

The first product to launch in the new miniNodes AI Server lineup is a Raspberry Pi 4 server combined with a Gyrfalcon 2801 NPU, for a maximum of 5.6 TOP/s of dedicated AI processing power. In the future, we will expand the lineup to include Google Coral and Intel Movidius hardware as well.

The Raspberry Pi has always been one of the most popular hosted Arm Servers at miniNodes, even as far back as the original Raspberry Pi (1) Model B, some of which are still running! Over the years, we upgraded to Raspberry Pi 2’s, 3’s, and the 3+. So, it was only a matter of time until we deployed new, faster Raspberry Pi 4 servers.

However, with the launch of the Pi 4 and its increased capabilities, we decided it was time to upgrade our infrastructure, management, and backend systems to match. That work is actually still ongoing, but in order to start testing the ability to run AI workloads, we have made a few units available for early adopters to begin testing their ML models. If you are looking for a cost effective way to get started with AI processing or are interested in testing AI/ML on Arm cores, this is a great way to get started. Check out the new miniNodes AI Arm Server here: https://www.mininodes.com/product/raspberrypi-4-ai-server/

And if you have any questions, just drop us a note at info@mininodes.com!

Posted on Leave a comment

Arm Server Update, Winter 2020 Edition

In the months since our last update, as usual, much has changed in the Arm Server ecosystem!  When assessing the industry and product performance in such an emerging field, things move fast!  Here are few observations and notes on the second half of 2019, and a look ahead to what to is forthcoming in 2020 for Arm Servers.

First, Marvell has continued to focus on the HPC market, and has promoted their ThunderX2 processor in talks, marketing materials, and social media posts focused on their National Laboratory projects and installations.  There is also some preliminary talk about their next generation product, the ThunderX3, though details are limited at the time of this writing.  In a sign of confidence, Arm directly invested a significant sum of money in the Marvell Arm Server processor as well.

Meanwhile, Ampere has had continued success with their eMag processor, including server sales through Lenovo, and a Workstation version of the platform now available as well.  Similar to Marvell, Ampere has begun to discuss their next generation processor as well, stating that it will have 80 cores, and will be based on the Arm Neoverse N1 reference architecture.  Again similar to Marvell, Arm has also invested directly in Ampere to keep momentum and product development strong, and more recently Oracle has invested as well.

Amazon has been competing strongly in the Arm Server market for the past year with their Graviton platform, which powers the AWS A1 Instance Type.  At the re:Invent conference, Amazon announced the Graviton2 processor, which will be available soon, and increases the core count to 64 and “deliver 7x performance, 4x the number of compute cores, 2x larger caches, and 5x faster memory compared to the first-generation Graviton processors” according to Amazon.

The last item to make note of is the SolidRun Honeycomb platform, which is technically marketed as a Developer Workstation, but could quite easily be adapted to a small server.  It offers a 16-core NXP Layerscape SoC, 4x 10gb Ethernet, SATA, PCIe, and a standard mini-ITX footprint.

As usual, if you have anything to add to the conversation, simply add your comments below, and we will continue to offer analysis and insight to all things Arm Servers!