In a short filesystem-only discussion at the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Jérôme Glisse wanted to talk about some (more) changes to support GPUs, FPGAs, and RDMA devices. In other talks at LSFMM, he discussed changes to struct page in support of these kinds of devices, but here he was looking to discuss other changes to support mapping a device’s memory into multiple processes. It should be noted that I had a hard time following the discussion in this session, so there may be significant gaps in the article.
What happens when you combine the Internet of Things, Machine Learning, and Edge Computing? Before I tell you, let’s review each one and discuss what AWS has to offer.
Internet of Things (IoT) – Devices that connect the physical world and the digital one. The devices, often equipped with one or more types of sensors, can be found in factories, vehicles, mines, fields, homes, and so forth. Important AWS services include AWS IoT Core, AWS IoT Analytics, AWS IoT Device Management, and Amazon FreeRTOS, along with others that you can find on the AWS IoT page.
Machine Learning (ML) – Systems that can be trained using an at-scale dataset and statistical algorithms, and used to make inferences from fresh data. At Amazon we use machine learning to drive the recommendations that you see when you shop, to optimize the paths in our fulfillment centers, fly drones, and much more. We support leading open source machine learning frameworks such as TensorFlow and MXNet, and make ML accessible and easy to use through Amazon SageMaker. We also provide Amazon Rekognition for images and for video, Amazon Lex for chatbots, and a wide array of language services for text analysis, translation, speech recognition, and text to speech.
Edge Computing – The power to have compute resources and decision-making capabilities in disparate locations, often with intermittent or no connectivity to the cloud. AWS Greengrass builds on AWS IoT, giving you the ability to run Lambda functions and keep device state in sync even when not connected to the Internet.
ML Inference at the Edge Today I would like to toss all three of these important new technologies into a blender! You can now perform Machine Learning inference at the edge using AWS Greengrass. This allows you to use the power of the AWS cloud (including fast, powerful instances equipped with GPUs) to build, train, and test your ML models before deploying them to small, low-powered, intermittently-connected IoT devices running in those factories, vehicles, mines, fields, and homes that I mentioned.
Here are a few of the many ways that you can put Greengrass ML Inference to use:
Precision Farming – With an ever-growing world population and unpredictable weather that can affect crop yields, the opportunity to use technology to increase yields is immense. Intelligent devices that are literally in the field can process images of soil, plants, pests, and crops, taking local corrective action and sending status reports to the cloud.
Physical Security – Smart devices (including the AWS DeepLens) can process images and scenes locally, looking for objects, watching for changes, and even detecting faces. When something of interest or concern arises, the device can pass the image or the video to the cloud and use Amazon Rekognition to take a closer look.
Industrial Maintenance – Smart, local monitoring can increase operational efficiency and reduce unplanned downtime. The monitors can run inference operations on power consumption, noise levels, and vibration to flag anomalies, predict failures, detect faulty equipment.
Greengrass ML Inference Overview There are several different aspects to this new AWS feature. Let’s take a look at each one:
Machine Learning Models – Precompiled TensorFlow and MXNet libraries, optimized for production use on the NVIDIA Jetson TX2 and Intel Atom devices, and development use on 32-bit Raspberry Pi devices. The optimized libraries can take advantage of GPU and FPGA hardware accelerators at the edge in order to provide fast, local inferences.
Model Building and Training – The ability to use Amazon SageMaker and other cloud-based ML tools to build, train, and test your models before deploying them to your IoT devices. To learn more about SageMaker, read Amazon SageMaker – Accelerated Machine Learning.
Model Deployment – SageMaker models can (if you give them the proper IAM permissions) be referenced directly from your Greengrass groups. You can also make use of models stored in S3 buckets. You can add a new machine learning resource to a group with a couple of clicks:
This post courtesy of Aaron Friedman, Healthcare and Life Sciences Partner Solutions Architect, AWS and Angel Pizarro, Genomics and Life Sciences Senior Solutions Architect, AWS
Precision medicine is tailored to individuals based on quantitative signatures, including genomics, lifestyle, and environment. It is often considered to be the driving force behind the next wave of human health. Through new initiatives and technologies such as population-scale genomics sequencing and IoT-backed wearables, researchers and clinicians in both commercial and public sectors are gaining new, previously inaccessible insights.
Many of these precision medicine initiatives are already happening on AWS. A few of these include:
PrecisionFDA – This initiative is led by the US Food and Drug Administration. The goal is to define the next-generation standard of care for genomics in precision medicine.
Deloitte ConvergeHEALTH – Gives healthcare and life sciences organizations the ability to analyze their disparate datasets on a singular real world evidence platform.
Central to many of these initiatives is genomics, which gives healthcare organizations the ability to establish a baseline for longitudinal studies. Due to its wide applicability in precision medicine initiatives—from rare disease diagnosis to improving outcomes of clinical trials—genomics data is growing at a larger rate than Moore’s law across the globe. Many expect these datasets to grow to be in the range of tens of exabytes by 2025.
Genomics data is also regularly re-analyzed by the community as researchers develop new computational methods or compare older data with newer genome references. These trends are driving innovations in data analysis methods and algorithms to address the massive increase of computational requirements.
Edico Genome, an AWS Partner Network (APN) Partner, has developed a novel solution that accelerates genomics analysis using field-programmable gate arrays, or FPGAs. Historically, Edico Genome deployed their FPGA appliances on-premises. When AWS announced the Amazon EC2 F1 PGA-based instance family in December 2016, Edico Genome adopted a cloud-first strategy, became a F1 launch partner, and was one of the first partners to deploy FPGA-enabled applications on AWS.
On October 19, 2017, Edico Genome partnered with the Children’s Hospital of Philadelphia (CHOP) to demonstrate their FPGA-accelerated genomic pipeline software, called DRAGEN. It can significantly reduce time-to-insight for patient genomes, and analyzed 1,000 genomes from the Center for Applied Genomics Biobank in the shortest time possible. This set a Guinness World Record for the fastest analysis of 1000 whole human genomes, and they did this using 1000 EC2 f1.2xlarge instances in a single AWS region. Not only were they able to analyze genomes at high throughput, they did so averaging approximately $3 per whole human genome of AWS compute for the analysis.
The version of DRAGEN that Edico Genome used for this analysis was also the same one used in the precisionFDA Hidden Treasures – Warm Up challenge, where they were one of the top performers in every assessment.
In the remainder of this post, we walk through the architecture used by Edico Genome, combining EC2 F1 instances and AWS Batch to achieve this milestone.
EC2 F1 instances and Edico’s DRAGEN
EC2 F1 instances provide access to programmable hardware-acceleration using FPGAs at a cloud scale. AWS customers use F1 instances for a wide variety of applications, including big data, financial analytics and risk analysis, image and video processing, engineering simulations, AR/VR, and accelerated genomics. Edico Genome’s FPGA-backed DRAGEN Bio-IT Platform is now integrated with EC2 F1 instances. You can access the accuracy, speed, flexibility, and low compute cost of DRAGEN through a number of third-party platforms, AWS Marketplace, and Edico Genome’s own platform. The DRAGEN platform offers a scalable, accelerated, and cost-efficient secondary analysis solution for a wide variety of genomics applications. Edico Genome also provides a highly optimized mechanism for the efficient storage of genomic data.
Scaling DRAGEN on AWS
Edico Genome used 1,000 EC2 F1 instances to help their customer, the Children’s Hospital of Philadelphia (CHOP), to process and analyze all 1,000 whole human genomes in parallel. They used AWS Batch to provision compute resources and orchestrate DRAGEN compute jobs across the 1,000 EC2 F1 instances. This solution successfully addressed the challenge of creating a scalable genomic processing pipeline that can easily scale to thousands of engines running in parallel.
Architecture
A simplified view of the architecture used for the analysis is shown in the following diagram:
DRAGEN’s portal uses Elastic Load Balancing and Auto Scaling groups to scale out EC2 instances that submitted jobs to AWS Batch.
Job metadata is stored in their Workflow Management (WFM) database, built on top of Amazon Aurora.
The DRAGEN Workflow Manager API submits jobs to AWS Batch.
These jobs are executed on the AWS Batch managed compute environment that was responsible for launching the EC2 F1 instances.
These jobs run as Docker containers that have the requisite DRAGEN binaries for whole genome analysis.
As each job runs, it retrieves and stores genomics data that is staged in Amazon S3.
The steps listed previously can also be bucketed into the following higher-level layers:
Workflow: Edico Genome used their Workflow Management API to orchestrate the submission of AWS Batch jobs. Metadata for the jobs (such as the S3 locations of the genomes, etc.) resides in the Workflow Management Database backed by Amazon Aurora.
Batch execution: AWS Batch launches EC2 F1 instances and coordinates the execution of DRAGEN jobs on these compute resources. AWS Batch enabled Edico to quickly and easily scale up to the full number of instances they needed as jobs were submitted. They also scaled back down as each job was completed, to optimize for both cost and performance.
Compute/job: Edico Genome stored their binaries in a Docker container that AWS Batch deployed onto each of the F1 instances, giving each instance the ability to run DRAGEN without the need to pre-install the core executables. The AWS based DRAGEN solution streams all genomics data from S3 for local computation and then writes the results to a destination bucket. They used an AWS Batch job role that specified the IAM permissions. The role ensured that DRAGEN only had access to the buckets or S3 key space it needed for the analysis. Jobs didn’t need to embed AWS credentials.
In the following sections, we dive deeper into several tasks that enabled Edico Genome’s scalable FPGA genome analysis on AWS:
Prepare your Amazon FPGA Image for AWS Batch
Create a Dockerfile and build your Docker image
Set up your AWS Batch FPGA compute environment
Prerequisites
In brief, you need a modern Linux distribution (3.10+), Amazon ECS Container Agent, awslogs driver, and Docker configured on your image. There are additional recommendations in the Compute Resource AMI specification.
Preparing your Amazon FPGA Image for AWS Batch
You can use any Amazon Machine Image (AMI) or Amazon FPGA Image (AFI) with AWS Batch, provided that it meets the Compute Resource AMI specification. This gives you the ability to customize any workload by increasing the size of root or data volumes, adding instance stores, and connecting with the FPGA (F) and GPU (G and P) instance families.
Next, install the AWS CLI:
pip install awscli
Add any additional software required to interact with the FPGAs on the F1 instances.
As a starting point, AWS publishes an FPGA Developer AMI in the AWS Marketplace. It is based on a CentOS Linux image and includes pre-integrated FPGA development tools. It also includes the runtime tools required to develop and use custom FPGAs for hardware acceleration applications.
For more information about how to set up custom AMIs for your AWS Batch managed compute environments, see Creating a Compute Resource AMI.
Building your Dockerfile
There are two common methods for connecting to AWS Batch to run FPGA-enabled algorithms. The first method, which is the route Edico Genome took, involves storing your binaries in the Docker container itself and running that on top of an F1 instance with Docker installed. The following code example is what a Dockerfile to build your container might look like for this scenario.
# DRAGEN_EXEC Docker image generator --
# Run this Dockerfile from a local directory that contains the latest release of
# - Dragen RPM and Linux DMA Driver available from Edico
# - Edico's Dragen WFMS Wrapper files
FROM centos:centos7
RUN rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# Install Basic packages needed for Dragen
RUN yum -y install \
perl \
sos \
coreutils \
gdb \
time \
systemd-libs \
bzip2-libs \
R \
ca-certificates \
ipmitool \
smartmontools \
rsync
# Install the Dragen RPM
RUN mkdir -m777 -p /var/log/dragen /var/run/dragen
ADD . /root
RUN rpm -Uvh /root/edico_driver*.rpm || true
RUN rpm -Uvh /root/dragen-aws*.rpm || true
# Auto generate the Dragen license
RUN /opt/edico/bin/dragen_lic -i auto
#########################################################
# Now install the Edico WFMS "Wrapper" functions
# Add development tools needed for some util
RUN yum groupinstall -y "Development Tools"
# Install necessary standard packages
RUN yum -y install \
dstat \
git \
python-devel \
python-pip \
time \
tree && \
pip install --upgrade pip && \
easy_install requests && \
pip install psutil && \
pip install python-dateutil && \
pip install constants && \
easy_install boto3
# Setup Python path used by the wrapper
RUN mkdir -p /opt/workflow/python/bin
RUN ln -s /usr/bin/python /opt/workflow/python/bin/python2.7
RUN ln -s /usr/bin/python /opt/workflow/python/bin/python
# Install d_haul and dragen_job_execute wrapper functions and associated packages
RUN mkdir -p /root/wfms/trunk/scheduler/scheduler
COPY scheduler/d_haul /root/wfms/trunk/scheduler/
COPY scheduler/dragen_job_execute /root/wfms/trunk/scheduler/
COPY scheduler/scheduler/aws_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/constants.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/job_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/logger.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/scheduler_utils.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/webapi.py /root/wfms/trunk/scheduler/scheduler/
COPY scheduler/scheduler/wfms_exception.py /root/wfms/trunk/scheduler/scheduler/
RUN touch /root/wfms/trunk/scheduler/scheduler/__init__.py
# Landing directory should be where DJX is located
WORKDIR "/root/wfms/trunk/scheduler/"
# Debug print of container's directories
RUN tree /root/wfms/trunk/scheduler
# Default behaviour. Over-ride with --entrypoint on docker run cmd line
ENTRYPOINT ["/root/wfms/trunk/scheduler/dragen_job_execute"]
CMD []
Note: Edico Genome’s custom Python wrapper functions for its Workflow Management System (WFMS) in the latter part of this Dockerfile should be replaced with functions that are specific to your workflow.
The second method is to install binaries and then use Docker as a lightweight connector between AWS Batch and the AFI. For example, this might be a route you would choose to use if you were provisioning DRAGEN from the AWS Marketplace.
In this case, the Dockerfile would not contain the installation of the binaries to run DRAGEN, but would contain any other packages necessary for job completion. When you run your Docker container, you enable Docker to access the underlying file system.
Connecting to AWS Batch
AWS Batch provisions compute resources and runs your jobs, choosing the right instance types based on your job requirements and scaling down resources as work is completed. AWS Batch users submit a job, based on a template or “job definition” to an AWS Batch job queue.
Job queues are mapped to one or more compute environments that describe the quantity and types of resources that AWS Batch can provision. In this case, Edico created a managed compute environment that was able to launch 1,000 EC2 F1 instances across multiple Availability Zones in us-east-1. As jobs are submitted to a job queue, the service launches the required quantity and types of instances that are needed. As instances become available, AWS Batch then runs each job within appropriately sized Docker containers.
The Edico Genome workflow manager API submits jobs to an AWS Batch job queue. This job queue maps to an AWS Batch managed compute environment containing On-Demand F1 instances. In this section, you can set this up yourself.
To create the compute environment that DRAGEN can use:
An f1.2xlarge EC2 instance contains one FPGA, eight vCPUs, and 122-GiB RAM. As DRAGEN requires an entire FPGA to run, Edico Genome needed to ensure that only one analysis per time executed on an instance. By using the f1.2xlarge vCPUs and memory as a proxy in their AWS Batch job definition, Edico Genome could ensure that only one job runs on an instance at a time. Here’s what that looks like in the AWS CLI:
You can query the status of your DRAGEN job with the following command:
aws batch describe-jobs --jobs <the job ID from the above command>
The logs for your job are written to the /aws/batch/job CloudWatch log group.
Conclusion
In this post, we demonstrated how to set up an environment with AWS Batch that can run DRAGEN on EC2 F1 instances at scale. If you followed the walkthrough, you’ve replicated much of the architecture Edico Genome used to set the Guinness World Record.
There are several ways in which you can harness the computational power of DRAGEN to analyze genomes at scale. First, DRAGEN is available through several different genomics platforms, such as the DNAnexus Platform. DRAGEN is also available on the AWS Marketplace. You can apply the architecture presented in this post to build a scalable solution that is both performant and cost-optimized.
For more information about how AWS Batch can facilitate genomics processing at scale, be sure to check out our aws-batch-genomics GitHub repo on high-throughput genomics on AWS.
The recent addition of Xilinx FPGAs to AWS Cloud compute offerings is one way that AWS is enabling global growth in the areas of advanced analytics, deep learning and AI. The customized F1 servers use pooled accelerators, enabling interconnectivity of up to 8 FPGAs, each one including 64 GiB DDR4 ECC protected memory, with a dedicated PCIe x16 connection. That makes this a powerful engine with the capacity to process advanced analytical applications at scale, at a significantly faster rate. For example, AWS commercial partner Edico Genome is able to achieve an approximately 30X speedup in analyzing whole genome sequencing datasets using their DRAGEN platform powered with F1 instances.
While the availability of FPGA F1 compute on-demand provides clear accessibility and cost advantages, many mainstream users are still finding that the “threshold to entry” in developing or running FPGA-accelerated simulations is too high. Researchers at the UC Berkeley RISE Lab have developed “FireSim”, powered by Amazon FPGA F1 instances as an open-source resource, FireSim lowers that entry bar and makes it easier for everyone to leverage the power of an FPGA-accelerated compute environment. Whether you are part of a small start-up development team or working at a large datacenter scale, hardware-software co-design enables faster time-to-deployment, lower costs, and more predictable performance. We are excited to feature FireSim in this post from Sagar Karandikar and his colleagues at UC-Berkeley.
―Mia Champion, Sr. Data Scientist, AWS
Mapping an 8-node FireSim cluster simulation to Amazon EC2 F1
As traditional hardware scaling nears its end, the data centers of tomorrow are trending towards heterogeneity, employing custom hardware accelerators and increasingly high-performance interconnects. Prototyping new hardware at scale has traditionally been either extremely expensive, or very slow. In this post, I introduce FireSim, a new hardware simulation platform under development in the computer architecture research group at UC Berkeley that enables fast, scalable hardware simulation using Amazon EC2 F1 instances.
FireSim benefits both hardware and software developers working on new rack-scale systems: software developers can use the simulated nodes with new hardware features as they would use a real machine, while hardware developers have full control over the hardware being simulated and can run real software stacks while hardware is still under development. In conjunction with this post, we’re releasing the first public demo of FireSim, which lets you deploy your own 8-node simulated cluster on an F1 Instance and run benchmarks against it. This demo simulates a pre-built “vanilla” cluster, but demonstrates FireSim’s high performance and usability.
Why FireSim + F1?
FPGA-accelerated hardware simulation is by no means a new concept. However, previous attempts to use FPGAs for simulation have been fraught with usability, scalability, and cost issues. FireSim takes advantage of EC2 F1 and open-source hardware to address the traditional problems with FPGA-accelerated simulation: Problem #1: FPGA-based simulations have traditionally been expensive, difficult to deploy, and difficult to reproduce. FireSim uses public-cloud infrastructure like F1, which means no upfront cost to purchase and deploy FPGAs. Developers and researchers can distribute pre-built AMIs and AFIs, as in this public demo (more details later in this post), to make experiments easy to reproduce. FireSim also automates most of the work involved in deploying an FPGA simulation, essentially enabling one-click conversion from new RTL to deploying on an FPGA cluster.
Problem #2: FPGA-based simulations have traditionally been difficult (and expensive) to scale. Because FireSim uses F1, users can scale out experiments by spinning up additional EC2 instances, rather than spending hundreds of thousands of dollars on large FPGA clusters.
Problem #3: Finding open hardware to simulate has traditionally been difficult.Finding open hardware that can run real software stacks is even harder. FireSim simulates RocketChip, an open, silicon-proven, RISC-V-based processor platform, and adds peripherals like a NIC and disk device to build up a realistic system. Processors that implement RISC-V automatically support real operating systems (such as Linux) and even support applications like Apache and Memcached. We provide a custom Buildroot-based FireSim Linux distribution that runs on our simulated nodes and includes many popular developer tools.
Problem #4: Writing hardware in traditional HDLs is time-consuming. Both FireSim and RocketChip use the Chisel HDL, which brings modern programming paradigms to hardware description languages. Chisel greatly simplifies the process of building large, highly parameterized hardware components.
How to use FireSim for hardware/software co-design
FireSim drastically improves the process of co-designing hardware and software by acting as a push-button interface for collaboration between hardware developers and systems software developers. The following diagram describes the workflows that hardware and software developers use when working with FireSim.
Figure 2. The FireSim custom hardware development workflow.
The hardware developer’s view:
Write custom RTL for your accelerator, peripheral, or processor modification in a productive language like Chisel.
Run a software simulation of your hardware design in standard gate-level simulation tools for early-stage debugging.
Run FireSim build scripts, which automatically build your simulation, run it through the Vivado toolchain/AWS shell scripts, and publish an AFI.
Deploy your simulation on EC2 F1 using the generated simulation driver and AFI
Run real software builds released by software developers to benchmark your hardware
The software developer’s view:
Deploy the AMI/AFI generated by the hardware developer on an F1 instance to simulate a cluster of nodes (or scale out to many F1 nodes for larger simulated core-counts).
Connect using SSH into the simulated nodes in the cluster and boot the Linux distribution included with FireSim. This distribution is easy to customize, and already supports many standard software packages.
Directly prototype your software using the same exact interfaces that the software will see when deployed on the real future system you’re prototyping, with the same performance characteristics as observed from software, even at scale.
FireSim demo v1.0
Figure 3. Cluster topology simulated by FireSim demo v1.0.
This first public demo of FireSim focuses on the aforementioned “software-developer’s view” of the custom hardware development cycle. The demo simulates a cluster of 1 to 8 RocketChip-based nodes, interconnected by a functional network simulation. The simulated nodes work just like “real” machines: they boot Linux, you can connect to them using SSH, and you can run real applications on top. The nodes can see each other (and the EC2 F1 instance on which they’re deployed) on the network and communicate with one another. While the demo currently simulates a pre-built “vanilla” cluster, the entire hardware configuration of these simulated nodes can be modified after FireSim is open-sourced.
In this post, I walk through bringing up a single-node FireSim simulation for experienced EC2 F1 users. For more detailed instructions for new users and instructions for running a larger 8-node simulation, see FireSim Demo v1.0 on Amazon EC2 F1. Both demos walk you through setting up an instance from a demo AMI/AFI and booting Linux on the simulated nodes. The full demo instructions also walk you through an example workload, running Memcached on the simulated nodes, with YCSB as a load generator to demonstrate network functionality.
Deploying the demo on F1
In this release, we provide pre-built binaries for driving simulation from the host and a pre-built AFI that contains the FPGA infrastructure necessary to simulate a RocketChip-based node.
Starting your F1 instances
First, launch an instance using the free FireSim Demo v1.0 product available on the AWS Marketplace on an f1.2xlarge instance. After your instance has booted, log in using the user name centos. On the first login, you should see the message “FireSim network config completed.” This sets up the necessary tap interfaces and bridge on the EC2 instance to enable communicating with the simulated nodes.
AMI contents
The AMI contains a variety of tools to help you run simulations and build software for RISC-V systems, including the riscv64 toolchain, a Buildroot-based Linux distribution that runs on the simulated nodes, and the simulation driver program. For more details, see the AMI Contents section on the FireSim website.
Single-node demo
First, you need to flash the FPGA with the FireSim AFI. To do so, run:
This automatically calls the simulation driver, telling it to load the Linux kernel image and root filesystem for the Linux distro. This produces output similar to the following:
Simulations Started. You can use the UART console of each simulated node by attaching to the following screens:
There is a screen on:
2492.fsim0 (Detached)
1 Socket in /var/run/screen/S-centos.
You could connect to the simulated UART console by connecting to this screen, but instead opt to use SSH to access the node instead.
First, ping the node to make sure it has come online. This is currently required because nodes may get stuck at Linux boot if the NIC does not receive any network traffic. For more information, see Troubleshooting/Errata. The node is always assigned the IP address 192.168.1.10:
This should eventually produce the following output:
PING 192.168.1.10 (192.168.1.10) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Destination Host Unreachable
…
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=2017 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=1018 ms
64 bytes from 192.168.1.10: icmp_seq=3 ttl=64 time=19.0 ms
…
At this point, you know that the simulated node is online. You can connect to it using SSH with the user name root and password firesim. It is also convenient to make sure that your TERM variable is set correctly. In this case, the simulation expects TERM=linux, so provide that:
At this point, you’re connected to the simulated node. Run uname -a as an example. You should see the following output, indicating that you’re connected to a RISC-V system:
# uname -a
Linux buildroot 4.12.0-rc2 #1 Fri Aug 4 03:44:55 UTC 2017 riscv64 GNU/Linux
Now you can run programs on the simulated node, as you would with a real machine. For an example workload (running YCSB against Memcached on the simulated node) or to run a larger 8-node simulation, see the full FireSim Demo v1.0 on Amazon EC2 F1 demo instructions.
Finally, when you are finished, you can shut down the simulated node by running the following command from within the simulated node:
# poweroff
You can confirm that the simulation has ended by running screen -ls, which should now report that there are no detached screens.
Future plans
At Berkeley, we’re planning to keep improving the FireSim platform to enable our own research in future data center architectures, like FireBox. The FireSim platform will eventually support more sophisticated processors, custom accelerators (such as Hwacha), network models, and peripherals, in addition to scaling to larger numbers of FPGAs. In the future, we’ll open source the entire platform, including Midas, the tool used to transform RTL into FPGA simulators, allowing users to modify any part of the hardware/software stack. Follow @firesimproject on Twitter to stay tuned to future FireSim updates.
Acknowledgements
FireSim is the joint work of many students and faculty at Berkeley: Sagar Karandikar, Donggyu Kim, Howard Mao, David Biancolin, Jack Koenig, Jonathan Bachrach, and Krste Asanović. This work is partially funded by AWS through the RISE Lab, by the Intel Science and Technology Center for Agile HW Design, and by ASPIRE Lab sponsors and affiliates Intel, Google, HPE, Huawei, NVIDIA, and SK hynix.
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.