Importance of Natural Resources

UC Berkeley Cloud Computing Meetup 005


– All right, great. Thank you everybody for coming out on a beautiful Tuesday night. It’s great to see so many RSVPs. I think it was the most number of RSVPs, but maybe also the most drop-off. (laughing) Must be ’cause it’s such a beautiful day. So I’m not gonna say too much other than just to introduce who our
first three guests are. This is our first UC Berkeley
Cloud Computing meetup. And I’ll talk a little bit
about why we have the meetup. But our first speaker is Ira Tarshis, who is a cloud services
architect at UC Davis. And he’s gonna talk about controlled unclassified information, see why it’s part of NIST 800-171, and there’s a lot of security people in the audience, I can tell. So that’ll be the first talk. And then we’re gonna
hear from Pavan Gupta, who’s a digital health
engineer working at UCSF. It’s the summer, so I
think we’re widening out and we’re including other parts of the UC. (laughing) Which is pretty great. So he’s deployed a HIPAA-compliant
research environment, and he’s gonna talk about that. And then our third speaker
is the head of product for Kiwi Campus, that is Sasha Iatsenia. I think I’ve got that right. And so, he’s gonna talk about how they use different cloud providers to do different types of tasks, ’cause they have the machine learning that they use to train the
mostly autonomous robots that deliver food all over campus and that you’ve seen around campus. I guarantee you’ve seen them. So I wanted to thank our sponsors. So, Caroline Winnett is the executive director of the SkyDeck. She couldn’t be here today, but I’m gonna briefly hand
the microphone to Peng. He’s a program coordinator
here at the SkyDeck, and responsible for all the logistics and helping us get this set up. – Thank you, Bill. Hi everyone, my name is Gordon. I’m the program coordinator
here at SkyDeck. Thank you all for coming today. Please enjoy our space. We host this every last
Tuesday of the month. And what we do is we are UC Berkeley’s
own startup accelerator, and we have a fund
component to the accelerator that contributes half
of the ultimate carry of when there’s an exit of any startup back to the campus itself. So it’s a ecosystem that drives not only entrepreneurship at Berkeley and across the UC system, but also it has this
feedback loop going on to generate more revenue
for not only the university but also for future endeavors, especially in the entrepreneurship space. So if you have any questions, feel free to reach out to me
as well, with regard to that. – Thank you, Gordon. So I also wanna thank
all of the co-organizers that helped us out in getting this set up. So Amy Neeser from Research IT, Jason Christopher, also from Research IT, Anthony Suen, my partner
here in crime today from the Division of Data Sciences, Cathryn Carson is also one of our sponsors from Data Science Initiative, along with Jenn Stringer in the back, our deputy CIO for campus. And I’m gonna ask Cathryn to
come up and say a few words, and then we’ll talk about why
we’re actually doing this. – Yeah, thank you Bill. As you see, there’s a
coalition of partners who are bringing together the cloud data. And just speaking on
behalf of all of them, I wanted to make sure that
the spirit of inclusion and welcoming to people with all kinds of technical backgrounds and all kinds of concerns and all kinds of places, you know, sitting in all kinds of places
on their learning curve, is something that you feel
as you come into this room. Because it’s really
important for all of us, and I speak here in the
role as the faculty lead for the last two years of our undergraduate data science program, to help everyone feel that
they can get the benefits of cloud computing, new technologies that we
can now run on the cloud, to make things easier for
whatever areas we’re working in. So this notion that is
expressed in the meetups, the orientation materials, the experience that we want
you to have in the room here, is one of collective learning
and sharing information, and everyone feeling like
they’re welcome to contribute and welcome to question. I can speak a little bit also about the kinds of ways that
we try to embody the spirit in the new division of Data
Science and Information, which has been formally
set up starting July first and which will come into being as a kind of emergent,
inclusive academic space for everything connecting to
computing and data and society across UC Berkeley. And so, standing here as one of the team of faculty and staff and students who will build the division, I wanna express that special welcome to those of you who are coming
from outside UC Berkeley and are feeling that this
might be a place for you to do the learning and the contributions that you feel you have
particular to contribute to this process of not just technological, but also social, and even
organizational change. The division itself is an
emergent thing, as I mentioned, and so I won’t bore you with the details of how it’s actually possible to build an organization at UC Berkeley that includes the words,
quantum superposition, in its charter.
(laughing) If you’re interested in that,
come talk to me afterwards. But just take the spirit
of, we will find a way, we will figure out how
to connect to each other and share what we have,
and learn from each other. It’s fundamental to the
mission of the new division. And to share it with all of the partners who put together the cloud meetup. So with that spirit of
inclusion and also innovation that comes from bringing
those different voices, I wanna welcome you, and then pass the
microphone back once again to have the introduction
to our presenters. – Thank you, Cathryn. So once thing that I
always wanted to address sort of at the beginning is why, ’cause I get asked this, you know, why are we having a cloud meetup? And you know, I kind of view
that the answer is in the room. So we’ve done a poll at the
beginning of every session, and what we’ve found is we end
up with about 1/3, 1/3, 1/3 between IT staff, academics, and people from startups
and the outside community. And one of the things that I think is part of the spirit of inclusion is bringing everybody together to break down walls and silos and have people share information. And it was great. And the video will be posted
soon once we caption it, from last time. Someone watching the video
said this reminded them of a graduate seminar. There was one particular
place where someone was trying to solve a research problem, and a bunch of people
from all over the place helped them solve that problem. And I think it was relating to Globus and moving data around. So I’m gonna just gonna
do a quick poll again. How many of you here are academics, on the academic side of the university? So, three and a half. (laughing) It’s the summer. How many are IT staff? Okay, a large number
of IT staff this time. And how many of you are from
the community or from startups? So we have probably eight,
nine, or maybe 10 altogether. All right, thank you. So, as one of the other
things that we usually do, is we wanted to take a
moment for you to do, ’cause it’s a meetup. I wanted you to turn to
the person next to you on either side and talk about why you’re here tonight. We’ll give you a couple minutes
to introduce yourselves, and then we’ll come back
and we’ll start the talks. (murmuring) – Thanks. Yeah, so like Bill said,
my name’s Ira Tarshis. I’m from UC Davis. I do cloud systems engineering,
development, architecture. A bunch of cloud stuff within
our central IT organization. (clears throat) And yeah, I’m here today to talk about our NIST 800-171
compliant cloud environment that we built last year. So on top of some other stuff that I do, this was something that I worked on for my first year at UC Davis. And it’s sort of a
cross-functional, cross-team effort. And so, I’m gonna go through, you know, how we built it, a lot of
sort of administrative hurdles that we went through to get it set up, and then go through some of the technical architecture we have. And hopefully it’s interesting. And if it’s not, hopefully
the view makes up for it. (laughing) All right, so we’ll get started. So this is an agenda. Mostly a pointless slide,
(laughing) but that’s what’s gonna happen. So, our secure research
computing environment, our NIST 800-171 environment,
we nicknamed SRCE, which stands for Secure
Research Computing Environment, ’cause we’re super creative people. So, it’s, as I mentioned, it’s a NIST 800-171-compliant environment. We recently got a third-party attestation at the end of the year last year. It’s entirely deployed
on Amazon Web Services, so it’s a cloud-native, well, sort of. It’s cloud infrastructure, at any rate. And we did this as sort of a shared effort between my department,
the departmental IT group, and our ISO office, Information
Security Office on campus. And then a bunch of researchers
who did pilot testing, and are our clients,
basically, for this project. We were approached by a research team who really triggered the, were
the impetus for doing this. Back before I even joined UC Davis, they came to the ISO’s office
asking for an environment where they could do research on controlled, unclassified information. Basically, the had been
presented by data providers with a contract or a liability waiver asking them to assume liability for CUI that they were going to do research on, and telling them that they would be fined some tremendous amount of money if it was discovered
that they were not using a NIST 800-171 compliant environment. So they approached the
ISO office and central IT to help them do that with, and sort of assume some
of the technical overhead and security administrative
overhead doing that, because they didn’t have
the resources to invest in the operational overhead of becoming compliant, basically. So the challenge was,
we needed an environment that could provide secure handling of CUI, offer researchers something
that they were familiar with, that was a low barrier of entry for them, where they could run the tools that they’re used to
running against the data without a whole lot of
new training or education. We also needed to limit
our IT support footprint because we have limited resources in the central team that I work in. And so, we needed a lot of
help from the ISO’s office to produce all of the
tons of documentation that we needed for this, and we needed something
that we could support with minimal IT staff. And the idea at the end of all this was that we would get a
third-party attestation to compliance with NIST 800-171 so the researchers could feel comfortable assuming the liability for the data they were storing in the cloud. (clears throat) Yeah. So to scope the project,
we sort of created a cross-functional team. So we had volunteers who really needed this type of research
environment immediately from the College of Engineering. Professor Frank Loge, who works on water energy
efficiency research at the College of Engineering, approached us and wanted
to sort of beta test and pilot test the project, and has been incredibly
supportive during that time. Cheryl Washington, our chief information security
officer and her office and members of her team were instrumental in creating the documentation and reviewing the NIST standards to make sure we were meeting them, as well as providing sort of audit support throughout the process. And then campus IT, my team, which is sort of the central
IT organization of the campus, we had, usually on this
project, one full-time staff plus maybe some assistants working on it, and we assumed the brunt of
the technical implementation. And then we had some
departmental IT assistants from the College of
Engineering’s IT group, which was also instrumental
in getting this done. We wanted a really secure
and flexible environment, and a narrow scope for our audit purposes was really important. We didn’t have a lot of staff to build it, so shrinking it down to a
size that was easily auditable and isolated from all our other
stuff was really important. And so, we chose a public cloud provider. And we went with Amazon Web Services partially because that’s
where the expertise lied, partially because they
offer a bunch of services for doing compliance, and
have excellent support. And also, because they were
more mature at the time, when we went into this endeavor. So, why we chose to put this in the cloud. One, probably the primary reason is that it’s easy to limit the scope by putting a data center in the cloud. Our environment exists as a very small, sort of isolated environment with its own domain and its
own network infrastructure, and I’ll talk about that in a bit. And it was really easy to do this by putting it in the cloud. That way, you’re not connected
to any of your on-prem stuff, you don’t bring a bunch of
physical security into scope, and your campus data center
kind of stays out of scope. And as you’ll see in a minute, AWS actually assumes a lot of, sort of, they provide us a lot of things to cover the compliance workload. We also needed to build
something that we could automate and replicate very easily, and be sure that when we
were doing new deployments, or when new projects came onboard, we were consistently deploying things in a manner that, like, ensured
the security configurations were set up the way we wanted them. And that is much easier to do in the cloud than it is on-prem, at
least in my experience. And it also matches my
background a bit better. So, the, going to the cloud
ended up actually providing us with quite a bit in terms of value. So, NIST 800-171, for
folks who aren’t familiar, is basically a security control
with 14 control families. So these security standards are basically typically drafted with control families, and then within the control families, there’s controls that, they
can either be prescriptive or they can be sort of generalized. In the case of NIST 800-171,
they were fairly generalized. They don’t prescribe technical solutions. They say things like, you know, be sure to provide an access control and identification system. Okay. So they range in how they’re characterized and what the categories are from like, physical protection, media protection, to access control and identity management. They even have some standards around awareness and
training for personnel, that kind of thing. When we went to AWS, one of
the things that we noticed was a lot of the control
families no longer applied to the work we were doing. So what I did here is just to do some math to demonstrate this. I took the 111 NST 800-171 controls. We have to create a system security plan. That’s just something you have to do when you’re getting audited. And the system security
plan is basically a control, then the responsible party, and then how you address that control. And so, I took them and
I did some math on them. And basically what came
out was that AWS satisfies about 23% of our security controls. So for instance, like, physical security, we don’t even think about it, really, because it’s mostly covered by AWS. We can download the
attestations that they have through their auditors and present them to, you know,
to our auditor, basically. Media protection’s another one. But basically, a number of these controls are either serviced partly by AWS, or they have services
that are assisting us in addressing them. So that was a big help. And then the rest of it,
the other close to 2/3 of it is broken down like you
see in the chart here. So central IT, my organization, handles about 60% of
the compliance workload. ISO’s about 9%, and then
departmental IT’s about 10%. So that’s kinda how the
breakdown was for us in terms of the cross-functional work that needed to be done. And most of that stuff is documentation. So to meet the security requirement, the compliance requirement, we had, about half the work
was creating documentation, and about 35% of it
was following processes and making sure the
processes were documented and making sure we
documented our execution of the processes in the proper way. And then about 15% of it was
my job, which is, you know, the easy part, which was the technology and the technological implementation. So really, I think meeting
these security standards, people maybe sometimes think that it’s a big technical thing. But actually, the technical
part is the easy part. The rest of it is wasn’t really difficult, for someone like me, anyway,
who’s primarily technical. So with that said, I’m the technical guy, not the documentation guy. (laughs) So I’m gonna talk about the
technical implementation. And, yeah. So my boss really likes this slide because it sort of
inventories all of the stuff that is involved in SRCE. So you can see, there’s some
documentation components there, we have policies and standards. It’s about over, I’d say,
450 pages of documentation, the last time I looked,
that had been reviewed and assessed over and over again. And we turned them all over
to the auditor and stuff. And that includes policies, standards, standard operating procedures,
research unit handbooks for the research units
that are our customers, and then our SSP. And then we have a bunch
of server components. I’m gonna talk about those a bit later. But the big thing in the middle is all the AWS components we use. So AWS provides a ton of
managed services for compliance. And obviously, you run
networks in AWS and things. So this is sort of the
list of AWS services that right now, in the current
state of the environment, we’re making use of. I’d say the big ones from
a compliance perspective are Amazon GuardDuty, AWS
Config, CloudWatch, CloudTrail, AWS Inspector, those kinds of things. And then, automation is a
crucial component of our system. I’ll talk about that in a bit,
but the key takeaway here is, we basically use Terraform for everything. And I highly recommend it. So the way we got started technically is we used Amazon’s Quick
Start architecture for NIST. So they have a Quick Start architecture that covers NIST 800-53 and NIST 800-171, which is a subset of NIST 800-53. And that Quick Start architecture is basically a collection
of CloudFormation scripts, pre-turnkey, for people who
are fluent in CloudFormation. You basically upload these specifications, and AWS does all the work to
deploy your infrastructure. Works pretty well. Then we ended up modifying
the baseline architecture a fair amount for our needs. But it provided a good starting point. One of the other useful
things, for NIST anyway, was AWS provides a security controls matrix, which is basically a spreadsheet that lists all the compliance controls and then how the AWS
Quick Start architecture satisfies them. So it gives you a little bit of a roadmap or a taste for how you’re going to have to technically address the
security control standards. And I believe it covers
both 800-53 and 800-171. I think they have these
for FedRAMP and FISMA, various other things. So in order to enhance and solve
some of the problems we saw with the AWS Quick Start architecture, we looked at other universities that were doing NIST 800-171
compliant research environments in the cloud, and Purdue sort of popped
up as one of those. Basically, Purdue had developed, in 2016, what they call the data airlock model, which was an infrastructure
based on SFTP airlock for getting data into the system and getting it out of the system. And I’m going to go through how we sort of implemented that
idea a little bit later, but that was something we adapted from Purdue’s EDUCAUSE article, which is linked down there,
and is probably updated now. I think they do something different. I know we do. But they published it originally in 2016. And so, this is what we
ended up coming up with. I’m gonna go through the architecture in a bit more detail in a second. But basically, we have a
multi-account structure. So when new researchers come on board or new research projects come on board, we deploy a new AWS account for them, and that account is linked back to our master management account and the security account through
IAM roles and VPC peering along, you know, so we
have network connectivity and we have IAM role
permissions connectivity. So AWS sort of IAM access
to deploy our infrastructure and things like that. We have consolidated billing, which is one of the major things that, at our university
anyway, has been an issue, is people, it’s kinda difficult to pay with PO and invoicing through AWS, so we consolidate billing and chargeback to the project accounts that are onboarded in our environment. And then we have network control through the VPC flow architecture, and we deploy this
stuff through Terraform, which I’ll talk about in a bit. So yeah, I’ll just talk
about each of these accounts sort of individually. In the security account,
we have a single VPC. In that VPC, we have
third-party applications, Splunk, and Sophos. Splunk is used as our
primary logging, monitoring, and compliance auditing service. So we lean very heavily on
Splunk in this environment. It’s really customizable,
for folks who’ve used Splunk, it’s really expensive, but it does a lot of the things we need and it’s sort of a
catch-all solution for us. We use a bunch of AWS
native solutions as well that run out of the security account, so GuardDuty, which is an AWS service that provides sort of a
perimeter threat detection layer, I guess, on AWS. So if malicious IPs are trying
to access your environment through any public interfaces you have, GuardDuty checks for that. If there’s abnormal activity
within IAM on your accounts, GuardDuty will alert on that. AWS Config is a semi-customizable AWS compliance enforcement service. So it has visibility on AWS’s,
like, network architecture. So security groups, firewall
rules, firewall ACLs, and your network configuration
can all be sort of audited using AWS Config. Additionally, you can audit things like, are backups occurring, do
you have public S3 buckets, you know, that kind of stuff. CloudTrail is what we
use for audit logging. So any time a change is made in AWS, we log it to a specific place, and run a report every, you know, week, on what changes were
made in the environments using CloudTrail. And then we’re monitoring VPC Flow Logs for traffic that’s
flowing through the system between the private VPCs and out to the public, if that happens. We have a management account where all our identity and access
management infrastructure lives. So right now, we’re using Active Directory as our identified identity
and access management service. It lives in a private subnet. And we also run Systems
Manager and AWS Inspector out of this account to scan for vulnerabilities
on the different servers that are running within our environment. And we also route all network traffic outbound to the internet through this management account so that we can control, you know, what kind of traffic’s going on, where people are browsing,
that kind of thing. And then finally, we have our
project or client accounts. So like I said, each research
project gets its own account. They get a single VPC, and we deploy some basic
infrastructure within that VPC. We deploy an egress airlock
and an ingress airlock server, as well as a terminal server. So the ingress airlock
is how they get data in, egress airlock is how they take data out. And then the terminal
server is where they land to sort of move that data
into their research clusters, which live in a private VPC. So they have the ability to deploy systems into a private VPC and
configure them there, but they route all their traffic
through a terminal server. Additionally, when
researchers are onboarded we give them access to
their own SSL VPN tunnel using Pulse VPN. And that is how they get access
to their terminal server. So they need to log into VPN, you have two-factor set up on that, and then they access their
terminal server via RDP, and they can put things
into the airlocks via SFTP or take them out via SFTP. Yeah, the crucial component
of these client environments is the airlock design. So the airlocks are basically running on Amazon Linux AMIs
right now, unfortunately. And we have a number of, um, we have a number of, you know, Linux-based Active Directory
binding technologies that we use to bind
them to Active Directory so that we manage
authentication centrally. And you know, some other security things. We use CIS Benchmark and AMIs and everything is encrypted,
FIPS-compliant encryption on our SSH and SFTP services. And basically, this is the workflow for each of the airlocks. So the ingress airlock, basically a researcher would log in through their SSL VPN tunnel. They would then have access
to a SFTP drop location. When they drop their data in there, it’s scanned through
Sophos, logged to Splunk, and then made available
through a Samba share on their terminal server. To get data out, we have
an approval process. So a researcher would put something into an outbound SFTP airlock. It would trigger an approval workflow, and an authorized approver
would have to review what was in the airlock, approve it for egress, or not approve it, in which case it’s deleted. And once it’s approved for egress, it gets put onto a SFTP endpoint, where the researcher can pull
it off through the VPN tunnel. This is all sort of deployed, and provisioning is done using Terraform. So we have a version
state of the environment, and we have versions of
the Terraform scripting and specification in Bitbucket, and then a system for having Terraform pull versions of the
environment and deploy. So when we add a new project VPC, we make a new, we
specify a new project VPC in the Terraform configuration, and it gets deployed via the
Terraform’s execution module. And so, we use it to deploy
all our network infrastructure. So basically every VPC and every account is consistent in terms
of network configuration, in terms of the servers that exist there, what their configurations look like, and all the peering and IAM access roles with the new account. Yeah. Some lessons learned. So I think we underestimated
the time required. So this project took
about two years, which, I think my boss told me one time he thought it was gonna take a month. (laughing) So I think we sorta went into this without all of the understanding of what meeting a security
compliance standard entails. As a I mentioned earlier, it was about 450 pages of documentation, which takes more than a month to write, and more than, you know,
one or two people to review over the course of time. It also cost a lot more than we thought. Dev environments cost a lot of money. AWS will let you deploy a lot of resources into their accounts before
you hit a roadblock, and they can get very expensive over time. So doing a cost assessment early, and a real practical cost assessment that looks at what you’re gonna
spend for dev environments and test environments and how long you’re
gonna run in production before you start recouping money
and costs from your clients is a good thing to do. I think the other
technical takeaway I have is use automation early. So I think, our environment
was initially built by hand, or through modifications to CloudFormation deployed architectures. And then we had to go
through the painful task of retroactively specifying
all of that stuff in automation, and
getting it to work again after doing that. That cost a lot of time. And it’s tempting to go into the console and just deploy things and spin things up, because it’s very easy,
it’s a nice console to use. But I would encourage never doing that and only putting stuff
into your automation code because it really saves time in the future and makes it much easier
to get a read on things. Yeah, so. But the future, so this summer, a couple projects that we’re working on is we’re working on cost optimizations. So we want to make things more elastic, utilize S3 more for storage instead of EBS block
storage on the servers. We wanna implement Spot instances and more sort of auto-scaling
within our environment, and more time-based or
need-based scaling up and down of the systems within the environment. Right now, everything is very static. Researchers deploy the
research systems they need, and they spend the money
whether it’s a weekend, whether it’s a holiday, whenever. We wanna make it so that it’s need-based when they get those services. We’re also looking to
optimize the airlocks and make them more cloud-native using a combination of
Lambda, S3, Cognito, and ADFS. So this solution involves a Lambda frontend and API, basically, that puts sys files and data into S3. It’s then scanned, and we use Cognito and an ADFS link to our source
domain, to our secure domain, to authenticate users. That is mostly built at this point, and hopefully will be rolled out soon, and we’ll get away from using
servers for the airlocks, and then moving away from Splunk because it’s expensive
and hard to maintain. And then some other cloud initiatives. This is mostly selfish, because I want people to come up to me and tell me how to do these things. (laughing) ‘Cause these are things
we’re working on currently on our campus. So one of those is building a data lake. One of the things we’ve done so far is automate the ingestion
of infrastructure data, cataloging it using
Glue, putting it into S3, and then using Athena
to provision connections to various systems using
JDBC and ODBC connectors. That’s something that’s
in the works right now, and we’re working on
building systems to get data, mostly systems to get the
first types of data into S3. And then the other thing we’re working on is a cloud brokerage initiative, which is basically a way
of deploying AWS accounts, adding them to an inventory database, and configuring them for billing recharge. And that uses Lambda,
RDS, IAM, IDP, and ADFS, and we configure it on a single sign on through our domain with
ADFS when we do that. So those are sort of the
two other initiatives I’m working on right now. And that’s it for me. Anyone have questions? (clapping) – I think we have time for… – [Audience Member] I’m
wondering how you pay for this. Is this a recharge service? – Not yet. (laughs) So right now, right now,
we are charging cost to our clients, and that just happened. Up until last October when
we got the attestation, I believe the provost
was footing the bill. I dunno, somebody. Not me.
(laughing) Yeah, but no, we’re not recharging. – [Audience Member] So
along those same lines, how are you shutting people
down when the projects are done? Is there an automated mechanism for that, or do they have to notify you? – So there’s not an automated
mechanism currently. A lot of this, so the
cloud brokerage project that I talked about briefly at the end is going to be driven
through ServiceNow forms for provisioning accounts and
then de-provisioning accounts. So that is something
we’re looking at using to de-provision these services. But right now, they notify us, we tear it down using our Terraform, archive data if they need it. But usually they say they’re
done with everything. – [Audience Member] And
when a researcher signs up, do they get a select amount
of bandwidth from AWS, or do you allow bursting within accounts, or are you able to share that resource among several researchers? – So one of the things we can
do is share reserved instances across the sub-accounts in the org, but that’s about it right now, in terms of cost optimization. One of the things we’re
working on right now is being able to burst to
create Spot instances as needed for the capacities of
the research application that’s running. But that automation work
has sort of just started, and so we’re trying to
provision that stuff now. – [Audience Member] Do you
know if there are any plans down the road to open this up as a service to other campuses? (laughs) – Um, way down the road.
(laughing) Yeah, so yes. There is some discussion
about that, but, yeah. We haven’t talked too much about that. – [Audience Member] (clears
throat) So just on 800-171, do you feel that meets the
needs of most researchers? Because, like, a challenge that we have is people come to us with very different
compliance requirements. Is that, like, a bar that’s
at a pretty good level where you can accommodate most people? – I think it would be better if we had some more DFARS compliance. Like, NIST 800-53 is
probably, I would say, more of a generalized standard. I think a lot of our researchers and the researcher that
came to us initially are looking at CUI and working with CUI, so this really fits their need. But some other researchers
that we’ve talked to, and just talking to
researchers around campus, 800-53 seems like a standard
that we could potentially bring in a lot more customers using. And so, I think we’re looking
at sort of moving towards that kind of compliance. Once we optimize things for usability, because right now it’s kind of, we got a lot from researchers that are UAT that things are a little
bit difficult to use. So we’re making optimizations for there. – I that’s all the time
we have for questions. We can hopefully later, but
I’m gonna open it to Pavan. – Hello. So I’m Pavan Gupta, I’m
a digital health engineer with the Center for
Digital Health Innovation. And I’m gonna take this quickly, I think. So I thought it’d be fun to offer four sort of opinionated thoughts about how you can
consider using computing. And I will pivot around an advance in technology
called Kubernetes. And if you’re not familiar with that, essentially it’s the
full orchestration tool for sort of container and
cloud-native environments. All right, so Ira just did a great job describing a fantastic Amazon environment. It turns out we also, at UCSF,
worked on a version of that. In the end, our researchers
basically wanted a console, they wanted that console to be simple, they wanted to log in, they
wanted to do their work, and that was it. In fact, oh yeah, there we go. Can you hear me a little better now? Hello.
(laughing) Okay, so they wanted
something very simple. They wanted it to be usable. The problem was, as it turns out, Amazon is a particularly expensive tool, and in certain scenarios
it works a lot better than other scenarios,
like for a research group. And it turns out there
may be better solutions to consider in this. So here’s the first opinion I have. There’s a better way to
think about cloud computing. So there is a, I stole that image, by
the way, from Microsoft. Should’ve stolen from
Amazon, I’ll steal it. Don’t worry, Matt Jackson. So to be clear, there is
a notion that the world has kind of been going
from things that look like, you know, on-premise implementations into sort of fancy,
cloud-based implementations. And a lot of the cloud-native tooling is starting to return to
data centers everywhere. And it’s actually really exciting. It means that you can operate IT services as if they were just consumable APIs, and it offers a number
of game-changing options. So, one, it’s worth noting that if you’re a grant-funded researcher, and that’s a couple academics in the room, yeah, you don’t want your
machines to just go away. You wanna be able to have
something you can continue to use. Like I said, cloud native’s
pretty interesting. And it turns out it’s possible to build on-prem infrastructure in really thoughtful, cloud-native ways. In fact, I think as a research platform is doing that the best across
the University of California, as far as I can tell. And we are certainly implementers
of what they’re doing. And I want you to take
this away from this slide. A better way to think
about cloud computing is that it’s a mechanism for agility. It’s not necessarily a
mechanism for all computing. So in doing that, I think
I was introduced to a guy who was doing HIPAA-related
secure research. At some point, August 22nd, we will be presenting in
Santa Clara, at least briefly, about a NIST 800-53 exercise
that we’re busy going through, which, goodness gracious,
it’s surprisingly boring. But it’s about to show
that we can do cool, sort of cloud-native things
with high-performance computing using stuff that I’m told
Singularity was built somewhere up on the hill over there. So we’re gonna plan on that. But one of the things that comes with it is sort of this new notion
of sort of zero trust. So if you’re talking to security nerds, they’re all about this
thing called zero trust. And so, zero trust is basically the idea that up and down, so let me first say, I don’t think there’s any
UCSF people in the room, so pretend I didn’t say it. But yeah, maybe your internal
and external networks are no longer trustworthy. There’s probably good reasons
why they are trustworthy, but let’s just assume they’re not. It turns out researchers engaging in synchronous security
controls is probably mistake. That’s not their expertise, it’s probably not where
they wanna spend their time, and frankly, no one really wants to clean up the mess afterwards. And so, there’s this
notion that if you isolate all the way up the stack, right, so you have isolation
layer after isolation layer after isolation layer, there’s potentially a way of thinking about those isolation layers as security by design, right? So let’s say, I’ll just
do a very simple thing, I can explain this beautiful picture. Which by the way, I also
stole from sylabs.com? Who knows what that is. But the point is, if you’re
thinking about things that look like users and applications, you have to use data and the network, and everything is sort of
encapsulating each other, maybe in order for you to
escape your isolation, right, in order for there to be a chain of events that leads to a security problem, you really have to go tackling one isolation layer after another. You know, to be clear,
it’s still possible, but it’s harder, as you sort of maintain these isolation layers. So what I want you to
take away from that is, maybe there’s a new way
to sort of think about how you can isolate your workloads. Keep your secure stuff deep
inside far-away corners that are attached in like,
well-secured container platforms and attached to secure VMs, and
attached to secure hardware, that are attached to isolated SDNs,
software-defined networks, and you know, have those things riding even more isolated host networks. There’s a lot of ways to think about that. So thinking about zero trust, I’ll explain why I think that’s, why that’s important here in a second. And so, the next big takeaway is this. I think that you can build
your cloud solution on-prem. And in fact, if you
aren’t building on-prem, you might be doin’ it wrong. So we have, I haven’t
defined what ALICE is, ’cause unfortunately I had to
delete some slides from this. (laughs) I thought I’d have,
like, five minutes to talk. ALICE is, I think, the Artificial Learning and
Intelligence Compute Environment at the Center for Digital
Health Innovations using now for a smooth
machine learning research. It’s kinda cool, actually. This is the implementation for us. So I, strong shoutout to people who are doing this better than us, the NIST research platform,
and then across the UCs. But it turns out, so there’s a
lot of content on this slide. But lemme just explain. We’re running a small supercomputer within our sort of research environment. That research environment is designed to be easily accessible. And it’s turned that actually, the easily accessible part is coming true. Researchers don’t have to
be extremely sophisticated to be able to use extremely
sophisticated security designs, make sense? But it turns out, it’s actually
very hard to run Kubernetes in a successful way, and it requires significant expertise. And it turns out, it’s nice to have people
within the UC system that have that expertise. And believe me, we’re stealing
some of that expertise. But it turns out, things
are getting better and they’re getting better faster. The Kubernetes community
is actually quite large, and like I’ve said a million times, the UC version of that
community is extremely strong. In fact, I think at the
University of California Berkeley, you guys are running one of
the most impressive sort of implementations in the world. So, and I think you are also hosting some of the genius behind it. And like I said, it’s just
an enjoyable user experience. And it turns out, researchers
probably don’t care, well, unless they’re
researching security problems, they don’t care about the security, they care about the end use. Okay. So I wanted to end with this. And I don’t have a way for me to just prescribe the solution. But I’ve taken you down
a path where I’ve said maybe the version of the cloud
that we’ve been talking about for at least the two years
that I’ve been around and certainly the years before that should be rethought. The cloud is a tool that lets you sort of expand and contract, but it’s not the fundamental
place for you to compute. Right? I told you that you should be thinking about isolating things,
in that case of security, and I told you it looks like Kubernetes is a viable and workable solution with a strongly growing base. On August 22nd, I think we’ll have our completed NIST 800-53
variation of Kubernetes in a high-performance
computing environment within sort of a HIPAA context. And I think the way the world should think about this is this, so you build your cluster on-prem. When you decide you need to extend, which always happens, right, there’s like, some research
project that’s due tomorrow, the paper must be finished, that’s when you consume cloud resources. In that moment, you expand
through, like, Spot instances or whatever you wanna do. You grab them, you finish your research, shut ’em down, and that’s the
end of your cloud deployment. To be clear, that does not
obviate things like TR systems and clinical systems, and you know, and systems that have to
run on a regular basis, and also, you know, you might wanna do fast
prototyping in the cloud. But to run your cloud problems, to run your problems at
full capacity on the cloud is a mistake. Go look at how Dropbox
rebuilt their data centers. Lots of people need to be acknowledged. I will, you know, I’ll let
the picture show up here for a second. So everyone will know I work
for really fancy people, and lots of people fund this,
and lots of people help. And here’s the part that I wanted you to see.
(laughing) So we have two data science
positions that are open. We are trying to save the world, one small research
computing problem at a time. And I am told you guys have
the best data scientists in the land. Please come and work with us. (laughing) – [Audience Member] Pavan, you need to come work for Berkeley. And if you wanna work for
UCSF, you can go work for ’em. – It’s better that way, right? (laughs) And yeah, that’s it. You guys have any questions? Yeah. – [Audience Member] What sort
of networks, digital networks, which providers, and possible
challenges you have had? – That is a super great question. So in the high-performance
computing world, I think Lawrence Livermore
actually put out some research on how they’re losing something
like 18% to 25% performance with their Kubernetes-equivalent
implementation. And most of that’s falling
apart at the SDN layer. So we are a Calico implementation at UCSF. There are many different
players in that space from, you know, Flannel to
WeaveWorks, and Calico, and a bunch of others. It turns out, there is an open question about whether high-performance computing should be operating in
a Kubernetes environment as it stands today. I don’t think that question
will forever be open, but I think it will be solved. Down at the base layer,
we have host networks that are physically connected. In fact, you asked a question that has enough attached to it. So we run ALICE, the cluster
that we have right now where we’re messing with things. We run all of our storage, which runs, again something really cool, called Rook, which is an implementation
of Ceph on Kubernetes. Pretty nerdy, but there you go. But that host network is where
we do the storage transit. Everything that happens between sort of containerized workloads is taking place at that
software defined network level. Does that help? – [Audience Member] Yeah. And what’s the hardware
you’re using for your network? – Yeah, actually gonna
give a shoutout to Intel. Intel gave us a slew of fancy 1U boxes. I think I even have them listed somewhere. We had a, okay, so I’ll give
you three different variations. Very simple. We’re running some virtual servers to test to see whether
we can just, you know, drop in virtualized boxes and expand our cluster on demand. We’re using maybe 10, like, surprisingly large and
complicated Intel clusters that actually have, like, 48
actual cores sitting on them, some fancy Xeon processors. I could give you more
details if you wanted ’em. And then we had a slew of sort of, I think NVIDIA makes this illegal, but I think we’re running
a bunch of GPU hardware that Megan Dewey wouldn’t be happy about. But that’s how we’re
sort of expanding things. So it’s a very hybrid and odd thing. – [Audience Member] Do
you have an AI sandbox, and if so, would you put it in this? Or what would you do with that? – Yeah. In fact, we do. And look, that’s a
great question! (laughs) So the reason why we’re turning that to be more of a Kubernetes
shop in the classic HPC world is because some of the
data science tooling that is kinda critical involves actually looking at applications that are being hosted that have a need for strong
compute in the background. So I could send you like, a classic HPC storage implementation would make that very hard. So yeah, you’re exposing
namespaces that can host sandboxes sort of on demand. (Kiwibot grinding) (laughing) Okay, thank you very much. (clapping) – We’re hearin’ some Kiwi. And it can get over here on its own. I think it’s good. And this is Sasha, who’s gonna be giving
our third presentation. Do you have slides or anything? – I don’t, but I actually
have a full-on computer. – Okay, if you have a
computer, we’ll plug you in. – It’s in the robot. (murmuring) – So when the robot
arrives with his laptop, I’ll plug it in. (laughing) – Well thank you everyone for being here. I appreciate it. Sorry, we’re a little bit late, we had a meeting that was unscheduled. But I’m really excited to be here and share all the magic that
we’ve been building with Kiwi. And this is actually the Kiwibot, our helpful little delivery robot that delivers food all over Berkeley, and soon Stanford and other locations. So it’s really, really exciting stuff. And inside, we have my computer. So, on the computer we’ll
play the story of Kiwi. There’s actually a really,
really cool story behind Kiwi, so I’m really excited to share with you. So as we’re setting this
up, maybe a raise of hands. Who has heard of Kiwi before? Who has ordered with Kiwi before? Nice, nice. – [Audience Member] Who
has been hit by a Kiwi? (laughing) (murmuring)
(Kiwibot grinding) – Do we need to change
the input or something? – [Pavan] Did I steal the input? – Oh, there’s no HDMI cable on this one. All right, should be
loading just right now. What you’re seeing right now on the screen is the future of cities. Kiwi’s mission is to build an
operating system of the city. And that’s a pretty broad statement, but if you think about it, our cities right now are
structured around cars, around parking. A lot of our infrastructure
is pretty legacy. It’s not something that people wanna see in our cities as much, moving forwards. We definitely see a city
that is more made for people, that is made for living. And that’s where robots come in. So a bunch of years ago, we started building
these robots in Berkeley. And since then, we’ve
had over 40000 orders. And we’ve built over
150 robots to do that. What’s really incredible
is to see how people adapt their lifestyles to
accommodate more robots in their environment. So since we’ve launched,
we’ve had three generations. And we’ve seen a transformation in the way Berkeley approaches robots. So when we first launched,
people were kind of hesitant. Like, what is this thing, what does it do, is gonna take my job? And quickly people realized that actually, it’s there to help people. It’s there to empower communities. Deliveries are more
affordable and accessible. Early on, we had this idea that we’re gonna fully replace people, that we’re gonna fully automate things. But after building a robot
that was largely autonomous, our second generation was
actually 99% autonomous, we realized that even though we’re able to build an autonomous robot, it’s actually not the answer
for what our cities need. Instead, we decided to build a
robot with parallel autonomy. This means that the robots
are actually helping people instead of replacing them. And this has allowed us to build a model that scales to hundreds of orders a day. In fact, last semester
we had over 18000 orders and over 2800 unique clients. So almost 10% of the student population. It’s pretty incredible. And over here on the screen, we can actually see the
story of a day of operations. And it’s really beautiful because you see robotics meshing within the fabric of our sidewalks and live in our communities. The blue lines are actually
robots rolling around, and the yellow lines are people. So people, you might ask. That’s a great question. Why are there people involved in the mix? Well, that’s ’cause we’ve
found that generally speaking, it’s more efficient to have
people that are helped by robots than to just have robots
or just have people. So what you’re seeing
here is people in yellow, so these are people who
are goin’ to restaurants, they’re picking up the food. These people are
delivering it to clusters. Once the food is in the
cluster, they feed robots, and then the robots do
the last few hundred feet, the last few blocks to your doorstep. This way, instead of doing maybe one or two deliveries an hour as a DoorDash or Uber
Eats driver would do, our couriers do 15, which is
a significant improvement. And that’s why we’re able to offer free delivery with Kiwi Prime. You subscribe to Kiwi Prime and you can order as much as you want to anywhere in Berkeley. And some people really use it a lot. We have some people who have ordered more than 300 times in a semester. And our top 50 users, they ordered more than 50 times a week. That’s more than twice a day. So it’s pretty crazy stuff. Where does the cloud come into play here? Just about everywhere. (laughs) So these robots that are
actually fully connected, driving cloud extensions. It’s like, we have a Jetson inside. So it’s a computer that has a
really powerful GPU on board, lots of video RAM, but it’s always
communicating with the cloud. And actually, we have a couple different points of presence on the cloud. We have stuff that’s running
on Heroku, for example. So we have like, 15
different services on Heroku, some of them ranging for
interfaces with the robot, others are more to do with ordering, customer service,
interfaces with restaurants. We also have other cloud
platforms that we’re using. So for example, we’re using Google’s cloud
platform for routing, we’re using them for storage as well. Also other services like AWS. It’s kind of a mix of different services. So it’s sort of what the
developer wants to do, that’s what we choose. In addition to these
services that provide us our hosting, our compute, our storage, we also use some other cloud services. I guess maybe it’s going beyond the traditional stretch of imagination of what you’d call a cloud service, but we use Workplace by Facebook, and we use this really
cool tool called Node, which is where we’re running
this analysis right now. And prior to this, we actually had a couple different data
platforms that we built. But they were all like, Jupyter Notebooks, so it was really difficult to use. And then we went onto this one, and we have these really,
really beautiful visualizations that help us understand
what’s happening with Kiwi, what’s happening with our operations. And they don’t require
any technical knowledge, so people can use them who
are from a business background or a product background. So it’s a much better approach
to doing data analysis. Where else do we use the cloud? Yeah, I mean, this would not be possible without the cloud. It’s always connected to 4G, so we always have people who
are supervising the robot, as the persistent 4G connection has redundant connections, actually. So it wouldn’t be possible to
build this without the cloud. Yeah. Any questions? No questions? Am I that boring?
(laughing) Yes. – [Audience Member] So
you’re using your robot, when you connect it to
the cloud, what happen if the cell phone module within dies, or whatever happens, does it break, or does it do this last, command
or delivery or something? – That’s a great question. So in case, just repeat the question so we have a, everybody can hear it. So the question was, what happens if we lose 4G connectivity? That’s a fantastic question. So typically what happens is that we do lose 4G
connectivity from time to time, and that is a challenge because if we lose connectivity and the robot does not
connect again to 4G, we would have to dispatch a person to restart the robot manually. Fortunately, it doesn’t happen too often so that’s not much of an issue. However, we do have some areas in Berkeley where we have a lot of latency. So what we recently started to do is we started to pull
all of our latency data from our database, and
we started to analyze it. And recently what we did is
we actually published a map which shows us where in
Berkeley we have high latency. So we’re actually able to merge that map with another map we have which shows the dangerous
parts of Berkeley. So we have certain
streets where, you know, it’s a little too tight for cars, and it’s a construction zone, so actually combine
these two maps together, and now we have a map
that tells our supervisors where it’s safe for the robot to go, based on the latency and
based on real conditions. But connectivity’s probably
our number one issue. Yes. – [Audience Member] What is the interface to the people side of this? Are they getting phone
calls from somebody, or is it all kind of automated as well? – Yeah, that’s a great question. And that’s probably where
most of our magic is. ‘Cause for us, it’s not about
building the best robot, it’s about building the best
experience for the customer. At the end of the day, we’re
offering a commercial service that allows you to order food. And you as a user, you just
wanna order food with your app and you just want food. You don’t care about the robotics, you don’t care about
all that stuff behind. There’s actually a lot of magic, (laughs) a lot of magic that goes behind. So we have a lot of
internal tools that help us make that delivery possible. For the robot specifically, we
have a couple of interfaces. We have like, a God mode interface where we can see all the robots. We use a tool called Food Robotics, and that has like, a dashboard
of all of the robots, like, their state, like, are they working, are they charged, do they have any faults. And also for the supervision of the robot, we actually have a supervisor interface. So people set waypoints
for the robot to follow and then the robot follows
these waypoints autonomously. Any more questions? Yes. – [Audience Member] So I
was just wonderin’ about, I mean, we’ve heard about Lime scooters being
thrown into Lake Merritt, and like, rage directed
at technology. (laughs) Do you have problems with
that with the Kiwibots, and as they’re monitoring, to just make sure they’re
not being vandalized? – Yeah, that’s a fantastic question. So, we’ve been very fortunate in that, with the onset of Kiwi, we’re always understanding
what’s happening around us. We’re always trying to analyze the market, we’re always trying to
see how people react, like, look at what people
say on social media, look at how people react
to the robot next to, that’s why I’m standing next to it. And what we found is that
adding a face to the robot actually really helps us a lot. (light laughing) So once we added the face, people started seeing it more as a friend rather than as a foe. Actually in the very first generation, it was literally shopping
basket on wheels. (laughing) I might have a video of that somewhere. This is the very first Kiwibot,
that’s what it looked like. So it was literally a
shopping basket on an RC car. And you would’ve had
a Raspberry Pi inside, an Arduino, it had a phone
that was doing a video call off to somebody who’s controlling it with an Xbox controller. So it was very, very basic. And it worked so badly
that most of the time, we just dropped it off in front
of the customer’s doorstep. (laughing) So, with this one, we
did have some issues. People were messing around with it. But when we quickly went
onto a second generation, we actually adopted an organic shape. So we had a really round
shape, really friendly. We actually built it in this penthouse in the Harvard room over there. So Gordon really loved
us during that time. And yeah, it was a cool experience. We realized that actually having
a face makes it friendlier. We adopted this kawaii style of design, which people really love
and associate well with. And yeah, we’ve had people
who try to steal it, but nobody got away with it. (laughing) Yes. – [Audience Member] How do you keep people from stealing stuff out of it? – Well, we actually have
a locking mechanism here. So you’d have to apply,
like, a lot of force to open the door. And it got to the point where
now it’s actually cheaper for us to replace the food than to put in any more
sophisticated locking mechanisms. So incidents of like,
people stealing food, are very, very low. Maybe happens once or twice a
month or something like that. So it’s a very rare occurrence, and it actually costs less for us to have the
system as it is right now than to upgrade it. – [Audience Member] Have
you considered placing this, you know, in downtown San Francisco? I wonder how it would fare. – Yeah. (laughing) I actually brought the robot
once to downtown San Francisco, and people love it. There was this one person who
started changing themselves. They were using this as like,
a thing to hold their clothes. (laughing) That was on Market Street. (laughing) I think right now, our model is more
centered toward campuses. ‘Cause with college campuses, we have a really friendly environment. A lot of people are
really keen to try robots and have an open mind and open heart. With cities, we have a lot more different types of personalities
that we’d have to handle. And also, it’s a question of cost. ‘Cause right now, we’re
operating the robot at about $4.6 an hour. We have to make sure we’re able to scale our business model to cities, and I don’t think we’re ready yet. We’d have to reach more like the $2 target before we’re able to do it,
deploy that into cities. – [Audience Member] So
do you have statistics how many of these that get
hit by a car, by a truck? How many of these get kicked
by people into the lake. Do you have some statistics like that? (laughing) – Well, we firstly– – [Audience Member] I never
did kick one into the lake. (laughing) – Firstly–
– I do know of someone. – Well, we’ve never had
any robots in the lake yet, so I’m really thankful for that. We do have minor bumps and scratches that happen with robots. We don’t necessarily
have statistics for that, as we don’t track that very closely. We get some emails from time to time with really angry people. But then like, we bring
them over to our office, we show them what we’re doing, like, explain to them, tour of the building, we give them a plushie,
and they walk away happy. (laughing) We haven’t had any issues with that. Berkeley’s really welcoming
of what we’re building. It’s actually the very
first time in the world that we’ve seen a community
that adopts a robot. Like, if you walk around Berkeley, you’ll see people helping out robots, if the robot gets stuck in a pothole or falls off the side of a sidewalk, somebody will come rescue these. (laughing) So it’s truly incredible to
see the love in the community. – [Audience Member] I’m
wondering, do certain areas ever get too crowded
at the lunch hour, say, for the robot to move? ‘Cause they’re not as agile as people. – Yeah. And we’re actually solving this right now. We’re making the robot far more agile. So we have a new version
that’s coming out really soon. And I’m excited to
share the news with you, but I can’t quite yet. But it’ll actually solve
this problem specifically. Yes. – [Audience Member] Have
you had any challenges from government entities and meeting different municipal codes? – So at first, yes. Because when we started
rolling around Berkeley, we had no permit, we had no permission. But we actually built up a relationship, and now we have a really good relationship with the university. They invest into us through SkyDeck. So that’s doing great at Berkeley. And we actually, when we
initially went to other campuses, we did a trial at UCLA,
a trial at Stanford. And what we found is that actually, it wasn’t a municipal code or any regulation that was the challenge, it was communication. We needed to communicate clearly our plans and what we want to do. So once we had that on paper, once the local authorities can say, oh okay, we actually
understand what they’re doing, then they’re happy with it. Another example was San Jose. So San Jose was, trying to
think of the right word, overtaken by scooters at one point. And there were like,
several scooter companies that just like, dumped
their scooters everywhere to the point where San
Jose had to hire a person, like, they had to hire extra resources to manage all these scooters. And they were really frustrated because nobody approached them telling them what’s gonna happen, what the scooters are,
how they would be used. And when Kiwi approached
them, they were like, oh, hey, that’s so bizarre. We didn’t expect a company
like yours to approach us, and we’re really excited to work with you because now we can actually
build something together instead of like, trying
to oppose each other. So for us, it’s been a more
pragmatic approach now. Yes. – So how do you handle, I
guess like, obstructions that occur that are not mapped. Does the robot send some information back that then other robots can pull down to understand that there’s an
obstruction on the sidewalk? – Yeah, that’s a good question. I think you might be over-imagining how automated this robot is. (laughing) For the most part, it is just people who are sitting behind the laptop and setting waypoints
for the robot to follow. So they typically see a
video stream of the robot, and an overlay of sensor data. So they can see, like,
if there’s any obstacles, any people, if it has a risk of like, colliding with something on the side or if it’s about to fall off. So it’s just like, an augmented video feed where they click for the robot to go to. So the supervisor would actually see if there was an obstruction, and it would navigate around it. – [Audience Member] So I
saw near the Starbucks here, one time the robot was just
following people annoyingly, and the person would get into Starbucks and then it would start
following another one. So somebody was doing that. (laughing) – That’s pretty funny. Yeah, I haven’t heard that before, but that’s actually pretty funny. Yes. – [Audience Member] How many
people does each controller, how many robots does
each controller control? – Typically three. But we’re pushing that up. We’re expecting to have more
as we improve the automation, as we improve how smooth they perform. So we definitely expect to
have more and more robots supervised by each person. Some questions here. Yes. – [Audience Member] I think
you answered this question, but is it, how is it navigated? There’s no campus map that it’s following, or is it all related to
those waypoints that are set? – Pretty much, yeah. We actually have a map,
as I explained earlier with the latency and the dangerous areas. But that just is more of like, a pointer. So the supervisors can see the map and they can see the robot on the map, but ultimately, they make the decisions. We do have sensors onboard the robot so they can detect whether
it’s about to fall off. We also use object detection, so we detect people, cars, bikes, and all sorts of different objects. So it’s very intelligent in that approach. And it’s a very pragmatic approach. ‘Cause at first you’re like, oh okay, we’re going to build a neural network that does image
segmentation and plot paths, and we can do all this routing. And it turns out, it’s possible, but it’s actually only 99% reliable. So what ends up happening is that once you’re crossing
10000 intersections a week as we were doing last
year, now we’re doing more, you still have, like, 100
collisions or 100 problems at 99% accuracy. So we cannot have that
99% accuracy enough, or achieve higher for commercial service. Yes, in the back. – [Audience Member] So
today is food delivery. What’s the future? – Great question. I mean, you can imagine a lot of things. I think long-term what we’ll see is probably like an API, particularly. Today we have an internet that allows us to communicate
to anybody in the world, but it’s limited to bits. You know, you can send information. But what if you wanna send atoms? What we’re building is
the internet of atoms, the physical layer for a city. And this food delivery robot
is just the first step. Yes. – [Audience Member] Have the communities that you’re working in,
the campuses and cities, have they ever asked you for data back? For example, for the busiest streets, the busiest walkways, the
easiest paths to take? It seems like city
planners could use that. It seems that people who are interested in, like, helping the disabled,
like, people in wheelchairs, and they could somehow
benefit from all that data. – Yeah. Surprisingly, we haven’t
had too many governments or officials reach out to
us for this kind of data. With our advisors, we speak
about what kind of data we have available. Probably the most interesting data set is cell coverage and latency maps. We were strongly advised
against showing that because that could endanger our relationship with our carriers. ‘Cause I mean, like, all
of these governments, where are they getting this data from? It’s like, oh, it’s Kiwi’s! Wait, we’re providing Kiwi’s services, why should we continue working with them? So it’s kind of like, a tricky
game you have to figure out. But we haven’t been strictly approached for the kind of data
you were talking about. Any more questions? Yes, sorry. – [Audience Member] How
do you actually know when to cross the road? – That’s a good question.
(laughing) – [Audience Member] Traffic signals? – Yeah. So when we first started the robot, we built a machine learning algorithm that’d try to detect the traffic light. And we did not quite
get it accurate enough, so even though we were able
to detect, most of the time, the color of the signal, it was not accurate enough for us. So now instead what we do is we detect the traffic
light, which is super accurate. We can 100% detect the traffic light. And then the person actually
judges what color it is. So it’s like, super zoomed in and they can actually see it. – [Audience Member] So real-time, and supervisor is determining
to cross the road now. – Yeah. Yes. – [Audience Member] So,
two linked questions. How large is your engineering team? And when you were talking
about vendor selection to run cloud services, I believe you said that it’s
up to just the developer and what they know or wanna do. Is that true, and is that
your longer-term plan? Or where do you see it
evolving as you scale? – Some great questions. So to answer the first one,
we have 48 people in Kiwi. Our engineering team, well, okay. So we have, like, AI and robotics teams which is like, seven-ish people. We have like, a backend team,
which is five-ish people. We have a mobile apps team
which is a couple more people, and a product team which is five people. So something like that. (laughs) That’s our engineering team. In terms of choosing cloud providers, it’s more about, like, how
quickly can we finish a product, how quickly can we get
something out the door? So developers, they’re
typically saying like, okay, how can I get this
out as quickly as possible, so they’re just going
for the easiest solution. For us, it’s very important to try things. So for us, it’s more
important to try something and see if we will fail it rather than to build a perfect solution. For some systems, we’re
actually starting to see, okay, we’ve done this a lot, we know that we need to use this system. So some of the early
systems that we built, we’re rebuilding them. And we’re trying to
standardize around Heroku, which is running on AWS, because the deployment and the management and the metrics are super
straightforward, super simple. It’s a little on the expensive side, but for now it’s serving
our needs pretty well. So most of our backend
code is running on Heroku. Some things, like for example, we have our OSRM server
that’s running on AWS because we couldn’t figure
out how to run it on Heroku. So yeah. I think I have a question
in the back there somewhere. Yes. – [Audience Member] You said
you started with an Arduino, and then you moved up to the Jetson. I just have a question, does the robot push the
limits of the Jetson hardware, or do we still see that
there’s still more space for the robot to improve? – Yeah, I love that question. And we definitely pushed
the limits of the robot. We develop a lot of features
that marginally improve the performance of the robot, but we actually cannot include them in final releases for the robot because they use too much CPU. So we definitely do hit the limits. And I think the next question
is, is our code optimized? And the answer is definitely no. So I think the reason
why we’ve hit the limits is we’re not optimizing our code. But I think that’s a really
good constraint to have, because we’re actually able to ship units of functioning code that is able to run on a mobile device. So that’s probably the
most important part. Yes. – [Audience Member] So
these have cameras on ’em, and you guys are retrieving
video data from ’em. Did you store that data? – That’s a great question, and actually a question we get a lot. So we do not store this raw data anywhere, however, we do process it. And we also do stream
it for the supervisors who are monitoring the robots. Yes. – [Audience Member] Do
they have a microphone? – Yep. (laughing) – That’s the last
question, we can wrap up. It’s actually 6:30. Feel free to mingle afterwards. But maybe another hand for him. (clapping)


Leave a Reply

Your email address will not be published. Required fields are marked *