Ryan Marcus

Photo Credit: Simon Goodacre

May 9, 2019

Simon Goodacre | Graduate School of Arts and Sciences

The primary reason that organizations are moving to cloud computing is elasticity: the ability to engage more servers during periods of high usage. However, the increasing complexity of clouds creates challenges associated with scaling up the architecture of existing data systems. Ryan Marcus, PhD'19, discusses how machine learning can address these issues.

Transcript

Simon:

Hello, and welcome to the Highlights Podcast. I'm Simon Goodacre, the Assistant Director of Communications and Marketing for the Graduate School of Arts and Sciences at Brandeis University. Today I'll be speaking with Ryan Marcus, who recently completed his dissertation defense in Computer Science. Ryan, welcome to the podcast. Thanks for making time. 

Ryan:

Thanks, Simon. Happy to be here.

Simon:

I would like to ask you today a little bit about your dissertation and also just your general experience at Brandeis. But to start off with, I thought, we better give you a chance to introduce yourself and how you came to Brandeis in the first place.

Ryan:

Sure. So I'm now a fifth year PhD student. I just finished my defense. I came to Brandeis after I got an email from taking the GRE. I hadn't heard of the university before, but when I checked it out I was very surprised I hadn't heard of it. It is excellent place with really amazing faculty, and it made my decision very easy when I was considering all the schools I could go to.

Simon:

Let's talk about your dissertation. As more and more organizations are moving towards cloud computing, you've identified many new challenges associated with the systems. Could you perhaps describe some of those challenges?

Ryan:

Sure. The main reason people are moving to the cloud is because of scalability. So, if you're eBay.com, you're going to get a lot more users around Christmas and other gift giving holidays than you're going to get at any other time. So, if you needed 100 servers to handle the amount of traffic that you were going to get at Christmas, before the cloud, you would have to buy 100 servers, and you would have to pay to run all 100 of those servers year-round, even if normally, on an average day, like May 8, you only need 50 servers. The appeal and the advantage of the cloud is that you can pay for 50 servers when you need 50 servers, and then suddenly on that one day when you need 100, you can scale up—and we call that elasticity.

The problems and the challenges that organizations encounter is that even though we can do this really cool thing, and we can reduce our costs, all the data systems we have designed assume that you have constant cluster size. So if we want to go from 50 nodes to 100 nodes, we have to re-architect every component of ebay.com to be able to do that kind of scaling automatically, and those challenges are huge. They range from maintaining consistency to even when to decide when to scale up. How do we know when we're going to get a big spike in workload, for example.

Simon:

And you argue that machine learning techniques can assist with these challenges? Could you explain how that would work?

Ryan:

Clouds are getting more and more complex every day. So, maybe with Amazon cloud version 1.0, it was possible for a human being to sit down and look at all of the options and figure out what would be the most appropriate thing to do for a particular organization. But clouds are getting more complex every day. Just last month, Amazon practically doubled the number of instance types, which is one measure of complexity that you can use inside of that cloud. And so, as the number of nodes and configurations and options goes from dozens to hundreds to thousands, a human being can't really fit it all in their head anymore. So instead of asking human beings to fit it all in their head, our argument is that we should instead try to use machine learning techniques that can help organizations sort of figure out what subset of cloud features they might need without having to be overwhelmed by all the options.

Simon:

And how did you come to select this topic?

Ryan:

When I initially came to Brandeis, I really thought I knew what I wanted to do, and I had a very specific idea. And after I selected an advisor and talked to them for a little bit, they essentially had to explain to me that that idea was not very good. And from the get go, Olga, my advisor, was very kind and, in so many words, she told me that's not going to work. And then she sort of pointed me in the direction of cloud computing. And once we realized kind of working through some of the problems that we weren't going to be able to solve them from a human point of view, the machine learning naturally kind of came into the equation. So as very lucky to have a good advisor who guided me towards sort of a hotspot in the area, and then we came up with good solution.

Simon:

And do you anticipate continuing to work in this vein after Brandeis?

Ryan:

Yeah, so the field of applying machine learning techniques to systems problems, like the cloud, or database systems, or operating systems, or even things on your cell phone is really, really blowing up because the cloud isn't the only thing that's showing this pattern. Even your computer today has drastically more configurations and nodes and components than computers did 20 years ago. And so almost all of the software, all of the assumptions that we make about computer systems from 20 years ago, are starting to decay or not apply very well anymore. And because of this, the variance in those assumptions, the massive difference between what we could assume before and what we can assume now and what we're going to be able to assume in 20 years means that machine learning can be applied much more rigorously over the next five or six years to a lot more applications.

Simon:

How would you describe the faculty at Brandeis?

Ryan:

Brandeis is a R1 research institution; they get money from the NSF, they do top tier-cutting research, but they're a lot smaller than a lot of larger universities. And so, when you come to do a PhD at Brandeis, you're probably going to be in a lab of one to four people. Whereas, if you do a PhD at one of those larger universities, you might be in a lab of 18 to 20 people. And because of that the time that you're going to get with your advisor, the time that they're going to spend in a room, just with you explaining and helping you progress as an academic is a lot bigger here at Brandeis than in other places. And that's sort of the primary advantage that I think Brandeis has over these larger research institutions. You’re going to get a lot more one on one attention, and that one on one attention is going to let you grow faster and reach a higher peak by the time you graduate.

Simon:

And in speaking about that kind of attention, do you think that you have any particular mentors among the faculty on campus?

Ryan:

Yeah, so my advisor, Olga, is an obvious example. But the computer science department tends to work pretty closely together. I also did a good amount of work with a professor named Jim Storer and also Antonella DiLillo. And they were both very willing to sort of take me into a lab that was unrelated to what I was looking at, let me come to their meetings, learn about what they were doing, and even make a contribution. So everybody talks to everybody, everyone will look at your research, and they'll think to themselves, “Hey, that might solve this thing that I've been working on.” And then you'll get invited into that room. And then you'll do work with them. And you'll learn more, and it becomes a very collaborative environment.

There are not a lot of silos and everyone talks.

Simon:

And you have your end of the road. Now, you've defended your dissertation. So what are your plans post Brandeis.

Ryan:

After Brandeis, I'm going to go work as a postdoc at MIT for a little bit. I'm excited to join the computer science and artificial intelligence lab or CSAIL over there. We're continuing a lot of the work that we did at Brandeis, and I'm still going to be connected and involved with Brandeis. But it is very exciting to get to move just 10 miles and be with a completely different group.

Simon:

Okay, so my final question is: What advice do you have for students who are considering pursuing a graduate education in computer science?

Ryan:

For PhD students, I would say it's very easy to fall into the trap where you apply to a whole bunch of schools, and you get into a couple that might have a big name, but you don't know who your advisor is going to be. And they'll tell you, “Oh, come here, and you'll find an advisor eventually,” and that might be true. But the advisor is much, much, much more important than any other factor that might influence your decision. And so if you're looking for a PhD program, you should figure out: Who are my potential advisors at these schools? How big are their labs? Where do their students end up? And if you do that, you might find yourself coming to a different conclusion than you thought you would before.

Simon:

Well, thank you so much for making time for this today, Ryan, and to our listeners, I hope you'll join us next time on the highlights podcast.