Scale Contest

Win an iPad and tickets to RAMP
by solving a real-world problem

It’s time for you to test your skills. Work with Prezi’s engineering team on a real-life problem and get the chance to win an iPad and tickets to RAMP! Join the challenge and help us scale Prezi’s conversion systems!

Prezi is available on multiple platforms, so we have to convert some of the uploaded media files to other formats before we can display them on all devices. This conversion happens on virtual machines. At peak times, we need to launch more of these but over the weekend we can stop most of them. Help us find the perfect algorithm that launches exactly the number we need!

This contest is for university students, who want to test their knowledge of control and queueing theories on actual, real-life problems.


Your task

To compete in the contest, create a program in your favorite language which receives conversion requests on STDIN and issues commands to turn virtual machines on or off. We pay for these virtual machine instances by the hour, so it’s best to terminate them near the end of the billing hour. The goal is to have conversion jobs spend no more than five seconds in the queue while minimizing the overall cost incurred.

There are three queues:

export
contains jobs which result in downloadable zipped prezis.

url
these jobs download an image from a URL and insert them into a prezi.

general
all other conversion jobs (audio, video, pdf, ppt, etc)

The format of the input:

                2012-12-14 21:35:12 237 general 9.134963
                

This means that at the given time, a job enters the general queue with the id 237. The job will take 9.134963 seconds.

The logs used as input by contestants’ programs contain three weeks of actual data accumulated by Prezi’s conversion system. The first two weeks of logs are publicly available. It’s a good idea to separate this data into a development and a verification set. The former can be used while you develop your program, while the latter can act as a safeguard against depending too much on the particulars of the development set. We will use the logs from the third and final week to evaluate your submissions.

Your program should copy every line read from STDIN to STDOUT and then add all necessary launch or terminate commands:

                2012-12-14 21:35:12 237 general 9.134963
                2012-12-14 21:35:34 terminate url
                2012-12-14 21:35:42 launch general
                

Prezi provides an emulator application, which takes this output and simulates the entire conversion system, showing the evaluation of the control program upon completion. The terminate command always terminates the virtual machine instance assigned to the given queue closest to the billing period. After issuing a launch command, you must account for a two minute boot time before the new instance can process conversion jobs. At start, the system contains no virtual machines.

Although submissions will be evaluated based on a week of data, the first day does not count towards the evaluation. This grace period prevents your program from being penalized before it has a chance to reach the optimal virtual machine count.

Submissions can be written using any language or framework as long as they are easily run on a typical Linux or Mac OS X installation.

We judge entries based on the instance hours used by the logic — the lower the number, the better. If you are over the 5 second queue time, you can't win. If multiple entries achieve the same lowest instance hour, we'll look at the code and announce the winner based on how clever and nice the solution is (this will be subjective on our part).

How to enter?

We accept submissions from both individuals and teams, but the first prize is a single iPad and two tickets to the RAMP conference, regardless of how many participants contributed to the solution.

Submission deadline: May 19, 2013

Download the evaluator and input logs here:

Please send submissions to: scale@prezi.com

RAMF Conf, July 11-12, 2013


FAQ

Q: Who can participate?

A: While we made this contest with university students in mind, anyone from anywhere is welcome to participate.

Q: How many jobs can a machine do at a time?

A: Just one (there's no concurrency).

Q: Is there any difference between the virtual machines in terms of performance?

A: Nope. While different machines are assigned to process different queues, they are otherwise identical.

Q: Does using a VM for 30 minutes cost the same as 59 minutes?

A: Yes, they are billed at the beginning of the hour.

Q: Does using a VM for 30 minutes cost the same as 59 minutes?

A: Yes, they are billed at the beginning of the hour.

Q: Can I use a database?

A: Yes, as long as it does not complicate the setup of your submission for us. For example, don’t expect the availability of an Oracle cluster, but using SQLite is fine.

Q: What is the format of submissions?

A: The source code and all additional assets required to execute your program should be packaged in a single zip file. Either send this file as an attachment or send a link to where we can download this file to the submission address (see above). Alternatively you can send the link to the github repository hosting your code.

Q: What is the deadline for submissions?

A: May 19, 2013.

Q: Can I cheat?

A: You can certainly try, but since we’ll look at your source code, we’ll probably figure out what you’re trying to do and penalize you accordingly. For example, it’s a neat trick to buffer input and issue commands based on the “near future.” We’ll be on the lookout for tricks like that.

Q: How will we understand how the emulator works?

A: You will have access to the source code of the emulator. In addition, the emulator has a debug mode which displays the entire state of the system after each command.

Q: Can there be periods of extreme utilization in the input data?

A: Yes. Your algorithm should be prepared to handle situations where lots of conversions are started in a short timeframe.

Q:Can I decide which virtual machine executes a job?

A: No. Each of the three job queues has a set of associated virtual machines. The machine which executes the next task is randomly chosen from this set.

Q:If the assignment of jobs to virtual machine instances is random, do I need luck to win?

A: Not really. Each submission will be evaluated 9 times. The final score will be the median value of these results. If during one of the 9 runs a job waits in the queue for more than 5 seconds, then the result of that run will be invalid, and its calculated cost will be replaced by a humorously large number.

Q: I found an error in the contest description / I’m stuck and need some tips on how to solve a problem.

A: Contact us at scale@prezi.com.


Updates

  • 2013-04-27 Log entries are now sorted by date (thank you Marton Sereg for letting us know about the unsorted logs).