by April 27, 2009

Velocity Labs believes heavily in the power of cloud computing. Dynamically allocated hardware on a pay-for-what-you-need basis has tremendous advantages when it comes to helping clients manage and provision their clusters. The main advantage of cloud computing is dynamically growing, or shrinking, hardware as the needs of the application change.

Because of the dynamic nature of cloud computing, we don't need a guaranteed answer on hardware requirements up front. However, a client may want a ballpark figure in order to set aside the right amount of budget or let investors know the estimated operational cost. You could crunch the numbers yourself, but why would you do that when we've already automated the process for you?

Determining optimal cost

We've constructed a very basic model for minimizing the cost of Amazon EC2 hardware resources which satisfies a minimum number of EC2 Compute Units and a given amount of RAM per process.

The technique uses linear programming and the GNU linear programming kit (GLPK). Note: I'm a math geek that likes linear modeling, so if you're unfamiliar with either, I'd be happy to chat with you about them over lunch.

Installation

First, install the GLPK. On Ubuntu execute the command

sudo aptitude install glpk

on Mac OS X execute the command

sudo port install glpk

Next, create the following as cloud_cost.txt.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
## Amazon EC2 Cloud
## Optimizing cost per processor

# The set of instance types
set InstanceTypes;

# The costs per instance
param InstanceCosts{a in InstanceTypes};

# The compute units per instance
param InstanceCU{b in InstanceTypes};

# The amount of RAM per instance
param InstanceRAM{b in InstanceTypes};

# The number of compute units needed
param unitsNeeded;

# The amount of RAM required per instance of the application
param ramRequiredPerAppInstance;

# The quantity of each instance to purchase
var InstanceQuantity{q in InstanceTypes}, integer, >= 0;

# The objective function to minimize, in this case: cost
minimize cost: sum{i in InstanceTypes} InstanceCosts[i] * InstanceQuantity[i] * 720;

# Minimum total compute unit constraint
s.t.  supply: sum{d in InstanceTypes} InstanceCU[d] * InstanceQuantity[d] >= unitsNeeded;

# Maximum RAM per instance constraint
s.t.  ramrequired{d in InstanceTypes}: InstanceCU[d] * ramRequiredPerAppInstance * InstanceQuantity[d] <= InstanceRAM[d] * InstanceQuantity[d];

solve;

display{i in InstanceTypes}: InstanceQuantity[i];

data;

set InstanceTypes := Small Large XLarge HCPULarge HCPUXLarge;

param InstanceCosts :=  Small 0.1
                Large  0.4
                XLarge 0.8
                HCPULarge .2
                HCPUXLarge  .8;

param InstanceCU := Small 1
                    Large 4
                    XLarge 8
                    HCPULarge 5
                    HCPUXLarge 20;

param InstanceRAM := Small 1700
                    Large 7500
                    XLarge 15000
                    HCPULarge 1700
                    HCPUXLarge 7000;

# The number of compute units our cluster will need
param unitsNeeded := 500;

# The amount of RAM each process requires
param ramRequiredPerAppInstance := 125;

end;

Computing the cost

The model requires the specification of two variables: total number of EC2 Compute Units and RAM. Both variables are specified at the bottom with param unitsNeeded and param ramRequiredPerAppInstance respectively. Change these params to reflect your particular situation. Note: A future article will explore capacity planning in more detail.

When you're ready, execute the solver using the following command:

glpsol --model cloud_cost.txt --output result.txt

Analyzing the results

The program generates the result into a file called result.txt. Assuming 500 EC2 Compute Units with 125MB of RAM per process, the file will look something like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Problem:  cloud_cost
Rows:    7
Columns:  5 (5 integer, 0 binary)
Non-zeros: 15
Status:   INTEGER OPTIMAL
Objective: cost = 14400 (MINimum)

 1 InstanceQuantity[Small]
                    *              0             0
 2 InstanceQuantity[Large]
                    *              0             0
 3 InstanceQuantity[XLarge]
                    *              0             0
 4 InstanceQuantity[HCPULarge]
                    *              0             0
 5 InstanceQuantity[HCPUXLarge]
                    *             25             0

The objective function was cost, so the optimal arrangement of hardware needed to get that computational power costs $14,400/month. The second column of the hardware arrangement indicates the number of instance needed. In this case we need 25 high-CPU, extra large instances.

Did you find this useful? Let us know in the comments!

Curtis Miller

Curtis Miller

Managing Partner

Startup junkie, Rubyist and gamer. Loves to brainstorm about new ideas.

Need help with your project?

We specialize in Ruby on Rails and JavaScript projects. Code audits, maintenance and feature development on existing apps, or new application development. We've got you covered.

Get in touch!