# Dynamic programming for fun and profit

April 20, 2019Imagine that you decide to make an investment. You have 1720 currency units (CU) at your disposal and the investment options that are available are:

Option# | min amount | max amount | interest | sign up bonus |
---|---|---|---|---|

1 | 50 | 99 | 1% | 0.5 |

2 | 100 | 299 | 1.1% | 1 |

3 | 300 | 499 | 1.6% | 2.2 |

4 | 500 | 999 | 1.8% | 2.5 |

5 | 1000 | 1999 | 2% | 4 |

You may use multiple investment options, each one possibly multiple times. Each option accepts a minimum and maximum amount of CUs, and provides an interest on that amount plus an additional bonus amount which does not depend on the invested amount. How will you invest your money? Take your time and continue reading when ready...

Let see how you did! If you invested all 1720 CU in option 5 then... you chose
poorly. Your profit is going to be 38.4 CU — not bad, but you can do better. If
you invested 1020 in option 5 and 700 in option 4, that's a bit better at 39.5
CU, but still not optimal. What about 1020 in option 5, 500 in option 4 and 200
in option 2? That's actually a bit worse at 39.1 CU! I'll spare you the agony
and reveal the best solution: 1020 in option 1, 300 in option 3 *twice*, and
100 in option 2, giving a profit of 40.5 CU. This is about 5.5% better than the
straightforward solution of assigning everything to option 5.

As this short exercise has hopefully exhibited, finding the optimal portofolio of options is far from trivial. Is there a way to programmatically find the optimal solution?

## The road to dynamic programming

First, let's define what are trying to find. A portofolio is a multiset of pairs, each pair representing that we are assigning CU to option . For this solution to be acceptable, all the pairs need to be valid and the total sum of all pairs must not exceed the total amount of money we want to use. We are trying to find the portofolio which provides the maximum profit.

### Brute force

To understand the solution space a bit more let's devise a way to enumerate
*all* the portofolios, which is in fact not trivial.

One way to think about this is that to produce a portofolio we first have to split our total amount in some way, and assign an option to each element in the split. Each such split is formally called a partition of . The number of partitions for a number is given by the partition function which becomes spectacular very quickly. For each such partition we also need to factor in all the different acceptable option permutations.

Another way to think about this, which turns out to be more helpful, is with a recursive, constructive view of the problem. Let be the set of portfolios for total CU. From this set, the subset of porfolios that use option with CU consists of all portofolios with total CU combined with the option . Expressing this with set notation we have ( is the multiset sum operator, which in our case is used to add items to the multiset):

The whole set is the union of all sets for all valid combinations of and :

Using this recursive set description of the solution space we can write a
program to enumerate all the solutions for a particular amount and set of
options. This description provides a straightforward, albeit rather
inefficient, way to solve this problem: by brute force — evaluate *all*
solutions and use the best. In pseudocode this would look like:

solve_brute_force(n): solutions = {} for i in [1..n]: for o in OPTIONS: if o accepts i: for s in (solve_brute_force(n-i) ∪ {∅}): solutions = solutions ∪ (s ⊎ {(o,k)}) return solutions solve_driver(n): return argmax(solve_brute_force(n), profit)

### Optimal substructure

Let's see if this problem has any interesting properties we can take advantage of to find the optimal solution more efficiently.

If we look at we see that we are combining with all portofolios for , and we ultimately check all these combinations to find the best solution. This is wasteful, however, since the choice of doesn't affect the profit of any of the members of . To put it more formally, the profit function is an additive multiset function:

We can thus get the best total value by using one of the members of the
set with the highest profit. Any other choice from that set would yield a lower
total profit. We have thus shown that our problem exhibits the *optimal
substructure* property: an optimal solution includes optimal solutions to
subproblems.

Armed with this revelation we can now produce a much more efficient formulation:

This leads us to the following pseudocode:

solve_optimal_substructure(n): solutions = {} for i in [1..n]: for o in OPTIONS: if o accepts i: solutions = solutions ∪ (solve_optimal_substructure(n-i) ⊎ {(o,k)}) return argmax(solutions, profit) solve_driver(n): return solve_optimal_substructure(n)

### Overlapping subproblems

Another interesting thing to note is that to solve for
we need to solve
for
. Similarly, to solve for
we need to solve
for
and so on. We are solving the same problems over and
over again! Our problem thus exhibits the property of *overlapping subproblems*.
We can take advantage of this property by storing results in a cache and
reusing them instead of recalculating them.

Updating our original pseudocode to use a cache we have:

solve_overlapping_subproblems(n): if n in CACHE: return CACHE[n] solutions = {} for i in [1..n]: for o in OPTIONS: if o accepts i: for s in (solve_overlapping_subproblems(n-i) ∪ {∅}): solutions = solutions ∪ (s ⊎ {(o,k)}) CACHE[n] = solutions return CACHE[n] solve_driver(n): return argmax(solve_overlapping_subproblems(n), profit)

### Dynamic programming

If we combine both optimizations, taking advantage of both the optimal substructure property and the overlapping subproblems property, we reach the dynamic programming solution:

solve_dp_recursive(n): if n in CACHE: return CACHE[n] solutions = {} for i in [1..n]: for o in OPTIONS: if o accepts i: solutions = solutions ∪ (solve_dp_recursive(n-i) ⊎ {(o,k)}) CACHE[n] = argmax(solutions, profit) return CACHE[n] solve_driver(n): return solve_dp_recursive(n)

Alternatively, we can express this without recursion:

solve_dp_iterative(n): solutions = {} for i in [1..n]: for o in OPTIONS: if o accepts i: solutions = solutions ∪ (CACHE[n-i] ⊎ {(o,k)}) return argmax(solutions, profit) solve_driver(n): for i in [1..n]: CACHE[i] = solve_dp_iterative(i) return CACHE[n]

## Scaling down

Instances of this problem with a large can still be prohibitive to solve even when using the dynamic programming approach I just described. Not all is lost, however, if we are willing to relax our requirement for a guaranteed optimal solution.

One approach to cutting down our solution time, while still producing a
possibly good solution, is to scale down our problem. We do this by dividing
the start, end and bonus values of all options, and also the target number for
which we solve for by a particular *scale* factor. Another way to view this is
that scaling down by *scale* corresponds to performing the algorithms with a
minimum step of value *scale* instead of value 1.

For example if we scaled down our original problem by 10 we would solve for 172 (= 1720 / 10) and use the following scaled down options:

Option# | min amount | max amount | interest | sign up bonus |
---|---|---|---|---|

1 | 5 | 9 | 1% | 0.05 |

2 | 10 | 29 | 1.1% | 0.1 |

3 | 30 | 49 | 1.6% | 0.22 |

4 | 50 | 99 | 1.8% | 0.25 |

5 | 100 | 199 | 2% | 0.4 |

When we find the optimal solution we can scale it back up using the same
*scale* factor.

The expectation is that the optimal solution for the scaled down version of our
problem is going to be in the proximity of the optimal solution for the
original version, and assuming that the *Profit* function is generally
smooth, the profit is also going to be near the optimal one.

## Getting greedy

Another approach to solving this problem faster is to explore whether there is greedy solution for it. Below I will describe an approach that works in many cases, but is not guaranteed to provide the optimal solution.

Taking inspiration from the greedy solution to the fractional knapsack problem,
at each step we greedily select to put as many CUs as possible in the option
with the best marginal profit. The key observation here is that in our case
each option has *two* different marginal profits. The first one involves
investing in a new instance of an option using its minimum acceptable amount.
The second one involves investing more in an instance of an option we have
already invested in. In the first case the sign up bonus kicks in and increases
the marginal profit. In the second case the marginal profit is simply the interest.

For our original scenario we have:

Option# | min amount | max amount | min marginal profit | add marginal profit |
---|---|---|---|---|

1 | 50 | 99 | 2% | 1% |

2 | 100 | 299 | 2.1% | 1.1% |

3 | 300 | 499 | 2.3333% | 1.6% |

4 | 500 | 999 | 2.3% | 1.8% |

5 | 1000 | 1999 | 2.4% | 2% |

For a total amount of 1720 this method works flawlessly. We first select to add to our portofolio for a marginal profit of 2.4%. From the remaining 720 we add for a marginal profit of 2.333%. From the remaining 420 we again choose . We now have 120 left, for which we choose , and the final 20 we add to the instance we already have.

Unfortunately, this method doesn't find the optimal solution in all cases. Take for example a total amount of 500. The greedy algorithm chooses , , for a total profit of 11.2. Alas, the optimal solution is simply for a total profit of 11.4.

Perhaps there is some other greedy approach that provides optimal solutions in all cases. Still, the described greedy approach is useful if we want to quickly get a possibly good solution. We could also try both the greedy approach and the scaling approach to raise our confidence level in the generated solution.

## Can we do better?

We have mostly focused on solving our investment problem using a dynamic programming approach. Taking a step back, we may ask if this is indeed the best way to get an optimal solution. Can we approach this problem from a completely different angle to improve performance?

It turns out that the answer is yes! With some creativity our problem can be expressed as an integer linear program (ILP), for which we have algorithms that work quite efficiently in practice. I plan to present this approach in more detail in a future post.

## An ounce of action...

If you want to try out the ideas mentioned in this post, I have created a program implementing all the algorithms and variations. You can find it at: https://github.com/afrantzis/invest.

Enjoy!