For you gurus out there, I stumbled upon this interview question online:

Write a function that given a string of digits and a target value, prints where to put +'s and *'s between the digits so they combine exactly to the target value. Note there may be more than one answer, it doesn't matter which one you print.


"1231231234",11353 -> "12*3+1+23*123*4"

"3456237490",1185 -> "3*4*56+2+3*7+490"

"3456237490",9191 -> "no solution"

Let's say the string is 100 digits!


Note: this was written before the "100 digits" part was added to the question. I'm not going to attempt to work out a non-brute-force solution.

I would brute force it. Basically we can think of any potential solution as a number in base 3, with as many trits (trinary digits) as there are digits in the original string, -1. Each trits could be:

0 : no symbol
1 : +
2 : *

So for the example of "3456237490" and a trinary number of 100212001 the result would be 3+456*2+3*749+0.

Then you've just got to run through all the possibilities and evaluate each one. It's fiddly, but not hard. (In an interview I'd then ask if the interviewer wanted me to actually code it up, and if so whether to code the whole thing or just part of it. I'm happy to code it up in C# if anyone wants...)

There may be more intelligent ways of doing it, but as a friend of mine says: "If brute force isn't working for you, you're not using enough of it."


Dynamic programming solution (O(len(s) * n**3))

def place_op(s, n):
    Given a string of digits s and a nonnegative integer n, return a
    string t of times, plus and digits with s as a subsequence that
    evaluates to n.  Runs in time O(len(s) * n**3).
    >>> eval(place_op('1231231234', 11353))
    >>> eval(place_op('3456237490', 1185))
    >>> place_op('3456237490', 9191)  # returns None, for no solution
    # What follows is a dynamic programming algorithm.  Each partial
    # solution can be described by a partial sum, a partial product, and
    # a partial integer.  For 1+2*3+4*5*67, the partial sum is 1+2*3 ==
    # 7, the partial product is 4*5 == 20, and the partial integer is
    # 67.  Given a new digit d, we can extend this string by d, *d, or
    # +d and update the partial sum, product, and integer appropriately.
    # All values are capped at n + 1 so that this remains an O(len(s) *
    # n**3) algorithm.  This is a valid optimization because all
    # operations except multiplying by zero only increase the quantity
    # in question, and multiplying by zero always yields the same
    # result, zero, no matter the other operand.
    sol = {(0, 1, int(s[0])): s[0]}
    for c in s[1:]:
        d = int(c)
        sol1 = {}
        for (psum, pprod, pint), t in sol.iteritems():
            pint1 = min(pint * 10 + d, n + 1)
            sol1[(psum, pprod, pint1)] = t + c
            pprod1 = min(pprod * pint, n + 1)
            sol1[(psum, pprod1, d)] = t + '*' + c
            psum1 = psum + pprod * pint
            if psum1 <= n:
                sol1[(psum1, 1, d)] = t + '+' + c
        sol = sol1
    for (psum, pprod, pint), t in sol.iteritems():
        if psum + pprod * pint == n:
            return t
    return None

For numbers of this size (10 digits), provided that you can only insert +'s and *'s, there are only about 20,000 possible ways to insert the operators (between any two digits, insert a +, a *, or nothing, so 3^9 = 19683), so you should be able to brute force a solution quickly.


edit: This approach won't work, because I assumed x * c > x + c, but this is only true for c >= 2.

I think I'd start with trying to search the space... Lets see, for a n-digit first number, you have 3^(n-1) choices (each gap can have +,*,or nothing).

That gives you a search space of size 19,683 for your first example. That would be doable, I guess.

You can prune the space - because you've got only + and * and positive numbers, You can know how the total behaves... If you start with all plusses (smallest possible value) and start changing them to * or nothing, then your total will not decrease (except for the zero digits...). If it's gone over the target, you can prune out the rest of the choices from here. That should prune out a reasonable amount of the search space.

I'm not sure what to do about the zero digits. But then, that's the sort of thing which probably makes this a good interview question. Generally, I have the feeling that this problem is hard when you get to the 100-digit case. I can't see a clear way to do this except for search and pruning (And pruning doesn't work as well as I initially thought).


Here's a brute-force solution in F# in under 30 lines (golf-style, not particularly readable), that I started writing before the mention of "100 digits":

let rec AllPartitions = function
    | [x] -> [[[x]]]
    | [a;b] -> [ [[a;b]] ; [[a];[b]] ]
    | h::t -> [for th::tt in AllPartitions t do 
                   yield (h::th)::tt
                   yield [h]::th::tt]
let ListProd l1 l2 f = [for x in l1 do for y in l2 do yield f x y]
type Mul = Mul of int * Mul | MI
type Sum = Sum of Mul * Sum | SI
let C s1 op s2 = if s1 = "" then s2 else s1 ^ op ^ s2
let rec EMul = function | Mul(x,y) -> x*(EMul y) | MI -> 1
let rec ESum = function | Sum(x,y) -> (EMul x)+(ESum y) | SI -> 0
let rec PMul = function | Mul(x,y) -> C (PMul y)"*"(x.ToString()) | MI -> ""
let rec PSum = function | Sum(x,y) -> C (PSum y)"+"(PMul x) | SI -> ""
let Make l = [l |> List.fold (fun s x -> 10 * s + x) 0]
let MakeMul l = AllPartitions l |> List.collect (List.fold (fun s x -> 
    ListProd (Make x) s (fun x y -> Mul(x,y))) [MI])
let MakeSum l = AllPartitions l |> List.collect (List.fold (fun s x -> 
    ListProd (MakeMul x) s (fun x y -> Sum(x,y))) [SI])
//let origStr,target = "1231231234",11353
let origStr,target = "3456237490",1185 
//let origStr,target = "3456237490",9191 
let data = origStr |> Seq.map (fun c -> int c - int '0') |> Seq.to_list 
printfn "The solutions for '%s' -> '%d' are:" origStr target
for s in MakeSum data do
    let ans = ESum s
    if ans = target then
        printfn "%s = %d" (PSum s) ans

Edit: See Greg's comment. The precedence of + over * makes this approach invalid.

We can somehow partition the search space in Jon's solution by ruling out combinations based on how they end.

If a combination is successful, it will have a sequence of m intermediate results R(1)...R(m) in which R(m)==target number, let's say it is 1185.

Let's say your string is "3456237490" which has length = 10, and implies 1 <= m <= 10.

Then R(m-1) , the last step before getting the wanted result, must be a number that either:

  • (Multiplying Option 1) Multiplied by 0 gives 1185
  • (Multiplying Option 2) Multiplied by 90 gives 1185
  • (Multiplying Option 3) Multiplied by 490 gives 1185
  • (Multiplying Option 4) Multiplied by 7490 gives 1185
  • ...
  • (Multiplying Option 10) Multiplied by 3456237490 gives 1185
  • (Addition Option 1) Having 0 added to it gives 1185, i.e. R(m-1)==1185
  • (Addition Option 2) Having 90 added to it gives 1185 i.e. R(m-1)==1095
  • (Addition Option 3) Having 490 added to it gives 1185 i.e. R(m-1)==695
  • (Addition Option 4) Having 7490 added to it gives 1185
  • ...
  • (Addition Option 10) Having 3456237490 added to it gives 1185

You discard all the multiplying options which contain numbers which are not divisors of your current target, and the adding options with too large numbers. Now we are left with 3 possible paths. 3, of course, satisfies 3 <= 20 == 2 * length, i.e. you could have at most 20 possible paths but you will typically discard a whole lot of them.

Specifically, the only valid paths contain a last number which is either a divisor (for multiplying paths) or lower (for adding paths) than the target.

Then you can continue recursively exploring the paths. You have a new target (either 1185, 1095 or 695) and a new string (the part of the original string that you still haven't used). It shouldn't be hard.


After a bit of thinking, I came up with this Python solution.

The whole solution tree is explored, but if the value of a partial expression is found to be bigger than the target, the branch is closed, and children are not evaluated.

All solutions are printed, not just one.

To work correctly with operator precedence, and to avoid re-evaluating the whole expression at each step, you are forced to represent a partial expression as a (sum + product * int)

For the record, the aim of the code is to be clear and readable, not to be the most efficient hacked-tricky code that is the fastest. Pythonistas, if something can be improved to be clearer, please do tell me ;) (in particular, I am not so fond of the deepcopy..)

import copy

class Calculator(object):
    Representing a simple calculator.
    As you enter digits or operators, it computes
    the value of the resulting expression.

    An expression is always represented as:
    sum + product * number

    def __init__(self, value):
        @type value: str
        n = int(value)

        self.sum = 0
        self.prod = 1
        self.num = n

        # value of expression
        self.value = n

        # string representation
        self.repr = value

    def __str__(self):
        return self.repr

    def getvalue(self):
        return self.value

    def operation(f):
        decorator recalculating the value of the expression
        after an operation.
        It works on a deepcopy to avoid changing the value,
        and returns the newly created object.
        def decorated(self, digit):
            deepcopy = copy.deepcopy(self)
            f(deepcopy, digit)
            deepcopy.value = deepcopy.sum + deepcopy.prod*deepcopy.num

            return deepcopy

        decorated.func_name = f.func_name
        decorated.func_doc = f.func_name
        return decorated

    def digit(self, digit):
        @type digit: str
        self.repr += digit

        self.num *= 10
        self.num += int(digit)

    def add(self, digit):
        @type digit: str
        self.repr += '+' + digit

        self.sum = self.sum + self.prod * self.num
        self.prod = 1
        self.num = int(digit)

    def mult(self, digit):
        @type digit: str
        self.repr += '*' + digit

        self.prod *= self.num
        self.num = int(digit)

def solve(input, target):
    @type input: str
    @type target: int
    if not input:
        print 'No Solution.'

    calc = Calculator(input[0])
    thislevel = dict()

    thislevel[calc.getvalue()] = [calc]

    for digit in input[1:]:
        nextlevel = {}
        for nodes in thislevel.itervalues():
            for node in nodes:
                for op in (node.digit, node.add, node.mult):
                    child = op(digit)
                    val = child.getvalue()
                    if val <= target:

        thislevel = nextlevel

        for node in thislevel[target]:
            print node
    except KeyError:
        print 'No Solution.'

solve("1231231234", 11353)
solve("3456237490", 1185)

Edit Most of this is wrong.

Operator precedence make this approach wrong. First, you have to stack up the last products anyway, so right to left or left to right is the same. And... You should not delete the same values on each level.

Also, the fact that 2 expressions have the same values doesnt mean that the derived expressions will all have the same values. Think of 6 and 2*3: if you add 1, it gets 61 and 2*31. Different, so you have to keep both 2*3 and 6 on each level

As a remark, you will probably want to generate sequences starting from the end, because of the multiplication priority. If you consider 123:

23=2*10+3 6=2*3 5=2+3
23 24 123 6 7 16 5 6 16

It allows you to calculate recursively the values, which is not possible in the other way: 1 -> 1+2 -> 1+2*3 here, 1+2*3 cannot be calculated from 1+2

You can also order and remove on each generation step the duplicates. The first tree becomes:

5 6 23
5 6 7 16 23 24 123

Also, since the values are increasing, if at one step one value is bigger than the target, you can safely ignore it.


Coming from the brute-force method (proposed by Jon Skeet) but being confronted with lots of digits, I would guess an alternative could be to use a greedy DFS.

The smallest possible result will be '1...1', you're summing up all digits. If this is already larger than the target, you're done (no solution).

Otherwise you could now consider switching each of the n ones to either 0 or 2: both actions will bring you closer to the goal number (how close can be calculated easily). Then you choose the single action that brings you closest to the goal action, and repeat. Backtracking ensures that you will eventually search the whole search space (i.e., visiting all trigits like in the brute-force method, but now even slower due to search overhead :).


EDIT: This one was wrong


I just realized that the problem does not allow for parentheses, which simplifies things a bit. There are three partitions: first into numbers, then into products, and finally into a sum. So a good solution might be to compose the final number from a combination of sums, which will be sums of sums and products, which will be products of products and concatenations, calculating the possible paths dynamically.

In pseudocode:

for partLength=2..inputLength
  for i=1..inputLength-partLength
    numbers = input.range(i, i+partLength)
    if map.has(numbers)
    if concat(numbers)<=target
      subsolution.addProduct(concat(numbers)) # add the number as a product for simplicity 
    pairs = numbers.split()
    for pair in pairs
       products = pairwiseProduct(concat(pair.first), concat(pair.second))
       products.add(pairwiseProducts(map.get(pair.first).products, map.get(pair.second).products))
       sums = pairwiseSum(concat(pair.first), concat(pair.second))
       sums.add(pairwiseSums(map.get(pair.first).products, map.get(pair.second).products))
       sums.add(pairwiseSums(map.get(pair.first).sums, map.get(pair.second).sums))
    map.add(numbers, subsolution)
if map.get(target).sums.contains(target) or map.get(target).products.contains(target)
  return success
  return failure

Since the result calls for the final formula, the pairwise functions have to store with each result the operands it is formed from, and then on success you simply flatten the operation tree that got you to your target number.


Here's a Skeet brute force implementation in Python, with calc times. Horribly inefficient I'm sure but I'm a beginner. Note that OP's 2nd example has 3 solutions:

import time

def increment( symbols ):
    if len( symbols ) <= 1:
        return { '': ' ', ' ': '+', '+': '*', '*': '+ ' }[symbols]
    inc = increment( symbols[1:] )
    if len( inc ) == len( symbols ):
        return increment( symbols[0] ) + inc[1:]
    return symbols[0] + inc

def expr( sequence, ops ):
    padops = ' ' * ( len( sequence ) - len( ops ) - 1 ) + ops + ' '
    exp = ''
    for i in range( len( sequence ) ):
        exp += sequence[ i ] + padops[ i ]
    exp = exp.replace( ' ', '' )
    return exp

def equate( sequence, target, ops = '' ):
    print '%s: %s ==> %d' % ( time.asctime(), sequence, target )
    while len( ops ) < len( sequence ):
        exp = expr( sequence, ops )
        val = eval( exp )
        if val == target:
            print '%s = %d' % ( exp, target )
        ops = increment( ops )
    print '%s: done\n' % time.asctime()

equate( '1231231234', 11353 )
equate( '3456237490', 1185 )
equate( '3456237490', 9191 )

Fri Jun 05 16:39:29 2009: 1231231234 ==> 11353
12*3+1+23*123*4 = 11353
Fri Jun 05 16:39:31 2009: done

Fri Jun 05 16:39:31 2009: 3456237490 ==> 1185
34*5+6*23*7+49+0 = 1185
34*5*6+23*7+4+9*0 = 1185
3*4*56+2+3*7+490 = 1185
Fri Jun 05 16:39:32 2009: done

Fri Jun 05 16:39:32 2009: 3456237490 ==> 9191
Fri Jun 05 16:39:34 2009: done


Thoughts on better implementation of brute force/pruning:

  • Insertion of a + or * symbol into an existing expression creates a new expression of equal or lesser value, never greater. So starting at the initial, symbol-less string, one should insert symbols in a way that brings the value down toward the target as quickly as possible. Because once we get to an expression that is LESS than the target, we can immediately prune any expression that adds symbols, as they cannot possibly increase our value to come up to the target.

  • We get to smaller values quicker by effectively bisecting unbroken sequences of digits. So our first operator should go smack in the middle, next should bisect one half, etc.

As you can see from the results, 10 digits solves in a sec or two, even with gross brute force. With intelligent operator insertion and pruning, this could run very fast even for large strings. If anyone can find a longer test I'll try it out.


Use a genetic algorithm to search the solution space.