Scenario

Determine number of possible ways to decode a given Caeser cipher encoded message that maps:
‘A’ -> ‘1’
‘B’ -> ‘2’

‘Z’ -> ‘26’
The input is a string of n numbers and the output is a number counting all the possible ways a message could be decoded…

For message = “122”, there are 3 possible ways to decode the message:
“ABB” (1 2 2)
“LB” (12 2)
“AV” (1 22)

The good solution here should take O(n) time.

import random
def generate_message():
    """
    generates message and cipher strings
    """
    ALPHA = "_ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    MAX_LENGTH = random.randint(0,100)
    message = ""
    cipher = ""
    for _ in range(MAX_LENGTH):
        i = random.randint(1,26)
        message += ALPHA[i]
        cipher += str(i)
    return message, cipher
random.seed(113)
generate_message()
('UZX', '212624')

Observation

For a long string of 1’s of length n, the total number of letter combinations ends up being the number of ways for n-1 plus number of ways for n-2.

For example, take cases for n=2, n=3, and n=4. For n=2, ‘11’ decodes to one of (‘A’,‘A’) or (‘K’), making 2 possible decodings. For n=3, ‘111’ could be a set of 3 messages: (‘A’,‘A’,‘A’) (‘A’,‘K’), or (‘K’,‘A’). If you write out the possible decoding set for n=4 ‘1111’, you’ll have written 5 possible messages, which, importantly, is 2+3 (the addition of counts of previous two message lengths).

With that in mind, I’ll use memoization to precalculate validly consecutive values for future use.

I’ll only memoize 51 elements because the message is randomly and independently generated from equal probability distribution of letters, and there is a very low chance we’ll have a very long string of letters that fit this kind of observation. Similarly, in the real world we’ll most likely run into halting letter (see next observation).

# caveat you could have test case with intentionally very long message
ways = [1]*51
for i in range(2,len(ways)):
    ways[i] = ways[i-1] + ways[i-2]

So generally, ways[j] represents how many letter combinations are possible considering numbers [0:j-1], inclusive, of the message, but there are special numbers that signal a subset of all the possible letters.

Observations

  • ways[0] = 1 because a single number can only represent a single letter, but also, it can’t be letters [J:Z]
  • ways[1] = 1 way if message[0] > 2 because only [CDEFGHI] could be > 2, so there are 2 letters; else (since message[0] <= 2), ways = 2 because it could be two letters where message[0] can only be [AB] and message[1] takes on a value of [1:9] corresponding to [A:I], assuming the message terminates at this position.

Because there are numbers that indicate a specific subset of possible letters, I’ll take a dynamic programming approach and consider whether to include the current number in the previous section of numbers, or whether there should be a section break. A section break means I’ll have to multiply the possible ways of the last section with the possible ways of the next section. I’ll break sections by considering classes of digits.

Classes of digits are characterized base on its value and the previous value. Assume x as a qualified number that is to be considered with the current section.

  • 00 disqualifies entire message
  • x0 the number must end at zero, so don’t include this zero in the section or the next, but calculate total ways and reset to 0
  • 1[3:9] can be included (+1) as a digit on its own unless the previous number is 2. Must also end here, so reset to 0 in this loop
  • 2[3:6] is can be included to length of previous section (+1) or be a number on its own. Must also end here, so reset to 0 in this loop
  • 2[7:9] this is not a value continuation, so don’t include it in this section but treat it as reseting the next section to 1
  • x[1:2] this can be included (+1) in the section up to this number, and it can be a number on its own. May continue onto next section, so don’t reset section length
def do_it(message):
    n = len(message)

    if n == 0:
        return 1
    if n == 1 :
        if int(message[0]) == 0:
            return 0
        else:
            return n

    section_length = 1
    total_ways = 1

    # Decide whether to include i in length (+1) or end it calc and reset
    for i in range(1, n):
        prev = int(message[i-1])
        curr = int(message[i])
        #previous value is a nonstarter going forward eg it would be 3x or higher, or 0x which is not a letter
        if prev > 2 or prev == 0:
            if curr == 0:
                return 0
            total_ways = total_ways * ways[section_length]
            section_length = 1 # reset
            one_more = False
        # previous value is 2, so check self
        # 20 25 26
        # previous value is 1 so add prev value
        elif (prev == 1 or prev == 2) and curr == 0:
            section_length -= 1
            total_ways = total_ways * ways[section_length]
            section_length = 1 # reset
            one_more = False
        elif prev == 2 and curr <= 6:  
            section_length +=1
            one_more = True
        # 27 28 29 don't add this to section length and calc
        elif prev == 2 and curr > 6:
            total_ways = total_ways * ways[section_length]
            section_length = 1 # reset
            one_more = False           
        else:
            section_length +=1
            one_more = True

    if one_more:
        # do one final calculation since the last thing we did was add to section length (instead of calculating totals ways)
        total_ways = total_ways * ways[section_length]


    return total_ways

Let’s do it!

random.seed(1)
message, cipher = generate_message()
print(f"for a cipher text of {cipher}, we have {do_it(cipher)} possible message decodings")
for a cipher text of 1926253941625151621132674161, we have 2560 possible message decodings
def test_it():
    known = {
        "1": 1,
        "10122110": 5,
        "301": 0,
    }
    
    for c, ans in known.items():
        assert(do_it(c) == ans)
        
    print("Checks good!")
    
test_it()
Checks good!

Visual verification

Here is a message with small n so to visually check that the count is correct.

random.seed(113)
message, cipher = generate_message()
print(f"message is {message}")
print(f"cipher is {cipher}")
message is UZX
cipher is 212624
# knowing the cipher numbers, I'll manually build the possible set of messages 
combos = [[2,1,2,6,2,4],
          [21,1,2,6,2,4],
          [2,12,6,2,4],
          [2,1,26,2,4],
          [21,26,2,4],
          [2,1,26,24],
          [2,1,2,6,24],
          [2,12,6,24],
          [21,2,6,24],
          [21,26,24],
         ]
ALPHA = "_ABCDEFGHIJKLMNOPQRSTUVWXYZ"
s = set()
for c in combos:
    s.add(
        "".join(
        map(lambda x: ALPHA[x], c)
    ))
print(f"possible {len(s)} messages {s}")

print(f"total possible messages calculated: {do_it(cipher)}")
possible 10 messages {'UABFBD', 'UBFX', 'BABFX', 'BLFBD', 'BAZX', 'BLFX', 'UZX', 'BAZBD', 'BABFBD', 'UZBD'}
total possible messages calculated: 10