While reading about all the different ways in which someone can better their understanding of a problem, I consistently came upon one piece of advice: write about what you are learning.

The two sum problem is a variation of the subset sum problem, in which the goal of a function is to return the sets of numbers that add to a given sum. The two-sum challenge is similar, where the goal is to find all the pairs of two integers in an unsorted array that sum up to a given integer. For example, if the array is [1, 5, 3 9, 11, -3 ] and the sum is 6, your program should return [[9, -3], [1, 5]] because 9 + -3 = 6 and 1 + 5 = 6.

The simpler, albeit slower solution is to loop through the array for each element in the array, logging a pair if they sum to our given integer. While this will work, it runs in O(n^2), as we loop through n time, checking n elements. Thankfully, there is a faster solution, using the glory that be; hash tables.

To achieve a faster running time, we want to create an empty hash table, and for each “i” element in our array, we check to see if the i element minus our sum, S, exists in our table. If it doesn’t, we add it to our table, and if it does, we have found a pair. Let’s demonstrate how that works:

def two_sum(arr, S)

sums = []

hastable = {}

for i in range (0, len[arr]): //for as many elements as are in out array

sum_minus_element = S -arr[i]

if sum_minus_element in hashtable: //if sum_minus_element is in our has table:

sum.append([arr[i], sum_minus_element]) //add and our current index in arr to it

hashtable[arr[i]] = arr[i] //otherwise add sum_minus_element to our hash table

return sums //returns an array of sum arrays

Using the same array and sum as last time, ( [1, 5, 3 9, 11, -3 ], summing to 6 ) we begin by placing the first element in the array, 1, into our hash table. We then move to the next element, and check if 6 – 5 = 1 is in our hash table, and it is! This means we have found a pair! We then check if 6-3 = 3 is in our table, and since it isn’t we add 3 to our hash table. We only have to loop through our array once, so our time complexity is roughly O(n).

While researching this post I found two explanations of the run time, another being O(sum*n), which I thought was slightly odd.

http://algorithms.tutorialhorizon.com/dynamic-programming-subset-sum-problem/

While I really like their visuals, I personally think visualizing the time complexity as a linear function of the number of elements in our array emphasizes how useful this algorithm is. Hash tables are a vitally important part of software development, and knowing how to use them effectively is necessary skill for all successful programmers and computer scientists.

Cheers!