What THEY Do Not Teach YOU In Data Science! - By Anish Arya (Machine Learning and Reinforcement Learning Consultant)

To be honest, I’ll not talk here about what is Data Science and why should we use it. For, that there are already millions of pages to read on the web. Am not going to waste my as well as your time on that.

Via this platform, will try share with you as much as possible all practical problems which I face/have faced at my workplace or in google code jams where I held Rank #93 in 2019, and Rank #581 in 2018.

In a coding interview, the problem is not Technology which we all fear of, the real problem is how can we can break the problem in various sets/pieces. The thing here, which am talking about, is Data Structures and Algorithmic Design. This simple yet complex subject is embedded EVERYWHERE in data science realm.

Assume any Machine Learning technique you use, the science behind is the way how that ALGORITHM is constructed. Therefore, it becomes vital for us to understand the application of this subject.

I will try to post solved questions in this forum, with complexity varying from easy to very hard, and if you have any queries regarding the same, let’s take the advantage of this forum to discuss the same.

Prerequisites:

  1. Any Programming language (I’ll recommend Python for all you Data Analysts and Data Scientists)
  2. How to measure algorithm complexity (Such as ** “BIG-O” ** Not necessary to know in advance but if you do that is a plus) as I’ll be using some Complexity Notations alongside algorithm for better understanding

So here’s comes a sweet easy problem: SUM OF THE TWO NUMBERS
CAUTION!!! Do not go by the name as it is not simply RETURN (a+b) :wink: . We’ll use the concept DYNAMIC Programming (Subset Problem) here.

Problem Statement:
Write a function that takes in a nonempty array of distinct integers and an integer representing a target sum. If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order. If no two numbers sum up to the target sum, the function should return an empty array.

  • Note that the target sum has to be obtained by summing two different integers in the array; you can’t add a single integer to itself in order to obtain the target sum.

You can assume that there will be at most one pair of numbers summing up to the target sum.

Sample Input:
array = [3, 5, -4, 8, 11, 1, -1, 6]
targetSum = 10

Sample Output:
[-1, 11] // The numbers could be in any order

So let’s start writing Pseudo-Code for the problem:
Albeit the optimal complexity that can be achieved in O(logn) which I’ve mentioned in the comments, but to understand the problem better, I had to go with this approach “”" Complexity --> (n-1)(n-1) + c --> O(n^2)
# Psuedo-Code
** “”" Complexity --> (n-1)(n-1) + c --> O(n^2)**
I/P - array, targetSum

FOR index_of_array_input_seq from 0 TO length(array)-2 DO
FOR index_of_array_next_seq from index_of_array_input_seq+1 to index_of_array_input_seq -1 DO
IF array[index_of_array_input_seq] + array[index_of_array_next_seq] = targetSum THEN
return [ array[index_of_array_input_seq], array[index_of_array_next_seq] ] # Pair
# End Inner FOR
return [] # empty_array
#End Outer FOR

Since, I have done 95% task for you, all I ask from all of you is to code the same and post the solution here and If you have a better solution with less Runtime complexity, then please post the same :slight_smile:

I’ll post the possible solutions for this.

P.S. Note: My frequency of questions will be based ion the response I receive.

By: Anish Arya

1 Like

Hi Anish,
The below snippet will get the desired output.

def func(arr,s):
h=arr[::-1]
print(“numbers are:”)
for i in arr:
for j in p:
if(i!=j):
if(i+j==s):
print([i,j])
p=list(map(int,input("\nEnter the numbers with space(like 1 2 3): ").strip().split()))
t=input("Enter Target Sum: ")
r=int(t)
func(p,r)

1 Like

def myFun(lst, targetSum):
# number of elemetns in the list
n = len(lst)

# iterating from first to last but one element
for i in range(0, n - 1):
    # iterating from current index to last element
    for j in range(i + 1, n):
        # comparing with targetSum
        if lst[i] + lst[j] == targetSum:
            return [lst[i], lst[j]]
return []

if name == “main”:

# Creating List and Targetsum
lst = [3, 5, -4, 8, 11, 1, -1, 7]
targetSum = 18
print(myFun(lst, targetSum))
1 Like

Code is correct but, this code needs better naming conventions (i, j, lst, n, and myFun) as these are the heart of any program and lets the other person (usually colleague) understand your code better : )
You also kept comments in the source code, that reflected your understanding of what you were doing in the program step-by-step. Keep up the good work Saloni.

  1. Just one correction in code: instead of “current index” in the comment, it is current +1 index. Rest everything seems fine.

for i in range(0,len(array)-2):
for j in range(i+1,n-1):
if (array[i]+array[j]==Targetsum):
return [array[i], array[j]]
return []

For your 1st code:
The output is incorrect (as it will return every possible pair which the problem statement didn’t want and moreover it will return 2xN outputs always where N is the number of pairs E.g. [x1,y1] and [y1,x1], which meant the same things), albeit the rationale behind the logic was right.

** But, I’ll appreciate your thinking. Well Done 0/10 for code, and 10/10 for effort**

  1. There is mention of variable h, but never used.
  2. When you put i != j, you considered 1 extra case of comparison which will cost more to the algorithm. You can eliminate this step by limiting the iteration in For loop.

Thank you so much for the feedback :slightly_smiling_face:

1 Like
def twoNumberSum(array, targetSum):
#Am not putting comments on the code as the pseudo code has already been told in the article itself
    length_of_the_input_seq = len(array)
    for index_of_array_input_seq in range(0, length_of_the_input_seq - 1):
        for index_of_array_next_seq in range(index_of_array_input_seq + 1, length_of_the_input_seq):
            if array[index_of_array_input_seq] + array[index_of_array_next_seq] == targetSum:
                 return [ array[index_of_array_input_seq], array[index_of_array_next_seq] ]
     return []
# O(n) time | O(n) space ............... Better Solution

def twoNumberSum(array, targetSum):
     nums = {}
     for num in array:
         potentialMatch = targetSum - num
         if potentialMatch in nums:
             return [potentialMatch, num]
         else:
             nums[num] = True
     return []
# O(nlog(n)) | O(1) space .... 3rd possible solution
def twoNumberSum(array, targetSum):
    array.sort()
    left = 0
    right = len(array) - 1
    while left < right:
        currentSum = array[left] + array[right]
        if currentSum == targetSum:
            return [array[left], array[right]]
        elif currentSum < targetSum:
            left += 1
        elif currentSum > targetSum:
            right -= 1
    return []