Product Of Other Numbers

Problem:

You have a list of integers, and for each index you want to find the product of every integer except the integer at that index.

Restrictions: divisions are not allowed.

First thought that comes to my mind is to use 2 for-loops and multiply all the indices except the current one of the first loop. Easy, finished in less than a minute. The issue is the solution has an n^2 complexity.

How to optimize?

As usual the problem is the data is not organized in a nice way for this problem. Time to analyze!

Given an example:

input: [1,2,3,4]
expected output: [24, 12, 8, 6]
output expanded: [2*3*4, 1*3*4, 1*2*4, 1*2*3]

By looking at the expanded output, I can see there is a pattern going on. Exactly what it is took me a few minutes to figure it out, but right out of the box I can definitely see that:

The lowest index of the input is added on the expanded output with an index of 1. In other words the number 1, with an index 0, of the input array, was added on the index 1 of the output array; which makes sense given that index 0 of the input cannot be considered in the product of the output in index 0.

Note: this sort of reminds me to the fibonacci problem were you have to reuse operations to avoid recalculating the same thing over and over. Its hard to think how to reuse operations though. In this case I have to break it into sub-problems to identify the patterns.

A few minutes pass of me staring at the whiteboard and I realize that the current index has all the numbers to the left and to the right of it.

index 2 of ouput expanded: [—,—,1*2*4,—]

The 1 and 2 are numbers to the left of the current number and 4 is to the right. Sounds silly, but that helps to think about the solution. In other words, the current number is the product of all the numbers to the left and all the numbers to the right!

What if I have 2 extra arrays:

  • before[] where its current index is the product of all the indices before it.
  • after[] where its current index is the product of all the indices before it.

I can then loop through the before[] and after[] array and simply multiply the current index and store it in a result[] array.

I am not sure, but that sounds like it reduces the complexity (hopefully to 2 individual loops).

It sounds like a good moment to start drawing some code (I say drawing because I am getting used to whiteboards so I write it there first).

My first attempt looked something like this:

public int[] getProduct(int[] numbers){
int[] result = new int[numbers.length];
int[] arrBefore = new int[numbers.length]; 
int[] arrAfter = new int[numbers.length]; 
int productBefore = 1;
int productAfter = 1; 
int b = numbers.length - 1; 

//Organize the data 
for(int a = 0; a < numbers.length; a++){
productBefore *= numbers[a];
productAfter *= numbers[b];
arrBefore[a] = productBefore;
arrAfter[b] = productAfter;
 b--; 
} 

//Calculate the result 
for(int a = 0; a < numbers.length; a++){ 
result[a] = arrBefore[a] * arrAfter[a]; 
}
return result;
}

I made a mistake here! I did not realize right away the ordering of the multiplication and assigning the arrays was important. First I need to assign the value to the product array, otherwise I am skipping ahead.

(I briefly made another mistake but corrected it in my mind. Initializing the product variables to 0 because there are no values before the first or last index. What is interesting is the representation of this emptiness works well with the number 1 and not 0 – also 0 would just 0 out the entire multiplication)

Small correction:

public int[] getProduct(int[] numbers){
int[] result = new int[numbers.length];
int[] arrBefore = new int[numbers.length]; 
int[] arrAfter = new int[numbers.length]; 
int productBefore = 1;
int productAfter = 1; 
int b = numbers.length - 1; 

//Organize the data 
for(int a = 0; a < numbers.length; a++){
arrBefore[a] = productBefore;
arrAfter[b] = productAfter;
productBefore *= numbers[a];
productAfter *= numbers[b];
 b--; 
} 

//Calculate the result 
for(int a = 0; a < numbers.length; a++){ 
result[a] = arrBefore[a] * arrAfter[a]; 
}
return result;
}

It works and its optimized!

  • n+n or 2n is the new processing complexity.
  • I also sacrificed some memory, 2 arrays of n or 2n memory to be exact. On the bright side we can buy more memory.

I liked this problem because it involves memoization, which I am not very good at spotting right away; my brain is not used to seeing it. Good practice!

Code available on github

Advertisements

Understanding AWS – API Gateway

API Gateway is a service from AWS that provides a ReSTful interface between a client/requester and an endpoint.

The term client/requester and endpoint are abstracted on purpose because it can literally be anything that can send or receive an HTTP/HTTPs request.

Another concept that is hard to define is “ReSTful”. Recently I realized I could not really define what restful really meant; I could briefly mention characteristics about it, but not really dive deep into it, even though I have been using it for years.

Note to myself – restful interfaces are a style of sending or receiving http requests. It is a “style” because the request will not actually break or not work if it is not sent in a restful manner.

The client:

  • Describes where the request is going via a URL.
  • Controls the request with headers.
  • Sends additional data via: query string parameters and the body (for methods that allow it). Additional data can also be sent via the headers, but they are mostly meant for control.
It is important to note why "control with headers" is done via the client 
and not the server. The server does not make any assumptions about the
request; it can enforce rules (example: authorization), but does not make 
assumptions. This allows the server to have more flexibility with each 
request it receives. In this case the server/interface is API Gateway.

Endpoint:

  • What the endpoint has to do will depend a lot on the particular endpoint.
  • Examples of endpoints: HTTPAWS Lambda, a database, a mail server, a queue. Note that the endpoint does not have to understand anything about restful APIs, because another service (API Gateway) has abstracted that for us.
  • The endpoint does have to return a status code or the interface will assume there was an error. Additionally the endpoint can return more information if the interface allows it.

What else can be said about a restful request?

The style also uses resources and methods/verbs to describe the actions you are trying to do.

The request

For example the URL: https://httpbin.org/ip contains several parts that describe our request:

  • https – describes the protocol the request is trying to use.
  • httpbin.org – is the domain that gets asked to a name server to try to find the IP that can (hopefully) serve the request.
  • /ip – is a resource that describes what are we trying to access under that domain.
  • We do not get to see it in the URL, but the resource is tied to verbs or methods. In this case it is a GET method. Notice also how it is very easy to describe what that URL is trying to accomplish by reading it with simple english words: use the https protocol to GET the ip. If we were super strict it would be GET the ip of the httpbin.org domain. But in reality this request returns the IP of the client.
  • There are also headers that automatically get sent even though we do not specify any. host and accept, are two very common headers.
  • The request does not contain a body. How do I know that? Because it is a GET method. The standard does not actually say the GET method does not need to contain a body, but in this restful style we are adhering to the verbs. GET is a verb used to retrieve something, and the same request should always return the same thing; if we send data in the body then we allow data to control what is going to be returned and loses the style.

Note to myself – a restful interface defines the following verbs: GET, POST, PUT, DELETE, PATCH, HEAD and OPTIONS.

The response

In the simplest response, the client really only needs a status code, to know the status of what happened with the request. Of course, it gets more complicated because the response could also have a body and headers. And each header means something different.

For example: when a website is opened like ‘https://httpbin.org/&#8217;, the client is actually making a GET request to (usually) an HTML page. The response contains a very important header: ‘Content-Type’, without it your page would not even render because the browser would not know how to interpret the document.

Phew! That was a lot of information!

So, with all the previous data, it is easier to appreciate what API Gateway does for us. It makes it very easy to process a restful request, send it to an endpoint and respond to the client. It is very interesting also because API Gateway allows us to customize every single process of the request and response. It even allows us to completely customize the errors.

Further deep diving into API Gateway in future posts!

Design – Simplified Twitter

The problem reads:

“Design a simplified version of Twitter where users can post tweets, follow/unfollow another user and is able to see the 10 most recent tweets in the user’s news feed. Your design should support the following methods:

  1. postTweet(userId, tweetId): Compose a new tweet.
  2. getNewsFeed(userId): Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent.
  3. follow(followerId, followeeId): Follower follows a followee.
  4. unfollow(followerId, followeeId): Follower unfollows a followee.”

 

Understanding the data

Each userId is related to a tweetId, in the form of a “twitter post”.

Each userId is related to multiple userIds, in the form of a “follower” or “followee”.

I need to store and organize the data in such a way that makes it easy to relate a userId to a tweetId; to perform find operations.

The data also needs to be organized in such a way that I can find a userId and based on this userId find all the followees, to then extract all the “tweetIds” from the followees.

We could create all these relationships in a Twitter class; using lists, maps and sets. The problem is the code then becomes convoluted and there is no clear separation of logic. The previous leads me to believe I need to use classes.

Class User
-id:int
-following:Set<Integer>
-tweets:List<Tweet>
+postTweet():void
+follow(userId):void
+unfollow(userId):void
+getTweets():List<Tweet>
+getUsersFollowed():Set<Integer>
+setId(id):void
Class Tweet
-id:int
-datePosted:Date
+getId():int
+getDatePosted():Date
Class Twitter
-userMap:Map<Integer,User>
+postTweet(userId,tweetId):void
+getNewsFeed(userId):List<Integer>
+follow(followerId,followeeId):void
+unfollow(followerId,followeeId):void
+addUser(userId):void
+getUser(userId):User
+sortTweetsBasedOnDate(List<Tweet>):void
+getAllTweetsFromUser(userId):List<Tweet>
+getRangeOfTweetIDs(List<Tweet>,start,end):List<Integer>

Not terribly sophisticated but gets the job done.

The Twitter class ended up being the most complicated one because it is the one that orchestrates the other 2 classes and gives them the functionality.

I chose to use List for non randomized access to data, HashMap for 1:1 relations and Set whenever I need a unique-random-access data.

The pros:

  • Organized data and easy to identify each user and tweetId.
  • Users and tweets are also encapsulated.

The cons:

  • Relating data outside its data structure requires some processing – retrieving the news feed of a user requires retrieving tweets of several other users. Perhaps the relationship can be improved?

Thoughts:

As part of my learning experience I thought about using the Java Streaming API somewhere in this implementation, but I did not find a good use for it. The stream appears to work better when all the data is in a single point. I thought about using the stream in the news feed method but I have not really found a good use for it when the data is scattered through different users – again, maybe the design needs to be improved.

The design does not consider any scenarios of an actual implementation: networking, databases, distributed systems. These 3 classes get much more complicated if tried to implement in a real scenario.

Implementation with a few tests

I also added some tests. To be honest the tests need more work as I am only testing for the ideal case.

Update:

A friend of mine suggested to think of how it would work in a database and that the list of tweets would not be held by the user table or class. By moving the tweet list to the Twitter class I was able to remove a lot of the aggregated complexity. In general, the design became much simpler and… I was able to use the Java streaming API!

Currently the design does not need date comparison since its single threaded, in a distributed system the software would more than likely need date-based sorting.