Product Of Other Numbers

Problem:

You have a list of integers, and for each index you want to find the product of every integer except the integer at that index.

Restrictions: divisions are not allowed.

First thought that comes to my mind is to use 2 for-loops and multiply all the indices except the current one of the first loop. Easy, finished in less than a minute. The issue is the solution has an n^2 complexity.

How to optimize?

As usual the problem is the data is not organized in a nice way for this problem. Time to analyze!

Given an example:

input: [1,2,3,4]
expected output: [24, 12, 8, 6]
output expanded: [2*3*4, 1*3*4, 1*2*4, 1*2*3]

By looking at the expanded output, I can see there is a pattern going on. Exactly what it is took me a few minutes to figure it out, but right out of the box I can definitely see that:

The lowest index of the input is added on the expanded output with an index of 1. In other words the number 1, with an index 0, of the input array, was added on the index 1 of the output array; which makes sense given that index 0 of the input cannot be considered in the product of the output in index 0.

Note: this sort of reminds me to the fibonacci problem were you have to reuse operations to avoid recalculating the same thing over and over. Its hard to think how to reuse operations though. In this case I have to break it into sub-problems to identify the patterns.

A few minutes pass of me staring at the whiteboard and I realize that the current index has all the numbers to the left and to the right of it.

index 2 of ouput expanded: [—,—,1*2*4,—]

The 1 and 2 are numbers to the left of the current number and 4 is to the right. Sounds silly, but that helps to think about the solution. In other words, the current number is the product of all the numbers to the left and all the numbers to the right!

What if I have 2 extra arrays:

  • before[] where its current index is the product of all the indices before it.
  • after[] where its current index is the product of all the indices before it.

I can then loop through the before[] and after[] array and simply multiply the current index and store it in a result[] array.

I am not sure, but that sounds like it reduces the complexity (hopefully to 2 individual loops).

It sounds like a good moment to start drawing some code (I say drawing because I am getting used to whiteboards so I write it there first).

My first attempt looked something like this:

public int[] getProduct(int[] numbers){
int[] result = new int[numbers.length];
int[] arrBefore = new int[numbers.length]; 
int[] arrAfter = new int[numbers.length]; 
int productBefore = 1;
int productAfter = 1; 
int b = numbers.length - 1; 

//Organize the data 
for(int a = 0; a < numbers.length; a++){
productBefore *= numbers[a];
productAfter *= numbers[b];
arrBefore[a] = productBefore;
arrAfter[b] = productAfter;
 b--; 
} 

//Calculate the result 
for(int a = 0; a < numbers.length; a++){ 
result[a] = arrBefore[a] * arrAfter[a]; 
}
return result;
}

I made a mistake here! I did not realize right away the ordering of the multiplication and assigning the arrays was important. First I need to assign the value to the product array, otherwise I am skipping ahead.

(I briefly made another mistake but corrected it in my mind. Initializing the product variables to 0 because there are no values before the first or last index. What is interesting is the representation of this emptiness works well with the number 1 and not 0 – also 0 would just 0 out the entire multiplication)

Small correction:

public int[] getProduct(int[] numbers){
int[] result = new int[numbers.length];
int[] arrBefore = new int[numbers.length]; 
int[] arrAfter = new int[numbers.length]; 
int productBefore = 1;
int productAfter = 1; 
int b = numbers.length - 1; 

//Organize the data 
for(int a = 0; a < numbers.length; a++){
arrBefore[a] = productBefore;
arrAfter[b] = productAfter;
productBefore *= numbers[a];
productAfter *= numbers[b];
 b--; 
} 

//Calculate the result 
for(int a = 0; a < numbers.length; a++){ 
result[a] = arrBefore[a] * arrAfter[a]; 
}
return result;
}

It works and its optimized!

  • n+n or 2n is the new processing complexity.
  • I also sacrificed some memory, 2 arrays of n or 2n memory to be exact. On the bright side we can buy more memory.

I liked this problem because it involves memoization, which I am not very good at spotting right away; my brain is not used to seeing it. Good practice!

Code available on github

Advertisements

Understanding AWS – API Gateway

API Gateway is a service from AWS that provides a ReSTful interface between a client/requester and an endpoint.

The term client/requester and endpoint are abstracted on purpose because it can literally be anything that can send or receive an HTTP/HTTPs request.

Another concept that is hard to define is “ReSTful”. Recently I realized I could not really define what restful really meant; I could briefly mention characteristics about it, but not really dive deep into it, even though I have been using it for years.

Note to myself – restful interfaces are a style of sending or receiving http requests. It is a “style” because the request will not actually break or not work if it is not sent in a restful manner.

The client:

  • Describes where the request is going via a URL.
  • Controls the request with headers.
  • Sends additional data via: query string parameters and the body (for methods that allow it). Additional data can also be sent via the headers, but they are mostly meant for control.
It is important to note why "control with headers" is done via the client 
and not the server. The server does not make any assumptions about the
request; it can enforce rules (example: authorization), but does not make 
assumptions. This allows the server to have more flexibility with each 
request it receives. In this case the server/interface is API Gateway.

Endpoint:

  • What the endpoint has to do will depend a lot on the particular endpoint.
  • Examples of endpoints: HTTPAWS Lambda, a database, a mail server, a queue. Note that the endpoint does not have to understand anything about restful APIs, because another service (API Gateway) has abstracted that for us.
  • The endpoint does have to return a status code or the interface will assume there was an error. Additionally the endpoint can return more information if the interface allows it.

What else can be said about a restful request?

The style also uses resources and methods/verbs to describe the actions you are trying to do.

The request

For example the URL: https://httpbin.org/ip contains several parts that describe our request:

  • https – describes the protocol the request is trying to use.
  • httpbin.org – is the domain that gets asked to a name server to try to find the IP that can (hopefully) serve the request.
  • /ip – is a resource that describes what are we trying to access under that domain.
  • We do not get to see it in the URL, but the resource is tied to verbs or methods. In this case it is a GET method. Notice also how it is very easy to describe what that URL is trying to accomplish by reading it with simple english words: use the https protocol to GET the ip. If we were super strict it would be GET the ip of the httpbin.org domain. But in reality this request returns the IP of the client.
  • There are also headers that automatically get sent even though we do not specify any. host and accept, are two very common headers.
  • The request does not contain a body. How do I know that? Because it is a GET method. The standard does not actually say the GET method does not need to contain a body, but in this restful style we are adhering to the verbs. GET is a verb used to retrieve something, and the same request should always return the same thing; if we send data in the body then we allow data to control what is going to be returned and loses the style.

Note to myself – a restful interface defines the following verbs: GET, POST, PUT, DELETE, PATCH, HEAD and OPTIONS.

The response

In the simplest response, the client really only needs a status code, to know the status of what happened with the request. Of course, it gets more complicated because the response could also have a body and headers. And each header means something different.

For example: when a website is opened like ‘https://httpbin.org/&#8217;, the client is actually making a GET request to (usually) an HTML page. The response contains a very important header: ‘Content-Type’, without it your page would not even render because the browser would not know how to interpret the document.

Phew! That was a lot of information!

So, with all the previous data, it is easier to appreciate what API Gateway does for us. It makes it very easy to process a restful request, send it to an endpoint and respond to the client. It is very interesting also because API Gateway allows us to customize every single process of the request and response. It even allows us to completely customize the errors.

Further deep diving into API Gateway in future posts!

Design – Simplified Twitter

The problem reads:

“Design a simplified version of Twitter where users can post tweets, follow/unfollow another user and is able to see the 10 most recent tweets in the user’s news feed. Your design should support the following methods:

  1. postTweet(userId, tweetId): Compose a new tweet.
  2. getNewsFeed(userId): Retrieve the 10 most recent tweet ids in the user’s news feed. Each item in the news feed must be posted by users who the user followed or by the user herself. Tweets must be ordered from most recent to least recent.
  3. follow(followerId, followeeId): Follower follows a followee.
  4. unfollow(followerId, followeeId): Follower unfollows a followee.”

 

Understanding the data

Each userId is related to a tweetId, in the form of a “twitter post”.

Each userId is related to multiple userIds, in the form of a “follower” or “followee”.

I need to store and organize the data in such a way that makes it easy to relate a userId to a tweetId; to perform find operations.

The data also needs to be organized in such a way that I can find a userId and based on this userId find all the followees, to then extract all the “tweetIds” from the followees.

We could create all these relationships in a Twitter class; using lists, maps and sets. The problem is the code then becomes convoluted and there is no clear separation of logic. The previous leads me to believe I need to use classes.

Class User
-id:int
-following:Set<Integer>
-tweets:List<Tweet>
+postTweet():void
+follow(userId):void
+unfollow(userId):void
+getTweets():List<Tweet>
+getUsersFollowed():Set<Integer>
+setId(id):void
Class Tweet
-id:int
-datePosted:Date
+getId():int
+getDatePosted():Date
Class Twitter
-userMap:Map<Integer,User>
+postTweet(userId,tweetId):void
+getNewsFeed(userId):List<Integer>
+follow(followerId,followeeId):void
+unfollow(followerId,followeeId):void
+addUser(userId):void
+getUser(userId):User
+sortTweetsBasedOnDate(List<Tweet>):void
+getAllTweetsFromUser(userId):List<Tweet>
+getRangeOfTweetIDs(List<Tweet>,start,end):List<Integer>

Not terribly sophisticated but gets the job done.

The Twitter class ended up being the most complicated one because it is the one that orchestrates the other 2 classes and gives them the functionality.

I chose to use List for non randomized access to data, HashMap for 1:1 relations and Set whenever I need a unique-random-access data.

The pros:

  • Organized data and easy to identify each user and tweetId.
  • Users and tweets are also encapsulated.

The cons:

  • Relating data outside its data structure requires some processing – retrieving the news feed of a user requires retrieving tweets of several other users. Perhaps the relationship can be improved?

Thoughts:

As part of my learning experience I thought about using the Java Streaming API somewhere in this implementation, but I did not find a good use for it. The stream appears to work better when all the data is in a single point. I thought about using the stream in the news feed method but I have not really found a good use for it when the data is scattered through different users – again, maybe the design needs to be improved.

The design does not consider any scenarios of an actual implementation: networking, databases, distributed systems. These 3 classes get much more complicated if tried to implement in a real scenario.

Implementation with a few tests

I also added some tests. To be honest the tests need more work as I am only testing for the ideal case.

Update:

A friend of mine suggested to think of how it would work in a database and that the list of tweets would not be held by the user table or class. By moving the tweet list to the Twitter class I was able to remove a lot of the aggregated complexity. In general, the design became much simpler and… I was able to use the Java streaming API!

Currently the design does not need date comparison since its single threaded, in a distributed system the software would more than likely need date-based sorting.

Website up!

These days it is a 5 minutes step to have a professional website up and running. I decided I would configure everything from the ground up to gain more practice with the tools and technologies that are normally automatically configured for us.

Step #1 – My domain and Route53

I had already bought a domain jfalco.com in Route53 but it was not being routed anywhere yet. I followed this guide to forward the jfalco.com traffic to the EC2 instance.

  • Created an A record http://www.jfalco.com that pointed to the EC2 instance IP. Wait! Isn’t http://www.jfalco.com different than jfalco.com? I tried to access my EC2 instance through my newly configured domain but my browser kept returning “This site can’t be reached – ERR_NAME_RESOLUTION_FAILED”. What?! But I just configured the route! Maybe it has not propagated yet, it being the cloud and all… Waited a good 20minutes and still would not budge.
  • After a lot of reading and googling, turns out my jfalco.com name servers were different than my hosted zones name servers. Doh! I made the assumption Route53 would automatically use the same name servers as my domain given that I literally created the hosted zone by clicking on my domain through the Route53 console. I copy pasted my hosted zones name servers to my domain name servers and it took roughly 30 seconds to start working. I also added a tiny CNAME record to link jfalco.com to http://www.jfalco.com. Woot! My domain is finally pointing to my website!

Step #2 – EC2, Tomcat8 and the actual content of the Website

  • Previously I had very hastily installed tomcat8 just to make sure my EC2 instance could be accessed. Now I had to make the real configuration and deploy my war file. First, I moved the manager app outside of the root and very quickly found the configuration I required to make my war be the root as well as change the port to be 80 instead of :8080; it would not look very good if people had to type http://www.jfalco.com:8080/mywebsite.
  • To make the war be root, the war file has to be named ROOT.war and the folder with the content has to also be named ROOT.
  • To change the port to be :80, the file ‘server.xml’ in the tomcat8/conf folder needs to have the changes for port :80.
<Connector port="80" protocol="HTTP/1.1"
 connectionTimeout="20000"
 redirectPort="8443" />
  • Also had to change users to secure access.

The Website

  • Using Java EE I went ahead to setup my project in Eclipse. First hurdle, how do I configure the web.xml to serve the content of my website? At first I tried
<servlet>
    <servlet-name>home</servlet-name>
    <jsp-file>/index.html</jsp-file>
</servlet>
<servlet-mapping>
    <servlet-name>home</servlet-name>
    <url-pattern>/</url-pattern>
</servlet-mapping>

And that did work with my <html><body>Hello, World!</body></html>, but failed once I tried to serve any CSS, JS or image file.

  • One more thing to note in here is that the html files need to be at the root of the WebContent folder, right above WEB-INF.
  • Turns out I did not need any servlet mappings yet.
<welcome-file-list><welcome-file-list> <welcome-file>index.html</welcome-file> <welcome-file>index.htm</welcome-file> <welcome-file>index.jsp</welcome-file> <welcome-file>default.html</welcome-file> <welcome-file>default.htm</welcome-file> <welcome-file>default.jsp</welcome-file> </welcome-file-list>

Was all I needed.

  • Great, now to the actual development of the content. Since I am doing everything from scratch I want to leverage on some frameworks, that will also help me learn. ReactJS + Typescript + bootstrap + LESS or SASS and a package manager like npm? Maybe I could use AngularJS? What about the testing frameworks? Wait… I need to take a step back. Do I really need all this complexity? I am going to have a very simple home page with some of my information and the blog is currently hosted in wordpress, so… Maybe I just need bootstrap  to make my CSS life easier and npm to make packaging slightly easier.
  • My initial setup includes jquery and bootstrap. I created a very simple introductory home page that needs more styling but its not terrible. I think its a good start and I finally have my landing page that gives me an online presence. I have a ton of plans for that website, but one step at a time.

A lot of good lessons learnt in here: how to configure name servers in Route53; a good tomcat8 installation in EC2 that displays the WAR file at the root level; and refreshing those web development skills. I am not happy with the theme yet but its something.

Two Sum

Problem:

Given an array of integers and an integer target, find any 2 numbers in the list that add up to the target.

My first naive approach was to brute force it: for every number, iterate it against every other number and check to see if the addition matches the target. That is great, we will find the solution but its a very slow algorithm. It consumes n^2 processing cycles in the worse case scenario.

Trying to do better – initial thoughts:

  • If the target is greater than zero, we do not need to consider any number that is larger than the target.
  • If the target is small than zero, we do not need to consider any number that is smaller than the target.
  • Is the data organized in the data structure that we need? Maybe the algorithm is slow because it is constrained by the list/array of integers.
  • Is there any data structure that gives me O(1) access to the data? A map or a set could do the job but then how would the algorithm look like? Plus I would need to process the data to store it in the new data structure, that would take n cycles.

This did not come right away to my thoughts but after reading more about it I did understand it: the difference of any index in the array and the target can be found in the newly built set fairly quickly. If we do not find the data, then there is no answer. I also ignore any number larger than target.

Java solution

Not terribly bad – 2 for loops over n, so n+n(2n). Definitely better than n^2. The solution also traded processing for memory because now an extra Set is required.

Update and reminder to myself: at first I did not take into consideration negative numbers. My initial logic was discarding all negative numbers because of the logic: “we do not need to consider any number that is smaller than the target”. The previous logical statement works only when target is greater than zero. Funny enough by trying to optimize, I ended up making it more complicated and breaking the whole thing with any target less than zero.

Update #2: The optimization was actually breaking a case. What happens when the target is 10 and the array has numbers like 12 and -2?

jfalco

The intention of this initial post is to hopefully serve as a guide for myself to remind me of the goal for this blog.

I often find myself writing small notes or documents so that I remember a tutorial, a problem I was working on, how to configure a service or some other technical piece of information that I want to keep for a later date. I am far from a good writer, but I thought to give it a try at writing things in a blog to keep it organized. At the same time I might also begin establishing a professional footprint on the web.

Lets see how it goes!