Google Code Jam

2013 Google Code Jam Round 1B

Better than Round 1, but still missed by several hundred.  If I had been able to get a second problem complete, I think I could have advanced.  However, my attempt at the third problem turned out to be way too slow.


Problem Osmos:

Armin is playing Osmos, a physics-based puzzle game developed by Hemisphere Games. In this game, he plays a "mote", moving around and absorbing smaller motes.

A "mote" in English is a small particle. In this game, it's a thing that absorbs (or is absorbed by) other things! The game in this problem has a similar idea to Osmos, but does not assume you have played the game.

When Armin's mote absorbs a smaller mote, his mote becomes bigger by the smaller mote's size. Now that it's bigger, it might be able to absorb even more motes. For example: suppose Armin's mote has size 10, and there are other motes of sizes 9, 13 and 19. At the start, Armin's mote can only absorb the mote of size 9. When it absorbs that, it will have size 19. Then it can only absorb the mote of size 13. When it absorbs that, it'll have size 32. Now Armin's mote can absorb the last mote.

Note that Armin's mote can absorb another mote if and only if the other mote is smaller. If the other mote is the same size as his, his mote can't absorb it.

You are responsible for the program that creates motes for Armin to absorb. The program has already created some motes, of various sizes, and has created Armin's mote. Unfortunately, given his mote's size and the list of other motes, it's possible that there's no way for Armin's mote to absorb them all.

You want to fix that. There are two kinds of operations you can perform, in any order, any number of times: you can add a mote of any positive integer size to the game, or you can remove any one of the existing motes. What is the minimum number of times you can perform those operations in order to make it possible for Armin's mote to absorb every other mote?

For example, suppose Armin's mote is of size 10 and the other motes are of sizes [9, 20, 25, 100]. This game isn't currently solvable, but by adding a mote of size 3 and removing the mote of size 100, you can make it solvable in only 2 operations. The answer here is 2.

My Solution:

This is actually a pretty easy problem, both for the small and large sets.  The strategy in Osmos should always be just absorb the smallest remaining mote, if you can.  Once you can't absorb the smallest, you can't absorb any other ones, so it is up to us to either add a new mote which you can absorb, or remove all the remaining motes.  Since we have a vested interest in seeing you absorb the motes with as little help as possible, we want to give you the biggest mote you can currently absorb.  This lends itself well to a recursive solution, where you have your current size, and a sorted list of the remaining motes.




There is one edge case to concern ourselves with... if we are a mote of size 1, then it isn't possible for our mote to absorb anyone.  In this case, we have to return that we remove all remaining motes, as our strategy for comparison would go into an infinite loop.

Another concern is the runtime.  Fortunately, our recursion only needs to run down a single branch for each decision, since if we have to remove motes, we know we have to remove all of them.  With that in mind, we run in O(N) time with respect to the length of the mote list...  The other concern is when the next remaining mote >> our current mote.  But by picking the largest mote available to absorb, we nearly double in size with each iteration, which means we can cover the gap in O(log(M)) time, where M is with respect to the maximum size of a mote.  So our algorithm is something like O( N log(M) ).





Problem Falling Diamonds:

Diamonds are falling from the sky. People are now buying up locations where the diamonds can land, just to own a diamond if one does land there. You have been offered one such place, and want to know whether it is a good deal.
Diamonds are shaped like, you guessed it, diamonds: they are squares with vertices (X-1, Y), (X, Y+1), (X+1, Y) and (X, Y-1) for some X, Y which we call the center of the diamond. All the diamonds are always in the X-Y plane. X is the horizontal direction, Y is the vertical direction. The ground is at Y=0, and positive Y coordinates are above the ground.
The diamonds fall one at a time along the Y axis. This means that they start at (0, Y) with Y very large, and fall vertically down, until they hit either the ground or another diamond.
When a diamond hits the ground, it falls until it is buried into the ground up to its center, and then stops moving. This effectively means that all diamonds stop falling or sliding if their center reaches Y=0.
When a diamond hits another diamond, vertex to vertex, it can start sliding down, without turning, in one of the two possible directions: down and left, or down and right. If there is no diamond immediately blocking either of the sides, it slides left or right with equal probability. If there is a diamond blocking one of the sides, the falling diamond will slide to the other side until it is blocked by another diamond, or becomes buried in the ground. If there are diamonds blocking the paths to the left and to the right, the diamond just stops.

Consider the example in the picture. The first diamond hits the ground and stops when halfway buried, with its center at (0, 0). The second diamond may slide either to the left or to the right with equal probability. Here, it happened to go left. It stops buried in the ground next to the first diamond, at (-2, 0). The third diamond will also hit the first one. Then it will either randomly slide to the right and stop in the ground, or slide to the left, and stop between and above the two already-placed diamonds. It again happened to go left, so it stopped at (-1, 1). The fourth diamond has no choice: it will slide right, and stop in the ground at (2, 0).


My Solution:

So I didn't reach the problem until after the contest had ended, but I was I had.  There are a couple of insights that make this problem much easier to solve than the third problem.  First, you may notice that what we are making is a pyramid, one layer at a time.  Each layer of the pyramid (centered at 0,0) is completed before the next layer can begin, because we need the peak of the pyramid in place before a diamond can slide out to the next tier.

So how many diamonds are in a tier?  Well, we can inspect the sequence to find a pattern.




So if we sum all those layers together, we get...

Where D is the number of diamonds, and L is the layer of interest.  So given that that previous layer must be complete before we start our own layer, we can subtract away everything from the layer below our own, and are left with N-D diamonds.

Now I'd reform the question as follows... how many of the remaining diamonds would have to fall on the opposite side from ours for the layer not to complete up to our location?  Well, our location at height Y means that there would have to be at least Y diamonds falling on our side for us to get a diamond.  That means that if only Y-1 diamonds fell to our side, N-D-(Y-1) diamonds must fall to the other side.  If this number is greater than the other side can hold, then there is a 100% chance that our location will get a diamond.  If N-D < Y-1, then there is a 0% chance that our location will get a diamond.

A quick google search tells me...

 "
The general formula for the probability of w wins out of n trials, where the probability of each win is p, is combin(n,w) × pw × (1-p)(n-w) = [n!/(w! × (n-w)!] × pw × (1-p)(n-w)"  


Since the probability p = 0.5 (its equally likely to fall left or right), and we want to consider all cases from Y to N-D, we can reformulate as follows...

So we've arrived at a formula for calculating the probability.  I haven't checked my numbers, since I didn't actually reach this part in competition, but I'm fairly certain that it will work.  Another optimization in implementing this summation is to recognize that as w gets larger, the added probability gets smaller.  We probably can stop the search early if the probability of the last iteration times the remaining iterations is less than the threshold google checks for... In cases where N-D is quite large relative to Y, this can be very useful.


Problem Garbled Email:

Gagan just got an email from her friend Jorge. The email contains important information, but unfortunately it was corrupted when it was sent: all of the spaces are missing, and after the removal of the spaces, some of the letters have been changed to other letters! All Gagan has now is a string S of lower-case characters.

You know that the email was originally made out of words from the dictionary described below. You also know the letters were changed after the spaces were removed, and that the difference between the indices of any two letter changes is not less than 5. So for example, the string "code jam" could have become "codejam", "dodejbm", "zodejan" or "cidejab", but not "kodezam" (because the distance between the indices of the "k" change and the "z" change is only 4).

What is the minimum number of letters that could have been changed?

My Solution:

This was the problem I actually attempted... and while I produced valid results for the small example test cases, it was far too slow to actually finish even the small dataset.  Funny, I work at an email company, and have a coworker named Jorge... I couldn't resist trying this problem instead of the second.  I'll present my ideas here, but would be interested in seeing what the correct approach is to this problem from someone else.  It reminded me of a problem I'd done on my blog a few months ago... a Morse Decoder... so I felt like I could reuse some of those ideas.  The wildcard matching needed seemed to make this problem much harder... so I went about setting up the dictionary in a tree, where each node of the tree contains a dictionary mapping the next character to the list of possible nodes, as well as a list of all sub nodes.  When inserting the dictionary into this tree, I broke it up into N trees, where each tree was of words of length n.  The tree class then had a function for returning the distance between itself and a given substring from the test case... that code looked like this:



Here, scoreSoFar is initialized to 0, and bestSoFar would be the length of the word (where we would have to change all letters to match something in the dictionary).  Then this code was used in a recursive search, where we'd map out all possible first word fits in the test case, and recursively search the remaining part, keeping the minimum score of that.  I tried caching as I went for reuse, but that didn't really help. This was just way too slow.  Interested in feedback on successful solutions.


So those are my thoughts on Round 1B of the code jam.  Last chance for me is 1C... if I can stay awake that late.

2013 Google Code Jam Round 1A

Well that was a bust.  I figured out how to do the first problem within the first 15 minutes, and spent the rest of my time screwing it up.  The problem is as follows...


Problem

Maria has been hired by the Ghastly Chemicals Junkies (GCJ) company to help them manufacture bullseyes. A bullseye consists of a number of concentric rings (rings that are centered at the same point), and it usually represents an archery target. GCJ is interested in manufacturing black-and-white bullseyes.



Maria starts with t millilitres of black paint, which she will use to draw rings of thickness 1cm (one centimetre). A ring of thickness 1cm is the space between two concentric circles whose radii differ by 1cm.
Maria draws the first black ring around a white circle of radius r cm. Then she repeats the following process for as long as she has enough paint to do so:
  1. Maria imagines a white ring of thickness 1cm around the last black ring.
  2. Then she draws a new black ring of thickness 1cm around that white ring.
Note that each "white ring" is simply the space between two black rings.
The area of a disk with radius 1cm is π cm2. One millilitre of paint is required to cover area π cm2. What is the maximum number of black rings that Maria can draw? Please note that:
  • Maria only draws complete rings. If the remaining paint is not enough to draw a complete black ring, she stops painting immediately.
  • There will always be enough paint to draw at least one black ring.

My Solution

The key here is writing a generalized expression relating area to number of rings...  Lets start with the area for the ith ring... this will be the area of the circle with a radius going to the outside of the black paint, minus the area of the paint just inside the black line...

Equation 1: Area of ring #i

Here, R is the radius just at the inside of the ith ring, and we know that the ring radius will be one centimeter larger.  Next, lets formalize the radius with respect to the ring number...

Equation 2: Incorporating formalized Ri
Now all we need to do is sum all N of our rings... which means summing from 0 to N-1, the way we formulated our answer...

Equation 3: Total Area of paint as a function of N rings
Equation 3 gives us a total area... and the problem specifies that 1 ml of paint covers an area of pi... we get a final answer as follows...
Equation 4: Quadratic equation for number of rings that can be painted
So this is where I made my big mistake... I have a quadratic equation that I can stuff into the Quadratic Formula... so that is what I did.  Convert everything to double precision and solve.  Then the number of rings would be the floor of the solution to this equation.  But double precision runs into rounding errors... I'd get answers that would be correct MOST of the time, and then fail for edge cases where rounding matters, and you can bet Google checks for rounding.

However, I was convinced that I had made an algebra mistake, and so I went through this math several times, trying to figure out if I had an off by one error.  I was left with only a few minutes to work through the rest of this problem, when it occurred to me that left in the final form in equation 4, I could use a binary search to finish the search in a way that wouldn't induce round-off errors.

Unfortunately, my first attempt at the binary search for the small set didn't pass, and at this point, the round ended.  I'm pretty much convinced that my approach was correct, but I got caught in the technical details of the solution, which killed me.  I'll be back next week, trying to get through to Round 2.  But I'll have to do a lot better than this.  None the less, Google Code Jam is always my favorite programming event of the year, and this is no exception.  Cheers, Google!

Google Code Jam Qualification Round

Today was the qualification round for the 2013 Google Code Jam... always my favorite programming contest.  I decided to make an attempt at the first three problems, and didn't attack the last problem, given the points I figured a depth first search was too naive to work.

The first problem was a 4x4 tic tac toe game, with wildcard squares.  The goal was just to determine if there was a winner, or if it was a draw, or if the game is not done yet.  I've done many chess games based on bitmasks, so I decided to use that approach here.  I created the masks by hand, they are easy enough to figure out.  The problem is then solved with 10 bitwise checks per side, and an additional draw check.


The second problem was about figuring out if a given lawnmower pattern could be achieved given certain rules... first, the lawnmower could only make straight line cuts perpendicular to one of the edges.  Second, the height of the blade could only be set at the start of an edge.  The lawnmower had to be driven all the way across the grass, at which point you could drive the lawnmower across from another edge starting point.  The key insight was that for any given patch of grass, the height had to be greater than or equal to the max height in the row, or the max height in the column.  Finding the max for each row and each column has to be done only once, and then once through each patch to do the check, leads to O(n*m) time.


The third problem was sneaky .. but I've been messing around with palindromes for fun with some coworkers, so we had figured out how to get them efficiently as a sequence.  The problem asks us to find all numeric palindromes which are also squares of numeric palindromes.  So for instance, 121 is a palindrome, which is 11^2, where 11 is also a palindrome.  The hard part about this problem is the interval bounds.  The small set is only squares less than a thousand, but the middle set could be all such numbers from 0 to 10^14, and the large set was numbers from 0 to 10^100.  I gave up on the large data set immediately, figuring that it would take some mathy trick to get it to pass.  The key insight for the other two data sets is first recognizing that instead of looking for numbers to be taking the square root of, you start with smaller palindromes, and check to see that the squares are also palindromes.  This means we really are only searching for numbers less than 10^7, which is much more tractable.  In fact, there are only 10999 palindromes less than 10^7, so I went with the approach of just finding all of the requested numbers in the full range up front.  Then, for each interval, just test how many in the list fall within the bounds.

I didn't attempt the last problem, but I did notice a couple of things that could have helped improve the performance of a solution.  This problem was described as having a set of locked chests, each which may contain some keys, and each chest can only be unlocked by certain classes of keys.  You start with at least one key, and need to check if you can unlock all the chests.

If you go with a depth first search approach, don't include the empty boxes in the initial search.  They are outside the critical path, and could be added back in to the critical sequence after the fact, maintaining the "lexigraphical order".  Second, at each node of the search, you should probably check if there are enough reachable keys left to unlock any given box... this is a more directed search, and could cut off much more quickly.  Third, it may be possible to identify if the graph of chests is fully connected graph up front, and immediately fail without searching for an unlock order if it isn't connected.  Other than that, I didn't take a crack at this one.

I hope everyone is enjoying the code jam so far, and am interested in hearing other approaches.

Google Code Jam Prep

  I love the Code Jam... I've done it every year since I first discovered it in grad school.  As a competitive person, I was always looking for an edge that could improve my scores.  With Google Code Jam right around the corner, I thought I'd share some of the tips I've picked up over the last several years.  Feel free to include them into your contest.

  • Optimize your work space for efficiency.
With the exception of the first round, you'll have a really limited time frame to work with, and you want to avoid as many disturbances as possible.  I like to close all the miscellaneous windows I generally keep open, save for some music and maybe a quick window for Google searches, in addition to my IDE and a window with the problem sets.  I have some water handy, and anything that could distract me during the contest hours conveniently taken care of.
  • Write your boilerplate before the contest.
All Google problems look exactly the same from an IO standpoint. You read some data from a file, perhaps multiple lines per problem.  You parse the data into their respective data types, do some real work on the data, and then output to a file a line starting with a leading number, followed by whatever answer you are going to give.  Typically, they have three problems for you to work on, so I have a project set up with all three problem classes already set up to do some generic parsing of a simple problem, and generic output of whatever you are going to solve.  With three hours available for critical thinking, these presious minutes can be the difference between a working solution and an almost finished solution.
  • Have a resource file easily available to paste in the free test data they give you
Google problems all give you some example inputs and outputs as a starting point for what your program should be able to do.  In the boilerplate code, I like having an empty text file pre-wired to be parsed as the input file of choice, which I can just paste the examples in and run against my program.  I also have a pre-wired text file for the output, already open, which automatically refreshes when I run my program.  It gives me instant feedback as I am writing my algorithms.
  • Read all the problems before you start.
This is more of a generic test taking strategy, but it applies well here.  Google assigns different points to different problems.  Typically in the first round, a score large enough to get you on to the next round doesn't have to be a perfect score... but it also typically isn't the score of the easiest problem either.  Also, I feel that reading all the problems before starting gives your mind a chance to process a bit before you jump into coding.
  • Have a plan of attack
Unless there is a very obvious optimal type answer, you'll need to think about how you are going to attack a problem before you jump into it.  The big points require optimized algorithms.  O(n^2) is the quickest way to not having a working answer.  Diagram out on a piece of paper the implications of what you want to achieve.
  • Know where to find your file I/O.
Once you download an input file for processing, you typically have around 5 minutes to upload your program's results.  The more time you are fumbling around looking for where your input file downloaded to, or wiring up your program to use it... and again finding the file your program outputs for upload, the less time your program can be crunching on the answer.  Sometimes, sub-optimal algorithms can still pass the large dataset... but usually with very little time to spare.  Seconds count here, and having done a few practice runs against Google's practice problems will help iron out the wrinkles.
  • Fuzzy test your big data sets before submitting.
This will be a new one for me this year... but it is something that I tried in the Facebook Hacker Cup.  Before attempting the large dataset, you need to know the performance pain points of your algorithm.  The best solution I've come up with is to Fuzzy Test your algorithm against the boundaries supplied in the question.  Making sure your fuzzy testing module tries various combinations on the boundaries especially, because this is where the tricks occur.


Any additional suggestions you may have about how to improve your Google Code Jam experience, I'd love to hear.  Good luck in the 2013 Code Jam!

Google Code Jam 2013


Google Code Jam is right around the corner!  registration will open on March 12, 2013... this is always my favorite programming contest each year.  I'll be trying to get a group of coworkers to participate in the action.  Last year, I reached the second round, and actually had one of the better scores in round 1C, which was the best round of my life.  Hope to see you there!