- Clear the boilerplate that `mix` generated
- Consume `Math.factor` to test which inputs are semiprimes
- Cache all inputs that are semiprimes as soon as we discover that they are
- semiprimes
I considered a couple things related to the Cache:
- Could save space by storing all semiprime factors in a tree. This would make
the lookups more expensive. Also because the tree's depth would never exceed
two (because all semiprimes only have two factors), the tree would be quite
broad, and we may not be saving enough space for the trade to be worthwhile. I
might be wrong about that though.
- We could consider pre-computing all semiprimes when we start the app, but
without running some tests firsts, I'm not sure whether or not it's worth the
trouble.
Define a simple in-memory key-value store for our cache.
TL;DR:
- Define `Cache` as a simple state-keeping `Agent`
- Define `Sup`, which starts our Cache process
- Define `App`, which starts our Supervisor process
- Whitelist `App` in `mix.exs`, so that it starts after calling `iex -S mix`
In case I want to package this project with Nix. For now, it's useful to store
this at the project root because it help my Emacs's `project-find-file`
function.
I've subtly been implementing breadth-first traversals in graphs
incorrectly. The change is subtle, but updating `seen` needs to happen
immediately after queuing an item.
The results will remain the same, but the runtimes will differ dramatically. I
didn't notice this until I attempted to complete LeetCode's "count islands"
challenge, and LeetCode rejected my solution because it could not finish before
timing out. After looking at other candidates' solutions and comparing them to
mine, I couldn't see any difference... except for this subtle difference.
This SO answer provides a helpful explanation:
https://stackoverflow.com/questions/45623722/marking-node-as-visited-on-bfs-when-dequeuing
The take-away lesson here is to always call `seen.add(..)` immediately after
enqueuing.
The space cost is O(n). The runtime cost of enqueue is O(1); the runtime cost of
dequeue is O(n). Using the "accounting method", the cost of an item in the
system is O(1). Here's why:
+------------+----------------------------+------+
| enqueue | push onto lhs | O(1) |
+------------+----------------------------+------+
| lhs -> rhs | pop off lhs; push onto rhs | O(1) |
+------------+----------------------------+------+
| dequeue | pop off rhs | O(1) |
+------------+----------------------------+------+
The bottom-up solution run in O(n) time instead of O(2^n) time, which the
recursive solution runs as:
```
def fib(n):
return fib(n - 2) + fib(n - 1)
```
Remember that exponential algorithms are usually recursive algorithms with
multiple sibling calls to itself.
Write a predicate for checking if a linked-list contains a cycle. For additional
practice, I also implemented a function that accepts a linked-list containing a
cycle and returns the first element of that cycle.
I believe the previous solution is invalid. This solution works and it should be
more time and space efficient.
Space-wise our stack grows proportionate to the depth of our tree, which for a
"balanced" BST should be log(n). Doing a BFT on a BST results in memory usage of
n because when we encounter the leaf nodes at the final level in the tree, they
will be 1/2 * n for a balanced BST.
InterviewCake.com has a section on Facebook's interview, so I'm attempting to
solve all of the problems on there even if that means I'm resolving
problems. The more practice, the better. Right?
URL: interviewcake.com/facebook-interview-questions
This is the mother of dynamic programming algorithms in my opinion. It computes
the minimal "edit distance" between two input strings where an edit is
considered one of:
- inserting a character into `a`
- deleting a character from `a`
- substituting a character in `a` with a character from `b`
It took me awhile to grok the algorithm, but I implemented this from my
understanding of something that I read ~3 nights prior, so I must've understood
what I read. Good news!
This morning, I attended the "Interview Club" and was asked this question by the
interviewer in front of ~20 FTEs. While I struggled to fully solve it during the
abridged (i.e. 20 minute) timeslot, I completed the problem afterwards.
Here is my solution.
Create a suffix tree from an input string. This implementation uses a stack to
control the flow of the program.
I expected this attempt to be easier than my first attempt, but surprisingly, it
was similarly difficult. It took me ~30-45 minutes to successfully implement
this function, and I'm still not pleased with the final result.
While it took me awhile to implement, this exercise was definitely worth
doing. I think there should be a more elegant way to construct the tree using
maybe a stack, but I couldn't find it.
All of this was part of a larger effort to search a string for a variety of
patterns. The solution is to compile the string into a suffix tree and then
search the suffix tree for each of the patterns.
I'm glad I didn't gloss over this exercise.
Tonight I learned that random sample where each element in the sampling corpus
has an equal likelihood of being chosen is a brand of algorithms known as
"reservoir sampling".
- Implement random.shuffle(..)
- Implement random.choice(..)
Surprisingly, candidates are expected to encounter problems like this during
interviews.
Given an input like "gello" suggest an correction like "hello".
This is a proof-of-concept problem for writing a simplistic auto-correction
algorithm for a mobile device.
This algorithm is pretty interesting because it runs in linear time with respect
to the length of the `corpus` string. It does this by using a sliding window
hash. This hash -- because it's a sliding window -- runs in constant time for
each iteration; we're only adding and subtracting one character each time and
not re-hashing the whole "window".
When our hashes match, only then do we compare the "window" to the
`pattern`. String comparisons are linear because they compare each character to
each character one at a time. But because we only compare strings when are
hashes match (a check which runs in constant time), this spares us the
performance hit.
Firstly, implement a function that adds two arguments together... without using
the `+` operator. I need to drill this problem. Thankfully I took a Coursera
course that taught me how to make a half-adder and a full-adder, but the
recommended solution for this is a bit more difficult.
I was always curious how hashing functions were implemented, so I read about the
"polynomial rolling hash function", and I decided implementing it would be a
good exercise. After writing that, writing a hash table was simple.
Write a function to modify an array of integers in-place such that all of the
zeroes in the array are at the end, and the order of the other integers is not
changed.
This solution operates in O(n) time instead of O(n*log(n)) time, which
surprisingly isn't *that* big of a difference...
Consider a size of n of 10M...
1) ~10s
2) ~0.5s
So, yes, the O(n*log(n)) will take 100x longer to complete, but for an enormous
input size of 10M elements, it can still complete in under a minute. The
difference between that and the second, faster, algorithm, is just 9s.
Write a function that reads a string of compressed XML and outputs the
decompressed version.
Note to self: Now that I'm growing more comfortable writing parsers, I'd like to
become equally comfortable writing pretty-printers.