Last year I made a post about Attaca.
In this post I want to talk a little about my current project, Attaca.
This summer, I was lucky enough to be accepted to work with Liquid Haskell as part of the Summer of Haskell project! This blog post is an explanation of one of the more interesting puzzles involved, as well as what I learned of Liquid Haskell, its strengths, and its limitations.
I’m currently working on Attaca, a fast, distributed, and resilient version control system for extremely large files and repositories, designed for scientists working on projects containing terabytes or petabytes of data. (Warning: not currently in any sort of working condition as of 8/29/17!) There are a few key components to this design; the first is the git data structure, with one special modification. The second is a distributed hash table (DHT) which is used to store the objects of the graph data structure, and the third is a technique called “hashsplitting” or “hash chunking”. Resilience is handled by the distributed object storage system, and version control itself by the git-like data structure.
(N.B. The word “fuck” appears multiple times in this post. I recommend that the reader temporarily not consider “fuck” as profanity, as it isn’t used that way here.)