To see posts by date, check out the archives

The Rsync Algorithm in Python
Tyler Cipriani Posted

I’ve often pondered this great and terrible beast – rsync. It’s spiny, nearly impenetrable command-line interface. Its majestic and wonderful efficiency. The depths of its man page, and the heights of its use-cases.

Leaving aside the confusing implications of trailing slashes, rsync is amazing. The Wikimedia deployment tooling – scap (which at this point has been iterated on for over a decade) – still makes heavy use of rsync. At $DAYJOB - 3, rsync is used to manage a library of hundreds of thousands of flac, mp3, and ogg files. It’s hard to argue with rsync. The amount of network traffic generated via rsync is really hard to beat with any other program.

But what’s it doing?

rsync is fast. rsync is ubiquitous. rsync uses few client resources, and little network IO. OK…Why?

I started reading about the rsync algorithm when a fellow I work alongside began espousing the relative superiority of zsync for the case of our deployment server. Currently scap has a built-in (and quite fancy) fan-out system so as not to put too high of a load on only 1 server; however, zsync flips the rsync algorithm on its head, running the rsync algorithm on the client rather than the server. What exactly is rsync doing that makes the load on the server so high?

The Meat

For the purposes of explanation, let’s say you ran the command: rsync α β.

The rsync algorithm boils down to 5 steps

  1. Split file β into chunks of length n.
  2. Calculate a weak (adler32) and strong (md4) checksum for each chunk of file β.
  3. Send those checksums to the rsync server (where file α is)
  4. Find all the chunks of length n in α that are in β by comparing checksums
  5. Create a list of instructions to recreate α from β

Easy.

Do it then

I actually think it would have been easier for me to understand a bad python implementation of the rsync algorithm, than to read a tech report on rsync. So with that in mind, here’s a bad python implementation of the rsync algorithm.

Pythonic!

First it might be helpful to define my block size, and create a couple of helper functions to create the rolling checksums.

import collections
import hashlib
import zlib


BLOCK_SIZE = 4096


# Helper functions
# ----------------
def md5_chunk(chunk):
    """
    Returns md5 checksum for chunk
    """
    m = hashlib.md5()
    m.update(chunk)
    return m.hexdigest()


def adler32_chunk(chunk):
    """
    Returns adler32 checksum for chunk
    """
    return zlib.adler32(chunk)

I’ll also need a function that creates a rolling checksum of a file. The checksums_file function will read in BLOCK_SIZE bytes through to the end of the file, calculate both the adler32 checksum and the md5 checksum for those chunks, and then put those chunks in a data structure.

I’d like a nice interface beyond primitives for both the signatures and the list of checksums – I’ll create 2 objects Signature and Chunks to make that interface. Chunks is basically a list of Signatures with a few other methods for fanciness.

# Checksum objects
# ----------------
Signature = collections.namedtuple('Signature', 'md5 adler32')


class Chunks(object):
    """
    Data stucture that holds rolling checksums for file B
    """
    def __init__(self):
        self.chunks = []
        self.chunk_sigs = {}

    def append(self, sig):
        self.chunks.append(sig)
        self.chunk_sigs.setdefault(sig.adler32, {})
        self.chunk_sigs[sig.adler32][sig.md5] = len(self.chunks) - 1

    def get_chunk(self, chunk):
        adler32 = self.chunk_sigs.get(adler32_chunk(chunk))

        if adler32:
            return adler32.get(md5_chunk(chunk))

        return None

    def __getitem__(self, idx):
        return self.chunks[idx]

    def __len__(self):
        return len(self.chunks)


# Build Chunks from a file
# ------------------------
def checksums_file(fn):
    """
    Returns object with checksums of file
    """
    chunks = Chunks()
    with open(fn) as f:
        while True:
            chunk = f.read(BLOCK_SIZE)
            if not chunk:
                break

            chunks.append(
                Signature(
                    adler32=adler32_chunk(chunk),
                    md5=md5_chunk(chunk)
                )
            )

        return chunks

Now I need a couple of methods to complete the algorithm – one that will find the BLOCK_SIZE chunks in file β that are in file α, and one that will produce instructions that can be used to assemble the new and improved β from the β we’ve already got.

The _get_block_list function will return a list of chunk indices and bytes. The chunk indices are indices of chunks already present in file β (we know from the checksums_file function), the bytes are raw bytes that are in α but may not be in β. If a chunk is found in α that is not in β then the first byte of that chunk is appended to the output list and a checksum is calculated for the next BLOCK_SIZE chunk.

This is why network IO for rsync is so efficient – the only raw data that is sent is the information missing from the remote. This is also why rsync causes higher load on the server than the client – it’s not just checksumming files, it’s checksumming, comparing, and building a diff. And it’s doing that process for every machine to which it is attempting to sync.

def _get_block_list(file_one, file_two):
    """
    The good stuff.

    1. create rolling checksums file_two
    2. for each chunk in file one, determine if chunk is already in file_two
        a. If so:
            i. return the index of that chunk
            ii. move the read head by the size of a chunk
        b. If not:
            i. return the next byte
            ii. move the read head by 1 byte
    3. start over at 2 until you're out of file to read
    """
    checksums = checksums_file(file_two)
    blocks = []
    offset = 0
    with open(file_one) as f:
        while True:
            chunk = f.read(BLOCK_SIZE)
            if not chunk:
                break

            chunk_number = checksums.get_chunk(chunk)

            if chunk_number is not None:
                offset += BLOCK_SIZE
                blocks.append(chunk_number)
                continue
            else:
                offset += 1
                blocks.append(chunk[0])
                f.seek(offset)
                continue

    return blocks

The poorly named file function (but it’s in the rsync.py module, so rsync.file is good…right? No? OK.) takes the list of chunk indices and raw bytes from _get_block_list, finds the chunks in β referenced by the index, combines those chunks with the raw bytes from α and returns a string that is the same as file α – it just took a weird route to get there :)

def file(file_one, file_two):
    """
    Essentially this returns file one, but in a fancy way :)

    The output from get_block_list is a list of either chunk indexes or data as
    strings.

    If it's a chunk index, then read that chunk from the file and append it to
    output. If it's not a chunk index, then it's actual data and should just be
    appended to output directly.
    """
    output = ''
    with open(file_two) as ft:
        for block in _get_block_list(file_one, file_two):
            if isinstance(block, int):
                ft.seek(block * BLOCK_SIZE)
                output += ft.read(BLOCK_SIZE)
            else:
                output += block

    return output

Creating a python file that imports this script as a module, and invokes file is all you need to actually run it. I wrote a bunch of tests to help me write the script. The core of the test file was simply:

import rsync

if __name__ == '__main__':
    rsync.file('fixtures/foo.txt', 'fixtures/bar.txt')

And at the end of our python-ing, we came to see the rsync beast in a new light – not a beast at all, just a misunderstood algorithm with a lot of command line flags. The End.

Literate Vimrc
Tyler Cipriani Posted

Literate ~/.vimrc

This blog post is my ~/.vimrc now.

Hold on, let me explain.

Why would anyone do a thing like this?

I have tons of comments in my ~/.vimrc file. I try to leave a lot of comments because everything in my Vim configuration is hard-won knowledge. My ~/.vimrc is accumulated knowledge from the day I first opened the editor and couldn’t figure out how to leave until now, many years later, when I open my editor and don’t have any reason to leave. I have tweaked, cajoled, and tamed this beast called Vim. I think a lot of folks have configuration kind of like that – it’s knowledge that is locked-away, not privately or purposefully, just badly-documented and ubiquitous. This is a problem.

So I made a thing.

This is an experiment that seeks to answer: can my ~/.vimrc be a page on my website?

Literate programming

Literate programming is the idea that your code should work as a narrative as well as a functional piece of software. This is a practice that was popularized by Donald Knuth and was utilized in writing the source code for TeX.

The are times when literate programming may not be appropriate; however, in this instance – the case of a .vimrc – I feel like literate programming will be helpful to remind me what a setting is and why I applied it. This also may mean that my accumulated Vim knowledge (however limited it may be) is more generally accessible than if it were just a plaintext documents sitting in my dotfiles.

Literate Vimscript

My resolve to make a literate ~/.vimrc led to some searching on DuckDuckGo and a slow, settling realization that I was going to have to write the plugin that does this. Keeping your Emacs configuration as a literate document written in org-mode is pretty common practice at this point, so I made the faulty assumption that a Vim configuration written in Markdown would be a common aspiration as well.

I was wrong and thus was the genesis of LiterateVimrc.

“Literate” Vimrc is more than a bit of a misnomer. It, unlike Donald Knuth’s vision of Literate Programming, does not support a loosely coupled web of macros – instead it just lets you write a file in Markdown that has codeblocks (as defined by v0.27 of the CommonMark spec) containing Vimscript as your ~/.vimrc file.

Install

To install LiterateVimsrc copy the autoload/literavevimrc.vim file into your ~/.vim/autoload directory, move your current ~/.vimrc to ~/.vimrc.md and add ~~~ to to first and last lines of that file (to create fenced codeblock). Inside your (now empty) ~/.vimrc add the line execute literatevim#load("~/.vimrc.md") and everything should remain the same.

Now you are free to add text explaining your configuration and divide your configuration into blocks. This blog post is currently acting as my ~/.vimrc file, as of this writing. This may be a bad idea. I do expect to update it somewhat regularly, but his frontmatter should remain the same.

My ~/.vimrc

This begins the content of my ~/.vimrc file. It includes various explanations and links that may only make sense to me. Here be dragons. You have been warned.

The first lines that should be in every ~/.vimrc – disabling arrow keys – because it’s not enough to use vim, you’ve got to live it.

noremap <up> <nop>
inoremap <up> <nop>

noremap <down> <nop>
inoremap <down> <nop>

noremap <left> <nop>
noremap <right> <nop>
inoremap <left> <nop>
inoremap <right> <nop>
" B A start

What is this ‘Vee-Eye’ of which you speak?

Vim’s compatible mode means lots of plugins won’t work. It evidently means that you can’t use \ in vim scripts to break a command across a few lines. nocompatible is also necessary for Vundle which is my vim packagemanager of choice.

set nocompatible

Plugins

All of my plugins. To install all of my plugins on a brand new machine I can run vim +PluginInstall +qall, which is pretty neat.

filetype off " required
set rtp+=~/.vim/bundle/Vundle.vim
call vundle#begin()

" let Vundle manage Vundle, required
Plugin 'gmarik/Vundle.vim'

" Other plugins
Plugin 'vim-pandoc/vim-pandoc-syntax'
Plugin 'tpope/vim-surround'
Plugin 'groenewege/vim-less'
Plugin 'kchmck/vim-coffee-script'
Plugin 'kien/ctrlp.vim'
Plugin 'Lokaltog/vim-easymotion'
Plugin 'jistr/vim-nerdtree-tabs'
Plugin 'scrooloose/syntastic'
Plugin 'godlygeek/tabular'
Plugin 'majutsushi/tagbar'
Plugin 'goldfeld/vim-seek'
Plugin 'altercation/vim-colors-solarized'
Plugin 'joonty/vdebug'
Plugin 'vim-airline/vim-airline'
" BOOO!
Plugin 'vim-airline/vim-airline-themes'
Plugin 'scrooloose/nerdtree'
Plugin 'Shougo/neocomplete'
Plugin 'Shougo/neosnippet'
Plugin 'juvenn/mustache'
Plugin 'nathanaelkane/vim-indent-guides'
Plugin 'tpope/vim-fugitive'
Plugin 'tpope/vim-rsi'
Plugin 'dag/vim-fish'
Plugin 'rodjek/vim-puppet'
Plugin 'ClockworkNet/vim-vcl'
Plugin 'airblade/vim-gitgutter'
Plugin 'fatih/vim-go'


call vundle#end()            " required
filetype plugin indent on    " required

A Random solution to a random problem

This disables Background Color Erase (BCE) when the $TERM env var contains the text 256color. The problem I was seeing is summarized on Suraj N. Kurapati’s blog post.

This may, in fact, not be necessary. At least according to reddit

if &term =~ '256color'
   set t_ut=
endif

Basic options

let mapleader=','
set t_Co=256              " My terminal's got all those colors, yo

set title                 " Change the terminal title
set encoding=utf-8        " Show utf-8 chars
set showcmd               " count highlighted
set ruler                 " Show where I am in the command area
set showmode              " -- INSERT (appreciation)-- :)
set laststatus=2          " always show the status line
                          " ↪ (0 = never, 1 = default [multi-window only])
set mouse=a               " Use the mouse

set modelines=0           " Don't read first/last lines of file for settings
set hidden                " Stash unwritten files in buffer
set vb                    " Don't beep at me
set cursorline            " Highlight current line
set scrolloff=3           " Start scrolling when I'm 3 lines from top/bottom
set history=1000          " Remember commands and search history
set backspace=2           " Backspace over indent, eol, and insert
set mousehide             " Hide the mouse pointer while typing

set number                " Show linenumbers
set nowrap                " Turn off linewrap
set list                  " Show invisible chars
set tabstop=4             " 4 spaces
set shiftwidth=4          " 4 spaces
set softtabstop=4         " 4 spaces
set expandtab             " Expand tabs to spaces

set hlsearch              " highlight my search
set incsearch             " incremental search
set wrapscan              " Set the search scan to wrap around the file

set ignorecase            " when searching
set smartcase             " …unless I use an uppercase character

Syntax

I have actually been considering what it would be like to work without syntax highlighting. I may be too far gone to start trying, but it’d certainly make me more attune to various code problems rather than code colors. In any event, with some of the huge files I view the syntax sync minline option and the synmaxcol=2048 are lifesavers.

syntax on                 " Syntax highlighting
syntax sync minlines=256  " Makes big files slow
set synmaxcol=2048        " Also long lines are slow
set autoindent            " try your darndest to keep my indentation
set smartindent           " Be smarter about indenting dummy

Format options

c - Auto-wrap comments using textwidth, inserting the current comment leader automatically.

o - Automatically insert the current comment leader after hitting ‘o’ or ‘O’ in Normal mode.

t - Auto-wrap text using textwidth

q - Allow formatting of comments with “gq”.

r - Automatically insert the current comment leader after hitting in Insert mode.

I have a different setting for format options when emailing in mutt. There I use set formatoptions=aw which works for format-flowed emails.

set formatoptions=cotqr  " I like smart comments

Folding

At some point in the past I used to use the marker for folding (which was opened with {{{ and closed by }}}) I found this a nice way to divide up things like my ~/.vimrc for instance; however, what’s even nicer is folding on indent. It makes reading yaml files super easy. I read lots of very (VERY) large yaml files. I suppose I could do this with an autocmd, but I set it globally instead ¯\_(ツ)_/¯.

" set foldmethod=marker     " Fold on 3x{
set foldmethod=indent
set nofoldenable          " But turn it off initially

Fishshell

I needed this for running some commands in Vim inside fishshell which I used for a time and then abandoned. I think that fishshell is really great (and it’s an actual programming language), the problem with it is two-fold:

  1. Nobody wants to push fishshell on a server – which I understand
  2. Using it requires a million little hacks like the one below

This is the flip side of fixing all the dumb decisions of the bourne shell – everyone else has already worked around them (obligatory xkcd).

" set shell=/bin/bash\ --login
" set shell=bash            " Needed if using fishshell

Backup and Swap files

I hate all the little backup and swap files all over the place. There has been 1 time when these would have come in handy in the past decade as far as I can remember.

set nobackup
set nowritebackup
set noswapfile

Colorschemes

" Colorscheme
" https://github.com/altercation/vim-colors-solarized
" colorscheme solarized
" let darkcolorbg='#839496'

set background=dark
let darkcolorbg=234
colorscheme Tomorrow-Night

" GUI Font (same as my gnome-terminal font)
" https://github.com/adobe/source-code-pro
set guifont=Source\ Code\ Pro\ 14

" Use the same symbols as TextMate for tabstops and EOLs
set listchars=tab:▸\ ,eol:¬\,trail:·

Vim 7.0.3 new features

Maybe I should update the stuff that hedges on a machine not having Vim 7.0.3 since version 8 is out now. This is a project for later.

if v:version >= 703
  set colorcolumn=75
  hi ColorColumn ctermbg=234
  set undodir=~/.vim-undo
  set undofile
  set undolevels=1000 "max number of changes that can be undone
  set undoreload=10000 "max number lines to save for undo on buffer reload

  " Toggle line numbers in normal mode, set by default
  set number relativenumber
  function! NumberToggle()
    if(&relativenumber == 1)
      set number norelativenumber
    else
      set number relativenumber
    endif
  endfunc

  nnoremap <leader>n :call NumberToggle()<cr>
endif

OSX

I don’t really know why I still have OSX hacks in all my dotfiles. Maybe someday I’ll be forced to jump off the Linux ship. I assume I’ll be jumping to a BSD or something at that point though.

if has("unix")
    let s:uname = system("uname")
    if s:uname == "Darwin\n"
        set clipboard=unnamed
    endif
endif

Very Magic

I’m not sure if I should turn this off. Half the time I think it’s great, the other half it’s a pain. I guess leaving it on for now seems fine.

nnoremap / /\v

Mappings

" Vimrc editing
nnoremap <silent><leader>ev :vsplit $MYVIMRC<cr>
nnoremap <silent><leader>sv :source $MYVIMRC<cr>

" un-highlight search results
nnoremap <silent><leader><space> :noh<cr>

" Toggle auto-indent before clipboard paste
set pastetoggle=<leader>p

" Shortcut to rapidly toggle `set list`
nnoremap <silent><leader>l :set list!<cr>

" Normal/Visual tab for bracket pairs
nnoremap <tab> %
vnoremap <tab> %

"Opens a vertical split and switches over (,v)
nnoremap <leader>v <C-w>v<C-w>l

"Moves around split windows
nnoremap <leader>w <C-w><C-w>

"Close a window
nnoremap <silent><leader>q :q<cr>

" Close buffer
noremap <silent><leader>d :bd<cr>

" Buffer previous
noremap <silent><leader>z :bp<CR>

" Buffer next
noremap <silent><leader>x :bn<CR>

nnoremap <S-Tab> gT
nnoremap <silent> <S-t> :tabnew %<CR>

" Set working directory
nnoremap <leader>. :lcd %:p:h<CR>

" Vmap for maintain Visual Mode after shifting > and <, prevents the use of
" '.' to repeat
" vmap < <gv
" vmap > >gv

" Better use of folding
" nnoremap <leader>z za

Status Line

I leave this status line here more as a reference than anything. Currently (as can be seen in the Plugins section) I’m using vim-airline for my statusbar.

augroup ft_statuslinecolor
    au!
    au InsertEnter * hi StatusLine ctermfg=196 guifg=#FF3145
    au InsertLeave * hi StatusLine ctermfg=130 guifg=#CD5907
augroup END
set statusline=%f    " Path.
set statusline+=%m   " Modified flag.
set statusline+=%r   " Readonly flag.
set statusline+=%w   " Preview window flag.
set statusline+=\    " Space.
set statusline+=%=   " Right align.
" Line and column position and counts.
set statusline+=\ %l\/%L\ \/\/\ %03c)

Autocmds

Don’t expand tabs in Makefiles or php files

autocmd FileType make setlocal noexpandtab
autocmd Filetype php setlocal noexpandtab
autocmd Filetype go setlocal noexpandtab

Highlight any trailing whitespace in red

highlight ExtraWhitespace ctermbg=red guibg=red
match ExtraWhitespace /\s\+$/
autocmd BufWinEnter * match ExtraWhitespace /\s\+$/
autocmd InsertEnter * match ExtraWhitespace /\s\+\%#\@<!$/
autocmd InsertLeave * match ExtraWhitespace /\s\+$/
autocmd BufWinLeave * call clearmatches()

Ensure that puppet files are handled properly in Vim. IIRC I pulled this line from the office wiki somewhere…

" detect puppet filetype
autocmd BufRead,BufNewFile *.pp set filetype=puppet
autocmd BufRead,BufNewFile *.pp setlocal tabstop=4 shiftwidth=4 softtabstop=4 expandtab textwidth=80 smarttab

These are the settings I use for my email. I try to send format=flowed emails that sill look good in terminal readers like Mutt.

augroup mail_filetype
    autocmd!
    autocmd VimEnter /tmp/mutt* set formatoptions=aw tw=72
augroup END

Plugin settings

" Tagbar (requires Exuberant ctags 5.5+)
nnoremap <silent><leader>c :TagbarToggle<cr>

It’s kind of insane that you can use Vim to drive a debugger like XDebug. As evidenced by the path maps here, I haven’t used this in a while and it was a pain to setup when I did. Kept here as a reminder that Vim is amazing.

" Xdebug local debugger
let g:vdebug_options = {
\    'server': '33.33.33.1',
\    'port': '9000',
\    'path_maps' : {
\        '/srv/www/local.people.dev': '/Users/tyler/Development/upsync-vagrant/shared/people'
\    }
\}
let g:syntastic_javascript_jshint_conf="$HOME/.jshintrc"
let g:syntastic_error_symbol = '✘'
let g:syntastic_warning_symbol = "▲"
" let g:syntastic_python_python_exec = '/usr/bin/python3'
let g:jedi#force_py_version=2

NERDTree was the first plugin I ever installed.

" NERDTree Settings---------------------------------------------------- {{{
"map <leader>t :NERDTreeToggle<cr>
nnoremap <leader>t :NERDTreeTabsToggle<cr>
" Close vim if NERDTree is the last thing standing
autocmd bufenter * if (winnr("$") == 1 && exists("b:NERDTreeType") && b:NERDTreeType == "primary") | q | endif
" }}}

" Included for Airline ------------------------------------------------ {{{
let g:airline_theme = 'tomorrow'
let g:airline_powerline_fonts = 1
" let g:airline#extensions#tabline#enabled = 1
" }}}

" CtrlP --------------------------------------------------------------- {{{
let g:ctrlp_max_files = 0 " Set no max file limit
let g:ctrlp_working_path_mode = 0 " Search current directory not project root
if executable("ag")
  set grepprg=ag\ --nogroup\ --nocolor
"  let g:ctrlp_user_command = 'ag %s -l --nocolor -g ""'
  let g:ctrlp_user_command = {
    \ 'types': {
      \ 1: ['.git', 'git --git-dir=%s/.git ls-files -oc --exclude-standard'],
      \ 2: ['.hg', 'hg --cwd %s locate -I .'],
      \ },
    \ 'fallback': 'ag %s -l --nocolor -g ""'
    \ }

endif

let g:ctrlp_show_hidden = 1
"}}}

These are all settings for neocomplete and neosnippet which requires Vim 7.0.4 or higher and +lua when you do vim --version.

if v:version >= 704 && has("lua")
  " Neocomplete.vim --------------------------------------------------- {{{
  " Disable AutoComplPop.
  let g:acp_enableAtStartup = 0
  " Use neocomplete.
  let g:neocomplete#enable_at_startup = 1
  " Use smartcase.
  let g:neocomplete#enable_smart_case = 1
  " Set minimum syntax keyword length.
  let g:neocomplete#sources#syntax#min_keyword_length = 3
  let g:neocomplete#lock_buffer_name_pattern = '\*ku\*'
  " I will probably never hit <TAB> 10 times
  let g:neocomplete#max_list = 10

  " Automatically open and close the popup menu / preview window
  " https://github.com/JessicaKMcIntosh/TagmaBufMgr/issues/8
  au CursorMovedI,InsertLeave * if pumvisible() == 0|silent! pclose|endif
  set completeopt=menuone,menu,longest

  " Define dictionary.
  let g:neocomplete#sources#dictionary#dictionaries = {
      \ 'default' : '',
      \ 'vimshell' : $HOME.'/.vimshell_hist',
      \ 'scheme' : $HOME.'/.gosh_completions'
          \ }

  " Define keyword.
  if !exists('g:neocomplete#keyword_patterns')
      let g:neocomplete#keyword_patterns = {}
  endif
  let g:neocomplete#keyword_patterns['default'] = '\h\w*'

  " Plugin key-mappings.
  inoremap <expr><C-g>     neocomplete#undo_completion()
  inoremap <expr><C-l>     neocomplete#complete_common_string()

  " Recommended key-mappings.
  " <CR>: close popup and save indent.
  inoremap <silent> <CR> <C-r>=<SID>my_cr_function()<CR>
  function! s:my_cr_function()
    return neocomplete#smart_close_popup() . "\<CR>"
    " For no inserting <CR> key.
    "return pumvisible() ? neocomplete#close_popup() : "\<CR>"
  endfunction
  " <TAB>: completion.
  inoremap <expr><TAB>  pumvisible() ? "\<C-n>" : "\<TAB>"
  " <C-h>, <BS>: close popup and delete backword char.
  inoremap <expr><C-h> neocomplete#smart_close_popup()."\<C-h>"
  inoremap <expr><BS> neocomplete#smart_close_popup()."\<C-h>"
  inoremap <expr><C-y>  neocomplete#close_popup()
  inoremap <expr><C-e>  neocomplete#cancel_popup()

  " Close popup by <Space>.
  " inoremap <expr><Space> pumvisible() ? neocomplete#close_popup() : "\<Space>"

  " Enable omni completion.
  autocmd FileType css setlocal omnifunc=csscomplete#CompleteCSS
  autocmd FileType html,markdown setlocal omnifunc=htmlcomplete#CompleteTags
  autocmd FileType javascript setlocal omnifunc=javascriptcomplete#CompleteJS
  autocmd FileType python setlocal omnifunc=pythoncomplete#Complete
  autocmd FileType xml setlocal omnifunc=xmlcomplete#CompleteTags
  autocmd FileType php set omnifunc=phpcomplete#CompletePHP

  " Enable heavy omni completion.
  if !exists('g:neocomplete#sources#omni#input_patterns')
    let g:neocomplete#sources#omni#input_patterns = {}
  endif
  let g:neocomplete#sources#omni#input_patterns.php = '[^. \t]->\h\w*\|\h\w*::'
  let g:neocomplete#sources#omni#input_patterns.c = '[^.[:digit:] *\t]\%(\.\|->\)'
  let g:neocomplete#sources#omni#input_patterns.cpp = '[^.[:digit:] *\t]\%(\.\|->\)\|\h\w*::'

  " For perlomni.vim setting.
  " https://github.com/c9s/perlomni.vim
  let g:neocomplete#sources#omni#input_patterns.perl = '\h\w*->\h\w*\|\h\w*::'
  " }}}

  " Neosnippet.vim ------------------------------------------------------ {{{
  " Plugin key-mappings.
  imap <C-k>     <Plug>(neosnippet_expand_or_jump)
  smap <C-k>     <Plug>(neosnippet_expand_or_jump)
  xmap <C-k>     <Plug>(neosnippet_expand_target)

  " SuperTab like snippets behavior.
  imap <expr><TAB> neosnippet#expandable_or_jumpable() ?
  \ "\<Plug>(neosnippet_expand_or_jump)"
  \: pumvisible() ? "\<C-n>" : "\<TAB>"
  smap <expr><TAB> neosnippet#expandable_or_jumpable() ?
  \ "\<Plug>(neosnippet_expand_or_jump)"
  \: "\<TAB>"

  " For snippet_complete marker.
  if has('conceal')
    set conceallevel=2 concealcursor=i
  endif

  " Enable snipMate compatibility feature.
  let g:neosnippet#enable_snipmate_compatibility = 1

  " Tell Neosnippet about the other snippets
  let g:neosnippet#snippets_directory='~/.vim/bundle/vim-snippets/snippets'
  " }}}
endif
" }}}

Indent guides are neat. I need to figure out a way to have the ctermbg vary with colorscheme though :

" Indent Guides ------------------------------------------------------- {{{
let g:indent_guides_enable_on_vim_startup = 1
let g:indent_guides_auto_colors = 0
let g:indent_guides_space_guides = 1
autocmd VimEnter,Colorscheme * :hi IndentGuidesOdd  ctermbg=234
autocmd VimEnter,Colorscheme * :hi IndentGuidesEven ctermbg=none
" }}}

Custom Functions

:BangOpen lets you open the file that results from the output of a command. I was sort of surprised I had to write a function that did this. The most common use case would be: :BangOpen which script_I_wrote_thats_in_my_path_but_I_forget_where.

SetSpaces is a function that Steve Barbera wrote that I think is kinda neat. It sets all of your spaces and things to the same value with one command. This is kinda what modelines are for, but modelines have always struck me as weird and potentially dangerous (which is why I have them disabled in Basic Setting).

" BangOpen ------------------------------------------------------------ {{{
function! BangOpen(arg)
    execute 'tabe ' . system(a:arg)
endfunction

command! -nargs=1 BangOpen :call BangOpen(<f-args>)
" }}}

function! SetSpaces(arg)
    echo "settings spaces to: " . a:arg
    execute 'set tabstop=' . a:arg
    execute 'set shiftwidth=' . a:arg
    execute 'set softtabstop=' . a:arg
endfunction

command! -nargs=1 SetSpaces :call SetSpaces(<f-args>)

I must’ve been having some trouble with keycode mapping and added these. I can’t remember the context of these.

set timeout
set timeoutlen=2500
set ttimeoutlen=10
Deterministic Turing Machine in Python
Tyler Cipriani Posted

Below is a Turing machine implementation based on the example in Understanding Computation by Tom Stuart. The Turing machine is an entertaining thought exercise that makes computer science feel a bit like poetry. Most things I learn on about computers on a day-to-day basis are decidedly unpoetic.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
"""
Deterministic Turing Machine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is a turing machine written in python. This code closely follows code from
chapter 5 of the book _Understanding Computation_ by Tom Stuart.
This is basically a Finite State Machine with a tape.
"""

import collections


LEFT = 'left'
RIGHT = 'right'
BLANK = '_'


Rule = collections.namedtuple('Rule', [
    'state',
    'head',
    'next_state',
    'write',
    'move'])


def rule_applies(rule, state, tape):
    """Determine whether a rule applies to a state."""
    correct_state = rule.state == state
    correct_read = rule.head == tape.head
    return correct_state and correct_read


def follow_rule(rule, state, tape):
    """Follows the current rule."""
    if rule_applies(rule, state, tape):
        state = rule.next_state
        tape.middle = rule.write
        tape.move(rule.move)

    return rule, state, tape


class Tape(object):
    """Represents the tape in a turing machine."""
    def __init__(self, left=None, middle=None, right=None, blank=BLANK):
        """Initialize and show initial state."""
        self.left = left or []
        self.right = right or []

        self.middle = middle
        if self.middle is None:
            self.middle = blank

        self.blank = blank

    def move_right(self):
        """Move tape one unit right, add blanks as needed."""
        self.left = self.left + [self.middle]
        if self.right:
            self.middle = self.right.pop(0)
        else:
            self.middle = self.blank

    def move_left(self):
        """Move tape one unit left, add blanks as needed."""
        self.right = [self.middle] + self.right
        if self.left:
            self.middle = self.left.pop()
        else:
            self.middle = self.blank

    @property
    def head(self):
        return str(self.middle)

    def move(self, direction):
        """Move tape left or right."""
        if not direction in [LEFT, RIGHT]:
            raise RuntimeError('Unrecognized direction "%s"' % direction)

        if direction == LEFT:
            return self.move_left()

        if direction == RIGHT:
            return self.move_right()

    def __repr__(self):
        """Tape state with current head in parens, like _12(3)4."""
        out = '{}({}){}'.format(
            ''.join(map(str, self.left)),
            self.middle,
            ''.join(map(str, self.right)))
        return out


class DeterministicTuringMachine(object):
    """This is a turing machine."""
    def __init__(self, state, tape, accept_states, rules):
        """
        Initialize machine
        :state: - integer - that represents the current state of the machine
        :tape: - Tape - the machine's tape
        :accept_states: - [integer] - represents states when the machine has
                                      exited successfully
        :rules: - [Rule] - list of rules for the machine to follow
        """
        self.state = state
        self.tape = tape
        self.accept_states = accept_states
        self.rules = rules

    @property
    def accepting(self):
        return self.state in self.accept_states

    @property
    def stuck(self):
        """Stuck when we have no next rule."""
        return not self.next_rule

    @property
    def working(self):
        """Working when not done and we still have rules to apply."""
        return not (self.accepting or self.stuck)

    @property
    def next_rule(self):
        """Get next rule."""
        rules = self._find_rules()

        if rules:
            return rules[0]

        return rules

    def _find_rules(self):
        """Find a rules we can apply."""
        applicable_rules = [rule for rule in self.rules
                            if rule_applies(rule, self.state, self.tape)]

        return applicable_rules

    def step(self):
        """Apply any rules we can find."""
        _, self.state, self.tape = follow_rule(
            self.next_rule, self.state, self.tape)

    def run(self):
        while self.working:
            self.step()

This machine contains objects for a tape (Tape), rules for a machine to follow (Rule), and an object representing the state of the Turing machine itself (DeterministicTuringMachine).

Incrementing binary numbers

Given the appropriate set of rules, this machine can perform general computing tasks. In the book, the rules for incrementing a binary number are used as an example.

We start with the number 10111 (A.K.A, 23), which we’d like to increment by 1 to get 11000 (A.K.A., 24). To begin we set the tape with the number we’d like to increment with the read head of the tape resting on right-most digit of the binary number:

# Tape looks like: 1011(1)
# Where () represents the read/write head
t = Tape(left=[1,0,1,1], middle=1)

This machine will have three available machine “states” that help to define the rules for the Turing machine to follow. When the machine is in a particular state, and encounters a particular condition (i.e., the read head is over a particular number) it will follow a particular rule – that is, it will write either a 1 or a 0, move the read head either LEFT or RIGHT, and, possibly, change machine state. These rules are based on machine state in combination with a read condition.

The machine will start in state 1. When the machine enters into one of the accept_states, the machine will stop processing. The only accept_state for this machine is 3.

# Availale states in this example are 1, 2, 3
initial_state = 1

m = DeterministicTuringMachine(
    state=initial_state, tape=t, accept_states=[3], rules=[
        # if the state is 1 and the read head is ... etc
        Rule(state=1, head='0', next_state=2, write='1', move=RIGHT),
        Rule(state=1, head='1', next_state=1, write='0', move=LEFT),
        Rule(state=1, head=BLANK, next_state=2, write='1', move=RIGHT),

        # if the state is 2 and the read head is ... etc
        Rule(state=2, head='0', next_state=2, write='0', move=RIGHT),
        Rule(state=2, head='1', next_state=2, write='1', move=RIGHT),
        Rule(state=2, head=BLANK, next_state=3, write=BLANK, move=LEFT)
    ]
)

If we call the step method of the state machine we can trace how it follows rules.

>>> m.tape
1011(1)
>>> m.step()
>>> print(m.state)
1
>>> print(m.tape)
101(1)0

Since it was in state 1 and the read head was over a 1 it followed rule Rule(state=1, head='1', next_state=1, write='0', move=LEFT) – it wrote 0 in its current location, it moved the read head LEFT, and stayed in the 1 state. Since the state is still 1 and the read head is once-again over a 1, the same rule will be followed again:

>>> m.tape
101(1)0
>>> m.step()
>>> print(m.state)
1
>>> print(m.tape)
10(1)00

Calling the run method of the machine will continue to follow the defined rules until state 3 is reached:

>>> m.run()
>>> print(m.state)
3
>>> print(m.tape)
1100(0)_

Turing machines are magic, I guess is what I’m saying.

What I Told the FCC
Tyler Cipriani Posted

I think that new FCC proposal for “Restoring Internet Freedom” would completely destroy the internet. Should this proposal go any further, the internet will no longer be recognizable within a very short amount of time. I’m worried that I will no longer have a place on the internet because my internet activity and work isn’t designed to invade people’s homes and privacy to sell them things they don’t need. Non-profits on the internet are dead.

This proposal is entirely rent-seeking by big ISPs. ISPs will make more money without providing any additional service for users, and the FCC would be enabling it all.

Big Disclaimer

This is a post where I feel it is important to make absolutely clear that my views are my own and do not represent the views of my employer(s past or present)

My comments to the FCC

The FCC has opened their proposal to public comment and I encourage you to do so.

Here’s what I wrote:

I am a Colorado resident and I’ve worked as an internet engineer for my entire professional career (small disclaimer that my views are my own and do not represent those of my employer), and I’m worried that new FCC rules may allow large ISPs to destroy the internet.

The current state of the internet is possible because ISPs aren’t selling tiers of websites to their users. If we create internet fast-lanes, then we create a tiered internet. This is currently the way cable channels already work: if you just want CNN and ESPN you can pay one amount, but if you want HBO you pay a little more. The problem is that the internet is constantly in flux, there are constantly new websites being built. This is a major driver of economic growth and is good for everyone who uses the internet.

If you start offering internet fast-lanes, internet tiers become possible: if you just want Facebook and Google then you pay one amount, but if you want Wikipedia you pay a little more. Ipso facto, Wikipedia is unavailable to a portion of the internet population.

There may be those who make the argument that this is how the free market works: if Wikipedia were good, people would be willing to pay more for it; however, this artificial choice between Google and Wikipedia is only made possible by a perversion of the design of the internet. Bits of information are bits of information, an ISP is arguing that websites should pay more while it offers the same service. ISPs want Netflix to pay more than Google for them to move 100 bytes from the internet backbone to your houuse. Moving 100 bytes from the backbone to someone’s house, the so-called “last mile”, is same amount of work whether those bytes are from Google or Netflix so why should Netflix have to pay more?

Meanwhile, large ISPs that move bits from the internet backbone over the “last mile” do not compete with one another by means of designed but unspoken collusion. The suggestion that an internet user should have to choose between Google and my blog is being made by companies that make all of their money by rent-seeking and now demand more money for less service. Consumers are not being protected, only ISP shareholders are.

This proposal has the net-effect of censoring the internet so that only websites that have enough money to pay protection fees to ISPs will be available to people. This proposal is censorship that stifles freedom and innovation and will destroy the internet.

/me drops mic.

Yearly Reading: Done and To Do
Tyler Cipriani Posted

Reading fiction is not that dissimilar from watching a movie in your brain. The ways in which audio-visual consumption of fiction and reading fiction diverge is in their demand on engagement – when reading I must learn new words, I struggle to find meaning, I use my imagination, I slowly strengthen understanding – I meet books more than half-way in my mind’s eye.

Movies and television, in contrast, demand less engagement from the viewer – much of the vision and interpretation is handed (or forced) on the audience. As popular movies and TV demand less, they also seem more apt to acquiesce to popular whim – mirroring the culture that consumes them rather than challenging it. The level of engagement demanded by audio-visual entertainment is also subject to a shifting baseline – that is, it’s gotten less challenging over time.

This is not a knock on all television and movies as a form of art. Often television, movies, and documentaries are the perfect medium to convey a particular message. Also, there are many movies and television programs which are, in fact, very challenging to their audience in addition to being very enjoyable.

Also, it is noteworthy that I tend to read mass-market crap – the book-equivalent of high-fructose corn syrup. I’ve come to terms with this fact. I read some crap, I read some good things, occasionally I’ll read a few great (and nominally “great”) things. I end up with a greater sense of accomplishment and satisfaction from reading crap rather than watching crap – YMMV.

This is a long way of saying – I think that reading is a Good Thing™ and I want to do more of it. To enforce more of a habit, I set reading goals for myself at the start of each year. In 2015 I wanted to read a book a month. In 2016 I wanted to read 2 books a month. In both 2015 and 2016 my grasp has exceeded my reach – which is not how goals are supposed to work.

Books read in 2016

  1. Stranger in a Strange Land by Robert A. Heinlein
  2. Black Hole by Charles Burns
  3. Fun Home by Alison Bechdel
  4. The Girl on the Train by Paula Hawkins
  5. Starship Troopers by Robert A. Heinlein
  6. Ready Player One by Ernest Cline
  7. The Circle by Dave Eggers
  8. Notorious RBG by Irin Carmon and Shana Knizhnik
  9. The Three-Body Problem by Cixin Liu
  10. A Wizard of Earthsea by Ursula K. Le Guin
  11. Stoner by John Williams
  12. Prost! The Story of German Beer by Horst D. Dornbusch
  13. Bock by Darryl Richman
  14. Do Androids Dream of Electric Sheep by Philip K. Dick
  15. Room by Emma Donoghue
  16. Kitchen Confidential: adventures in the culinary underbelly by Anthony Bourdain
  17. H is for Hawk by Helen Macdonald
  18. Watchers by Dean Koontz
  19. The Tombs of Atuan by Ursula K. Le Guin
  20. A Confederacy of Dunces by John Kennedy Toole
  21. Everything but Espresso by Scott Rao
  22. Harry Potter and the Sorcerer’s Stone by J.K. Rowling
  23. Harry Potter and the Chamber of Secrets by J.K. Rowling
  24. Harry Potter and the Prisoner of Azkaban by J.K. Rowling
  25. Naked Lunch by William S. Burroughs
  26. CODE: The Hidden Language of Computer Hardware and Software by Charles Petzold
  27. The Crying of Lot 49 by Thomas Pynchon
  28. The Best Christmas Pageant Ever by Barbara Robinson
  29. Harry Potter and the Goblet of Fire by J.K. Rowling

I also reviewed a couple of these books over the course of 2016, hopefully I can continue down that path in the year to come.

Goals for 2017

I’m going to try to read roughly a book a week in 2017 – 50 books for the year. Hopefully 50 books is an ambitious enough goal. Additionally, I’d like to try to do 6 book reviews on this blog as a means of:

  • Tracking what I’ve read
  • Checking my understanding
  • Engaging and owning what I’ve read

I’d also like to lay claim to my crazy ambitious plan to read 4 whole, real, grown-up (i.e., not beer- or computer-related) non-fiction books this year.

As I’m proclaiming these goals publicly for the first time, I’m sure I’ll come up woefully and shamefully short – yay blogging! :)