Data Infra by AH

The VC company Andreessen Horowitz published a blog post detailing different approaches to data architecture for AI/ML and business analytics. It highlights 2 main points: 1) architectures are growing increasingly complex, and 2) data is central. This last point is in line with a recent presentation by Andrew Ng about MLOps, insisting about the need for a shift from model-centric AI to data-centric AI.

Install pdflatex on MacOS

Tex distributions

First step is to install a latex distribution, either basictex or mactex. If you want to be thrifty, go for basictex. But be prepared to install a lot of missing style files. Both ways can done via brew,

brew install --cask basictex

Installing missing style files

All of this is done via tlmgr. You can first update it,

sudo tlmgr update --self

Then, if you’re missing comment.sty when trying to compile a latex document, you’ll need to do

sudo tlmgr install comment

Notice that you always run tlmgr in sudo mode.

A last comment: I went with basictex. But if I had to do it again, I would install the full distribution so that I don’t have to install so many missing style files.

Ref: Getting started and productive with latex basictex on OSX terminal

Setting up a Raspberry Pi 4

I decided to start playing with the Raspberry Pi, and this post is here to summarize the steps I went through to set it up.

Setting up the OS

Some packages come equipped with a micro SD card already burned with an image. I thought it would be more interesting to do that setp myself. It’s actually pretty simple as all you need to do, besides buying a SD card, is install etcher, then follow the steps:

I actually first re-formatted the micro SD card to the FAT format using Disk Utility. But I’m not even sure this was needed as etcher might very well take care of that step.

Sample from categorical distribution

It is possible to (approximately) sample from a categorical distribution in a continuous, differentiable form. This ICLR 2017 paper introduces the Gumbel-Softmax distribution which relies on the Gumbel distribution to sample one-hot vectors from a categorical distribution.

The blog post I linked shows how the Gumbel-Max trick is equivalent to a softmax output. And the paper shows that you can sample.

This Gumbel-Softmax distribution can be used for neural architecture search, like for instance the Differentiable Neural Architecture Search (DNAS).

Upgrade jekyll and Ruby

When switching to a new mac, my version of Ruby got bumped up to 3.0 and jekyll to 4.2.0. A few things changed along the way and broke this website. So here is what I had to do to fix it:

  • In the _config.yml file, replace gems with plugins. See here
  • To install packages, you don’t need to do gem install <package>. You can instead add it to a Gemfile which looks like something like this:
source "https://rubygems.org"

gem "webrick", "~> 1.7"

gem "kramdown-parser-gfm", "~> 1.1"

gem "jekyll-watch", "~> 2.2"

gem "jekyll-paginate"
gem "jekyll-seo-tag"
gem "jekyll-sitemap"
gem "jekyll-gist"

You then run bundle install (you obviously need to install bundle before).

  • I had some version conflicts in my dependencies. This was resolved by resorting to bundle again. Instead of doing jekyll serve, you can do:
bundle exec jekyll serve