Dept. Dirigible Flightcraft

This web page is where I collect my experiments on data visualization, machine learning, and other technical miscellany.

News

Keming Labs

2011 June 1

There was a huge response on “The Twitter” to the Kindle typography article I posted in April. Amazon has an SDK for their lil' e-reader, so I started a company to explore the possibilities. Drop us a line if you want a Kindle app or have other digital typography needs.

Typography screenshot

Kindle typography

2011 April 18

Amazon’s Kindle ebook reader is an impressive piece of hardware, but its typographic rendering software leaves a bit to be desired. Lets add hyphenation and use the Knuth and Plass line breaking algorithm via the built-in WebKit browser. Advanced typographic features such as hanging punctuation and non-rectangular paragraph shapes are also possible.

Prote.cs dictionary screenshot

Interaction (un)design in data visualization

2011 February 23

I am giving a talk on data-driven JavaScript applications at 6:30 on Wednesday, February 23rd in downtown Portland.

Webtrends
851 SW 6th Avenue
Floor 16
Portland OR 97204

Slides

Prote.cs: A novel protein fold search algorithm

2010 December 4

Prote.cs dictionary screenshot

Prote.cs is an algorithm that assigns an uncharacterized protein structure to a fold family by answering the question

“Which fold family has members that can best sum to approximate this protein?”

More specifically, proteins are mapped into a vector space by their alpha carbon distance matrices and then assigned to a fold family according to an $l^1$-norm minimized linear regression on a basis derived from protein structures with known fold.

An accuracy of 95% was achieved on a set of 466 CATH fold families using just six-by-six pixel distance matrix thumbnails (21 dimensions).

Details and examples here, code on github

YALL1

2010 November 14

I’ve rewritten YALL1 (“Your ALgorithms for L1”) for Octave compatibility. YALL1 is a Matlab program by Yin Zhang, Junfeng Yang, and Wotao Yin (see original) that solves several $l^1$ minimization problems that have become notable in connection with compressed sensing. These problems are variations on the basis pursuit theme:

\[ \min \norm{\vec x}_1 \quad \mbox{such that} \quad \mat A \vec x = \vec b, \]

where the measurement matrix $\mat A \in \real^{m \times n}$ has $m \lt\lt n$, and the solution $\vec x$ is known to be (approximately) sparse. As it turns out, sparse reconstructions have a wide variety of applications in image compression, signal acquisition, and even classification/machine learning problems.

Compressed sensing details and examples here, code on github

Rliblinear

2010 November 5

I’ve forked an R interface to the LIBLINEAR C++ library, which solves classification and regression problems having millions of instances and features. These problems are variations on the theme:

\[ \min_{\mathbf w} \quad \tfrac{1}{2}\mathbf w^{\mathrm{T}} \mathbf w + C \sum_{i=1}^{l} \max\left(0, 1 - y_{i} \mathbf{w}^{\mathrm T}\mathbf x_{i} \right)^{2}\kern-6pt, \]

with \( \mathbf x_i \) a sample vector, \( y_i \in \{0, 1\} \) its class, and \( \mathbf w \) the vector that defines a decision hyperplane (problems with \(k\) classes are handled automatically by finding \(k\) one-vs-all hyperplanes).

Essentially, this is a very fast, large-scale support vector machine library that has no desire to play in fancy-pants kernel spaces. The R interface automatically expands categorical features (i.e. factors) with $c$ levels into $c$ binary dimensions.

Details and examples here, code on github

Protein viewer update

2010 October 2

I’ve updated my protein residue & Voronoi cell viewer to use the o3d-WebGL renderer. The WebGL canvas is supported by Firefox 4 and the current development builds of Chrome and Safari.

VisWeek 2010 paper

2010 September 1

A paper describing my work on information visualization for medical bill analytics has been accepted by IEEE VisWeek 2010.

See older news