Computational Linguistics Portfolio

Simple Dynamical Systems of Language Change

I built models of simple dynamical systems of language change in Python. The process involved deriving and implementing formulas for statistical model analysis and optimizing parallel process and file writing runtimes from 120 minutes to 30 seconds.

Dynamical Systems Image 1

Spanish Proto-Lexicon Kernel Density Scoring

I developed an SQL tool to query word frequency and score data from 1000 sample Spanish proto-lexicon (a set of remembered word forms without associated meaning) stimuli databases and determined best samples by calculating Jensen-Shannon divergence 1000 times for 30 classifications of data using kernel density curves.

Density Image 1

Spanish Proto-Lexicon Stimuli Generation

I conducted word-based and morph-based phonotactic scoring and analysis of 200000 fake Spanish words using Python Pandas, SRILM, and Morfessor to generate an accurate set of stimuli consisting of fake words based on 10257 real spanish words for an experiment examining Spanish proto-lexicons (a set of remembered word forms without associated meaning) in English speakers in California and Texas.

Stimuli Image 1