Data Science / Analytics
I have 8+ years of experience performing data analysis and scientific computing including the cleaning, mining, and statistical analysis of large data sets, often > 1 TB. To speed up these processes, I’ve written encompassing software packages with professional-grade GUIs at every stage of my career. The goal of these projects was to go from raw data through full statistical analysis to publishable ready figures at the click of a button. Below a brief description of these numerical routines. The code for each of these can be found on my personal Github page here.
TIme-domain signal processor
Software suite capable of processing an electrical or optical signal as a function of time. The software performs a fast Fourier transform of a user-loaded sample and reference signal from which the complex spectral transmission function is calculated. The software then numerical inverts the complex transmission equation through a Newton-Raphson algorithm to extract the complex index of refraction, optical conductivity, or magnetic susceptibility of the sample. This code is capable of performing analysis as a function of 2 external parameters, originally temperature and magnetic field, and 3 unique types of samples. The output of the code is publishable ready plots of user-specified physical quantities. These routines remain extensively in use in the Complex Materials Spectroscopy Group at Johns Hopkins University. Routines encompass approximately 5500 lines of code.
CCD image loader and radial integrator
Software suite capable of processing CCD images. The software loads, despeckles, and averages multiple user specified images into a single image before performing radial integration within a user-defined area. This code is capable of performing analysis as a function of an external parameter, originally temperature, and 4 unique measurement geometries. The outputs are publishable ready polar plots of the intensity as a function of angle. These routines remain extensively in use in the Search For Quantum Phases Group at The California Institute Of Technology. Routines encompass approximately 1600 lines of code.
languages, Platforms, & Software experience
Languages / Platforms
Python including the MatPlotLib, NumPy, and Pandas libraries
Data Analysis Software