Summary

A programmer passionate about leveraging automation, data and technology to promote open government and improve socioeconomic and environmental issues. I have 11 years of experience working as programmer to leverage data and technology to inform research, policy and regulation in banking and biopharmaceuticals.

Projects

Predicting Building Site Energy Usage - Predicting building site energy usage using building performance data. Linear regression and random forest models were trained, but a random forest showed the lowest mean absolute error (TensorFlow, sklearn)
Exploratory Data Analysis of Los Angeles Traffic Collisions - Course final project demonstrating experience with EDA and visualization in python (pandas, matplotlib, seaborn).

Experiences

Data Scientist II

2023 - Present
AbbVie Biopharmaceutical, Information Research
  • Developed custom python data pipeline to orchestrate and transform source data into RDF (triple-store) for entity reconciliation
  • Orchestrated data pipelines and implemented data testing framework using Airflow
  • Reconfigured three knowledge extraction pipelines through normalization, knowledge extraction, and graph representation
  • Implemented new code deployment and promotion process with external contractors
  • Developed python module standardize and extend PySpark functionality amongst team
  • Developed custom python code to parse corrupted xml files
  • Documented new features and processes using confluence and created training videos to help cross-train and coach new employees/contractors
  • Created new Development and QA environments to ensure seamless code handoffs to Production operations team

Programmer Analyst II

2019 - 2023
Federal Reserve Bank of Dallas, Statistics
  • Developed four extract-transform-notify python automation jobs to eliminate over 700 hours of staff work per year
  • Designed and implemented new ETL process to transition survey collection platform to PaaS
  • Performed data migration and database design to on-board two new survey collections
  • Developed dashboard for preliminary survey data to enable rea-time data acess for researches, thus eliminating a 1-day delay
  • Developed K-means clustering notebook and KNN clustering notebook to test proof of concept to pilot AWS Sagemaker environment
  • Led requirements, development, testing, and deployment of five minor version software releases
  • Developed python scripts to migrate survey data to new database environment

Economic Research Assistant

2018 - 2019
Federal Reserve Bank of Dallas, Research
  • Developed R script to automate the retrieval, cleaning, and validation of data for economic modeling
  • Developed VBA excel workbook to automatically update FOMC regional data, eliminating 4 days of rework for every 6-week cycle
  • Provided research support and analysis for real-time economic updates in the 11th District

Business Operations Analyst

2016 - 2018
Federal Reserve Bank of Dallas, Houston
  • Improved data collection and analysis of district cafeteria IRS de minimis calculation
  • Analyzed and summarized regional economic conditions for the Board of Directors
  • Supported executive leadership research requests during and after Hurricane Harvey
  • Represented Houston Branch on the President’s Sustainability Initiative Council

Technical Skills

Programming

  • Python, R, Scala, SAS

Data engineering

  • PySpark, Airflow, Hadoop
  • Neo4j
  • PostgreSQL (psycopg2)
  • SQL Server (SSIS)
  • Mongo, Redis
  • MarkLogic (RDF/triple store), SPARQL

Data munging

  • Spark, PySpark
  • Pandas, numpy
  • Hive, Impala, T-SQL, pgSQL

Cloud

  • Linux, Docker, AWS (S3, EC2, Sagemaker, boto3)

Collaboration

  • Jira, Agile, Git, GitHub/GitLab, CML, Anaconda, Jupyter

Machine learning

  • Keras Tensorflow
  • Scikit-learn

Visualization

  • Tableau, Power BI, Spotfire
  • Python (Matplotlib, Seaborn, Plotly)
  • R (ggplot)
  • HTML, CSS

Methods

  • Machine learning: clustering, K means, K nearest neighbors, decision trees, random forest, naive Bayes, neural networks, CNN
  • Statistics: linear regression, linear regression, gradient descent

Publications

  • Minimum Wages and Occupational Skills Acquired During High School
  • Benjamin Meier, Kyrstin Shadle, Brent E. Kreider and Peter F. Orazem
    Iowa State University, 2018
  • At the Heart of Texas: Cities' Industry Clusters Drive Growth
  • Pia Orrenius, Laila Assanie, Michael Weiss, Alex Abraham, Stephanie Gullo, Benjamin Meier
    Federal Reserve Bank of Dallas, 2018