hoffmann.zhu@duke.edu

Hey there, My name is

Hoffmann Zhu

Business Analyst & Data Scientist

Welcome to my professional space. I am a business analytics professional who specializes in driving decisions with data. I bring to the table a unique blend of creativity and analytical prowess.

Check Out My Projects!

01. ABOUT ME

Hey there!

I'm Hoffmann Zhu, an Aussie with a background in business, data and music. My academic foundation spans a Bachelor of Music in Violin Performance and a Master of Science in Business Analytics from Duke University's Fuqua School of Business, embodying a rare combination of technical proficiency and leadership acumen.

Transitioning from a fascinating career in the music industry, to the vibrant field of data science and analytics, my journey has been driven by a deep-seated curiosity and an evolving interest in data. Encounters with friends in the tech industry sparked my fascination with how data informs decisions, shapes strategies, and reveals insights. This newfound passion propelled me into a self-directed exploration of data science, where I delved into projects and online courses, embracing Python, SQL, and machine learning.

My pursuit of a Master of Science in Business Analytics from Duke University’s Fuqua School of Business was a pivotal step, deepening my technical expertise and reinforcing my commitment to this career path. Through practical projects, I've applied analytical techniques to solve complex problems, blending the meticulousness of a musician with the analytical rigour required in tech. This unique blend of skills has positioned me to tackle challenges in data science with creativity and precision.

My expertise lies in harnessing the power of data analytics tools such as SQL, Python, R, alongside advanced business intelligence platforms like Tableau and Power BI, to uncover insights and drive business decisions. Through projects that range from market segmentation analyses to user behavior predictions, I've demonstrated a capacity to translate complex data sets into actionable strategies.

Here, you will find a curated showcase of my work and projects that highlight my commitment to excellence and innovation in the field of business analytics. I invite you to explore my portfolio and consider how my blend of analytical skills and strategic thinking can contribute to your organization's success.

In my spare time I love bouldering, and improvising on the piano, enriching my perspective through reading, fostering a mindset that values adaptability, continuous learning, and innovation. These experiences not only provide a creative outlet but also enhance my analytical work, offering fresh insights and approaches to problem-solving.

 

profile-img

Here are some of the tools I have and skills I work with:

  • A/B Testing
  • Hypothesis testing
  • Generative AI
  • Diffusion Modeling
  • Hyperparameter Tuning
  • K-Means Clustering
  • Factor Analysis
  • Inferential Statistics
  • Descriptive Statistics
  • Time Series Analysis
  • Survival Analysis
  • Machine Learning Algorithms
  • Snowflake
  • R (dplyr, tidyverse, ggplot)
  • Python
  • SQL
  • NoSQL
  • MySQL
  • Tableau
  • Advanced Excel
  • Power BI
  • Jupyter
  • Google Analytics
  • MATLAB
  • CRM
  • DBT
  • Data Analysis
  • Data Mining
  • Monte Carlo
  • Tableau
  • Leadership
  • Project Management
  • Project Leadership
  • Bayesian Statistics
  • Data Visualization
  • Regression Analysis
  • Statistical Modeling
  • Business Strategy
  • Fivetran

02. Where I've Worked

Business Data Analyst@ Durham NC

2024

    ·     Uncovered key insights that informed strategic decisions on orthodontist segments by cleaning and validating 97,000+ rows of customer account data using SQL and Python

    ·     Built interactive Tableau dashboards displaying key metrics that supported stakeholders in determining concrete marketing strategies to identify at risk customers and boost retention

    ·     Decreased marketing expenses by 7% by building and implementing a BG/NBD model to pinpoint and engage high value customers, enhancing marketing ROI and customer interaction

    ·     Streamlined communication with non-technical stakeholders by creating comprehensive documentation on data processes, including data cleaning procedures and model implementation

    ·     Collaborated with cross-functional teams in marketing and sales to implement recommended retention strategies and track metrics with Tableau dashboards

    ·     Developed and executed comprehensive A/B testing strategy to measure the impact of marketing strategies on user retention, increasing customer engagement by 5% 

Business Data Analyst@ Cuddebackville NY

2020-2023

    ·     Statisical Modeling: Utilized Python to conduct regression analysis on performance data from 121 shows per year to identify factors impacting performance, increasing performance quality by 15%.

    ·     Data Analysis: Developed and managed complex SQL queries to extract, clean and analyze large datasets (800,000+ rows) on performance metrics, operational data and financial records, reducing query times by 20% and enhancing financial tracking by 12%.

    ·       A/B Testing: Created and executed comprehensive A/B testing strategy using Python to measure the impact of marketing initiatives on audience retention, increasing customer engagement by 5%.

    ·       Market Research: Conducted market research by analyzing audience feedback and demographic data using Python, cluster analysis and factor analysis to develop customized marketing strategies, boosting ticket sales by 10% in target segments.

    ·       Business Intelligence: Leveraged Power BI to create interactive dashboards that visualized key metrics, leading to data-driven recommendations and a 20% increase in operational efficiency.

    ·       Cross-Functional Collaboration: Facilitated effective communication between cross-functional teams by understanding and translating business requirements into actionable insights, leading to improved alignment on goals.

03. Something I've built

Featured Projects

Athena Project-Game Revenue and Customer Analysis

Objective

Leverage advanced data analytics to determine the game with the highest revenue potential for Athena Softworks by analyzing extensive customer survey data.

Tools and Methods

Techniques: Factor analysis and K-means clustering in Python to segment the customer base, followed by regression analysis to develop a dynamic pricing strategy.

Data Analysis: Segmented customers into five distinct groups, facilitating targeted marketing and optimal pricing strategies based on customer willingness to pay.

Data

Used extensive customer survey data to understand preferences and willingness to pay, which informed the segmentation and pricing strategy.

Insights & Impact

Strategic Game Selection: Identified the game likely to maximize revenue through detailed analysis, enhancing Athena Softworks' portfolio strategically.

Revenue Optimization: Implemented a tailored pricing strategy across different customer segments that significantly boosted potential revenue.

Business Impact: The project demonstrated the ability to translate complex data into actionable insights, driving strategic business decisions and potentially increasing market competitiveness.

Key Skills Demonstrated

Proficiency in Python for complex data analysis.

Strategic application of statistical methods to solve business problems and inform decision-making.

To see the full report and code, please click on the Github icon below

Python, Statistics, Pricing Models, Pricing Strategy, Factor Analysis, K-Means Clustering, Business Strategy, Unsupervised Learning

Featured Project 2

Image Recognition using AI Model (CNN)

Objective

Create a machine learning model using convolutional neural networks (CNNs) to accurately classify varieties of rice based on image data, aiming to enhance agricultural quality control.

Tools and Methods

Techniques: Convolutional Neural Networks (CNNs) designed for deep learning image recognition tasks.

Data Preprocessing: Images were resized, augmented, and normalized to prepare for effective CNN training.

Data

The project utilized the Kaggle Rice Image Dataset, featuring 75,000 images across five rice varieties, annotated with morphological, shape, and color features.

Insights & Impact

High Precision: The model achieved a 99.33% accuracy rate, indicating exceptional precision in classifying rice varieties.

Agricultural Application: The accuracy and efficiency of this model can significantly improve seed selection and crop management practices, enhancing yield and minimizing losses.

Industrial Adoption Potential: Integrating this model into rice processing machinery could revolutionize quality control measures, potentially reducing waste and increasing productivity in the agriculture sector.

Key Skills Demonstrated

Expertise in building and tuning high-accuracy neural networks.

Practical application of advanced machine learning to solve significant industry challenges.

 

To see the full report and code, please click on the Github icon below

Python, Pytorch, Deep Learning, Neural Networks, Convolutional Neural Networks, AI, Machine Learning

Featured Project 3

Spotify User Conversion Enhancement

Objective

Develop strategies to convert Spotify's free users to premium subscribers, thereby boosting revenue and profitability.

Tools and Methods

Languages & Software: R, RStudio

Techniques: PCA for dimensionality reduction, K-Means Clustering for user segmentation, and Logistic Regression for conversion probability prediction.

Data

Analysis was based on the "Spotify User Behavior Dataset" from Kaggle, encompassing user demographics, usage history, and behavior patterns.

Insights & Impact

Targeted User Segments: Identified specific user segments such as Cluster 6, which showed a 66.7% willingness to subscribe, offering a strategic target for conversion efforts.

Predictive Accuracy: Employed a Random Forest model that demonstrated high effectiveness in predicting user conversion, as evidenced by robust model performance metrics and a well-performing ROC curve.

Business Impact: The strategies derived from this analysis are projected to significantly increase the rate of conversion from free to premium, thereby enhancing overall revenue. The focus on Cluster 6 is expected to leverage high-conversion potential efficiently.

To see the full report and code, please click the Github icon below

R, Statistical Modeling, Unsupervised Learning, R Studio, PCA, K-Means Clustering, Logistic Regression

Featured Projects

Athena Project-Game Revenue and Customer Analysis

Objective

Leverage advanced data analytics to determine the game with the highest revenue potential for Athena Softworks by analyzing extensive customer survey data.

Tools and Methods

Techniques: Factor analysis and K-means clustering in Python to segment the customer base, followed by regression analysis to develop a dynamic pricing strategy.

Data Analysis: Segmented customers into five distinct groups, facilitating targeted marketing and optimal pricing strategies based on customer willingness to pay.

Data

Used extensive customer survey data to understand preferences and willingness to pay, which informed the segmentation and pricing strategy.

Insights & Impact

Strategic Game Selection: Identified the game likely to maximize revenue through detailed analysis, enhancing Athena Softworks' portfolio strategically.

Revenue Optimization: Implemented a tailored pricing strategy across different customer segments that significantly boosted potential revenue.

Business Impact: The project demonstrated the ability to translate complex data into actionable insights, driving strategic business decisions and potentially increasing market competitiveness.

Key Skills Demonstrated

Proficiency in Python for complex data analysis.

Strategic application of statistical methods to solve business problems and inform decision-making.

To see the full report and code, please click on the Github icon below

Python, Statistics, Pricing Models, Pricing Strategy, Factor Analysis, K-Means Clustering, Business Strategy, Unsupervised Learning

Featured Project 2

Image Recognition using AI Model (CNN)

Objective

Create a machine learning model using convolutional neural networks (CNNs) to accurately classify varieties of rice based on image data, aiming to enhance agricultural quality control.

Tools and Methods

Techniques: Convolutional Neural Networks (CNNs) designed for deep learning image recognition tasks.

Data Preprocessing: Images were resized, augmented, and normalized to prepare for effective CNN training.

Data

The project utilized the Kaggle Rice Image Dataset, featuring 75,000 images across five rice varieties, annotated with morphological, shape, and color features.

Insights & Impact

High Precision: The model achieved a 99.33% accuracy rate, indicating exceptional precision in classifying rice varieties.

Agricultural Application: The accuracy and efficiency of this model can significantly improve seed selection and crop management practices, enhancing yield and minimizing losses.

Industrial Adoption Potential: Integrating this model into rice processing machinery could revolutionize quality control measures, potentially reducing waste and increasing productivity in the agriculture sector.

Key Skills Demonstrated

Expertise in building and tuning high-accuracy neural networks.

Practical application of advanced machine learning to solve significant industry challenges.

 

To see the full report and code, please click on the Github icon below

Python, Pytorch, Deep Learning, Neural Networks, Convolutional Neural Networks, AI, Machine Learning

Featured Project 3

Spotify User Conversion Enhancement

Objective

Develop strategies to convert Spotify's free users to premium subscribers, thereby boosting revenue and profitability.

Tools and Methods

Languages & Software: R, RStudio

Techniques: PCA for dimensionality reduction, K-Means Clustering for user segmentation, and Logistic Regression for conversion probability prediction.

Data

Analysis was based on the "Spotify User Behavior Dataset" from Kaggle, encompassing user demographics, usage history, and behavior patterns.

Insights & Impact

Targeted User Segments: Identified specific user segments such as Cluster 6, which showed a 66.7% willingness to subscribe, offering a strategic target for conversion efforts.

Predictive Accuracy: Employed a Random Forest model that demonstrated high effectiveness in predicting user conversion, as evidenced by robust model performance metrics and a well-performing ROC curve.

Business Impact: The strategies derived from this analysis are projected to significantly increase the rate of conversion from free to premium, thereby enhancing overall revenue. The focus on Cluster 6 is expected to leverage high-conversion potential efficiently.

To see the full report and code, please click the Github icon below

R, Statistical Modeling, Unsupervised Learning, R Studio, PCA, K-Means Clustering, Logistic Regression

04. Education

Duke University, The Fuqua School of Business @ Durham, NC

Master of Science in Business Analytics (MQM)

May 2024

Fei Tian College @ Cuddebackville, NY

Bachelor of Music: Violin Performance

May 2020

·         GPA 3.79

·         Cum Laude

·         President’s List

·         Full Merit Scholarship

05. What's Next?

Get in Touch

Hi, I’m currently exploring new opportunities in the data field, and my inbox is wide open. Whether you’re interested in a chat, a potential collaboration on a project, have a question, or just to say hi, feel free to reach out and I’ll do my best to get back to you.

Say Hello!