### Slide 1 – Title **Title:** Did Stack Overflow Answers Increase After ChatGPT? - Changes in Stack Overflow answer activity post-ChatGPT launch - Impact of related policy events - Developer behavior balancing Stack Overflow vs. AI tools --- ### Slide 2 – Research Question **Research Questions:** 1. Volume of answers: - Did Stack Overflow answers change systematically after ChatGPT launched (late 2022)? 2. Policy/event impact: - Did AI-answer policies and moderation events create additional shifts? 3. Substitution effect: - Are heavy ChatGPT users visiting/answering less on Stack Overflow? **Approach:** - Look for structural breaks in answer time series - Link site-level patterns to developer survey data --- ### Slide 3 – Data Sources **Dataset 1:** - Monthly new answer counts (2018–2025) - Pulled from Stack Exchange Data Explorer - Includes deleted posts - Provides pre-ChatGPT baseline and post-event window **Dataset 2:** - Microdata from Stack Overflow Developer Surveys (2023–2025) - Focus: - Visit frequency - Adoption of AI tools like ChatGPT **Exploratory Plots:** - Raw time series - Pre/post comparisons - Seasonality - Moving averages --- ### Slide 4 – Preliminary Patterns **Key Observations:** - Long-run time series: - Downward drift in answers pre-2022 - Sharper drop in level and slope post-ChatGPT launch - Pre/post comparison: - Post-ChatGPT period sits lower, even after accounting for seasonal dips (e.g., summer, year-end) - Seasonal plots: - 2018–2025 share consistent within-year rhythm - Confirms changes aren’t due to seasonality --- ### Slide 5 – Methodology **Modelling Strategies:** 1. **Interrupted Time-Series Regression (ITS):** - Predictors: time trend, level jump (ChatGPT launch), slope change - Optional indicators: policy/moderation periods 2. **Poisson/Negative-Binomial Count Models:** - Predictors: same as ITS - Suitable for count data - Quantifies percentage changes per month 3. **ARIMA Model:** - Trained on pre-ChatGPT data - Forecasts counterfactual trajectory - Compares observed vs. predicted post-event counts 4. **Survey Logistic Regression:** - Predicts frequent Stack Overflow visits - Predictors: ChatGPT usage, demographics **Diagnostics:** - Residual checks - Over-dispersion - Out-of-sample performance --- ### Slide 6 – Model Fits & Counterfactuals **Findings:** - **Interrupted Time-Series Regression:** - Downward level shift post-2022 - Steeper negative slope post-ChatGPT - Controls for pre-existing trend - **Poisson Model:** - Pre-ChatGPT: mild monthly contraction - Post-ChatGPT: steeper decline (compounds over time) - **ARIMA Forecast:** - Trained on pre-ChatGPT data - Post-2022 counts fall below 80% prediction interval - Observed counts never recover **Takeaway:** - Structural break in answer supply post-ChatGPT and policy changes - Changes not explained by trend/seasonality alone --- ### Slide 7 – Survey Results **Key Insights:** - **ChatGPT Adoption (2023):** - Widespread among developers, especially heavy coders - Daily use common in workflows - **Visit Frequency (2023–2024):** - 2023: Heavy ChatGPT users visit Stack Overflow at similar daily rates as non-users - 2024: Frequent visits drop more for heavy ChatGPT users - **Logistic Regression:** - ChatGPT usage alone: weak predictor of visit frequency (low-50% accuracy) - Combined with cross-tabs: supports partial substitution (marginal questions shifted to ChatGPT) --- ### Slide 8 – Key Findings **Summary:** - Monthly answers on Stack Overflow: - Sharp drop post-ChatGPT release - Continued lower trend (even after controlling for pre-existing decline) - Policy/moderation events: - Additional dips align with governance decisions - Suggest amplification of ChatGPT effect - ARIMA counterfactuals: - Post-2022 counts outside expected range of pre-ChatGPT dynamics - Substitution effect: - Heavy ChatGPT users less likely to visit Stack Overflow daily over time --- ### Slide 9 – Limitations **Caveats:** 1. **Causality:** - Overlap of ChatGPT, AI policies, moderation strike - Broader economic/tooling trends also in play 2. **SEDE Data:** - Doesn’t capture moderation queues/private spaces - Some activity may be invisible 3. **Survey Data:** - Self-reported - May under-represent active answerers or certain regions/roles **Interpretation:** - Results are **correlational evidence** of shifts in answer supply/usage patterns - Not a precise causal estimate of “ChatGPT effect” --- ### Slide 10 – Implications & Future Work **Implications:** - Answer supply sensitive to: - Assistance tooling - Governance decisions - Platforms should: - Carefully consider AI policies/moderation capacity - Explore integration with conversational assistants (e.g., structured answer APIs) **Future Work:** - Tag-level/user-cohort analyses - Stronger quasi-experimental designs (e.g., synthetic controls) -