diff --git a/Assignment IV/Presentation Notes.md b/Assignment IV/Presentation Notes.md new file mode 100644 index 0000000..1e4d703 --- /dev/null +++ b/Assignment IV/Presentation Notes.md @@ -0,0 +1,171 @@ +### Slide 1 – Title + +**Title:** Did Stack Overflow Answers Increase After ChatGPT? +- Changes in Stack Overflow answer activity post-ChatGPT launch +- Impact of related policy events +- Developer behavior balancing Stack Overflow vs. AI tools + +--- + +### Slide 2 – Research Question + +**Research Questions:** +1. Volume of answers: + - Did Stack Overflow answers change systematically after ChatGPT launched (late 2022)? +2. Policy/event impact: + - Did AI-answer policies and moderation events create additional shifts? +3. Substitution effect: + - Are heavy ChatGPT users visiting/answering less on Stack Overflow? + +**Approach:** +- Look for structural breaks in answer time series +- Link site-level patterns to developer survey data + +--- + +### Slide 3 – Data Sources + +**Dataset 1:** +- Monthly new answer counts (2018–2025) +- Pulled from Stack Exchange Data Explorer +- Includes deleted posts +- Provides pre-ChatGPT baseline and post-event window + +**Dataset 2:** +- Microdata from Stack Overflow Developer Surveys (2023–2025) +- Focus: + - Visit frequency + - Adoption of AI tools like ChatGPT + +**Exploratory Plots:** +- Raw time series +- Pre/post comparisons +- Seasonality +- Moving averages + +--- + +### Slide 4 – Preliminary Patterns + +**Key Observations:** +- Long-run time series: + - Downward drift in answers pre-2022 + - Sharper drop in level and slope post-ChatGPT launch +- Pre/post comparison: + - Post-ChatGPT period sits lower, even after accounting for seasonal dips (e.g., summer, year-end) +- Seasonal plots: + - 2018–2025 share consistent within-year rhythm + - Confirms changes aren’t due to seasonality + +--- + +### Slide 5 – Methodology + +**Modelling Strategies:** +1. **Interrupted Time-Series Regression (ITS):** + - Predictors: time trend, level jump (ChatGPT launch), slope change + - Optional indicators: policy/moderation periods +2. **Poisson/Negative-Binomial Count Models:** + - Predictors: same as ITS + - Suitable for count data + - Quantifies percentage changes per month +3. **ARIMA Model:** + - Trained on pre-ChatGPT data + - Forecasts counterfactual trajectory + - Compares observed vs. predicted post-event counts +4. **Survey Logistic Regression:** + - Predicts frequent Stack Overflow visits + - Predictors: ChatGPT usage, demographics + +**Diagnostics:** +- Residual checks +- Over-dispersion +- Out-of-sample performance + +--- + +### Slide 6 – Model Fits & Counterfactuals + +**Findings:** +- **Interrupted Time-Series Regression:** + - Downward level shift post-2022 + - Steeper negative slope post-ChatGPT + - Controls for pre-existing trend +- **Poisson Model:** + - Pre-ChatGPT: mild monthly contraction + - Post-ChatGPT: steeper decline (compounds over time) +- **ARIMA Forecast:** + - Trained on pre-ChatGPT data + - Post-2022 counts fall below 80% prediction interval + - Observed counts never recover + +**Takeaway:** +- Structural break in answer supply post-ChatGPT and policy changes +- Changes not explained by trend/seasonality alone + +--- + +### Slide 7 – Survey Results + +**Key Insights:** +- **ChatGPT Adoption (2023):** + - Widespread among developers, especially heavy coders + - Daily use common in workflows +- **Visit Frequency (2023–2024):** + - 2023: Heavy ChatGPT users visit Stack Overflow at similar daily rates as non-users + - 2024: Frequent visits drop more for heavy ChatGPT users +- **Logistic Regression:** + - ChatGPT usage alone: weak predictor of visit frequency (low-50% accuracy) + - Combined with cross-tabs: supports partial substitution (marginal questions shifted to ChatGPT) + +--- + +### Slide 8 – Key Findings + +**Summary:** +- Monthly answers on Stack Overflow: + - Sharp drop post-ChatGPT release + - Continued lower trend (even after controlling for pre-existing decline) +- Policy/moderation events: + - Additional dips align with governance decisions + - Suggest amplification of ChatGPT effect +- ARIMA counterfactuals: + - Post-2022 counts outside expected range of pre-ChatGPT dynamics +- Substitution effect: + - Heavy ChatGPT users less likely to visit Stack Overflow daily over time + +--- + +### Slide 9 – Limitations + +**Caveats:** +1. **Causality:** + - Overlap of ChatGPT, AI policies, moderation strike + - Broader economic/tooling trends also in play +2. **SEDE Data:** + - Doesn’t capture moderation queues/private spaces + - Some activity may be invisible +3. **Survey Data:** + - Self-reported + - May under-represent active answerers or certain regions/roles + +**Interpretation:** +- Results are **correlational evidence** of shifts in answer supply/usage patterns +- Not a precise causal estimate of “ChatGPT effect” + +--- + +### Slide 10 – Implications & Future Work + +**Implications:** +- Answer supply sensitive to: + - Assistance tooling + - Governance decisions +- Platforms should: + - Carefully consider AI policies/moderation capacity + - Explore integration with conversational assistants (e.g., structured answer APIs) + +**Future Work:** +- Tag-level/user-cohort analyses +- Stronger quasi-experimental designs (e.g., synthetic controls) +-