added presentation

2025-12-08 17:38:01 -05:00
parent 091831c67c
commit a9f73e4314
1 changed files with 171 additions and 0 deletions
@@ -0,0 +1,171 @@
+### Slide 1 – Title
+
+**Title:** Did Stack Overflow Answers Increase After ChatGPT?
+- Changes in Stack Overflow answer activity post-ChatGPT launch
+- Impact of related policy events
+- Developer behavior balancing Stack Overflow vs. AI tools
+
+---
+
+### Slide 2 – Research Question
+
+**Research Questions:**
+1. Volume of answers:
+   - Did Stack Overflow answers change systematically after ChatGPT launched (late 2022)?
+2. Policy/event impact:
+   - Did AI-answer policies and moderation events create additional shifts?
+3. Substitution effect:
+   - Are heavy ChatGPT users visiting/answering less on Stack Overflow?
+
+**Approach:**
+- Look for structural breaks in answer time series
+- Link site-level patterns to developer survey data
+
+---
+
+### Slide 3 – Data Sources
+
+**Dataset 1:**
+- Monthly new answer counts (2018–2025)
+- Pulled from Stack Exchange Data Explorer
+- Includes deleted posts
+- Provides pre-ChatGPT baseline and post-event window
+
+**Dataset 2:**
+- Microdata from Stack Overflow Developer Surveys (2023–2025)
+- Focus:
+  - Visit frequency
+  - Adoption of AI tools like ChatGPT
+
+**Exploratory Plots:**
+- Raw time series
+- Pre/post comparisons
+- Seasonality
+- Moving averages
+
+---
+
+### Slide 4 – Preliminary Patterns
+
+**Key Observations:**
+- Long-run time series:
+  - Downward drift in answers pre-2022
+  - Sharper drop in level and slope post-ChatGPT launch
+- Pre/post comparison:
+  - Post-ChatGPT period sits lower, even after accounting for seasonal dips (e.g., summer, year-end)
+- Seasonal plots:
+  - 2018–2025 share consistent within-year rhythm
+  - Confirms changes aren’t due to seasonality
+
+---
+
+### Slide 5 – Methodology
+
+**Modelling Strategies:**
+1. **Interrupted Time-Series Regression (ITS):**
+   - Predictors: time trend, level jump (ChatGPT launch), slope change
+   - Optional indicators: policy/moderation periods
+2. **Poisson/Negative-Binomial Count Models:**
+   - Predictors: same as ITS
+   - Suitable for count data
+   - Quantifies percentage changes per month
+3. **ARIMA Model:**
+   - Trained on pre-ChatGPT data
+   - Forecasts counterfactual trajectory
+   - Compares observed vs. predicted post-event counts
+4. **Survey Logistic Regression:**
+   - Predicts frequent Stack Overflow visits
+   - Predictors: ChatGPT usage, demographics
+
+**Diagnostics:**
+- Residual checks
+- Over-dispersion
+- Out-of-sample performance
+
+---
+
+### Slide 6 – Model Fits & Counterfactuals
+
+**Findings:**
+- **Interrupted Time-Series Regression:**
+  - Downward level shift post-2022
+  - Steeper negative slope post-ChatGPT
+  - Controls for pre-existing trend
+- **Poisson Model:**
+  - Pre-ChatGPT: mild monthly contraction
+  - Post-ChatGPT: steeper decline (compounds over time)
+- **ARIMA Forecast:**
+  - Trained on pre-ChatGPT data
+  - Post-2022 counts fall below 80% prediction interval
+  - Observed counts never recover
+
+**Takeaway:**
+- Structural break in answer supply post-ChatGPT and policy changes
+- Changes not explained by trend/seasonality alone
+
+---
+
+### Slide 7 – Survey Results
+
+**Key Insights:**
+- **ChatGPT Adoption (2023):**
+  - Widespread among developers, especially heavy coders
+  - Daily use common in workflows
+- **Visit Frequency (2023–2024):**
+  - 2023: Heavy ChatGPT users visit Stack Overflow at similar daily rates as non-users
+  - 2024: Frequent visits drop more for heavy ChatGPT users
+- **Logistic Regression:**
+  - ChatGPT usage alone: weak predictor of visit frequency (low-50% accuracy)
+  - Combined with cross-tabs: supports partial substitution (marginal questions shifted to ChatGPT)
+
+---
+
+### Slide 8 – Key Findings
+
+**Summary:**
+- Monthly answers on Stack Overflow:
+  - Sharp drop post-ChatGPT release
+  - Continued lower trend (even after controlling for pre-existing decline)
+- Policy/moderation events:
+  - Additional dips align with governance decisions
+  - Suggest amplification of ChatGPT effect
+- ARIMA counterfactuals:
+  - Post-2022 counts outside expected range of pre-ChatGPT dynamics
+- Substitution effect:
+  - Heavy ChatGPT users less likely to visit Stack Overflow daily over time
+
+---
+
+### Slide 9 – Limitations
+
+**Caveats:**
+1. **Causality:**
+   - Overlap of ChatGPT, AI policies, moderation strike
+   - Broader economic/tooling trends also in play
+2. **SEDE Data:**
+   - Doesn’t capture moderation queues/private spaces
+   - Some activity may be invisible
+3. **Survey Data:**
+   - Self-reported
+   - May under-represent active answerers or certain regions/roles
+
+**Interpretation:**
+- Results are **correlational evidence** of shifts in answer supply/usage patterns
+- Not a precise causal estimate of “ChatGPT effect”
+
+---
+
+### Slide 10 – Implications & Future Work
+
+**Implications:**
+- Answer supply sensitive to:
+  - Assistance tooling
+  - Governance decisions
+- Platforms should:
+  - Carefully consider AI policies/moderation capacity
+  - Explore integration with conversational assistants (e.g., structured answer APIs)
+
+**Future Work:**
+- Tag-level/user-cohort analyses
+- Stronger quasi-experimental designs (e.g., synthetic controls)
+-