# *ARM* Chapter 5: Logistic models of well-switching in Bangladesh

The logistic regression we ran for chapter 2 of *Machine Learning for
Hackers* was pretty simple. So I wanted to find an example that would
dig a little deeper into statsmodels’s capabilities and the power of the
patsy formula language.

So, I’m taking an intermission from *Machine Learning for Hackers* and
am going to show an example from Gelman and Hill’s *Data Analysis Using
Regression and Multilevel/Hierarchical Models* *(“ARM”)*. The chapter
has a great example of going through the process of building,
interpreting, and diagnosing a logistic regression model. We’ll end up
with a model with lots of interactions and variable transforms, which is
a great showcase for patsy and the statmodels formula API.

## Logistic model of well-switching in Bangladesh

Our data are information on about 3,000 respondent households in Bangladesh with wells having an unsafe amount of arsenic. The data record the amount of arsenic in the respondent’s well, the distance to the nearest safe well (in meters), whether that respondent “switched” wells by using a neighbor’s safe well instead of their own, as well as the respondent’s years of education and a dummy variable indicating whether they belong to ...