{"id":99,"date":"2024-08-30T15:24:40","date_gmt":"2024-08-30T05:24:40","guid":{"rendered":"https:\/\/xinranhu.com\/?p=99"},"modified":"2024-11-19T22:46:32","modified_gmt":"2024-11-19T12:46:32","slug":"why-should-you-include-your-controls-in-the-first-stage-of-2sls","status":"publish","type":"post","link":"https:\/\/xinranhu.com\/index.php\/2024\/08\/30\/why-should-you-include-your-controls-in-the-first-stage-of-2sls\/","title":{"rendered":"Why should you include your controls in the first stage of 2SLS?"},"content":{"rendered":"\n<p>So I was tutoring an undergraduate\/master&#8217;s level applied econometrics course, and several students asked me why it is necessary to include also the exogenous control variables in both stages of 2SLS &#8211; More specifically, wouldn&#8217;t this double-count the correlation between the instrument and the controls?<\/p>\n\n\n\n<p>This turns out to be a fascinating question that I somehow have never thought carefully about, and seemingly also lacks documentation online (Although I am known as a horrible search engine user&#8230;). Hence I spent half an afternoon creating an illustration, which, to maximize accessibility to an audience with little matrix algebra background, is purely algebraic:<\/p>\n\n\n\n<p>For simplicity, consider the following regression<br>$$\\qquad\\qquad Y_i=\\beta_0+\\beta_1X_i+\\beta_2K_i+u_i\\,, \\qquad\\qquad(1)$$<\/p>\n\n\n\n<p>where $X_i$ and $u_i$ are correlated, and $K_i$ is exogenous. We will need IV(s) for $X_i$ because it&#8217;s endogenous.<\/p>\n\n\n\n<p>A quick reminder that the second stage of 2SLS is running the following regression<\/p>\n\n\n\n<p>$$Y_i=\\beta_0+\\beta_1\\hat{X}_i+\\beta_2K_i+\\nu_i\\,, $$<\/p>\n\n\n\n<p>where $\\hat{X}_i$ is the predicted value of $X_i$ from the first stage, whatever that might be.<\/p>\n\n\n\n<p>It makes sense that $K_i$ needs to be included in the second stage, as that is just running (1) using $\\hat{X}_i$ instead. In the first stage, for simplicity let&#8217;s say that we are only using one instrument, $Z_i$. So the question becomes, why does<\/p>\n\n\n\n<p>\\begin{equation}<br>\\qquad\\qquad X_i=\\pi_{0,1}+\\pi_{1,1}Z_i+\\pi_{2,1}K_i+\\xi_{1,i} \\qquad\\qquad(2)<br>\\end{equation}<\/p>\n\n\n\n<p>make more sense than<\/p>\n\n\n\n<p>\\begin{equation}<br>\\qquad\\qquad X_i=\\pi_{0,2}+\\pi_{1,2}Z_i+\\xi_{2,i}\\,? \\qquad\\qquad(3)<br>\\end{equation}<\/p>\n\n\n\n<p>Well, it&#8217;s because of the <strong>omitted variable bias<\/strong>! If $Z$ and $K$ might be correlated (The assumptions for $Z$ to be a valid IV did not prohibit $Z$ to do so!), then we could cause ourselves troubles if we didn&#8217;t account for that in the first stage.<\/p>\n\n\n\n<p>For illustrative purposes, assume $K_i=C_i+\\theta Z_i$ where $corr(C_i,Z_i)=0$, and we are fully aware of that (i.e, actually both $C_i$ and $\\theta$ are known). Then (2) becomes<\/p>\n\n\n\n<p>\\begin{align*} X_i&amp;=\\pi_{0,1}+\\pi_{1,1}Z_i+\\pi_{2,1}(C_i+\\theta Z_i)+\\xi_{1,i} \\\\ &amp;=\\pi_{0,1}+(\\pi_{1,1}+\\pi_{2,1}\\theta)Z_i+\\pi_{2,1}C_i+\\xi_{1,i}\\,, \\end{align*}<\/p>\n\n\n\n<p>So if we regress $X$ on $Z$ and $K$, we are basically regressing $X$ on $Z$ and $C$, i.e., estimating $\\pi_{2,1}$ and $\\pi_{1,1}+\\pi_{2,1}\\theta$ and backing out $\\pi_{1,1}$ as we know the value of $\\theta$.<sup data-fn=\"4cf08db5-dfdf-40dd-8dde-3366a6676b1a\" class=\"fn\"><a href=\"#4cf08db5-dfdf-40dd-8dde-3366a6676b1a\" id=\"4cf08db5-dfdf-40dd-8dde-3366a6676b1a-link\">1<\/a><\/sup> $\\hat{\\pi}_{1,1}$ and $\\hat{\\pi}_{2,1}$ will be unbiased in this situation.<\/p>\n\n\n\n<p>On the other hand, for (3) we have an omitted variable, $K_i$. We can actually pin down the size of OVB because we know what&#8217;s omitted here! Equation (3) should be equivalent to a slight rearrangement of what we just did:<\/p>\n\n\n\n<p>\\[X_i=(\\pi_{0,1}+\\pi_{2,1}C_i)+(\\pi_{1,1}+\\pi_{2,1}\\theta)Z_i+\\xi_{1,i}\\,,\\]<\/p>\n\n\n\n<p>which means<\/p>\n\n\n\n<p>\\begin{align*}\\hat{\\pi}_{0,2}&amp;=\\hat{\\pi}_{0,1}+\\hat{\\pi}_{2,1}\\bar{C}\\,,\\\\ \\hat{\\pi}_{1,2}&amp;=\\hat{\\pi}_{1,1}+\\hat{\\pi}_{2,1}\\theta\\,, \\end{align*}<\/p>\n\n\n\n<p>indicating that $\\hat{\\pi}_{1,2}$ comes with a bias of size $\\hat{\\pi}_{2,1}\\theta$, which we will not be able to remove if we have only estimated (3) as we have no idea about the size of $\\hat{\\pi}_{2,1}$ unless we estimate (2).<\/p>\n\n\n\n<p>So imagine plugging the two $\\hat{X}$ to the second stage equation respectively:<\/p>\n\n\n\n<p>\\begin{align*}(2)\\quad\\Rightarrow\\quad Y_i&amp;=\\beta_{0,1}+\\beta_{1,1}\\hat{X}_{i,1}+\\beta_{2,1}K_i+u_i \\\\<br>&amp;=\\beta_{0,1}+\\beta_{1,1}(\\color{blue}{\\hat{\\pi}_{0,1}}\\color{black}{+}(\\color{blue}{\\hat{\\pi}_{1,1}}\\color{black}{+}\\color{blue}{\\hat{\\pi}_{2,1}\\theta}\\color{black}{)Z_i+}\\color{blue}{\\hat{\\pi}_{2,1}}\\color{black}{C_i)+\\beta_{2,1}(C_i+}\\color{blue}{\\theta} \\color{black}{Z_i)+u_i} \\\\<br>&amp;=(\\beta_{0,1}+\\beta_{1,1}\\color{blue}{\\hat{\\pi}_{0,1}}\\color{black}{)+[\\beta_{1,1}(}\\color{blue}{\\hat{\\pi}_{1,1}}\\color{black}{+}\\color{blue}{\\hat{\\pi}_{2,1}\\theta}\\color{black}{)+\\beta_{2,1}}\\color{blue}{\\theta}\\color{black}{]Z_i+(\\beta_{1,1}}\\color{blue}{\\hat{\\pi}_{2,1}}\\color{black}{+\\beta_{2,1})C_i+u_i} \\\\<br>(3)\\quad\\Rightarrow\\quad Y_i&amp;=\\beta_{0,2}+\\beta_{1,2}\\hat{X}_{i,2}+\\beta_{2,2}K_i+u_i \\\\<br>&amp;=\\beta_{0,2}+\\beta_{1,2}(\\color{blue}{\\hat{\\pi}_{0,2}}\\color{black}{+}\\color{blue}{\\hat{\\pi}_{1,2}}\\color{black}{Z_i)+\\beta_{2,2}(C_i+}\\color{blue}{\\theta} \\color{black}{Z_i)+u_i} \\\\<br>&amp;=(\\beta_{0,2}+\\beta_{1,2}\\color{blue}{\\hat{\\pi}_{0,2}}\\color{black}{)+(\\beta_{1,2}}\\color{blue}{\\hat{\\pi}_{1,2}}\\color{black}{+\\beta_{2,2}}\\color{blue}{\\theta}\\color{black}{)Z_i+\\beta_{2,2}C_i+u_i}\\end{align*}<\/p>\n\n\n\n<p>I&#8217;m colouring all the <em>numbers<\/em> we know in each scenario in blue. The reason why I intentionally used $\\beta_{j,1}$ and $\\beta_{j,2}$ is because as you&#8217;ll soon see, plugging in the two different $\\hat{X}$ gives you different estimates!<\/p>\n\n\n\n<p>You&#8217;ll see how if we assume we are running this regression instead:<\/p>\n\n\n\n<p>$$Y_i=\\gamma_0+\\gamma_1Z_i+\\gamma_2C_i+u_i\\,.$$<\/p>\n\n\n\n<p>Since $Z$ and $C$ are both exogenous, we can obtain unbiased $\\hat{\\gamma}\\,$s:<\/p>\n\n\n\n<p>\\begin{array}{rclcl} \\color{blue}{\\hat{\\gamma}_0}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{0,1}+\\hat{\\beta}_{1,1}\\color{blue}{\\hat{\\pi}_{0,1}}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{0,2}+\\hat{\\beta}_{1,2}\\color{blue}{\\hat{\\pi}_{0,2}} \\\\ \\color{blue}{\\hat{\\gamma}_1}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{1,1}(\\color{blue}{\\hat{\\pi}_{1,1}}\\color{black}{+}\\color{blue}{\\hat{\\pi}_{2,1}\\theta}\\color{black}{)+\\hat{\\beta}_{2,1}}\\color{blue}{\\theta}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{1,2}\\color{blue}{\\hat{\\pi}{1,2}}\\color{black}{+\\hat{\\beta}_{2,2}}\\color{blue}{\\theta} \\\\<br>\\color{blue}{\\hat{\\gamma}_2}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{1,1}\\color{blue}{\\hat{\\pi}_{2,1}}\\color{black}{+\\hat{\\beta}_{2,1}}&amp;\\color{black}{=}&amp;\\hat{\\beta}_{2,2}<br>\\end{array}<\/p>\n\n\n\n<p>Which, after a little algebra solves the IV estimates corresponding to the two different first stages:<\/p>\n\n\n\n<p>\\begin{align*} (2)\\quad\\Rightarrow\\quad \\hat{\\beta}_{1,1}&amp;=\\frac{\\color{blue}{\\hat{\\gamma}_1}\\color{black}{-}\\color{blue}{\\hat{\\gamma}_2\\theta}}{\\color{blue}{\\hat{\\pi}_{1,1}}}\\\\<br>(3)\\quad\\Rightarrow\\quad \\hat{\\beta}_{1,2}&amp;=\\frac{\\color{blue}{\\hat{\\gamma}_1}\\color{black}{-}\\color{blue}{\\hat{\\gamma}_2\\theta}}{\\color{blue}{\\hat{\\pi}_{1,2}}}=\\frac{\\color{blue}{\\hat{\\gamma}_1}\\color{black}{-}\\color{blue}{\\hat{\\gamma}_2\\theta}}{\\hat{\\pi}_{1,1}+\\hat{\\pi}_{2,1}\\color{blue}{\\theta}}<br>\\end{align*}<\/p>\n\n\n\n<p>The only difference between the two IV estimates are the denominators! $\\color{blue}{\\hat{\\pi}_{1,1}}$ is not biased, while $\\color{blue}{\\hat{\\pi}_{1,2}}$ is biased. Hence if we run the first stage without including $K_i$, we end up getting a biased IV estimate of $\\beta_1$<\/p>\n\n\n<ol class=\"wp-block-footnotes\"><li id=\"4cf08db5-dfdf-40dd-8dde-3366a6676b1a\">In practice you can imagine that we regress $K$ on $Z$ and $C$ to get $\\hat{\\theta}$ <a href=\"#4cf08db5-dfdf-40dd-8dde-3366a6676b1a-link\" aria-label=\"Jump to footnote reference 1\">\u21a9\ufe0e<\/a><\/li><\/ol>","protected":false},"excerpt":{"rendered":"<p>So I was tutoring an undergraduate\/master&#8217;s level applied econometrics course, and several students asked me why it is necessary to include also the exogenous control variables in both stages of 2SLS &#8211; More specifically, wouldn&#8217;t this double-count the correlation between the instrument and the controls? This turns out to be a fascinating question that I [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_themeisle_gutenberg_block_has_review":false,"footnotes":"[{\"content\":\"In practice you can imagine that we regress $K$ on $Z$ and $C$ to get $\\\\hat{\\\\theta}$\",\"id\":\"4cf08db5-dfdf-40dd-8dde-3366a6676b1a\"}]"},"categories":[8],"tags":[],"class_list":["post-99","post","type-post","status-publish","format-standard","hentry","category-economics"],"_links":{"self":[{"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/posts\/99","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/comments?post=99"}],"version-history":[{"count":60,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/posts\/99\/revisions"}],"predecessor-version":[{"id":162,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/posts\/99\/revisions\/162"}],"wp:attachment":[{"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/media?parent=99"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/categories?post=99"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xinranhu.com\/index.php\/wp-json\/wp\/v2\/tags?post=99"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}