ISSN 1403-2473 (Print) ISSN 1403-2465 (Online) Working Paper in Economics No. 808 An App Call a Day Keeps the Patient Away? Substitution of Online and In-Person Doctor Consultations Among Young Adults LinaMaria Ellegård, Gustav Kjellsson and LinnMattisson Department of Economics, June 2021 An App Call a Day Keeps the Patient Away? Substitution of Online and In-Person Doctor Consultations Among Young Adults Lina Maria Ellegård, Gustav Kjellsson and Linn Mattisson* June 7, 2021 Abstract The emergence of markets for online physician consultations – direct- to-consumers telemedicine (DCT) – is transforming healthcare services in many nations. The convenience of DCT lowers the cost of seeking care, thus potentially increasing demand. Yet, it is not known whether patients consuming online care turn to traditional providers as well. This is one of the first studies to causally assess to which degree online physi- cian consultations substitute for in-person consultations. We exploit the rapid emergence of a DCT market and exogenous changes in patient fees in a fuzzy difference-in-discontinuities analysis of young adults in two Swedish regions. We find evidence in support of partial substitution and an increase in total physician consultations. *Ellegård: Department of Economics, Lund University, linamaria.ellegard@nek.lu.se. Kjells- son: Department of Economics, gustav.kjellsson@economics.gu.se. Mattisson: Department of Economics, Lund University, linn.mattisson@nek.lu.se. Acknowledgements: We are thankful to Otàvio Bartalotti, Jan Bietenbeck, Carl Bonander, Yang He, Annika Herr, Naimi Johansson, Filippo Tassinari, and seminar participants at the EuHea seminar series, University of Gothen- burg, and Lund University for helpful comments. This work was funded by Jan Wallanders och Tom Hedelius stiftelse and Tore Browaldhs stiftelse (grant no. P19-0064), the Swedish Re- search Council for Health, Working Life and Welfare (FORTE, grant no. 2019-00123), Adlerbert Research Foundation, and Foundation for Economic Research in West Sweden. 1 1 Introduction In only a couple of decades, digitalisation has transformed many sectors in- cluding retail, travel and the financial industry. In the health care sector, the outbreak of the Covid-19 pandemic greatly reinforced the ongoing digitalisa- tion process, as remote consultations almost overnight became the preferred – or only – option available to examine patients without contributing to the spread of contagion (Mehrotra, Bhatia and Snoswell, 2021). But even before the pandemic, companies offering on-demand physician consultations via video calls or chats, so-called direct-to-consumer telemedicine (DCT) were rapidly growing in many countries.1 To illustrate, the Swedish DCT market, which emerged only in 2016 accounted for almost 5% of all primary care physician consultations in 2018 (SALAR, 2020). In contrast to traditional primary health care, DCT services are available around the clock to patients irrespective of their geographic location, and pa- tients incur no travel costs. These features make DCT an attractive substitute for in-person consultations, but also suggest that the availability of DCT might increase demand. In settings where DCT is (partly) financed by a third-party payer, such as an employer health plan or a national health insurance,2 this in turn means that DCT may aggravate existing moral hazard problems and spur inefficient consumption. Although the unit costs of DCT consultations may be lower than for in- person visits (Uscher-Pines et al., 2015a; Shi et al., 2018; Ekman, 2018), DCT will cause a net rise in health care costs if the demand effect is sufficiently strong (Li- curse and Mehrotra, 2018). Given long-standing concerns over rising health care costs, it is thus important to assess the degree to which patients use DCT services as substitute for traditional care, and to what extent the availability of these services increases total demand. We address this question in the context of the two largest regions in Sweden (Västra Götaland and Stockholm), where a number of private DCT companies operate on a fee-for-service basis since 2016. To obtain causal estimates, we exploit exogenous variation in consultation fees facing young DCT patients in 2016-18. Prior to an individual’s 20th birthday they could access DCT for free; thenceforth they had to pay a fee of approximately EUR 25/USD 30 per consul- 1Examples include Teladoc and K-Health in the US, Babylon Gp at Hand in the UK, Ping a Good Doctor in China, and Kry (Salisbury et al., 2020). 2According to a survey by the American Health Insurance Plans (AHIP), a vast majority of commercial plans offered virtual care already in 2019 (AHIP, 2019). 2 tation. We use the change in DCT consultations induced by the exogenous fee increase at the 20th birthday to identify the degree of substitution between DCT and in-person physician consultations. Recognising that the demand for in- person visits may change for other reasons around the 20th birthday, we purge our estimates of general 20th birthday effects using the change in traditional, in-person visits around the 20-year threshold of cohorts that reached the age limit before the emergence of the DCT market. Our results suggest that approximately 50% of online consultations replace in-person visits. The remaining consultations represent additional demand, i.e. consultations that would not have taken place if DCT was not available. Break- ing the results down by conditions, we find tentative evidence that the degree of substitution is substantially larger for respiratory infections, while online con- sultations related to skin conditions and reproductive health primarily repre- sent new consumption or substitution from cheaper professional categories. These results speak to the present challenge of designing incentives for the continued use of telemedicine in the post-pandemic era (Cutler, Nikpay and Huckman, 2020; Mehrotra, Bhatia and Snoswell, 2021). First, our results pro- vide proof-of-concept that even in the absence of a pandemic, providers are willing to supply remote consultations when there are financial incentives to do so. Second, payers ought to be aware of the risks of inefficient use of physi- cian time (i.e. when other professional categories could treat the patient at a lower cost), and they ought to discourage the treatment of minor issues that patients would normally not see a doctor for. Third, current co-payment levels may need to be revised in the face of lower indirect costs on part of patients. Although our analysis is restricted to individuals in a narrow age span, we show that our study population is similar to other young adults in terms of mor- bidity. The relevance of our analysis of early adopters may arguably have in- creased after the pandemic, as greater shares of the population now have expe- rience of and are comfortable with remote consultations. It is also notable that we obtain similar estimates in two independent administrative regions, with quite different incentive structures for traditional care providers. The earlier literature on substitution of DCT and in-person consultations is small. In two surveys of patients in the US (Martinez et al., 2018; Nord et al., 2018), most respondents stated that the DCT consultation replaced a practice- based consultation; only about 15% stated that they would have abstained from seeking care if DCT was not available. In Nord et al. (2018), most respondents further stated that the DCT consultation fully addressed their problem. No- tably, the responses in Martinez et al. (2018) indicated that DCT consultations 3 primarily substituted for nurse visits. The reliability of survey evidence is limited by the risk that respondents, ei- ther consciously or unconsciously, do not state the truth. To date, Ashwood et al. (2017) and Ellegård and Kjellsson (2019) are the only peer-reviewed stud- ies that use care utilisation data to estimate the degree of substitution. Ash- wood et al. (2017) studied care utilisation for acute respiratory infections among Californian public employees. Their matched difference-in-differences (DiD) analysis indicated that 12% of DCT consultations replaced in-person visits, and thus 88% of DCT consultations represented additional demand.3 Ellegård and Kjellsson (2019) studied a representative sample of the population in the Swedish region Skåne. They compared the number of in-person primary care consulta- tions before and after the emergence of the DCT market in DiD regressions us- ing entropy balancing to adjust for pre-existing differences in demography and morbidity between DCT users and other individuals. The results indicated that DCT consultations did not replace in-person physician visits at all. We find a substantially higher degree of substitution than previous studies with actual utilisation data. While the difference might relate to differences in populations and institutional settings, we also note that the DiD approaches of earlier studies may fail to account for time-variant unobserved heterogeneity and thus underestimate the degree of substitution. DCT providers typically ad- dress sudden and transitory health problems, such as respiratory infections or skin conditions; i.e., issues that are neither subsumed by time-invariant group- specific characteristics (fixed effects) nor possible to account for by controlling for the patient’s previously observed health issues. Outside the DCT setting, our study also relates to the literature on the adop- tion of similar telemedicine technology within traditional health care organisa- tions. In line with the findings from the DCT market, studies of patient portals offering patients the opportunity to communicate with their regular physician via electronic visits or secure messages indicate that such new modes of con- tacting one’s regular provider only partially replace in-person visits (Pearl, 2014; North et al., 2014; Shah et al., 2018; Bavafa, Hitt and Terwiesch, 2018). Our study also relates to the literature on how user fees affect care utilisa- tion. We find that the fee reduces the number of DCT consultations by about 50% (.15 visits per year). This may be compared to the drop of about 10% (.8- 3On the same note, two studies of DCT visits for acute respiratory infections in the US sug- gest that the number of follow-up visits are greater for patients whose index visit was in a DCT setting than patients with in-person index visits.(Shi et al., 2018; Li et al., 2021). 4 .10) in the number of in-person consultations previously estimated for indi- viduals reaching the age-based fee thresholds in the Swedish regions Västra Götaland (Johansson, Jakobsson and Svensson, 2019) and Skåne (Nilsson and Paul, 2018). The price elasticity of demand for DCT services is thus consider- ably higher.4 Finally, while this paper analyses a highly current topic, the emergence of DCT services and its effect on the overall primary care sector is an example of how markets and their actors change with the adoption of new technology. The emergence of DCT services provides an example of the distinguishing features of digitalisation discussed in Goldfarb and Tucker (2019), i.e., the reduction in indirect costs such as search costs and transportation costs. Examples from other sectors include automation in production (Cinco, 2021; Arntz, Gregory and Zierahn, 2019) and substitution between online and brick-and-mortar re- tail (Baugh, Ben-David and Park, 2018). The paper proceeds as follows. The next sections provide a background to the institutional setting (Section 2), describe the data (Section 3) and our em- pirical strategy (Section 4). Section 5 provides results and Section 6 concludes. 2 Institutional background 2.1 Traditional primary care Financing and provision of healthcare services in Sweden is delegated to 21 in- dependent regions. Healthcare is mainly financed by regional proportional in- come taxes (71%), central government grants (20%) and patient fees (5%) (SALAR, 2017). Primary care handles health issues that can be treated outside hospitals, and is organised in group practices – primary care centers (PCCs) – typically staffed by 4–6 general practitioners (GPs) and nurses (Anell, 2015).5 PCCs may be public or private, and the staff are salaried employees (Anell, Glenngaard and Merkur, 2012). Patients may register at any PCC. Notably, being registered 4Notably, the results of these earlier studies highlight the need to go beyond a simple regres- sion discontinuity approach when using the 20-year age threshold as basis for identification, as the same age threshold is used for both the DCT consultation fee and for in-person consul- tation fees in one of our study regions (Västra Götaland). 5PCCs may also employ, or contract with, other professional categories, e.g., physiothera- pists and cognitive therapists. 5 does not entail any restrictions on the patient’s choice of provider; patients are free to seek care with any outpatient provider (but the user fee may vary). During our study period, patients seeking care would normally first contact their PCC by phone.6 After a nurse-led phone triage (screening), the patient would either get an appointment at the PCC (with a GP or a nurse), or the nurse would give self-care advice. Low access to primary care is a long-standing debated issue in Sweden (Anell, 2015). Albeit, among those who were offered an appointment at a PCC in 2018, around 70% were able to see the physician on the same day.7 The regions have the right to levy fees for consultations taking place within the region, subject to the nationally set restriction that each patient must not pay more than EUR 100 annually. Throughout our study period, the fee level in Stockholm was EUR 20, while patients in Västra Götaland (VGR) paid EUR 10 for visits at the PCC where they are registered and EUR 30 for visits at other providers. The lower age limits in Stockholm and Västra Götaland were 18 and 20, respectively. The main reimbursement of PCCs comes from other sources. In Västra Gö- taland (VGR), PCCs were (and still are) almost entirely reimbursed by capita- tion, i.e., a fixed monthly sum per registered patient. In the Stockholm region (Sthlm), capitation accounted for roughly 60% of reimbursement, and most of the residual was variable payment based on the number of visits. Comparing the two regions, PCCs in Stockholm thus had stronger incentives to provide consultations. 2.2 The Swedish DCT market Since 2016, the PCCs face competition from a number of private DCT providers. The emergence of the Swedish DCT market was an unintended consequence of the Patient Right Law, enacted in 2015, which gave patients the right to seek care outside their region of residence. DCT entrepreneurs realised that they could locate a DCT company in one region, treat patients in other regions, and then bill the patients’ home regions. Notably, this arrangement means that the DCT providers operate outside the regular reimbursement systems, and are instead subject to the regulation for inter-regional reimbursement, which 6More recently, all regions have developed systems for asynchronous contacts allowing pa- tient to describe their problems in a form and then be contacted by the PCC. 7Figures from https://www.vantetider.se/Kontaktkort/Sveriges/PrimarvardBesok, accessed on Feb 15 2021. 6 is negotiated by the Swedish Association of Local Governments and Regions (SALAR). In practice, this means that DCT companies are reimbursed on a fee- for-visit basis.8 If the reimbursement exceeds the DCT companies’ marginal cost of a consultation, the DCT providers have incentives to be generous with appointments. Patients get in touch with a DCT provider by describing their symptoms in a smartphone app or on the company’s website. During our study period, pa- tients did not have to undergo nurse-led triage before being connected to a physician when contacting the two dominating DCT companies Kry and Min Doktor.9 Around 80% of patients waited less than 30 minutes for their online consultation at Kry (Kry, 2019). Online consultations may take the form of asynchronous chats or video consultations. These modes of contacts obviously preclude physical exami- nations, but DCT providers are allowed to set diagnoses, prescribe drugs, and write referrals to other providers. In other words, physicians working at a DCT company have the same responsibilities and authorities as physicians working in the traditional primary care sector. Importantly, few traditional PCCs offered chats or video calls in 2018.10 In comparison with patients attending practice-based primary care, infants and adolescents are over-represented at DCT providers. The reason for con- sultation also differs, with skin conditions and respiratory infections account- ing for a higher share of consultations in DCT settings (Ellegård and Kjellsson, 2019). Patients pay a consultation fee according to the rules of the region where the DCT provider is located (Blix and Jeansson, 2019). During our study period, the largest providers Kry and Min Doktor were located in Region Jönköping, where patients paid a consultation fee of EUR 25. A minor provider (Doktor.se) was located in Sörmland, where there was no consultation fee.11 Consultations in 8Initially, the reimbursement was approximately EUR 180, which roughly corresponds the cost per consultation in primary care (a crude average). Recognising that this reimbursement level did not correspond to DCT providers’ marginal costs (due to their very different casemix), SALAR reduced the reimbursement to EUR 65 in 2017 (SALAR, 2020). 9The third largest company, Doktor.se, which only served a small fraction of the market, had an initial nurse-led triage step. 10The traditional provider Capio and public providers in Västra Götaland launched online platforms in 2018, but the outreach was negligible. The user fee for also changed at the 20th birthday. 11In 2019, Kry and Min Doktor re-located to Sörmland . 7 Jönköping accounted for almost 90% of DCT consultations in 2018. For ease of reference, Table 1 summarises the user fee and reimbursement systems in the traditional primary care and DCT sectors. Table 1: Summary of Institutional Setting Sthlm VGR TRADITIONAL PCC Reimbursement: Capitation + Per visit Capitation User fee age limit: 18 20 User fee: €20 €10/30* DCT (ONLINE) MARKET Rembursement: Per visit Per visit User fee age limit: 20 20 User fee: €25 €25 Note: The table describes the reimbursement system and user fee for traditional in- person visits and DCT consultations. The user fee for the online consultations refers to the user fee in Region Jönköping, where the two largest DCT providers were lo- cated during the study period. *In VGR the user fee is lower if the patient visits the registered PCC 3 Data 3.1 Study population and data sources Employed with data from the Swedish population register (held by Statistics Sweden, SCB), we define a study population consisting of all individuals who belonged to the 19-20 year age group in any of the years 2012-2018, who resided in either the Stockholm or the Västra Götaland region two consecutive New Year’s Eves,12 and who had lived in Sweden at least since they were 15 years old.13 The care utilisation data is collected from regional administrative registers. The daily data include the universe of consultations with primary care physi- cians and nurses in these regions in 2012-2018, including diagnosis codes (ICD- 12Our annual data on place of residence is measured on December 31th. 13We employ this restriction to avoid compositional changes driven by the immigration wave in 2014-15. 8 10). To identify online consultations, we use billing information available in the regional care registers. These data cover all billable contacts with primary care providers in other regions. We also obtain information on diagnoses set at each DCT consultation in Region Jönköping, where the two main DCT providers were located during the study period. We link the care data to annual data on demographic and socioeconomic characteristics of the study population and their parents obtained from SCB. Importantly, these data include the exact dates of birth of the study population, and we can thus examine a narrow time span around the 20th birthday. The estimation dataset is structured as a daily panel, where the time dimension is defined relative to the 20th birthday. 3.2 Variable definitions 3.2.1 Online consultations To measure the number of online consultations per day, we use the billing data to identify contacts with primary care providers in the two regions where DCT providers were located (Jönköping and Sörmland).14 This approach slightly overestimates the number of online physician consultations, as the billing data does not allow us to distinguish between online and in-person consultations. Reassuringly, auxiliary analyses show that the overestimation is completely in- consequential.15 In our main analysis, we include all online consultations regardless of di- agnosis. In sub-analyses, we use data from Region Jönköping to look specifi- cally at diagnoses that are commonly set by DCT providers. Our definition of common DCT diagnoses cover roughly 90% of all online consultations with a physician.16 We also classify the set of common diagnoses into four subsets: 14The measure also includes a small number of DCT contacts with a provider that was located in a third region (Skåne) before it moved to Jönköping, and with the public online service in Västra Götaland. 15Using additional data obtained from Jönköping, where the age based user fee was applied, we note that online consultations in this region account for almost 90% of the consultations in our preferred measure, and that 9 out of 10 online consultations was with a physician (rather than a nurse etc). Our first stage estimates are practically the same when we use the billing data and when we use data from Jönköping (see Appendix A) 16Common diagnoses = ICD-codes (on a three-digit level) that cover 80% of the registered di- agnoses for 19– and 20– year-olds during online consultations with private providers, including diagnoses within the same ICD-block with more than 10 registered episodes. See Appendix B. 9 upper respiratory infections, skin conditions, diagnoses related to genital and reproductive organs, and a residual category (other). 3.2.2 In-person consultations Our main outcome variable counts the daily number of in-person physician consultations (visits) at a regular PCC or at an extended-hours practice (EHP) in the patient’s region of residence. In sub-analyses, we categorise the in-person visits into the same categories as described above. This allows us to establish the degree of substitution within diagnosis type (e.g., upper respiratory infec- tions). 3.2.3 Background variables We use the following predetermined background variables in this study: Fa- ther in white collar profession (Yes/No), father with university education (Y/N), mother with university education (Y/N), father’s income above national me- dian (Y/N), mother’s income above national median (Y/N), one or both parents born outside Scandinavia (Y/N), number of physician visits at age 18 (0/1/>1), rurality of municipality of residence (sparely populated / densely populated / metropolitan)17. Descriptive statistics summarising these characteristics as well as the physician visits are provided in Table C.1 in Appendix C. When comparing the 2018 cohorts, on average consumers of DCT visit doc- tors (in person) twice as often as non-users of DCT. Women and city residents are clearly overrepresented in the sample of DCT-users, with the implication that it will be interesting to look at the gender and geographical distribution in terms of our results. There are some slight differences in averages when it comes to indicators of family background, where DCT-users tend to have par- ents that are slightly more likely to be Swedish, and have higher income and education. However, the standard deviation is fairly large across all these vari- ables compared to the mean. In a wider perspective, we want to compare our study population of 19-20 year-olds to other parts of the age distribution to say something about the gen- eralisability of our analysis. If our study population is significantly different in their health demands to the rest of the Swedish population, it would imply we have limited generalisibility. Appendix L shows that the 19-20 year-olds are 17The rurality variable follows Statistics Sweden’s definition. The metropolitan category in- cludes the city of Stockholm, the city of Gothenburg, and municipalities close to these cities. 10 quite similar to other adolescents and young adults (<35) in terms of expected health care costs. The share of individuals with a DCT-relevant diagnosis is sim- ilar for a much wider age range, up to 49 years. 4 Empirical strategy 4.1 Fuzzy difference-in-discontinuity design Our objective is to estimate the causal effect of an online consultation, DC T , on the number of in-person consultations, y . The identification problem is that an individual’s decision to contact a DCT provider may correlate with unobserv- able characteristics that in turn influence the decision to make an in-person visit at a regular PCC. To eliminate the influence of such omitted variables, we need to find factors that exogenously alter the individual’s incentives to contact DCT providers, while not directly changing the incentives to contact traditional providers. During our study period, the incentives to contact a DCT provider changed exogenously at the 20th birthday, due to the application of the DCT user fee.18 A natural starting point for the estimation of the degree of substitution is there- fore to consider a fuzzy regression discontinuity (RD) design, using the dis- continuity at the 20th birthday as an instrument for the number of DCT con- sultations. The problem with such a strategy is that the incentives to con- tact traditional health care may also change at the 20th birthday. As already noted, one of our study regions (Västra Götaland) used the same age limit for in-person consultation fees. Moreover, 20 is the lower age limit for being al- lowed to buy strong liquor in Sweden, which has been shown to increase the risk of being hospitalised (Heckley, Gerdtham and Jarl, 2018). To purge our es- timates of other effects of turning 20, we use an older cohort to estimate dis- continuities around the 20th birthday in the period before DCT was available – a differences-in-discontinuities (diff-in-disc) strategy. Before we introduce our fuzzy diff-in-disc estimand of the degree of sub- stitution, it is instructive to first express a sharp diff-in-disc estimand for any random variable Z : τZ = (Z + 1 −Z − 1 )− (Z + 0 −Z − 0 ) (1) Here, Z+c (Z − c ) denotes the upper (lower) limit of the regression function E(Zc |ag ec = 18After 2018, all DCT companies relocated to a region with a zero consultation fee. 11 20) of cohort c ∈ (0,1) as it approaches the age threshold. Thus, the sharp diff- in-disc estimand compares discontinuities at the 20th birthday of two cohorts: a young cohort, who had access to DCT services both before and after they turned 20 (cohort 1), and an old cohort, who turned 20 before DCT were avail- able (cohort 0).19 Grembi, Nannicini and Troiano (2016) provide assumptions under which the sharp diff-in-disc identifies the causal effects of the new treatment in a set- ting where other, pre-existing, treatments are assigned at the same threshold. To identify the treatment effect on the cohort affected by both the new and the confounding treatments, two assumptions have to be satisfied. The first is the standard RD assumption that the conditional expectations of all potential out- comes must be continuous around the threshold (for both cohorts). Second, the effects of the confounding treatments must be time-invariant. The assump- tion implies that the only reason why the discontinuity at the 20th birthday would look different for the two cohorts is that the younger cohort had access to DCT. Under these assumptions, the sharp diff-in-disc identifies the effect of be- coming subject to the DCT consultation fee: When Z = y , Eq. (1) describes the effect of the DCT fee on the number of in-person consultations, and when Z = DC T , the equation describes the effect of the DCT fee on the number of DCT consultations. In principle, the second term of Eq. (1) is zero when Z = DC T , as the DCT market did not yet exist for the older cohort. In prac- tice, our billing data includes a small number of other consultations in Region Jönköping and so the term differs from zero (though very slightly). In order to estimate the degree of substitution between online and in-person consultations, we turn to a fuzzy diff-in-disc framework. Analogous with the standard fuzzy RD, we construct the fuzzy diff-in-disc estimand as the ratio of the sharp diff-in-discs of y and DC T :20 θ = τy τDC T = (y+1 − y − 1 )− (y + 0 − y − 0 ) (DC T+1 −DC T − 1 )− (DC T + 0 −DC T − 0 ) (2) The fuzzy diff-in-disc identifies a local average treatment effect that can be 19Specifically, the old cohort comprises individuals turning 20 before July 1 2016. 20Another example of a fuzzy diff in disc is Galindo-Silva, Some and Tchuente (2019). These authors discuss a special case in which the treatment of interest – buying insurance – is affected by multiple policies in a young cohort, but only by one policy in an old cohort. This setup differs from our setting, where the treatment of interest – the number of DCT consultations – is affected by one policy in a young cohort, but not available at all to the old cohort. 12 interpret as the degree of substitution for compliers – i.e., individuals who con- sult DCT providers less often only because of the fee – under the assumption of monotonicity (Millán-Quijano, 2020). In our context monotonocity implies an assumption of no one responding to the DCT fee by consulting DCT providers more often. Monotonicity thus rules out that DCT services are Giffen goods, which seems a plausible assumption to make. The first assumption of continuous conditional expectations around the threshold warrants some extra discussion. On one hand, the assumption fits well to a context using age as the running variable: Individuals will age and eventually be observed the other side of the age threshold. On the other hand, individuals may anticipate the onset of user fees, and adjust by scheduling care appointments before rather than after the 20th birthday. A strength of the diff- in-disc approach is that, as seen from the nominator of Eq. (2), inter-temporal substitution of in-person consultations due to anticipation effects would be purged by the difference of the two RDs, assuming that the incentives for inter- temporal substitution are the same for both cohorts. For online consultations, it is per definition impossible to use the old cohort to net out "usual" inter-temporal substitution. In section 5.2, we instead exam- ine if inter-temporal substitution is an issue by checking if the estimated τDC T is sensitive to removing observations close around the 20th birthday. As seen from the denominator of Eq. (2), inter-temporal substitution of online consul- tations would imply that we underestimate of the degree of substitution. 4.2 Estimation It is standard to estimate parameters of a RD model using a local linear (first order) polynomial regression for a given bandwidth (Calonico, Cattaneo and Titiunik, 2014, e.g). We follow the same route to estimate our diff-in-disc model. We apply a uniform kernel throughout. Estimating a fuzzy diff-in-disc in this way is equivalent to estimating a two stage least square model. The first-stage and the reduced form equations for the number of DCT and in-person consul- tations made by observation i in age-bin j in cohort c is specified as follows : Zi j c =β Z 1 +β Z 2 I(20)i j c +β Z 3 Ci j +β Z 4 I(20)i j c ×Ci j + f (ag ei j c ,I(20)i j c ,Ci j )+γ Z i j c (3) where I(20)i j c is a dummy for being at least 20 years old, Ci j ∈ 0,1 is a cohort dummy, and γZi j c is an error term. f (ag ei j c ,I(20)i j c ,Ci j ) is a function of the 13 running variable ag ei j c (normalised to 0 at the 20th birthday) and the age and cohort thresholds. In our main specification,21 this function equals f (ag ei j c ,I(20)i j c ,Ci j )= ag ei j c ( β Z 5 +β Z 6 I(20)i j c +β Z 7 Ci j +β Z 8 I(20)i j c ×Ci j ) (4) The coefficient of main interest in Equation (3) is βZ4 , the diff-in-disc estimate. As we rescale all care utilisation variables to reflect annual averages, the sharp diff-in-disc coefficient provides an estimate of the effect of the DCT fee on the number of consultations per year (e.g., a value of 1 implies one additional con- sultation annually per capita). The second stage equation equation for the number of in-person visits yi j c , in which the endogenous DC Ti j c is replaced by prediction from the first stage equation, can be expressed as follows: yi j c =α1 +α2I(20)i j c +α3Ci j +α4 ˆDC T i j c + f (ag ei j c ,I(20)i j c ,Ci j )+εi j c (5) where εi j c is an error term and all other variables are defined as above. The fuzzy diff-in-disc estimate α4 = β y 4 /β DC T 4 can be interpreted as the de- gree of substitution between online and in-person consultations. α4 = −1 im- plies that each DCT consultation replaces exactly one in-person visit. If α4 < −1, then the online consultation replaces more than one in-person visit; this might occur for problems for which regular PCCs, but not DCT companies, would provide both an initial and a follow-up consultation. α4 ∈ (−1,0) im- plies that each DCT consultation offsets less than one in-person visit. In this case, the net effect of the availability of DCT is an increase in the total number of physician consultations (DCT + in-person). In our main estimations, we select bandwidth using a data-driven proce- dure that minimises the mean square error of the reduced form equation of in-person visits y (Calonico, Cattaneo and Titiunik, 2014). We use the optimal bandwidth selected for y also for the first stage equation of DC T (compare, Imbens and Kalyanaraman, 2012), but use different bandwidths on each side of the 20 birthday threshold and for each cohort (young/old). To ensure that our results are not completely dependent on the bandwidth selection proce- dure, we also estimate the model across a range of bandwidths in robustness checks. We cluster standard errors on the running variable (=day relative to 20th birthday), using separate clusters for the young and old cohorts (Lee and Card, 21In the appendix, we adopt the stronger assumption thatβZ5 −β Z 8 = 0 to see if we can increase precision. 14 Figure 1: DCT consultations over time 2008).22 With standard errors clustered at the daily level, we may greatly save computational time – without affecting the point estimates or standard errors – by estimating the model on aggregated data. We therefore collapse the individual- day-level data to cells defined by age (in days relative to the 20th birthday), gen- der, region, and time period23, and include frequency weights (= the number of individuals in each cell) in the estimations. To examine if auto-correlation in the age dimension is a problem in our main specification, we also estimate a model on individual-level data in which we cluster standard errors by individ- ual (Appendix G). 15 Figure 2: Online consultations in the postperiod by gender 5 Results 5.1 The user fee discontinuity and online consultations Initially, we need to establish the existence of a first-stage relationship, i.e., that the demand for DCT services changes discontinuously at the 20th birthday, when the patient starts paying a fee. Figure 1 illustrates the annual number of online consultations per capita in different years,24 sorted by the day of the online contact relative to the individual’s 20th birthday (day "zero"). The subgraphs clearly show the development of the DCT market. Prior to 2016, there was little to no consumption25. In 2016, the market was still small, 22A recent literature discusses methods to obtain bias-corrected estimates and robust confidence intervals for settings with data-driven bandwidth choices standard RD set- tings (Calonico, Cattaneo and Titiunik, 2014; He and Bartalotti, 2020). Such methods are yet to be developed for the diff-in-disc setting. However, in a robustness check we modify the wild bootstrap procedure of He and Bartalotti (2020) to fit our fuzzy diff-in-disc setting. See Ap- pendix. 23Time periods are equivalent to birth year for the younger cohort (who turned 20 in 2017 or 2018). For the old cohort (pre-DCT), we define four 365-day time periods, each running from July 1 in year t to 31 June 31 t +1 for t ∈ (2013 to 2015). 24For each day relative to the 20th birthday, the number of consultations is multiplied by 365 to give an annual interpretation. 25Recall that we define DCT consumption as out-of-home-region care contacts, which ac- count for any non-zero consumption in the preperiods. 16 indeed too small to result in a visible first stage. In 2017, the utilisation of these services increased and we can identify a jump at the 20th birthday. The trend continued in 2018, with an annual average of .3 online consultations per capita for individuals below the 20th birthday and an even more distinct drop at the age threshold. This drop represents a decrease of about 50%. Figure 2 shows the online care consumption in the post-period years by gender. Here two characteristics emerge: Women consistently consume more online consultations (reflecting the pattern of use of traditional primary care), and the jump at the threshold appears larger for women than for men. As women and men react differently to the price discontinuity at the 20th birth- day, we present analysis on the full sample and split by sex in future analysis. Overall, the figures support the idea that the onset of the DCT fee at the 20th-birthday reduced the demand for DCT services when the market had gained some size, in particular in 2018. As there was still no valid first stage relation- ship in the fall of 2016, we exclude that year from the subsequent analyses. 5.2 Formalising the first stage: Does the experiment hold? The graphical analysis supports the validity of our natural experiment. In this section, we present formal estimates of the effect of the user-fee on online consultations and investigate threats to our identification strategy discussed in Section 4. Table 2 shows the sharp difference-in-discontinuity estimates that consti- tutes our first stage; the effect of the onset of the user fee identified as the differ- ence in the discontinuity at the 20th birthday between pre- and post-periods. For the pre-period, we pool data from July 2012 to June 2016. To account for the development of the market and the strength of the first stage we illustrated in figures 1 and 2, we estimate separate models for the post-periods 2017 and 2018. The first panel in Table 2 presents first stage estimates using the optimal bandwidth. The estimates are statistically significant and the F-statistics are of considerable size except for men in 2017. Consistent with the growth of the market observed in the descriptive figures, the magnitude of the coefficient is consistently smaller in 2017 than in 2018. Our analysis of in-person consulta- tions in section 5.3 also indicate that the market outreach in 2017 was not large enough to affect the use of in-person consultations. Our main focus in the pa- per will therefore be substitution during 2018 when market outreach was larger and we document a strong first stage. In 2018, the onset of the user fee reduces 17 the number of online consultations by .15 visits per year, corresponding to a decrease of about 50%. When splitting the sample by gender, we note that both coefficient and the F-statistic are noticeably larger for women than for men. The remaining panels in Table 2 explore a potential threat to our identifica- tion strategy: Inter-temporal substitution of care consumption. We particularly worry that individuals who anticipate the fee increase decide to contact DCT providers before their 20th birthday rather than after. Such behaviour would lead us to overestimate the effect of the user fee, and consequently underes- timate the degree of substitution. (As mentioned earlier, such concerns are smaller in the reduced form equation linking in-person consultations to the user fee change, as the diff-in-disc strategy may be assumed to difference out such behaviour.) The second panel and onwards of Table 2 show how change as we remove days just around the 20th birthday threshold. The estimates show that there is, at worst, very limited intertemporal substitution. We can remove two weeks each side of the threshold without the estimates being notably affected, and many estimates are still similar when removing +/21 days. As a separate point, the F-statistic is comfortably above the benchmark 10 for all 2018 specifications up to excluding 21 days around the 20th birthday, but not for 2017. This is an- other reason to put more trust in the 2018 results. There might of course have occurred intertemporal substitution for a time period longer than three weeks prior to the 20th birthday. However, when ex- cluding observations several weeks and months around the 20th birthday, we worry that the results from the estimations might be driven by the heavily re- duced sample size. As such, we use a longer, fixed bandwidth in complemen- tary analysis, which will lower the impact of the donuts on the sample size. In Appendix G we estimate donut specifications for a 365-day outer bandwidth varying the donut up to 14 weeks, thus maximising our sample size relative to the observations excluded. To ensure the results are not unique to the long bandwidth, we also run regressions for a 180-day outer bandwidth varying the donut up to 10 weeks. The results appear stable across the donut-specifications for the 365-day bandwidth. The coefficient appears to decrease somewhat as the donut exceeds eight weeks for the 180-day bandwidth, but at this point it is possible that the reduction in sample size and consequent loss of statistical power might affect the results. As we shall see, we are further reassured when we perform similar sensitivity to bandwidth for the fuzzy-diff-in-disc IV esti- mate. In order to establish our natural experiment, we also want to confirm that 18 Table 2: First Stage estimates, optimal bw 2017 2018 All Men Women All Men Women 0 ±20th BIRTHDAY FS coeff -0.0477*** -0.0171** -0.0819*** -0.149*** -0.0772*** -0.238*** (0.00792) (0.00712) (0.0122) (0.0119) (0.0143) (0.0252) F-stat 36.25 5.770 45.00 154.8 29.31 89.28 7 ±20th BIRTHDAY FS coeff -0.0399*** -0.0132 -0.0679*** -0.156*** -0.0823*** -0.241*** (0.00896) (0.00868) (0.0149) (0.0155) (0.0162) (0.0338) F-stat 19.84 2.300 20.93 101.3 25.78 50.91 14 ±20th BIRTHDAY FS coeff -0.0334*** 0.00175 -0.0642*** -0.161*** -0.0861*** -0.253*** (0.0125) (0.0134) (0.0211) (0.0193) (0.0203) (0.0447) F-stat 7.070 0.0170 9.284 69.93 18.06 32.12 21 ±20th BIRTHDAY FS coeff -0.0439** 0.00636 -0.105*** -0.151*** -0.0453* -0.252*** (0.0179) (0.0259) (0.0306) (0.0400) (0.0254) (0.0778) F-stat 6.010 0.0602 11.70 14.26 3.178 10.47 Note: * P<0.1, ** P<0.05, *** P<0.01. 95% confidence intervals. The table displays first stage coefficient estimates. Robust standard errors in parentheses clustered by days-to birthday and period pairs. The baseline case (the first panel) estimates using optimal bandwidths for the first stage. there is no other change at the discontinuity for any observable background variables. Such a change would imply selection into either side of the thresh- old, which would bias our estimated substitution effect. We therefore esti- mate sharp difference-in-discontinuity models for a set of background covari- ates (see section 3). For each covariate we estimate the optimal bandwidth as discussed in section 4 (Cattaneo, Idrobo and Titiunik, 2019). Consistent with our main specification, we allow the bandwidths to differ between discontinu- ities in the pre- and post period as well as each side of the cut-off. The results of this exercise, which are presented in Appendix Table D.1, are reassuring. All co- efficients are small. The only three coefficients that are statistically significant are from specifications using data for 2017. 19 5.3 In-person visits and the DCT user fee threshold Having established the existence of a robust first stage, the next question is whether the reduction in online consultations at the threshold reflects sub- stitution between online and in-person care. Figure 3 plots annual in-person physician visits per capita,26 by the day relative to the 20th birthday (i.e., the DCT user fee threshold) first for the whole sample and then by gender. The middle and rightmost sub-graphs illustrate the two years after the DCT market emerged (2017 and 2018), and the leftmost sub-graphs show the pooled pre- period (Figure G.1 shows that the pattern is similar for each of the years in the pre-period). We note from these graphs that individuals on average make one visit per year, slightly higher for women than for men. The graphs further give some indication of substitution. In the pre-period, the 20th birthday was associated with a drop in the number of in-person visits, likely driven by the onset of the fee for in-person visits (in Västra Götaland). The same sudden decrease of in- person visits was similar in 2017, when the DCT market still had limited out- reach. In 2018, when the DCT market exploded, the drop in in-person visits was much less pronounced than before. Taken together with the distinct drop in the number of online consultations at the threshold in Figure 1, this differ- ence in the discontinuities in 2018 and the pre-DCT period suggests that the PCCs take up some of the demand served by DCT companies before the 20th birthday. By contrast, the comparison between 2017 and the pre-DCT periods implies no measurable substitution between DCT and in-person consultations. How- ever, the change to 2018, joint with the weaker first stage we documented for 2017 in section 5.2, may suggest that the outreach of the DCT-market was too limited in 2017 to enable us to draw strong conclusions about the market that year. We provide a formal analysis for 2017 in the appendix, but in the remain- der of the main text we focus on formally assessing the degree of substitution during 2018. The thin, black lines on either side of the user fee threshold reflect how the data portrayed in the figure are used in our main regression specifications. The length of the lines, which we allow to differ on either side of the threshold, are the optimal bandwidths chosen by the cross-validation process suggested by Cattaneo, Idrobo and Titiunik (2019). The procedure chooses a bandwidth of 26As before, the statistics are scaled by 365 to allow an annual interpretation. 20 about 3-4 months on either side of the threshold for in-person visits. How- ever, the scatter plots show that the relationship between the running and the outcome variable is almost flat for larger bandwidths, while the regression lines within the optimal bandwidths appear to be influenced by a few outliers. While presenting our main results using the optimal bandwidth, the noisy care con- sumption data and the lack of a clear relationship between age (in days) and care consumption imply that other approaches may be as appropriate. We thus examine the stability of the results with alternative specifications: We vary the bandwidth and we use a 0-degree polynomial in the running variable (i.e., com- paring difference in mean level of care utilisation each side of the threshold for the two cohorts). Figure 3: In-person physician visits 5.4 Fuzzy difference-in-discontinuity results Table 3 provides our fuzzy diff-in-disc results for 2018. We estimate our regres- sion equation for the full sample as well as by gender. Each coefficient-standard error pair in the table comes from a separate estimation, where only the co- efficient of interest is displayed. The estimates in the first three columns are 21 obtained using the optimal bandwidths of each estimation sample (all, men, women), whereas the last two columns present estimates by gender when us- ing the optimal bandwidth for the full sample. The results are presented in three panels corresponding to each step of the empirical strategy. Panel A revisits the first stage estimates (the first three columns are identical to the top row estimates for 2018 in Table 2). Panel B presents our reduced form estimates, i.e. the sharp diff-in-disc at age 20, capturing the change in the number of in-person visits at the 20th birthday. The reduced form estimates are positive, as expected given the figures presented earlier i.e., a smaller drop in 2018 compared to the pre-period). The estimate is statisti- cally significant with the full sample, but not when splitting the sample by gen- der. When using the optimal bandwidths estimated separately for each gen- der (Panel Opt bw each sample), the reduced form estimates for both genders are smaller than the estimate for the full sample. To show how each gender contributes to the main estimate, we use the same bandwidth as in the main specification (Panel Opt be combined sample). The reduced form estimates for men and women then lie on each side of the estimate for the full sample (as expected). The last panel presents our fuzzy difference-in-discontinuity (IV) estimate that can be interpreted as the degree of substitution between the online and in-person consultations. For the full sample, the estimate of -.45 suggests that roughly every other online consultation replaces an in-person visit. The corre- sponding 95% confidence interval covers neither zero substitution (0) nor full substitution (-1), but the interval is quite wide. When splitting the sample by gender, the estimated degree of substitution is larger for men (-.46 and -.90) than for women (-.18 and -.31), independent of the bandwidth choice. As the reduced form effects are similar between the gen- ders, the difference in the estimated degree of substitution is due to the larger drop in online consultations at age 20 for women. For men, the uncertainty of the estimates are large and the 95% confidence interval contains both 0 and -1. For women, on the other hand, we can at least rule out complete substitution (-1) with some confidence. 5.5 Robustness and precision We first examine the robustness of our main results to the choice of bandwidth and inclusion of covariates. We then examine how sensitive the results are to the number of included pre-periods, the assumption of no changes in effects of 22 Table 3: Fuzzy diff-in-disc, different opt bandwidths OPT BW EACH SAMPLE OPT BW, COMBINED SAMPLE All Men Women Men Women A. FIRST STAGE (SHARP DIFF-IN-DISC) Online consultations -0.149*** -0.0772*** -0.238*** -0.0711*** -0.232*** (0.0119) (0.0143) (0.0252) (0.0120) (0.0240) B. RF (SHARP DIFF-IN-DISC) In-person visits 0.0673** 0.0356 0.0437 0.0632 0.0709 (0.0307) (0.0464) (0.0500) (0.0390) (0.0485) C. IV (FUZZY DIFF-IN-DISC) In-person visits -0.453** -0.461 -0.184 -0.889 -0.305 (0.206) (0.596) (0.210) (0.550) (0.208) Total bw pre 235 151 230 235 235 Total bw post 217 158 181 217 217 Avg. ind/day pre 151589 78183 73547 78168 73421 Avg. ind/day post 31483 16345 15207 16299 15184 Note: Variable names in left column represent the dependent variable. In the first three columns, each model uses MSE-optimal bandwidths for In-person visits for the respective estimation sample (All, Men, Women), with separate bandwidths for the pre-DCT cohorts and the post DCT cohort (2018), and varying bandwidths on the left- and righthand side of the age cutoff. In two columns to the right, each model uses the optimal bandwidth for the total sample in column 1. Bandwidth refers to extent of inclusion of values of running vari- able i.e. days to/since 20th birthday. Total bw = sum of bandwidths on left- and right side of age cutoff. Avg. ind/day = average number of individuals per day in cells, shown separately for the pre cohorts (which include several birth cohorts) and the post cohort (which only includes individuals turning 20 in 2018). Estimates using data collapsed by region, gender, birth year and day relative to 20th birthday. Standard errors clustered by the running vari- able, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. confounding treatments, and the linear polynomial specification. Finally, we show robustness to various standard error estimation methods. Table 4 shows how the estimates are affected by the choice of bandwidth (table rows) and by the inclusion of covariates, i.e., the pre-determined indi- vidual characteristics used in our balance test (columns "No Covs/With covs"). The last row of the table displays the main estimates using the MSE-optimal bandwidth. We note that the estimates are rather stable for bandwidths in the range 60—180 days. Across bandwidths, the included covariates do not affect 23 the estimates (as expected), but neither do they serve to improve the precision of the estimates. Table 4: IV estimates over different bandwidths ALL MEN WOMEN No Covs With Covs No Covs With Covs No Covs With Covs BW 30 -0.797 -0.795 -3.687 -3.684 0.0434 0.0415 (-2.057, 0.463) (-2.056, 0.465) (-7.320, -0.0536) (-7.317, -0.0506) (-1.158, 1.245) (-1.161, 1.244) KP F-stat 34.00 34.21 5.550 5.616 19.17 19.41 BW 60 -0.228 -0.224 -0.753 -0.748 -0.0497 -0.0469 (-0.827, 0.371) (-0.823, 0.376) (-2.222, 0.717) (-2.218, 0.722) (-0.645, 0.546) (-0.643, 0.549) KP F-stat 93.51 93.81 19.94 20.06 55.21 55.56 BW 90 -0.440 -0.431 -0.965 -0.958 -0.266 -0.258 (-0.884, 0.00345) (-0.875, 0.0125) (-2.144, 0.214) (-2.137, 0.220) (-0.713, 0.180) (-0.704, 0.189) KP F-stat 126.2 126.4 29.36 29.47 81.67 82.04 BW 120 -0.450 -0.444 -1.026 -1.026 -0.260 -0.253 (-0.842, -0.0576) (-0.837, -0.0520) (-2.102, 0.0496) (-2.102, 0.0509) (-0.645, 0.124) (-0.638, 0.131) KP F-stat 163.8 164.0 37.76 37.86 103.4 103.7 BW 150 -0.471 -0.467 -1.122 -1.133 -0.271 -0.262 (-0.848, -0.0940) (-0.844, -0.0904) (-2.200, -0.0451) (-2.211, -0.0539) (-0.633, 0.0916) (-0.625, 0.0996) KP F-stat 176.9 177.1 37.17 37.25 117.8 118.0 BW 180 -0.371 -0.367 -0.842 -0.857 -0.217 -0.206 (-0.709, -0.0332) (-0.705, -0.0292) (-1.780, 0.0972) (-1.797, 0.0832) (-0.551, 0.118) (-0.540, 0.128) KP F-stat 207.8 208.0 47.23 47.31 137.4 137.7 BW 365 -0.260 -0.267 -0.707 -0.739 -0.135 -0.137 (-0.508, -0.0120) (-0.514, -0.0190) (-1.563, 0.149) (-1.598, 0.119) (-0.363, 0.0928) (-0.365, 0.0908) KP F-stat 348.2 348.6 57.53 57.49 279.8 280.7 BW opt -0.453 -0.406 -0.461 -0.558 -0.184 -0.134 (-0.857, -0.0496) (-0.822, 0.00931) (-1.629, 0.708) (-1.749, 0.633) (-0.595, 0.227) (-0.556, 0.287) KP F-stat 154.8 146.3 29.31 28.57 89.28 81.53 Note: The table shows IV estimates for the full sample and for men and women separately. Each panel, other than the last one, is as- sociated with a fixed bandwidth on either side of the the threshold (and for both cohorts). The last panel presents the results by op- timal bandwidth, for ease of comparison to the previously shown results. 95% confidence intervals shown in parantheses, derived from standard errors clustered by the running variable (days since 20th birthday with separate clusters for pre- and post cohorts). Our identification strategy utterly relies on the assumptions spelled out in the empirical strategy. We showed in section 5.2 that the standard RD assump- tions are likely to hold: We find no evidence of an anticipation effect of the onset of the user fee, and our findings support that individuals are similar each side of the threshold. As we use the discontinuity of the pre-DCT cohorts to purge out effects of any confounding treatment, another crucial assumption is that the effects of confounding treatments are time invariant. Some support for this assumption have already been mentioned (see Figure G.1): The dis- 24 continuity at the 20th birthday is of similar size when we split the pre-period into four 12 month windows. In Appendix table G.1, we further re-estimate the fuzzy diff-in-disc specification several times, starting with only one pre-period (the most recent) and then adding one at a time. Regardless of the number of pre-periods, we arrive at an estimate of around 50% substitution for both gen- ders. The estimates by gender are more sensitive to the number of included pre-periods, which is not surprising given the large uncertainty around the gender-specific estimates in the main specification (which includes the largest number of periods). Nonetheless, Figure 3 suggests that the average number of in-person visits declined between the pre- and the post-period. It is possible that this secular decrease would have reduced the discontinuity at age 20 even in the absence of DCT (although the similarity of gaps in 2017, when most of the decline had taken place, suggest that is not the case), in which case we overestimate the de- gree of substitution. In appendix G.2, we show that this overestimation is likely to be small, estimating the size to be about 6% for the main result. An alterna- tive strategy that relaxes the assumption of time-invariant effects of confound- ing policies would be to focus exclusively on Region Stockholm, where there was no confounding change of in-person visit fees at age 20. Assuming there are no other confounding policies at age 20, we can then use an ordinary fuzzy RD specification (instead of a diff-in-disc). Appendix Table G.2 shows that such a specification yields similar estimates at bandwidths of 90 days or more. We also examine robustness to changing the assumptions of our modelling framework. In particular, recalling the lack of a trend in the running variable in Fig.3, we estimate models using a zero-degree polynomial in the running variables (i.e., we restrict the coefficients on functions of age to zero). Under this assumption, there is no reason to stick to the previous MSE-optimal band- widths and so we estimate this model using various bandwidths. As shown in Appendix G.6, this approach yields point estimates in the same neighbourhood as our main estimate for most bandwidths; for bandwidths of 120 or wider, the estimates are noticeably more precise. For instance, using a +/-180 days band- width, the zero-degree polynomial point estimate is -.35 with a standard error of .15, implying a 95% confidence interval of (-.64 to -.05). This should be com- pared to the confidence interval in the last row of Table 4, for which the lower limit is considerably more negative (-.86) We finally examine the sensitivity to the estimation of standard errors. To address worries that the standard errors of the preferred model disregard auto- correlation in the age dimension, we estimate the main model on individual- 25 level data with standard errors clustered by individual (Table G.3). The stan- dard error of the main estimate only increases slightly (s.e.=.23 compared to .21 before) suggesting the clustering dimension is not so important in our case. This is reasonable given the sporadic nature of primary care consultations in our sample; i.e., there is limited scope for auto-correlation for events that oc- cur around once a year. A recent literature discusses methods to obtain bias-corrected estimates and robust confidence intervals for settings with data-driven bandwidth choices in standard RD settings (Calonico, Cattaneo and Titiunik, 2014; He and Bar- talotti, 2020). Such methods are yet to be developed for a fuzzy diff-in-disc set- ting. However, we modify the wild bootstrap procedure of He and Bartalotti (2020) to fit our setting (Appendix G.5). We obtain similar confidence intervals with this bootstrap procedure. In sum, our robustness checks thus support our conclusion that DCT con- sultations partially substitute for in-person consultations in regular primary care. 5.6 Heterogeneity Analysis The following section provides heterogeneity analysis across two different di- mensions: Location and type of diagnoses associated with DCT visit. This de- composition will help us understand if the results are driven by certain sub- groups, but splitting the sample also introduces more noise. The subsequent analyses should therefore be viewed as exploratory. 5.6.1 Regional heterogeneity Our main analysis merges data from two independent administrative regions – Region Stockholm (Sthlm) and Region Västra Götaland (VGR). With different reimbursement systems in primary care it is of interest to examine heterogene- ity over these regions. As discussed in section 2.1, Region Stockholm relies more on fee-for-service and Västra Götaland relies exclusively on capitation. As pri- mary care centres in Stockholm thus have stronger incentive to offer consulta- tions, we expect the degree of substitution to be higher there. Table 5 presents the results where we run estimations split by regions. For comparability, we use the optimal bandwidths for the regions combined (though the results are very similar when estimating optimal bandwidths for each re- gion separately, see Table H.1). The fuzzy diff-in-disc estimates in the third 26 Table 5: Fuzzy diff-in-disc results, by region All Men Women A. FIRST STAGE (SHARP DIFF-IN-DISC) Sthlm -0.164*** -0.0819*** -0.252*** (0.0176) (0.0173) (0.0340) F-stat 86.95 22.40 54.98 VGR -0.130*** -0.0576*** -0.208*** (0.0182) (0.0172) (0.0331) F-stat 50.70 11.15 39.49 B. REDUCED FORM (SHARP DIFF-IN-DISC) Sthlm 0.112** 0.104* 0.117* (0.0454) (0.0539) (0.0704) VGR 0.0104 0.0122 0.00877 (0.0492) (0.0604) (0.0736) C. IV (FUZZY DIFF-IN-DISC) Sthlm -0.681** -1.272* -0.465 (0.282) (0.698) (0.285) VGR -0.0802 -0.212 -0.0422 (0.380) (1.051) (0.353) Sthlm: Avg. ind/day pre 82282 42391 39891 Sthlm: Avg. ind/day post 17493 9005 8487 VGR: Avg. ind/day pre 69328 35789 33539 VGR: Avg. ind/day post 13988 7292 6696 Note: The table shows estimates with the sample split by regions and sex. Each model uses the MSE-optimal bandwidth for in-person visits for the full sample (to remove heterogeneity due to changes in bandwidth). Thus, refer to Table 3 for the bandwidth used. ’Average ind/day pre (post)’ refers to average individuals per cell in the pre (post) cohort. Standard errors in parantheses clustered by relative age (running variable), with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. panel are in line with our hypothesis: The size of the Stockholm coefficient (- .68) is considerably larger than the coefficient for Region Västra Götaland (-.08), and it is only in Stockholm that we can reject the null hypothesis of no substi- tution. As the first stage estimates in the first panel suggests that the change in online consultations in response to the onset of the user fee is similar across re- gions, the difference in the degree of substitution is rather driven by differences in the reduced form estimates. Notably, there are other explanations than the regional incentive structures that could explain the results between the regions. One first candidate is that Västra Götaland has more rural areas. We know that the DCT companies ad- vertised a lot in the public transport systems of Stockholm and Gothenburg in 27 2018, and we also know that the uptake of these services was larger in urban areas (see the descriptive statistics in Appendix C). We hence examine whether the regional heterogeneity masks urban-rural heterogeneity by splitting the sample into one urban and one rural subsam- ple.27 Table 6 presents the fuzzy diff-in-disc results. The IV estimates in the leftmost part of Table 6 show that the degree of sub- stitution does not differ much between the urban areas of the two regions.28 This suggests that the regional heterogeneity is likely not due to different ad- ministrative structures, but to differences in the share of urban residents. The estimates by gender are very imprecise, especially when also broken down by region. The most precise results are obtained for urban women, for whom we can rule out complete, but not zero, substitution. In our most highly powered specification (including both genders and re- gions), the point estimate in rural areas is smaller than that for urban areas (-.256 vs -.503), although the estimates are not significantly different from each other. The first stage results (available in Table H.3) indicate that this related mainly to differences in the reduced form (the first stage in rural areas is smaller than in urban areas, but still strongly significant). Intuitively, one might think that rural individuals, living far from their primary care providers, would be relatively more prone to replace in-person visits. However, such a conclusion fails to account for the higher accessibility to health care practices in urban ar- eas, which also implies a higher demand for in-person visits – and thus a larger number of visits that can be replaced by online consultations. 5.6.2 Substitution within diagnosis types We next focus exclusively on consultations with diagnoses that are commonly set during online consultations. Table 7 shows the estimates from models relat- ing online consultations within diagnosis category j to in-person visits within the same diagnosis category (using the same bandwidth as in the main results section). The first column, Common, shows the degree of substitution for the 80% most common diagnoses during online consultations (covering 90% of all online consultations). The results for both genders are very similar to the main 27The variable definition is based on categorisation by Statistics Sweden where "urban" es- sentially includes residents in municipalities that belong to the City of Stockholm or the City of Gothenburg 28The similarity of the urban areas in the two regions is even more striking in models using a zero-degree polynomial and longer bandwidths, see Appendix H. 28 Table 6: IV estimate by urban/rural heterogeneity ALL MEN WOMEN Rural Urban Rural Urban Rural Urban Both -0.256 -0.503** -2.010 -0.732 0.0214 -0.407* (0.543) (0.216) (2.791) (0.512) (0.450) (0.229) Sthlm -2.718 -0.544** 19.69 -1.032 -1.752 -0.344 (2.058) (0.277) (94.97) (0.631) (1.443) (0.287) VGR 0.288 -0.397 -0.807 0.0996 0.526 -0.575 (0.615) (0.449) (2.210) (0.980) (0.539) (0.471) AVG INDIVIDUALS/DAY (CELL) Both Pre: 52500 99109 27251 50929 25249 48179 Both Post: 10193 21287 5307 10989 4885 10298 Sthlm Pre: 9382 72898 4829 37560 4553 35337 Sthlm Post: 1867 15626 956 8050 911 7577 VGR Pre: 43117 26211 22421 13369 20696 12842 VGR Post: 8326 5661 4351 2940 3974 2721 Note: The table shows estimates with sample split by regions, urban/rural dimension and sex. Each model uses the MSE-optimal bandwidth for in- person visits for the full sample (to remove heterogeneity due to changes in bandwidth, see Table 3 for bandwidth). ’Average ind/day pre (post)’ refers to average individuals per cell in the pre (post) cohort. Standard errors in parantheses clustered by relative age (running variable), with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. results, indicating slightly less than 1:2 substitution. We then divide the common DCT diagnoses into subsets: upper respiratory infections (Resp), skin related diseases (Skin), genital and reproductive organs Gen/Rep, and Other common diagnoses set by DCT providers; these categories respectively cover 19% 25%, 22%, and 28% of all online consultations in 2018. (Because each consultation may have more than one diagnosis, the contacts in these subsets are not completely mutually exclusive.) The first stage coeffi- cients are in line with what is expected, given each category’s share of all DCT consultations, but the degree of substitution varies between categories. Consultations related to upper respiratory infections display the highest degree of substitution. The estimate in Table 7 indeed indicates that online consultations (more than) fully replace in-person consultations for this reason (when looking at the results by gender, it should be noted that the first stage for men is weak). 29 For the categories including skin conditions or diagnoses related to the gen- ital and reproductive organs, the estimates are small and not statistically signif- icant. Although we cannot rule out partial substitution of similar magnitudes as before, we believe it is plausible to find little substitution for these categories. The gen/rep category mainly includes gynecological diagnosis codes, which are only relevant for women; accordingly, the first-stage relationship is virtually non-existent for men. Further, contraceptives are normally prescribed by mid- wives in Sweden, so there are few physician consultations to replace. As for skin conditions, it is plausible that the increased convenience of seeking care mainly spurs additional demand for issues that otherwise would not lead to a physician contact. That is, in the absence of accessible DCT, the patient might not have been offered a physician consultation, or might not even have tried to get such an appointment. For the category of other DCT-relevant diagnoses (including various diag- noses with vague symptoms, mental health related diagnoses, and renewal of prescriptions) the degree of substitution is similar to that for the overall cate- gory (Common) but not statistically significant. Estimations using other bandwidths support the main pattern of the sub- stitution within diagnosis groups (see Appendix I). 5.6.3 Other consultations and antibiotic prescriptions The results in the previous section suggest that online physician consultations may replace in-person visits with health care professionals other than physi- cians. Table 8 presents fuzzy diff-in-disc results for in-person visits with other health care professions: nurse visits at a primary care center; visits at a mid- wife, youth or STD clinic; the sum of these two types of visits (Nurse + mid- wife/youth/STD); and the sum of these two types and in-person physician visits (All consultations). For the combined sample, there are small but insignificant negative effects on nurse visits and visits at midwife/youth/STD clinics. The es- timates of substitution between online consultations and any type of in-person visits within primary care (All consultations) are marginally larger but more im- precise, compared to the estimates of the substitution with in-person physician visits only (e.g. the main results presented in table 3). When splitting the sam- ple by gender, the results suggest that the degree of substitution among women is larger compared to the main results due to the inclusion of midwife consul- tations (in line with the results by type of diagnosis above), and that DCT con- sultations increases the number of consultations at youth/STD clinics among 30 Table 7: Decomposition by type of diagnosis A:FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.129*** -0.0256*** -0.0379*** -0.0259*** -0.0468*** (0.0110) (0.00572) (0.00609) (0.00584) (0.00749) Men -0.0600*** -0.00581 -0.0206*** -0.00484** -0.0298*** (0.0105) (0.00618) (0.00632) (0.00207) (0.00679) Women -0.204*** -0.0468*** -0.0566*** -0.0487*** -0.0651*** (0.0208) (0.0104) (0.0103) (0.0119) (0.0129) B: IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.440** -1.447** -0.105 -0.0393 -0.472 (0.205) (0.575) (0.301) (0.346) (0.417) Men -0.624 -2.253 -0.391 -1.169 -0.478 (0.546) (3.886) (0.695) (1.501) (0.771) Women -0.378* -1.334*** 0.00721 0.0863 -0.464 (0.202) (0.504) (0.294) (0.329) (0.460) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). All models apply the bandwidths used for the main results in Table 3 (i.e., MSE-optimal bandwidths for In-person visits for both regions and both genders). Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. men. These patterns are consistent when using longer fixed bandwidths or a bandwidth optimal for the relevant outcome variable and estimation sample (see Appendix J). Overall, these results suggest that there is partial substitution between online physician consultations and any type of primary care consul- tations (although we cannot rule out full substitution). A considerable share of the DCT contacts relates to conditions for which physicians may consider to prescribe antibiotics. This is in particular true for upper respiratory infections but also for skin conditions (acne) and genital and 31 Table 8: Visits to other health care professionals FUZZY DIFF-IN-DISC All Men Women Nurse visits at PCC -0.0268 -0.519* 0.137 (0.123) (0.312) (0.127) Visits at midwife/youth/STD clinic -0.0611 0.376* -0.197 (0.183) (0.212) (0.230) Nurse+midwife/youth/STD -0.0879 -0.143 -0.0602 (0.220) (0.347) (0.265) All consultations -0.541 -1.032 -0.366 (0.331) (0.691) (0.360) Note: Table shows fuzzy diff-in-discs estimates of the effect of on- line consultations on in-person consultations with other health care professionals than physicians at primary care centers: consulta- tions with a nurse at a primary care center; consultations with a midwife/nurse/physician at a midwife/youth/STD clinic, the sum of these two types of consultations; and the sum of all consultations, including physician consultations at a primary care center. Each model uses the MSE-optimal bandwidth for main outcome variable (in-person consultations) for both regions and both genders (to re- move heterogeneity due to changes in bandwidth). Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. reproductive health (cystitis). The increased access to DCT consultations and substitution away from in-person visits may therefore affect antibiotic use. Ap- pendix K presents an analysis of how the onset of the fee for online consul- tations affects the number of antibiotic prescriptions. The sharp diff-in-disc estimates on the total number of antibiotic prescriptions are positive, and thus of opposite sign to the first stage estimate on online consultations. This means that the decrease in online consultations due to the onset of the user fee is not associated with a decrease in antibiotic prescriptions. Thus, DCT physicians seem to be at least as restrictive as other physicians are in terms of prescribing antibiotics.29 29This is consistent with recent descriptive evidence from Sweden (Entezarjou et al., 2021) and the US (Shi et al., 2018). Earlier studies of DCT in the US indicated that DCT providers in- dicated less appropriate antibiotics prescription in a DCT setting (Uscher-Pines et al., 2015a,b). According to Shi et al. (2018), the difference between the US results may be due to increase at- 32 Notably, the result holds also when examining the subset of antibiotics pre- scribed against respiratory infections. This is consistent with the available med- ical guidelines that advocate against prescriptions without a physical examina- tion. Speculatively, the complete substitution of in-person visits for respiratory infections thus prevents some unnecessary antibiotics prescriptions. 6 Concluding remarks We show that the demand for DCT consultations falls by half as individuals in our study regions reach the age at which they start to pay a consultation fee. We exploit this exogenous variation in demand to estimate the degree to which online consultations substitute for in-person physician consultations at the primary care practice. Our estimates imply that about 50 per cent of all on- line consultations replace in-person visits. Consequently, the availability of a direct-to-consumer telemedicine market increases the total number of physi- cian consultations (online and in-person). This conclusion is robust to dif- ferent bandwidths, functional form assumptions, and methods for estimating standard errors. A decomposition of the main estimate by gender reveals that the degree of substitution is higher among men, but the estimate is very imprecise due to men’s lower care utilisation and weaker response to the onset of the DCT fee. The evidence of gender heterogeneity is therefore weak. Nonetheless, one noteworthy reason why men might display higher substitution is that a larger share of women’s DCT contacts relates to reproductive health, which otherwise is usually handled by midwives rather than by primary care physicians. That is, women substitute across professional categories to some extent. Holding demand constant, the maximum attainable degree of substitution is limited by the fact that physicians sometimes need to examine the patient physically, and by the likelihood that traditional care providers would offer the patient an in-person appointment with a physician. These factors vary across medical conditions. When decomposing the main estimate across categories of common DCT diagnoses, we find that the degree of substitution is higher for upper respiratory infections, which relatively often warrant a physical exami- nation. Conversely, the degree of substitution is lower for diagnoses for which traditional care providers would either delegate the treatment to other medical tention to prescription guidelines in DCT settings. 33 professions (e.g., contraceptive management), or not even offer an appoint- ment (e.g., mild skin conditions). Given the uncertainty surrounding these es- timates, this heterogeneity should be viewed as tentative, though. Our conclusion that DCT increases the total demand for physician consul- tations aligns with the few previous studies in this field (Ashwood et al., 2017; Ellegård and Kjellsson, 2019), but our point estimates indicate a higher degree of substitution. One compelling reason for our different estimates is that the previous studies fail to account for important time-variant unobserved hetero- geneity such as transitory health issues. But the differences may also relate to the external validity of our analysis. Here, the study population and the study context are of particular concern. With regards to the study population, it should be acknowledged that our identification strategy captures causal effects for individuals around 20 years of age. As shown in the supplementary material (Appendix L), this age group is similar to the (rest of the) 15-34 age group – a large fraction of DCT users – in terms of the proportion with a common DCT-related diagnosis and expected health care spending. It thus appears reasonable to generalise our results to other persons in these ages, at least to the extent that their care seeking be- haviour and access to in-person consultations is similar. Regarding the context, we study a publicly financed primary care market in which DCT providers and traditional primary care providers operate under different contracts. As long as there is a third-party payer involved, patients’ willingness to substitute DCT consultations for in-person consultations is likely similar irrespective of whether this third party payer is a national health system or a private insurance plan. The access to on-demand online consultations gives similar incentives for patients to seek care in both cases. Variation in provider incentives is a more likely source of differences in the degree of substitution across contexts. In an international comparison, the Swedish healthcare system is characterised by relatively few primary care physi- cians, more task shifting towards nurses, and a stronger reliance on fixed pay- ments rather than fee-for-service. The scope for substitution might be greater in contexts where such factors, which limit the access to physician consul- tations, are not present or less pronounced. However, despite the large role played by nurses, we do not find evidence of substantially more substitution when considering nurse and physician consultations together. Further, tradi- tional providers’ incentives to offer appointments vary considerably between the two study regions (e.g., fee-for-service is only used in one of the regions), but the estimated degrees of substitution are similar once we account for re- 34 gional differences in the demographic structure. We therefore think that our results may be relevant also outside our study context(s). One might finally ask how the Covid-19 pandemic affects the validity of our results. It is plausible to assume that the pandemic has increased the tendency of both patients and health care providers to adopt a ‘digital-first’ approach. As far as patients are concerned, the pandemic might thus have increased the rele- vance of our analysis, which essentially concerns a group of early adopters. On the provider side, the transformation of traditional care providers’ view of the necessity to offer physical examinations might have reduced the propensity to offer in-person consultations. This would imply a lower scope for substitution after the pandemic. Notably, such a development would not affect our conclu- sion that the availability of a DCT market is to increase total demand. However, it would indicate that our estimates provide upper bounds for the degree of substitution. To calculate the total economic consequences of DCT, an estimate of the degree of substitution is only a first step. An increasing volume of doctor con- sultations does not have to imply increasing total costs. A key question is to what extent unit costs differ between DCT and traditional care for a given pa- tient. Our preferred estimated substitution rate of 50% implies that if the unit cost of a DCT consultation is 50% of the units cost of an in-person visit, then the availability of the DCT market implies larger volumes at the same total cost. If unit costs of DCT are even lower, then the DCT market increases care volumes while simultaneously reducing costs. The cost functions of both DCT and traditional providers are unknown,30 but it appears unlikely that the direct (treatment) cost of a consultation, of which a large part reflects labour costs, is twice as high in the traditional setting as in the DCT setting. For comparable cases (i.e., consultations during which the physician does not consider a physical examination necessary), it is hard to see why the time spent would be considerably shorter just because the consul- tation takes place online.31 Thus, it seems likely to assume that the net effect of the availability of DCT services is to increase the financial burden on third- 30In our setting, the only available attempt to compare costs uses the average cost of one DCT company and the national average primary care spending per consultation (Ekman, 2018). This comparison is not informative, as it does not account for the huge differences in case mix facing DCT and traditional primary care providers. Further, as pointed out by Salisbury et al. (2020), we do not know if DCT companies are taking initial losses for future profits. 31It is possible that wage levels differ, but such differences are only interesting to the extent that they do not reflect selection of e.g. more junior (low-paid) physicians into either sector. 35 party payers. However, the indirect costs of DCT services may in many cases be low enough to make the cost-benefit calculation positive. In particular, the value of the time saved for patients not having to take time off work to travel to the primary care practice might be substantial (Ekman, 2018). From a strict health policy perspective, it may still be a concern that up to 50% of all spending on DCT reflects spending on patients who would not otherwise have seen a doctor.32 Given a fixed budget, those resources might have otherwise gone to individuals with greater health needs, for whom online consultations are inadequate (Roland, 2019). Such distributional concerns, to- gether with conventional cost containment goals, suggest that the regulation of the DCT market should incorporate similar incentives for nurse triage as in traditional primary care. A further step in that direction might be to apply higher reimbursement rates for consultations with higher priority. Finally, a straightforward policy implication of our results is that even complete substi- tution might be problematic from a budget perspective, if the alternative would be a consultation with cheaper professional categories such as nurses or mid- wives. References AHIP. 2019. “Virtual Care Delivers Value.” https://www.ahip.org/virtual-care- delivers-value/ Accessed: 2021-05-10. Anell, A. 2015. “The Public-Private Pendulum – Patient Choice and Equity in Sweden.” New England Journal of Medicine, 372(1): 1–4. Anell, Anders, Anna H Glenngaard, and Sherry M Merkur. 2012. “Sweden: Health system review.” Health systems in transition, 14(5): 1–159. Arntz, Melanie, Terry Gregory, and Ulrich Zierahn. 2019. “Digitalization and the Future of Work: Macroeconomic Consequences.” Handbook of Labor, Human Resources and Population Economics, by Klaus F. Zimmermann (Editor-in-Chief). 32In the Swedish case, the figures for 2019 would imply that 1.5% of the expenditure on pri- mary care physician consultations was redirected in this fashion. 36 Ashwood, J. Scott, Ateev Mehrotra, David Cowling, and Lori Uscher-Pines. 2017. “Direct-To-Consumer Telehealth May Increase Access To Care But Does Not Decrease Spending.” Health Affairs, 36(3): 485–491. Baugh, Brian, Itzhak Ben-David, and Hoonsuk Park. 2018. “Can Taxes Shape an Industry? Evidence from the Implementation of the “Amazon Tax”.” The Journal of Finance, 73(4): 1819–1855. Bavafa, Hessam, Lorin M. Hitt, and Christian Terwiesch. 2018. “The Impact of E-Visits on Visit Frequencies and Patient Health: Evidence from Primary Care.” Management Science, 64(12): 5461–5480. Blix, Mårten, and Johanna Jeansson. 2019. “Telemedicine and the welfare state : The Swedish experience.” In Digital Transformation and Public Services. , ed. Anthony Larsson and Robin Teigland. Taylor & Francis. Branson, Zach, and Fabrizia Mealli. 2018. “Local randomization and beyond for regression discontinuity designs: Revisiting a causal analysis of the effects of university grants on dropout rates.” Calonico, Sebastian, Matias D Cattaneo, and Rocio Titiunik. 2014. “Robust nonparametric confidence intervals for regression-discontinuity designs.” Econometrica, 82(6): 2295–2326. Cattaneo, Matias D., Nicolás Idrobo, and Rocío Titiunik. 2019. A Practi- cal Introduction to Regression Discontinuity Designs: Foundations. Cam- bridge:Cambridge University Press. Cinco, Samantha Joy. 2021. “Companion or Substitution? Automation and Digitisation in the Workplace.” Journal of Entrepreneurship and Innovation in Emerging Economies, 1–6. Cutler, David M., Sayeh Nikpay, and Robert S. Huckman. 2020. “The Business of Medicine in the Era of COVID-19.” JAMA, 323(20): 2003. Ekman, Björn. 2018. “Cost Analysis of a Digital Health Care Model in Sweden.” PharmacoEconomics - Open, 2(3): 347–354. Ellegård, Lina Maria, and Gustav Kjellsson. 2019. “Nätvårdsanvändare i Skåne kontaktar oftare vårdcentral och gör inte färre akutbesök.” Läkartidningen, 116. 37 Entezarjou, Artin, Susanna Calling, Tapomita Bhattacharyya, Veronica Mi- los Nymberg, Lina Vigren, Ashkan Labaf, Ulf Jakobsson, and Patrik Midlöv. 2021. “Antibiotic Prescription Rates After eVisits Versus Office Visits in Pri- mary Care: Observational Study.” JMIR Medical Informatics, 9(3): e25473. Galindo-Silva, Hector, Nibene Habib Some, and Guy Tchuente. 2019. “Does Obamacare Care? A Fuzzy Difference-in-discontinuities Approach.” Goldfarb, Avi, and Catherine Tucker. 2019. “Digital Economics.” Journal of Economic Literature, 57(1). Grembi, Veronica, Tommaso Nannicini, and Ugo Troiano. 2016. “Do Fiscal Rules Matter?” American Economic Journal: Applied Economics, 8(3): 1–30. Heckley, Gawain, Ulf-G. Gerdtham, and Johan Jarl. 2018. “Too Young to Die: Regression Discontinuity of a Two-Part Minimum Legal Drinking Age Policy and the Causal Effect of Alcohol on Health.” Working Paper, Department of Economics, Lund University 2018:4. He, Yang, and Otavio Bartalotti. 2020. “Wild Bootstrap for Fuzzy Regression Discontinuity Designs: Obtaining Robust Bias-Corrected Confidence Inter- vals.” Econometrics Journal, 23(2): 211–231. Imbens, Guido, and Karthik Kalyanaraman. 2012. “Optimal Bandwidth Choice for the Regression Discontinuity Estimator.” The Review of Economic Studies, 79(3): 933–959. Johansson, Naimi, Niklas Jakobsson, and Mikael Svensson. 2019. “Effects of primary care cost-sharing among young adults: varying impact across income groups and gender.” The European Journal of Health Economics, 20(8): 1271–1280. Kry. 2019. “Kvalitetsrapport januari–september 2019 Sverige.” https://www.kry.se/medicinsk-kvalitet/kvalitetsrapport-q1-q3-2019/ Last accessed: 2021-06-04. Lee, David S., and David Card. 2008. “Regression discontinuity inference with specification error.” Journal of Econometrics, 142(2): 655–674. Licurse, Adam M., and Ateev Mehrotra. 2018. “The Effect of Telehealth on Spending: Thinking Through the Numbers.” Annals of Internal Medicine, 168(10): 737. 38 Li, Kathleen Yinran, Ziwei Zhu, Sophia Ng, and Chad Ellimoottil. 2021. “Direct-To-Consumer Telemedicine Visits For Acute Respiratory Infections Linked To More Downstream Visits.” Health Affairs (Project Hope), 40(4): 596– 602. Martinez, Kathryn A., Mark Rood, Nikhyl Jhangiani, Lei Kou, Susannah Rose, Adrienne Boissy, and Michael B. Rothberg. 2018. “Patterns of Use and Cor- relates of Patient Satisfaction with a Large Nationwide Direct to Consumer Telemedicine Service.” Journal of General Internal Medicine, 33(10): 1768– 1773. Mehrotra, Ateev, R. Sacha Bhatia, and Centaine L. Snoswell. 2021. “Paying for Telemedicine After the Pandemic.” JAMA, 325(5): 431. Millán-Quijano, Jaime. 2020. “Fuzzy difference in discontinuities.” Applied Economics Letters, 27(19): 1552–1555. Nilsson, Anton, and Alexander Paul. 2018. “Patient cost-sharing, socioeco- nomic status, and children’s health care utilization.” Journal of Health Eco- nomics, 59: 109–124. Nord, Garrison, Kristin L. Rising, Roger A. Band, Brendan G. Carr, and Judd E. Hollander. 2018. “On-demand synchronous audio video telemedicine visits are cost effective.” The American Journal of Emergency Medicine. North, Frederick, Sarah J. Crane, Rajeev Chaudhry, Jon O. Ebbert, Karen Yt- terberg, Sidna M. Tulledge-Scheitel, and Robert J. Stroebel. 2014. “Impact of patient portal secure messages and electronic visits on adult primary care office visits.” Telemedicine Journal and E-Health, 20(3): 192–198. Pearl, Robert. 2014. “Kaiser Permanente Northern California: Current Ex- periences With Internet, Mobile, And Video Technologies.” Health Affairs, 33(2): 251–257. Roland, Martin. 2019. “General practice by smartphone.” BMJ, 366. SALAR. 2017. “Sveriges Kommuner och Landsting – Sektorn i siffror.” http://skl.se/ekonomijuridikstatistik/ekonomi/sektornisiffror.1821.html Last Accessed 2021-06-04. 39 SALAR, (Swedish Association of Local Authorities and Re- gions). 2020. “Digitala vårdtjänster i primärvården.” https://skr.se/halsasjukvard/ehalsa/digitalavardtjansteriprimarvarden.28301.html Last Accessed: 2021-06-03. Salisbury, Chris, Anna Quigley, Nick Hex, and Camille Aznar. 2020. “Private Video Consultation Services and the Future of Primary Care.” Journal of Med- ical Internet Research, 22(10): e19415. Shah, Sachin J., Lee H. Schwamm, Adam B. Cohen, Marcy R. Simoni, Juan Estrada, Marcelo Matiello, Atheendar Venkataramani, and Sandhya K. Rao. 2018. “Virtual Visits Partially Replaced In-Person Visits In An ACO-Based Medical Specialty Practice.” Health Affairs, 37(12): 2045–2051. Shi, Zhuo, Ateev Mehrotra, Courtney A. Gidengil, Sabrina J. Poon, Lori Uscher-Pines, and Kristin N. Ray. 2018. “Quality Of Care For Acute Respira- tory Infections During Direct-To-Consumer Telemedicine Visits For Adults.” Health Affairs, 37(12): 2014–2023. Uscher-Pines, Lori, Andrew Mulcahy, David Cowling, Gerald Hunter, Rachel Burns, and Ateev Mehrotra. 2015a. “Access and Quality of Care in Direct-to- Consumer Telemedicine.” Telemedicine and e-Health, 22(4): 282–287. Uscher-Pines, Lori, Andrew Mulcahy, David Cowling, Gerald Hunter, Rachel Burns, and Ateev Mehrotra. 2015b. “Antibiotic Prescribing for Acute Respi- ratory Infections in Direct-to-Consumer Telemedicine Visits.” JAMA internal medicine, 175(7): 1234–1235. 40 Appendix A Comparison of DCT measures The main data sources of our DCT measure are Stockholm and Västra Göta- land’s registers of billing information for their residents’ care consumption in other regions. (Regions are financially responsible for their residents and get billed when their residents are treated by providers in other regions.) The DCT variable in the main analysis includes • all primary care contacts in the Sörmland and Jönköping regions. • contacts with private DCT providers in the Skåne region (Capio Go, un- til May 2018) and a public DCT provider in Västra Götaland (Närhälsan Online). The vast majority of the DCT contacts comes from Region Jönköping, where the largest DCT providers were located at the time (Min Doktor, Kry, Doktor 24, Medicoo, Accumbo, and Capio Go from June 2018). Doktor.se was the only provider located in Region Sörmland before 2019. In Tables A.1 and A.2, we examine how sensitive the estimated first stage sharp diff-in-disc is to various definitions of the DCT variable. In Table A.1, all models are estimated with the same bandwidth (the MSE-optimal bandwidth for in-person visits for both genders and regions). Table A.2 presents results from models using the MSE-optimal bandwidth for each estimation sample. In column 1, the definition of the DCT variable is the same as in the main anal- ysis. This variable is based on billing information, and includes all primary care visits in Sörmland and Jönköping. In the next two columns, we use the same definition as in column 1 but include only consultations registered in Re- gion Sörmland (column 2) or Region Jönköping (column 3). In column 4, we apply the same definition of DCT contacts, but use information from register data from Region Jönköping (instead of billing information from Västra Göta- land and Stockholm). In column 5, the DCT definition only includes registered remote contacts at the private PCCs in Jönköping that have agreements with DCT providers.33 In column 6, we further restrict this definition to only include remote consultations with physicians. The estimates are similar across outcomes, except when we look at contacts in Sörmland only (column 2). The similarity between the other coefficients im- plies that the discontinuity at age 20 primarily affected the number of DCT con- 33It was the agreements with these PCCs that enabled the DCT get public funding via the inter-regional agreement. 41 tacts with providers located Region Jönköping. This pattern is explained by the age differentiated DCT user fee in Region Jönköping; in Region Sörmland the user fee was zero for all ages groups. Notably, only one minor provider had an agreement with a primary care center in Region Sörmland at this time pe- riod, and this provider had a nurse triage system in place making the patient less likely to see a physician. The positive coefficient in column 2 indeed sug- gests that the user fee in Region Jönköping, if anything, had a reverse (although small) effect on the number of DCT contacts in Region Sörmland. This is also supported by the coefficients in column 3 and 4 being larger than in column 1. The similarity between columns 1, 3, 4, and 5 suggests that the billing in- formation from the home regions own registers provides an accurate measure of the DCT contacts and a reliable estimate of the effect of the reduction of the DCT user fee at age 20. (The estimated degree of substituion (fuzzy diff-in-disc) is also very similar when using the data from Jönköping to measure DCT con- tacts) The small difference between column 5 and 6 further suggests that only about 10% of the first stage coefficient relates to changes in consultations with other health care professionals than physicians. 42 Table A.1: Variation of DCT definitions (Opt.bw. combined sample) A: BOTH REGIONS (2018) DCT DCT-Smld DCT-Jkpg1 DCT-Jkpg2 DCT-digpr DCT-phys All -0.149*** 0.00649 -0.155*** -0.161*** -0.159*** -0.147*** (0.0119) (0.00413) (0.0111) (0.0116) (0.0114) (0.0109) Men -0.0711*** -0.000695 -0.0678*** -0.0677*** -0.0680*** -0.0613*** (0.0120) (0.00499) (0.0106) (0.0106) (0.0105) (0.0106) Women -0.232*** 0.0142* -0.249*** -0.261*** -0.258*** -0.239*** (0.0240) (0.00806) (0.0222) (0.0229) (0.0224) (0.0216) B: REGION STOCKHOLM (2018) All -0.164*** 0.0111* -0.175*** -0.183*** -0.182*** -0.166*** (0.0176) (0.00627) (0.0171) (0.0177) (0.0176) (0.0163) Men -0.0793*** -0.00151 -0.0778*** -0.0762*** -0.0783*** -0.0693*** (0.0190) (0.00958) (0.0179) (0.0182) (0.0182) (0.0177) Women -0.266*** 0.0235** -0.289*** -0.304*** -0.301*** -0.276*** (0.0378) (0.0116) (0.0355) (0.0364) (0.0356) (0.0334) C:REGION VÄSTRA GÖTALAND (2018) All -0.126*** 0.000894 -0.128*** -0.133*** -0.132*** -0.125*** (0.0191) (0.00575) (0.0167) (0.0175) (0.0173) (0.0152) Men -0.0576*** 0.00136 -0.0531*** -0.0520*** -0.0482*** -0.0477*** (0.0173) (0.00603) (0.0146) (0.0146) (0.0144) (0.0145) Women -0.208*** 0.0000444 -0.214*** -0.222*** -0.222*** -0.206*** (0.0332) (0.00959) (0.0293) (0.0304) (0.0295) (0.0267) Note: Table shows sharp diff-in-disc results for various definitions of online consultations. Each column presents the sharp diff-in-disc estimates for the a given outcome for each esti- mation sample (All, Men, Women) for both regions jointly, and separately. Each model uses MSE-optimal bandwidths for in-person visits used in the main estimations for both genders and regions jointly. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 43 Table A.2: Variation of DCT definitions (Opt.bw. each sample) A: BOTH REGIONS (2018) DCT DCT-Smld DCT-Jkpg1 DCT-Jkpg2 DCT-digpr DCT-phys All -0.149*** 0.00649 -0.155*** -0.161*** -0.159*** -0.147*** (0.0119) (0.00413) (0.0111) (0.0116) (0.0114) (0.0109) Men -0.0772*** -0.00246 -0.0743*** -0.0709*** -0.0695*** -0.0620*** (0.0143) (0.00624) (0.0122) (0.0124) (0.0123) (0.0125) Women -0.238*** 0.0195** -0.263*** -0.273*** -0.268*** -0.249*** (0.0252) (0.00939) (0.0228) (0.0235) (0.0229) (0.0223) B: REGION STOCKHOLM (2018) All -0.162*** 0.0132** -0.175*** -0.182*** -0.181*** -0.167*** (0.0177) (0.00621) (0.0170) (0.0176) (0.0175) (0.0162) Men -0.0793*** -0.00151 -0.0778*** -0.0762*** -0.0783*** -0.0693*** (0.0190) (0.00958) (0.0179) (0.0182) (0.0182) (0.0177) Women -0.266*** 0.0235** -0.289*** -0.304*** -0.301*** -0.276*** (0.0378) (0.0116) (0.0355) (0.0364) (0.0356) (0.0334) C:REGION VÄSTRA GÖTALAND (2018) All -0.126*** 0.000894 -0.128*** -0.133*** -0.132*** -0.125*** (0.0191) (0.00575) (0.0167) (0.0175) (0.0173) (0.0152) Men -0.0591*** -0.000414 -0.0573*** -0.0529*** -0.0485*** -0.0489*** (0.0180) (0.00628) (0.0150) (0.0150) (0.0148) (0.0150) Women -0.183*** 0.0123 -0.204*** -0.216*** -0.211*** -0.191*** (0.0382) (0.0121) (0.0324) (0.0336) (0.0322) (0.0305) Note: Table shows sharp diff-in-disc results for various definitions of online consultations. Each column presents the sharp diff-in-disc estimates for the a given outcome for each esti- mation sample (All, Men, Women) for both regions jointly, and separately. Each model uses the MSE-optimal bandwidths for in-person visits used in the main estimations for the re- spective estimation sample. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 44 Appendix B List of diagnosis categories We define the most common diagnoses using information on registered ICD- codes from the local care register in Region Jönköping. These data exactly iden- tify the online-consultations with physicians at DCT providers in Jönköping, but have the limitation that they do not include any consultations with DCT providers located elsewhere. Each care contact may have up to five registered diagnoses classified according to the ICD-10. The data include the complete ICD-code, but we define the most common diagnoses categories on a three digit level. To define the most common diagnoses, we first generate a list of all com- plete ICD-codes and the corresponding number of recorded registrations at on- line consultations from 2016 to 2018 (by each gender). We then sum the num- ber of registered diagnoses within each three digit ICD-code, and rank these three digit codes by the number of registrations (again by gender). A three digit ICD-code is defined as being among the most common diagnoses if it is in- cluded among the top 80 % of all registered diagnoses for either male or female individuals. For non-administrative ICD-codes (i.e., any ICD-code not in Z00- Z99), we also include three-digit ICD-codes that belongs to the same ICD-block (subchapter) as any of the most common ICD-codes and had been registered at least 10 times during the study period. We classify the set of common diagnoses into four subsets: upper respira- tory infections, skin conditions, diagnoses related to genital and reproductive organs, and a residual category (other). Table B.1, B.2, B.3, and B.4 list the three digit level ICD-codes in each subset. In 2018, 90 % of the online consultations with a physician had at least one registered ICD-code that is included in the definition of the most common diagnoses. The four categories respectively covered 19% 25%, 22%, and 28% of all online consultations in the same year. Because each consultation may have more than one diagnosis, the contacts in these subsets are not completely mutually exclusive. 45 Table B.1: Upper respiratory infections icd icd-block Description B27 B25 -B34 Infectious mononucleosis B30 B25 -B34 Viral conjunctivitis B34 B25 -B34 Viral infection of unspecified site J00 J00 -J06 Acute nasopharyngitis [common cold] J01 J00 -J06 Acute sinusitis J02 J00 -J06 Acute pharyngitis J03 J00 -J06 Acute tonsillitis J06 J00 -J06 Acute upper respiratory infections of multiple and unspecified sites J30 J30 -J39 Vasomotor and allergic rhinitis J31 J30 -J39 Chronic rhinitis, nasopharyngitis and pharyngitis J35 J30 -J39 Chronic diseases of tonsils and adenoids J45 J40 -J47 Asthma R05 R00 -R09 Cough Table B.2: Skin related conditions icd icd block Description B00 B00 -B09 Herpesviral [herpes simplex] infections B02 B00 -B09 Zoster [herpes zoster] B07 B00 -B09 Viral warts B08 B00 -B09 Other viral infections characterized by skin and mucous membrane lesions, not elsewhere classified B35 B35 -B49 Dermatophytosis B36 B35 -B49 Other superficial mycoses B37 B35 -B49 Candidiasis (excluding B373, B373P, or B374) L01 L00 -L08 Impetigo L02 L00 -L08 Cutaneous abscess, furuncle and carbuncle L03 L00 -L08 Cellulitis L08 L00 -L08 Other local infections of skin and subcutaneous tissue L20 L20 -L30 Atopic dermatitis L21 L20 -L30 Seborrhoeic dermatitis L23 L20 -L30 Allergic contact dermatitis L29 L20 -L30 Pruritus L30 L20 -L30 Other dermatitis L50 L50 -L54 Urticaria L60 L60 -L75 Nail disorders L63 L60 -L75 Alopecia areata L64 L60 -L75 Androgenic alopecia L65 L60 -L75 Other nonscarring hair loss L70 L60 -L75 Acne L71 L60 -L75 Rosacea L73 L60 -L75 Other follicular disorders R21 R20 -R23 Rash and other nonspecific skin eruption R22 R20 -R23 Localized swelling, mass and lump of skin and subcutaneous tissue R23 R20 -R23 Other skin changes 46 Table B.3: Genital & reproductive organs icd icd block Description B37 B35 -B49 Candidiasis (only B373, B373P, or B374) F52 F50 -F59 Sexual dysfunction, not caused by organic disorder or disease N30 N30 -N39 Cystitis N39 N30 -N39 Other disorders of urinary system N76 N70 -N77 Other inflammation of vagina and vulva N77 N70 -N77 Vulvovaginal ulceration and inflammation in diseases classified elsewhere N92 N80 -N98 Excessive, frequent and irregular menstruation N94 N80 -N98 Pain and other conditions associated with female genital organs and menstrual cycle R10 R10 -R19 Abdominal and pelvic pain Y42 Y40 -Y59 Hormones and their synthetic substitutes and antagonists, not elsewhere classified Z30 Z Contraceptive management Z92 Z Personal history of medical treatment 47 Table B.4: Other common diagnoses icd icd-block Description A08 A00 -A09 Viral and other specified intestinal infections A09 A00 -A09 Other gastroenteritis and colitis of infectious and unspecified origin A69 A65 -A69 Other spirochaetal infections B80 B65 -B83 Enterobiasis F32 F30 -F39 Depressive episode F33 F30 -F39 Recurrent depressive disorder F40 F40 -F48 Phobic anxiety disorders F41 F40 -F48 Other anxiety disorders F42 F40 -F48 Obsessive-compulsive disorder F43 F40 -F48 Reaction to severe stress, and adjustment disorders F45 F40 -F48 Somatoform disorders F51 F50 -F59 Nonorganic sleep disorders G43 G40 -G47 Migraine G44 G40 -G47 Other headache syndromes G47 G40 -G47 Sleep disorders H10 H10 -H13 Conjunctivitis K12 K00 -K14 Stomatitis and related lesions K13 K00 -K14 Other diseases of lip and oral mucosa K14 K00 -K14 Diseases of tongue K21 K20 -K31 Gastro-oesophageal reflux disease K29 K20 -K31 Gastritis and duodenitis K30 K20 -K31 Functional dyspepsia M54 M50 -M54 Dorsalgia M79 M70 -M79 Other soft tissue disorders, not elsewhere classified R00 R00 -R09 Abnormalities of heart beat R06 R00 -R09 Abnormalities of breathing R07 R00 -R09 Pain in throat and chest R11 R10 -R19 Nausea and vomiting R19 R10 -R19 Other symptoms and signs involving the digestive system and abdomen R50 R50 -R69 Fever of other and unknown origin R51 R50 -R69 Headache R52 R50 -R69 Pain, not elsewhere classified R53 R50 -R69 Malaise and fatigue R61 R50 -R69 Hyperhidrosis T14 T08 -T14 Injury of unspecified body region T38 T36 -T50 Poisoning by hormones and their synthetic substitutes and antagonists, not elsewhere classified T78 T66 -T78 Adverse effects, not elsewhere classified W57 W55 -W65 Bitten or stung by nonvenomous insect and other nonvenomous arthropods W64 W55 -W65 Exposure to other and unspecified animate mechanical forces X58 X58 -X59 Exposure to other specified factors Z00 Z General examination and investigation of persons without complaint and reported diagnosis Z02 Z Examination and encounter for administrative purposes Z03 Z Medical observation and evaluation for suspected diseases and conditions Z71 Z Persons encountering health services for other counselling and medical advice, not elsewhere classified Z76 Z Persons encountering health services in other circumstances 48 Appendix C Descriptive statistics Table C.1 shows descriptive statistics for three groups: the subset of individu- als who turned 20 in 2018 that had at least one DCT consultation in 2018 (DCT users 2018), individuals in the same cohort who did not contact a DCT com- pany in 2018 (Non-users 2018), and individuals turning 20 in the 2012-15 pe- riod (cohorts 2012-2015). The first row shows the average annual number of in-person physician vis- its (our main outcome variable). DCT users visited a physician more often than non-users, which may reflect a generally greater propensity to seek care or just a worse health status.34 Further below in the table, we also note that the pro- portion of individuals who did not visit a physician during their 19th life year (0 Phys vis 18) was lower among DCT users than in other groups. Women are very much over-represented among DCT users, who also tend to have parents with higher socioeconomic background than non-users in the same as well as the earlier cohorts. A larger fraction of DCT users live in the Stockholm region (vs Västra Götaland), and a larger fraction of DCT users lives in a large city (Stock- holm or Gothenburg) rather than a town or a rural area. Table C.1: Descriptives across cohorts and DCT users COHORT 2018 COHORT 2018 COHORTS 2012-2015 (DCT-USERS) (NON-USERS) (NON-USERS) Mean SD Mean SD Mean SD In-person phys visits 1.424 22.858 0.884 18.016 1.063 19.756 0 Phys vis ’18 0.381 0.486 0.524 0.499 0.473 0.499 1 Phys vis ’18 0.263 0.441 0.246 0.430 0.251 0.434 ≥ 2 Phys vis ’18 0.356 0.479 0.231 0.421 0.276 0.447 Female 0.716 0.451 0.452 0.498 0.484 0.500 Share Sthlm 0.632 0.482 0.546 0.498 0.542 0.498 Resides Rural 0.068 0.253 0.103 0.305 0.109 0.312 Resides Town 0.171 0.376 0.233 0.422 0.238 0.426 Resides City 0.761 0.427 0.664 0.472 0.652 0.476 Mum inc > median 0.498 0.500 0.424 0.494 0.402 0.490 Dad uni 0.411 0.492 0.376 0.484 0.350 0.477 Par non-nordic 0.243 0.429 0.263 0.440 0.225 0.418 Note: Descriptive statistics are presented here with sample split by telemed and non- telemed users. Non-users are also presented by cohort; those turning 20 in the period pre DCT being introduced 2012-2015, and those turning 20 in 2018. ’In-person phys visits’ is our main outcome variable, any in-person visit to a physician. 34A similar pattern was also demonstrated for the study population in Ellegård and Kjellsson (2019). 49 Appendix D Balance test Table D.1 shows balance estimation results for the background characteristics in the leftmost column, which are here used as outcome variables. Each out- come variable are coded either as a binary or as a categorical variable. The variables are time-invariant (for instance, we study the number of physician visits measures during the individual’s 19th life year) so that the estimations capture sample composition changes and nothing else. For each background characteristic, we obtain the MSE-optimal bandwidth and estimate the sharp difference-in-discontinuity with individual level data using the specification in Eq. (1). The table displays the estimated diff-in-discs with standard errors clus- tered on the individual level in parenthesis. As can be seen from the table, very few estimates are statistically significant. In particular, there is no sign of balance for any background characteristic when using the 2018 cohort, which is our preferred study cohort. Table D.1: Both regions balance BOTH GENDERS MEN WOMEN 2017 2018 2017 2018 2017 2018 Dad uni -0.000163 0.000134 -0.00168 0.000646 0.00110 -0.000672 (0.00110) (0.00113) (0.00151) (0.00151) (0.00170) (0.00167) Mum uni -0.00218* -0.000366 -0.00374*** 0.000413 0.00115 -0.00114 (0.00115) (0.00121) (0.00145) (0.00154) (0.00162) (0.00164) Mum inc > median -0.000649 -0.000466 -0.00195 0.000289 0.000688 -0.00106 (0.00121) (0.00109) (0.00157) (0.00158) (0.00178) (0.00159) Dad inc > median 0.000933 -0.00110 0.00109 -0.00144 0.000513 -0.000838 (0.00125) (0.00119) (0.00164) (0.00155) (0.00166) (0.00167) Lives with parent -0.000361 0.000490 -0.000439 0.000354 -0.000117 0.000894 (0.000654) (0.000627) (0.000792) (0.000834) (0.00105) (0.000984) Parents non-nordic 0.000109 0.000616 0.00129 -0.000268 -0.000564 0.000966 (0.00106) (0.000990) (0.00140) (0.00139) (0.00141) (0.00135) Rurality (cat) -0.00181 0.000541 -0.00219 -0.00202 -0.00177 0.00285 (0.00156) (0.00152) (0.00216) (0.00209) (0.00240) (0.00220) Phys visits age 18 (cat) -0.00192 0.00175 -0.00466* 0.000811 0.000342 0.00245 (0.00214) (0.00201) (0.00252) (0.00251) (0.00287) (0.00282) Note: Balance tests. Each table row represents an outcome variable and each cell shows a sharp diff-in- disc estimate contrasting the changes at age 20 for the cohort turning 20 in 2017 (2018) and the pre-DCT cohort. Robust standard errors clustered by individual in parenthesis. * P<0.1, ** P<0.05, *** P<0.01. 50 Appendix E First stage sensitivity to bandwidths Table E.1 displays first stage results, i.e. the sharp difference-in-discontinuity where the outcome is online consultations, for a selection of bandwidths rang- ing from one month to one year. All estimates indicate a significant drop at the 20th birthday both in 2017 and 2018. The estimates for bandwidths of 60 to 365 days are also very similar in size to the main estimates. The estimates from the shortest bandwidth of 30 days deviate to some extent (being smaller in 2017 and larger in 2018), but are still qualitatively the same. Complementary to these results, tables E.2 and E.3 show estimates with fixed bandwidths of 365 and 180 days before and after the 20th birthday, re- spectively. Here, each panel after the first panel’s baseline specification cor- responds to various donut estimations: Two weeks on either side of the 20th birthday are excluded, then each panel excludes two more weeks on either side of the threshold up to 10 and 14 weeks for the 180 and 365 outer bandwidths, respectively. The results are very stable across the donut-specifications for the 365-day bandwidth. With the 180-day bandwidth, the coefficient decreases as the ex- cluded time period grows longer, but it is at most 10% smaller than the main estimate for donuts of 2-8 weeks. The estimate decreases more when we ex- clude 10 weeks on either side of the threshold, but the decrease may also stem from noise due to the reduction in sample size (-70 days of 180 on each side). 51 Table E.1: First stage over fixed bandwidths 2017 2018 All Men Women All Men Women BW 30 -0.0738 -0.0382 -0.112 -0.107 -0.0460 -0.172 (-0.0993, -0.0484) (-0.0600, -0.0163) (-0.155, -0.0689) (-0.142, -0.0708) (-0.0843, -0.00773) (-0.250, -0.0953) F-stat 32.40 11.71 25.98 34.00 5.550 19.17 BW 60 -0.0561 -0.0273 -0.0870 -0.144 -0.0693 -0.226 (-0.0751, -0.0372) (-0.0450, -0.00956) (-0.119, -0.0549) (-0.174, -0.115) (-0.0998, -0.0389) (-0.286, -0.166) F-stat 33.60 9.109 28.24 93.51 19.94 55.21 BW 90 -0.0491 -0.0147 -0.0857 -0.151 -0.0712 -0.238 (-0.0653, -0.0328) (-0.0300, 0.000534) (-0.112, -0.0591) (-0.177, -0.125) (-0.0969, -0.0454) (-0.289, -0.186) F-stat 34.92 3.578 39.96 126.2 29.36 81.67 BW 120 -0.0499 -0.0195 -0.0824 -0.149 -0.0700 -0.234 (-0.0637, -0.0361) (-0.0332, -0.00575) (-0.105, -0.0600) (-0.172, -0.126) (-0.0923, -0.0477) (-0.280, -0.189) F-stat 49.99 7.728 51.92 163.8 37.76 103.4 BW 180 -0.0482 -0.0145 -0.0840 -0.141 -0.0646 -0.224 (-0.0599, -0.0365) (-0.0263, -0.00270) (-0.103, -0.0651) (-0.160, -0.122) (-0.0830, -0.0462) (-0.262, -0.187) F-stat 64.94 5.793 75.70 207.8 47.23 137.4 BW 240 -0.0539 -0.0268 -0.0828 -0.134 -0.0622 -0.213 (-0.0639, -0.0439) (-0.0375, -0.0161) (-0.0991, -0.0666) (-0.151, -0.117) (-0.0783, -0.0460) (-0.246, -0.180) F-stat 111.9 24.18 99.95 242.7 56.73 162.9 BW 365 -0.0564 -0.0318 -0.0826 -0.138 -0.0523 -0.231 (-0.0644, -0.0485) (-0.0404, -0.0232) (-0.0957, -0.0695) (-0.152, -0.123) (-0.0658, -0.0388) (-0.259, -0.204) F-stat 195.2 53.01 152.0 348.2 57.53 279.8 Note: The table displays first stage coefficient estimates and confidence intervals, where the outcome is online consultations. 95% confi- dence intervals in parantheses based on robust standard errors clustered by running variable with separate clusters for pre and post pe- riod. 2017 and 2018 refer to the postperiod used. 52 Table E.2: Long donut: First stage with 365 bandwidth All Men Women 0 WEEKS EXCLUDED FS Coeff -0.138 -0.0523 -0.231 (-0.152, -0.123) (-0.0658, -0.0388) (-0.259, -0.204) F-stat 348.2 57.53 279.8 2 WEEKS EXCLUDED FS coeff -0.141 -0.0522 -0.238 (-0.157, -0.124) (-0.0670, -0.0373) (-0.267, -0.208) F-stat 284.7 47.40 243.6 4 WEEKS EXCLUDED FS coeff -0.135 -0.0481 -0.229 (-0.152, -0.117) (-0.0644, -0.0318) (-0.262, -0.196) F-stat 218.4 33.37 185.2 6 WEEKS EXCLUDED FS coeff -0.132 -0.0433 -0.229 (-0.152, -0.112) (-0.0609, -0.0257) (-0.265, -0.194) F-stat 174.8 23.20 160.4 8 WEEKS EXCLUDED FS coeff -0.134 -0.0426 -0.235 (-0.155, -0.113) (-0.0621, -0.0231) (-0.273, -0.197) F-stat 152.3 18.38 147.8 10 WEEKS EXCLUDED FS coeff -0.131 -0.0375 -0.235 (-0.155, -0.107) (-0.0585, -0.0166) (-0.277, -0.192) F-stat 116.5 12.33 115.3 12 WEEKS EXCLUDED FS coeff -0.132 -0.0345 -0.240 (-0.159, -0.106) (-0.0582, -0.0109) (-0.287, -0.193) F-stat 97.10 8.211 99.54 14 WEEKS EXCLUDED FS coeff -0.143 -0.0371 -0.260 (-0.172, -0.114) (-0.0628, -0.0113) (-0.311, -0.209) F-stat 93.26 7.974 98.10 Note: The table displays first stage sharp diff-in-disc estimates and confidence intervals, where the outcome is online consultations. The bandwidth is set to 365 days on either side of the 20th birthday. Start- ing with a baseline of no donut, the consecutive panels exclude two weeks more than the prior panel on either side of the threshold (day 0). 95% confidence intervals in parantheses based on robust standard errors clustered by running variable with separate clusters for pre and post period. 53 Table E.3: Long donut: First stage with 180 bandwidth All Men Women 0 WEEKS EXCLUDED FS Coeff -0.141 -0.0646 -0.224 (-0.160, -0.122) (-0.0830, -0.0462) (-0.262, -0.187) F-stat 207.8 47.23 137.4 2 WEEKS EXCLUDED FS coeff -0.148 -0.0687 -0.235 (-0.172, -0.124) (-0.0908, -0.0466) (-0.280, -0.190) F-stat 146.7 37.17 105.1 4 WEEKS EXCLUDED FS coeff -0.135 -0.0643 -0.212 (-0.164, -0.106) (-0.0911, -0.0374) (-0.268, -0.156) F-stat 84.48 22.07 55.23 6 WEEKS EXCLUDED FS coeff -0.126 -0.0560 -0.203 (-0.161, -0.0905) (-0.0878, -0.0242) (-0.268, -0.137) F-stat 48.94 11.90 36.25 8 WEEKS EXCLUDED FS coeff -0.128 -0.0595 -0.205 (-0.171, -0.0860) (-0.0996, -0.0194) (-0.280, -0.129) F-stat 35.34 8.464 28.23 10 WEEKS EXCLUDED FS coeff -0.110 -0.0458 -0.181 (-0.165, -0.0546) (-0.0935, 0.00187) (-0.282, -0.0805) F-stat 15.20 3.546 12.42 Note: The table displays first stage sharp diff-in-disc estimates and con- fidence intervals, where the outcome is online consultations. The band- width is set to 180 days on either side of the 20th birthday. Starting with a baseline of no donut, the consecutive panels exclude two weeks more than the prior panel on either side of the threshold (day 0). 95% confi- dence intervals in parantheses based on robust standard errors clustered by running variable with separate clusters for pre and post period. 54 Appendix F Main results for 2017 cohorts Table F.1 presents diff-in-disc estimates equivalent to the main results in table 3 but compares 2017 (instead of 2018) to the pre-DCT periods. The estimates confirm the results from the graphical analysis in section 5.3. The first stage estimates are generally smaller and weaker compared to 2018. The sharp diff- in-disc for the full sample provides no indication that the onset of the user fee of DCT increase consumption of in-person visits. The fuzzy diff-in-disc estimate is positive but insignificant. 55 Table F.1: Fuzzy diff-in-disc, different opt bandwidths OPT BW EACH SAMPLE OPT BW, COMBINED SAMPLE All Men Women Men Women A. FIRST STAGE (SHARP DIFF-IN-DISC) Online consultations -0.0477*** -0.0171** -0.0819*** -0.0171** -0.0803*** (0.00792) (0.00712) (0.0122) (0.00739) (0.0131) B. RF (SHARP DIFF-IN-DISC) In-person visits -0.0216 0.0152 -0.0355 -0.00973 -0.0340 (0.0379) (0.0445) (0.0574) (0.0426) (0.0616) C. IV (FUZZY DIFF-IN-DISC) In-person visits 0.453 -0.891 0.434 0.568 0.423 (0.791) (2.598) (0.701) (2.499) (0.763) Total bw pre 235 151 230 235 235 Total bw post 198 202 234 198 198 Avg. ind/day pre 151589 78183 73547 78168 73421 Avg. ind/day post 31907 16439 15476 16462 15445 Note: Variable names in left column represent the dependent variable. In the first three columns, each model uses MSE-optimal bandwidths for In-person visits for the respective es- timation sample (All, Men, Women), with separate bandwidths for the pre-DCT cohorts and the post DCT cohort (2017), and varying bandwidths on the left- and righthand side of the age cutoff. In two columns to the right, each model uses the optimal bandwidth for the total sample in column 1. Bandwidth refers to extent of inclusion of values of running variable i.e. days to/since 20th birthday. Total bw = sum of bandwidths on left- and right side of age cutoff. Avg. ind/day = average number of individuals per day in cells, shown separately for the pre cohorts (which include several birth cohorts) and the post cohort (which only includes indi- viduals turning 20 in 2017). Estimates using data collapsed by region, gender, birth year and day relative to 20th birthday. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 56 Appendix G Robustness G.1 Sensitivity to inclusion of pre-periods of varying length Figure G.1 plots annual in-person physician visits per capita by the day rela- tive to the 20th birthday (i.e., the DCT user fee threshold) for 6 time periods, each corresponding to one year. The first four sub-graphs show each of the four years preceding the introduction of DCT services in mid-2016. The last two sub-graphs present figures for 2017 and 2018 (when the DCT market emerged). In the pre-period, the 20th birthday was associated with a drop in the num- ber of in-person visits. These graphs show that the drop at the 20th birthday was stable during the pre-period (although somewhat smaller in the the first year, July 2012 – June 2013). The drop in 2017 is similar to the years before the DCT emerged, while in 2018 we no longer observe a drop at the 20th birthday. The thin black lines on each side of the user fee threshold reflect how the data portrayed in the figure are used to estimate the diff-in-disc estimates. The length of the lines, which we allow to differ on either side of the threshold, are the optimal bandwidths chosen by the cross-validation process suggested by Cattaneo, Idrobo and Titiunik (2019)). In the periods before June 2016, the length of these corresponds to the bandwidths used in the main estimations (i.e. the MSE-optimal bandwidth for the full pre-period)..8.911.11.2In-person visits -400 -200 0 200 4002012 Jul - 2013 Jun .8.911.11.2In-person visits -400 -200 0 200 4002013 Jul - 2014 Jun .8.911.11.2In-person visits -400 -200 0 200 4002014 Jul - 2015 Jun.8.911.11.2In-person visits -400 -200 0 200 4002015 Jul - 2016 Jun .8.911.11.2In-person visits -400 -200 0 200 4002017 .8.911.11.2In-person visits -400 -200 0 200 4002018 Figure G.1: Physician visits over time, separate graph for each pre-period Table G.1 presents fuzzy diff-in-disc estimates for different lengths of the pre-period. The first row presents estimates using only the last 12 months be- 57 fore June 31 2016. Each row then adds another 12 months to the pre-period. The last row presents the estimates from the main results in Table 3. For cor- respondence with the main results, bandwidths are based on the sample using the full pre-period. The first three columns uses MSE-optimal bandwidths for in-person visits for the corresponding estimation sample (All, Men, Women). In the two columns to the right, each model uses the optimal bandwidth for the total sample in column 1. Overall, the results are similar across various length of the pre-period. Compared to our main estimate, which includes all pre- periods, we would obtain a slightly higher degree of substitution if we excluded the first period (for which the observed drop at the 20th birthday is smaller). The estimated degree of substitution is noticeably higher if we remove all but the last year in the pre-period (-.571, with a 95% confidence interval covering -1). We conclude that our main specification, which relies on more data points, yields a conservative estimate of the degree of substitution. 58 Table G.1: Fuzzy diff-in-disc, different pre-periods OPT BW OPT BW EACH SAMPLE COMBINED SAMPLE All Men Women Men Women 2015 Jul - 2016 Jun -0.571** -0.140 -0.386 -0.621 -0.548** (0.250) (0.669) (0.276) (0.611) (0.273) 2014 Jul - 2016 Jun -0.506** -0.637 -0.219 -0.934 -0.361 (0.218) (0.623) (0.222) (0.576) (0.222) 2013 Jul - 2016 Jun -0.506** -0.496 -0.260 -0.974* -0.350* (0.204) (0.603) (0.209) (0.560) (0.207) 2012 Jul - 2016 Jun -0.453** -0.461 -0.184 -0.889 -0.305 (0.206) (0.596) (0.210) (0.550) (0.208) Total bw pre 235 151 230 235 235 Total bw post 217 158 181 217 217 Avg. ind/day pre 151589 78183 73547 78168 73421 Avg. ind/day post 31483 16345 15207 16299 15184 Note: The table present fuzzy diff-in-disc estimates using various lengths of the pre-period. The first row presents estimates using only the last 12 months before June 31 2016. Each row than adds another 12 months to the pre-period. The last row presents the estimates from the main results in Table 3. For correspondence with the main results, bandwidths are based on the sample using the full pre-period. The first three columns uses MSE- optimal bandwidths for In-person visits for the corresponding estimation sample (All, Men, Women). In the two columns to the right, each model uses the optimal bandwidth for the total sample in column 1. Total bw = sum of bandwidths on left- and right side of age cutoff. Avg. ind/day = aver- age number of individuals per day in cells, shown separately for the pre co- horts (which include several birth cohorts) and the post cohort (which only includes individuals turning 20 in 2018). Estimates using data collapsed by region, gender, birth year and day relative to 20th birthday. Standard errors clustered by the running variable, with separate clusters for pre-post peri- ods. * P<0.1, ** P<0.05, *** P<0.01. 59 G.2 Sensitivity to secular decrease of in-person visits The graphs in Figure 3 suggest that there is a general decrease in the number of in-person visits between the pre-period and the post period also among in- dividuals above 20 years old. The graphs indicate that this decrease is about 5 to 10%. Not accounting for this overall reduction may bias the diff-in-disc estimates, as it is possible that this secular decrease would have reduced the discontinuity at age 20 even in the absence of DCT user fee. In section 4, we presented the diff-in-disc estimand as θ = τy τDC T = (Y +1 −Y − 1 )− (Y + 0 −Y − 0 ) (DC T+1 −DC T − 1 )− (DC T + 0 −DC T − 0 ) (6) To identify the degree of substitution this estimand relies on the assumption that any effects of confounding treatments must be time-invariant. (Millán- Quijano, 2020) suggests an estimand that relaxes this assumpion that in our context corresponds to θ M = τy τDC T = (Y +1 −Y − 1 )− (1−γ)(Y + 0 −Y − 0 ) (DC T+1 −DC T − 1 )− (DC T + 0 −DC T − 0 ) (7) where γ is equal to the overall (proportional) reduction in the in-person consultations between the two periods. Subtracting equation θM from θ (i.e equation 6 from 7) yields an expression of the bias of the standard diff-in-disc estimand under these circumstances bi as = γ (Y +0 −Y − 0 ) τDC T . (8) Thus, in order to get an estimate of the bias, we replace τDC T and (Y + 0 −Y − 0 ) by estimates from the first stage regression and a standard RD-model in the pre- period. Assuming that γ is equal to the proportional decrease in the in-person consultations among individuals just above 20 (i.e. (Y +1 −Y + 0 )/Y + 0 ), we can use the coefficients from our reduced form equation to compute γ. This exercise yields an estimate of the bias of the main results of .03, which corresponds to an overestimation of the degree of substitution of about 6 %. The estimated bias for results split by gender is of similar size. Thus, we do not consider the general decrease in the number of in-person consultations to be a major concern for our conclusions. 60 G.3 Fuzzy RD for Stockholm The diff-in-disc assumes that any confounding policy has time-invariant ef- fects. An alternative estimation strategy that relaxes that assumption would be to focus exclusively on Region Stockholm, where there was no confound- ing change of in-person visit fees at age 20. Assuming there are no other con- founding policies at age 20, we can then use a standard fuzzy RD specification (instead of a diff-in-disc). Table G.2 shows that such a specification yields sim- ilar estimates at bandwidths of 90 days or more (including when we we use an MSE-optimal bandwidth for the relevant estimation sample). 61 Table G.2: Fuzzy Regression Discontinuity, Sthlm 2018 All Men Women BW 30 0.139 -1.186 0.739 (0.650) (1.212) (0.641) KP F-stat 16.05 6.038 8.757 BW 60 0.0844 -0.0211 0.123 (0.298) (0.736) (0.279) KP F-stat 60.79 13.32 39.86 BW 90 -0.299 -0.528 -0.221 (0.258) (0.673) (0.251) KP F-stat 78.90 16.65 54.03 BW 120 -0.480** -0.697 -0.398* (0.244) (0.614) (0.239) KP F-stat 84.64 21.59 57.11 BW 365 -0.282* -0.531 -0.199 (0.155) (0.504) (0.139) KP F-stat 191.3 29.29 169.6 Opt BW -0.414* -0.626 -0.157 (0.243) (0.691) (0.239) Left bw 115 78 80 Right bw 110 98 105 KP F-stat 86.70 16.62 51.63 Note: The table presents fuzzy regression discontinuity estimates for Stockholm in 2018 (that is, no differencing across cohort as in our main specifications). The band- width is fixed for all but the last panel, where the bandwidth is chosen flexibly on either side of the threshold. ’KP F-stat’ refers to the Kleibergen-Paap F-statistic. Standard errors clustered by the running variable. * P<0.1, ** P<0.05, *** P<0.01. 62 G.4 Main specification on individual-level data Table G.3 shows the main specification estimated on individual-level daily data and Table G.4 shows the corresponding optimal bandwidths. The individual- level data has not been transformed to an annual basis, i.e., the interpretation of the coefficients in panel A and B is that they measure the increase in the number of visits per day. For comparison with the estimates from collapsed data, these coefficients should be multiplied by 365. For the full sample, this implies a first stage coefficient of -.149 and a reduced form coefficient of .079. (Any differences to the estimates using the collapsed data are due to the dif- ferences in the bandwidths.) The Wald-estimates in panel C, i.e. the ratio of estimates in panel A and B, is directly comparable with the corresponding esti- mate for the collapsed data (multiplying both the nominator and denominator by 365 does not affect the ratio), only slightly larger (due to the larger reduced form estimate). Just as for the collapsed data, the individual-level data gives us a 95% confidence interval that excludes both 0 and -1 (-.979 – - .082), i.e. we may rule out both zero and complete substitution. Table G.3: Fuzzy diff-in-disc results; individual-level data All Men Women A. FIRST STAGE (SHARP DIFF-IN-DISC) Online consultations -0.000409*** -0.000203*** -0.000642*** (0.0000413) (0.0000431) (0.0000791) B. RF (SHARP DIFF-IN-DISC) In-person visits 0.000217** 0.000136 0.000164 (0.0000902) (0.000130) (0.000155) C. IV (FUZZY DIFF-IN-DISC) In-person visists -0.531** -0.672 -0.255 (0.229) (0.659) (0.244) Observations 38711023 14316737 17477450 Individuals 227265 110600 108768 Note: The content in this table mirrors the results in Table 3 on collapsed data, but here estimations are based on individual-level daily data. Variable names in left column represent the dependent variable. Each model uses the MSE- optimal bandwidth for In-person visits for the respective estimation sample (All, Men, Women). Standard errors in parantheses clustered by individual. * P<0.1, ** P<0.05, *** P<0.01. 63 Table G.4: Opt bandwidths for individual-level estimations, main specification BANDWIDTH Before 20th Birthday After 20th Birthday Both Pre-DCT 118 91 2018 116 115 Men Pre-DCT 72 77 2018 92 71 Women Pre-DCT 89 107 2018 119 79 Note: Optimal bandwidths for y=in-person visits before and after 20th birthday using individual-level data. Separate bandwidth estimations for pre-DCT and 2018 birth cohorts. 64 G.5 Wild bootstrapping precision adjustment In this section we examine the sensitivity of our results to the estimation of standard errors. A recent literature discusses methods to obtain bias-corrected estimates and robust confidence intervals for standard RD settings with data- driven bandwidth choices (Calonico, Cattaneo and Titiunik, 2014; He and Bar- talotti, 2020). This literature has not yet developed methods adapted to a diff- in-disc setting, so we modify the wild bootstrap procedure developed (and thor- oughly described in) He and Bartalotti (2020) to fit such a setting. In short, He and Bartalotti (2020) estimate the bias from choosing an optimal bandwidth, h, using a higher order polynomial for a longer bandwidth b. Using a given set of h and b this procedure consists of two algorithms (both h and b are allowed to vary each side of the threshold). The first algorithm estimates the bias, and the second algorithm estimates the distribution. Both algorithms rely on a higher order polynomial with bandwidth b mimicking the data generated process, and the estimation of linear polynomials with bandwidth h of a dataset obtained from the data generating process. Using the notation from our study setting, the procedure inHe and Bartalotti (2020) estimates Z+c and Z − c for Z ∈ (DC T, y) and a single cohort, c, in order to obtain the bias and distribution of a fuzzy RD θ RD = (y + c −y − c ) (DC T+c −DC T − c ) . By contrast, our modified procedure estimates Z+c and Z − c for Z ∈ (DC T, y) and c ∈ (0,1) in order to obtain the bias and distribution of θ = τy τDC T = (y+1 − y − 1 )− (y + 0 − y − 0 ) (DC T+1 −DC T − 1 )− (DC T + 0 −DC T − 0 ) (9) The first three rows in each of the three panels of Table G.5 reproduces the coefficients, standard errors and 95% confidence intervals obtained in our main specification using the MSE-optimal bandwidth for the relevant estima- tion sample (see Table 3). The table further displays the bias corrected esti- mates and confidence interval obtained from the bootstrap procedure. The bias-corrected bootstrap results are overall in line with the main re- sults. The bias corrected coefficient for the fuzzy diff-in-disc for the full sample equals -.51 compared to the main estimate of -.45. The bootstrapped confi- dence interval is only slightly broader, and excludes both zero and one. When splitting the sample by gender, we observe that the bias corrected coefficients are larger for women, but smaller for men, compared to corresponding stan- dard coefficient. While the bootstrapped confidence intervals are broader, they lead to the same conclusions as the standard one. 65 Notably, both bias corrected coefficients and bootstrapped confidence in- tervals for the first stage are very similar to the standard coefficients and con- fidence intervals. Thus, the difference in the fuzzy diff-in-disc comes from the bias-correction of the reduced form. 66 Table G.5: Bootstrapped standard errors comparison All Men Women FIRST STAGE Coefficient -0.149 -0.0772 -0.238 SE (0.0119) (0.0143) (0.0252) CI (-0.172, -0.125) (-0.105, -0.0493) (-0.287, -0.189) Bias Corrected Coeff -0.151 -0.0828 -0.245 Bootstrapped CI [-0.176, -0.128] [-0.119, -0.0622] [-0.301, -0.201] F-stat 154.8 29.31 89.28 REDUCED FORM Coefficient 0.0673 0.0356 0.0437 SE (0.0307) (0.0464) (0.0500) CI (0.00704, 0.128) (-0.0553, 0.126) (-0.0543, 0.142) Bias Corrected Coeff 0.0767 0.0293 0.0617 Bootstrapped CI [0.0185, 0.148] [-0.0731, 0.115] [-0.0244, 0.180] IV Coefficient -0.453 -0.461 -0.184 SE (0.206) (0.596) (0.210) CI (-0.857, -0.0496) (-1.629, 0.708) (-0.595, 0.227) Bias Corrected Coeff -0.510 -0.272 -0.248 Bootstrapped CI [-0.989, -0.0800] [-1.599, 1.317] [-0.718, 0.112] F-stat 35.01 6.656 24.30 Note: The table shows IV results with sample split by gender and two sets of co- efficients, standard errors and confidence inervals. Each model uses the MSE- optimal bandwidth for In-person visits varying by the sex; see bandwidth in first three columns in Table 3. In each panel, the first three rows (Coefficient, SE, CI) represent the results as we have presented so far with the standard way of calcu- lating standard errors. The following two rows in each panel (Bias corrected coeff, Bootstrapped CI) present the same IV coefficient with bootstrapped standard er- rors corrected for potential bias, and the corrsponding confidence intervals. 67 G.6 Local randomisation estimates In this section we examine another potential avenue to increase precision. Re- calling our discussion of the regressions lines in fig.3 (see Section 5.3), we con- sider changing our modelling assumptions to better fit the structure of the data. In particular, we change the modelling framework from the standard continuity- based (CB) framework – which assumes and models a continuous relationship between the running and the outcome variables – to the so-called local ran- domisation (LR) framework (e.g. Branson and Mealli, 2018). The LR framework relies on the intuition that usually motivates the RD de- sign – i.e., that units close to the threshold can be thought of as "as good as ran- domly assigned" over values of the running variable. With local randomisation, this random assignment is assumed to hold not only at the threshold as with the standard RD design, but within a window (bandwidth) on either side of the threshold. Under this assumption, the researcher can use the standard toolkit available for analysing randomised experiments (including instrumental vari- ables techniques to deal with non-compliance) to estimate causal effects. Adopting the LR framework thus essentially means that we restrict the co- efficients on functions of the continuous running variable to zero. While this a strong assumption, we think it might be plausible for two reasons. One rea- son, as seen in Section 5.3, is the lack of a trend over the range of the running variable supports the notion that there is no sorting with respect to age on the daily level. Our second motivation for adopting the LR framework is that the as- good-as-random assumption can be interpreted as saying that people are not systematically different on each side of the age threshold. This appears plau- sible as long as we use relatively short bandwidths around the 20th birthday. The larger window – or bandwidth – one uses to estimate, the stronger this key assumption becomes (Branson and Mealli, 2018). Table G.6 shows both standard CB and LR specifications for the whole sam- ple and by gender, using different bandwidths. Looking at the results for both genders (column All), the LR estimates hover around the estimate from our pre- ferred specification (Table 3, with an estimate of -.453 and a s.e. of .206). For the very shortest bandwidth, the confidence interval covers 0 but not -1, whereas the confidence interval for the next shortest bandwidth covers -1 but not 0. For longer bandwidths, the confidence intervals cover neither -1 nor 0, just as in the main specification. With bandwidths of 180 or wider, LR estimates are much more precise than the main estimate. For instance, the s.e. of the estimate with a bandwidth of 180 is .15. Nonetheless, the confidence intervals are still some- 68 Table G.6: Continuity-based and local randomisation models ALL MEN WOMEN CB LR CB LR CB LR BW 30 -0.797 -0.337 -3.687 -0.877 0.0434 -0.180 (-2.057, 0.463) (-0.809, 0.136) (-7.320, -0.0536) (-2.292, 0.539) (-1.158, 1.245) (-0.646, 0.286) K-P F-stat 34.00 72.35 5.550 14.53 19.17 58.49 N 1200 5982010 600 3081349 600 2900661 BW 60 -0.228 -0.647 -0.753 -1.268 -0.0497 -0.449 (-0.827, 0.371) (-1.101, -0.192) (-2.222, 0.717) (-2.619, 0.0835) (-0.645, 0.546) (-0.889, -0.00946) K-P F-stat 93.51 79.66 19.94 18.57 55.21 62.24 N 2400 11965070 1200 6162953 1200 5802117 BW 90 -0.440 -0.564 -0.965 -1.046 -0.266 -0.400 (-0.884, 0.00345) (-0.953, -0.175) (-2.144, 0.214) (-2.155, 0.0622) (-0.713, 0.180) (-0.779, -0.0196) K-P F-stat 126.2 110.3 29.36 25.95 81.67 86.16 N 3600 17950218 1800 9246606 1800 8703612 BW 120 -0.450 -0.438 -1.026 -0.850 -0.260 -0.305 (-0.842, -0.0576) (-0.785, -0.0897) (-2.102, 0.0496) (-1.926, 0.226) (-0.645, 0.124) (-0.638, 0.0282) K-P F-stat 163.8 134.5 37.76 25.65 103.4 112.2 N 4800 23942202 2400 12332562 2400 11609640 BW 180 -0.371 -0.346 -0.842 -0.699 -0.217 -0.231 (-0.709, -0.0332) (-0.643, -0.0498) (-1.780, 0.0972) (-1.622, 0.224) (-0.551, 0.118) (-0.514, 0.0531) K-P F-stat 207.8 190.3 47.23 34.49 137.4 161.8 N 7200 35939757 3600 18511303 3600 17428454 BW 240 -0.260 -0.481 -0.644 -1.055 -0.130 -0.315 (-0.570, 0.0499) (-0.757, -0.205) (-1.504, 0.215) (-1.993, -0.117) (-0.435, 0.176) (-0.572, -0.0571) K-P F-stat 242.7 223.0 56.73 36.94 162.9 194.7 N 9600 47953591 4800 24698701 4800 23254890 BW 360 -0.250 -0.520 -0.665 -0.859 -0.130 -0.401 (-0.502, 0.00176) (-0.765, -0.276) (-1.514, 0.185) (-1.578, -0.140) (-0.363, 0.104) (-0.637, -0.165) K-P F-stat 332.5 276.2 58.43 55.31 266.8 230.5 N 14400 71921313 7200 37037111 7200 34884202 Note: IV estimates at various bandwidths using continuity-based (CB) or local randomisation (LR) specifications. CB models use a linear polynomial and the LR models use a zero-degree polynomial in the running variable. K-P F-stat is Kleibergen-Paap F-statistic. N=number of cells in CB specifications (collapsed data) and N=number of individual-days in LR specifications. Standard errors clustered by the running variable (separate clusters for pre- and post cohorts) in CB models and by individual in LR models. 95% confidence intervals in parentheses. * P<0.1, ** P<0.05, *** P<0.01. what wide. When dividing by gender, the LR estimates with very long bandwidths in- dicate that there is some substitution going on for both men and women, and 69 potentially more so for men. 70 Appendix H Regional and urban-rural heterogene- ity H.1 Regional heterogeneity Figure H.1 plots annual DCT consultations per capita by the day relative to the 20th birthday (i.e., the DCT user fee threshold) for men and women separated by regions for 2017 and 2018. The average number of consultations is larger in Region Stockholm (green triangles) compared to Region Västra Götaland (grey circles) for both years and genders. The drop at the 20th birthday is also more profound in Stockholm, at least in 2018. This is also confirmed by the formal analysis in panel A in Table H.1. Figures H.2, H.3 and H.4 plot in-person physi- cian consultations by region for the full sample, men, then women, respectively. Figure H.1: DCT consultations in the postperiod, by gender and region Table H.1 present the estimation corresponding to the graphical analysis in Figure H.1 (panel A) and figures H.2, H.3 and H.4 (Panel B). In contrast to the specification presented in table 5 in the main text, these estimates are ob- tained using the optimal bandwidth for the relevant estimation sample. Re- sults are overall similar. Lastly, table H.2 presents IV results over regions for a 71 Figure H.2: Physician visits over time by region Figure H.3: Physician visits over time, men by region 72 Figure H.4: Physician visits over time, women by region range of fixed bandwidths. For bandwidths not too dissimlar from the optimal badwidths used before, the confidence interval of the estimates is qualitatively similar to the results seen in the main paper. See Table 5 also for average indi- viduals used per cell in the pre and post periods, respectively. H.2 Urban-rural dimension In the main text, we present only heterogeneity in the urban/rural dimension for the fuzzy diff-in-disc estimates for the optimal bandwidth used in the main estimations. Table H.3 provides complementary information. The first two panels present first stage estimates where the first panel (A) corresponds to the optimal bandwidth used in the main paper, and the second panel (B) the optimal bandwidth as specified by each sex-region sample. The last panel (C) presents the fuzzy diff-in-disc results from an analysis splitting the sample by region and urban/rural dimension using the optimal bandwidth for the rele- vant estimation samples. Results are similar to the results in table 6 where we use the bandwidths from the main estimation. See Table 6 also for average in- dividuals used per cell in the pre and post periods, respectively. As complementary information we also provide table H.4; estimates across the urban dimension over a range of fixed bandwdiths. For the range of band- widths not far from the optimal bandwidth, we find similar results as in the 73 Table H.1: Regional heterogeneity All Men Women A. FIRST STAGE (SHARP DIFF-IN-DISC) Both -0.149*** -0.0772*** -0.238*** (0.0119) (0.0143) (0.0252) Sthlm -0.162*** -0.0793*** -0.266*** (0.0177) (0.0190) (0.0378) VGR -0.126*** -0.0591*** -0.183*** (0.0191) (0.0180) (0.0382) B. RF (SHARP DIFF-IN-DISC) Both 0.0673** 0.0356 0.0437 (0.0307) (0.0464) (0.0500) Sthlm 0.121*** 0.112* 0.0833 (0.0450) (0.0597) (0.0752) VGR 0.000813 0.0216 0.0162 (0.0507) (0.0664) (0.0841) C. IV (FUZZY DIFF-IN-DISC) Both -0.453** -0.461 -0.184 (0.206) (0.596) (0.210) Sthlm -0.744*** -1.408* -0.313 (0.286) (0.794) (0.283) VGR -0.00646 -0.364 -0.0887 (0.402) (1.131) (0.456) Note: The table shows estimates for both regions combined (Both), Region Stockholm (Sthlm) and Region Västra Götaland (VGR). Each model uses the MSE-optimal bandwidths for In-person visits for the re- spective estimation sample. Estimates using data collapsed by region, gender, birth year and day relative to 20th birthday. Standard errors in parantheses, clustered by the running variable with separate clusters for pre- and post cohorts. * P<0.1, ** P<0.05, *** P<0.01.Variable names in left column represent the dependent variable. main paper. 74 Table H.2: Regional heterogeneity over bandwidths: IV All Men Women BANDWIDTH 60 Both -0.228 -0.753 -0.0497 (0.306) (0.750) (0.304) SLL -0.274 -0.860 -0.0746 (0.347) (0.831) (0.341) VGR -0.119 -0.526 0.0192 (0.680) (1.660) (0.634) BANDWIDTH 90 Both -0.440* -0.965 -0.266 (0.226) (0.601) (0.228) SLL -0.633** -1.266* -0.422 (0.305) (0.769) (0.304) VGR -0.112 -0.482 0.0101 (0.433) (1.080) (0.411) BANDWIDTH 120 Both -0.450** -1.026* -0.260 (0.200) (0.549) (0.196) SLL -0.767*** -1.384** -0.543* (0.288) (0.706) (0.287) VGR 0.0137 -0.440 0.145 (0.343) (0.974) (0.313) BANDWIDTH 180 Both -0.371** -0.842* -0.217 (0.172) (0.479) (0.171) SLL -0.671*** -1.132* -0.502** (0.249) (0.608) (0.244) VGR 0.0719 -0.363 0.197 (0.290) (0.871) (0.275) BANDWIDTH 365 Both -0.260** -0.707 -0.135 (0.126) (0.437) (0.116) SLL -0.370** -0.919* -0.205 (0.174) (0.544) (0.159) VGR -0.0911 -0.381 -0.0197 (0.209) (0.738) (0.192) Note: IV (fuzzy diff-in-disc) estimates with a set of panels each with a different set of fixed bandwidths. Robust standard errors in parantheses clustered on relative age with sepa- rate clusters for pre and post period. Estimates for regions separately: Region Stockholm (Sthlm) and Region Västra Götaland (VGR). * P<0.1, ** P<0.05, *** P<0.01. 75 Table H.3: Urban/rural heterogeneity, different bandwidths ALL MEN WOMEN Rural Urban Rural Urban Rural Urban A: FIRST STAGE, OPT BANDWIDTH (MAIN SPEC) Both -0.0974*** -0.173*** -0.0266* -0.0923*** -0.174*** -0.260*** (0.0191) (0.0155) (0.0147) (0.0167) (0.0362) (0.0300) SLL -0.0963* -0.172*** 0.00869 -0.0927*** -0.205* -0.258*** (0.0582) (0.0187) (0.0420) (0.0187) (0.108) (0.0361) VGR -0.0978*** -0.175*** -0.0343** -0.0908** -0.167*** -0.266*** (0.0206) (0.0322) (0.0151) (0.0360) (0.0401) (0.0527) B: FIRST STAGE, OPT BANDWIDTH BY REGION AND SEX Both -0.0974*** -0.173*** -0.0329* -0.0983*** -0.170*** -0.270*** (0.0191) (0.0155) (0.0169) (0.0202) (0.0371) (0.0317) SLL -0.0970* -0.170*** -0.0162 -0.0869*** -0.200* -0.274*** (0.0572) (0.0187) (0.0480) (0.0205) (0.116) (0.0397) VGR -0.0991*** -0.164*** -0.0332** -0.0970** -0.149*** -0.231*** (0.0211) (0.0330) (0.0155) (0.0383) (0.0462) (0.0609) C: IV, OPT BANDWIDTH BY REGION AND SEX Both -0.256 -0.503** -2.046 -0.191 0.00510 -0.242 (0.543) (0.216) (2.650) (0.556) (0.459) (0.228) Sthlm -2.998 -0.589** -11.16 -1.180 -1.817 -0.182 (2.147) (0.278) (35.34) (0.735) (1.629) (0.287) VGR 0.475 -0.456 -0.953 -0.0715 0.518 -0.677 (0.631) (0.500) (2.472) (1.044) (0.669) (0.650) Note: Panels A and B display first stage estimates. Panel A shows the first stage equivalent to that in the main paper, but here with the sample split by urban di- mension. Panel B uses the bandwidth optimised for each sample (sex, region). Thus, the row ’Both’ in the first and second panel for the two ’All’ columns are equivalent. Panel C shows IV estimates (fuzzy diff-in-disc) using bandwidth opti- mised for each sample (sex, region) in contrast to the results presented in the main paper. Estimates use data collapsed by region, gender, birth year and day relative to 20th birthday. Standard errors in parantheses clustered by relative age (running variable), with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 76 Table H.4: Urban heterogeneity over fixed bandwidths: IV ALL MEN WOMEN Rural Urban Rural Urban Rural Urban BANDWIDTH 30 Both -0.248 -0.221 -1.463 -0.554 0.162 -0.104 (0.772) (0.299) (2.070) (0.772) (0.733) (0.299) Sthlm -1.932 -0.173 -10.74 -0.677 -1.226 0.0103 (2.205) (0.344) (43.26) (0.781) (1.880) (0.341) VGR 0.130 -0.445 -0.918 0.133 0.570 -0.567 (0.885) (0.958) (1.958) (2.859) (0.871) (0.877) BANDWIDTH 90 Both -0.296 -0.478** -1.659 -0.779 0.0325 -0.363 (0.527) (0.242) (1.858) (0.601) (0.467) (0.252) Sthlm -2.189 -0.505* -4.867 -1.155 -1.941 -0.275 (1.648) (0.301) (14.89) (0.726) (1.592) (0.303) VGR 0.157 -0.400 -1.364 0.254 0.592 -0.638 (0.613) (0.568) (1.786) (1.198) (0.565) (0.600) BANDWIDTH 120 Both -0.0490 -0.561*** -1.467 -0.971* 0.158 -0.397* (0.503) (0.212) (2.684) (0.515) (0.412) (0.220) Sthlm -2.920 -0.631** 12.11 -1.113* -1.731 -0.433 (2.291) (0.280) (31.91) (0.625) (1.463) (0.288) VGR 0.495 -0.391 -0.170 -0.619 0.630 -0.304 (0.558) (0.400) (2.083) (0.942) (0.483) (0.406) BANDWIDTH 180 Both 0.213 -0.512*** -0.951 -0.840* 0.362 -0.384** (0.499) (0.180) (2.996) (0.440) (0.418) (0.183) Sthlm -3.704 -0.564** 1.813 -1.013* -2.417 -0.382 (4.002) (0.239) (4.573) (0.544) (2.099) (0.241) VGR 0.695 -0.387 -0.307 -0.407 0.893* -0.381 (0.528) (0.324) (2.041) (0.782) (0.477) (0.334) BANDWIDTH 365 Both 0.0928 -0.365*** -0.577 -0.748* 0.224 -0.245* (0.298) (0.137) (1.223) (0.450) (0.270) (0.127) Sthlm -0.331 -0.375** 2.046 -1.021* -0.618 -0.177 (0.890) (0.172) (6.432) (0.539) (0.861) (0.160) VGR 0.173 -0.338 -0.851 0.0176 0.393 -0.407* (0.321) (0.254) (1.201) (0.823) (0.297) (0.240) Note: IV estimates over panels each with a different set of fixed bandwidths, using the continuous based model. Robust standard errors in parantheses clustered on relative age with separate clusters for pre and post period. Mod- els estimated separately for individuals in the most urban (Urban) and other (Rural) municipalities. Estimates for both regions (Both), Region Stockholm (Sthlm), and Region Västra Götaland (VGR). * P<0.1, ** P<0.05, *** P<0.01. 77 Appendix I Robustness analysis of diagnosis data Table I.1 shows estimations similar to the corresponding table in the main text but with optimal bandwidth for the outcome of interest and the relevant esti- mation sample. Tables I.2, I.3, I.4, and I.5 present results using the same esti- mation approach but for a variety of fixed bandwidths. The results from estimations using other bandwidths support the main pat- tern of the substitution within diagnosis groups from table 7. The results are overall similar, suggesting an overall degree of substitution of about 30-45%, except for the shortest bandwidth of 60 days. The result that consultations re- lated to upper respiratory infections display the highest degree of substitution holds for all bandwidths. The overall pattern from table 7 remains also when splitting the sample by gender. Although there is more variation in size across bandwidths, consulta- tions related to upper respiratory infections display the highest degree of sub- stitution for both genders. The first stage estimates are similar across band- widths for both genders (and jointly). The variation comes from the reduced form (the sharp diff-in-disc in-person visits). 78 Table I.1: Decomposition by type of diagnosis, Opt bw each sample A: FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.130*** -0.0258*** -0.0457*** -0.0259*** -0.0461*** (0.0110) (0.00608) (0.00615) (0.00577) (0.00704) Men -0.0587*** -0.00586 -0.0252*** -0.00562** -0.0268*** (0.0117) (0.00622) (0.00700) (0.00227) (0.00784) Women -0.216*** -0.0507*** -0.0644*** -0.0490*** -0.0694*** (0.0226) (0.0113) (0.0116) (0.0119) (0.0143) B: IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.377* -1.243** -0.190 0.0206 -0.498 (0.211) (0.597) (0.263) (0.344) (0.411) Men -0.189 -3.424 -0.412 -1.464 1.287 (0.644) (4.939) (0.610) (1.415) (1.104) Women -0.389* -0.637 -0.0415 0.212 -0.0560 (0.205) (0.436) (0.286) (0.338) (0.465) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). Each model uses MSE-optimal bandwidths for In-person visits for the respective diagnosis group and estimation sample (All, Men, Women), with separate bandwidths for the pre-DCT cohorts and the post DCT cohort (2018), and varying bandwidths on the left- and right- hand side of the age cutoff. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the run- ning variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 79 Table I.2: Decomposition by type of diagnosis, (bw=60) A. FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.133*** -0.0218*** -0.0481*** -0.0242*** -0.0478*** (0.0154) (0.00808) (0.00845) (0.00736) (0.0112) Men -0.0604*** -0.00183 -0.0329*** -0.00553** -0.0224** (0.0141) (0.00794) (0.00843) (0.00271) (0.00900) Women -0.212*** -0.0434*** -0.0646*** -0.0445*** -0.0753*** (0.0274) (0.0143) (0.0139) (0.0152) (0.0183) B. IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.160 -1.424 -0.0804 0.147 0.385 (0.275) (0.956) (0.320) (0.474) (0.522) Men -0.312 -13.72 -0.175 -2.393 1.152 (0.708) (61.82) (0.554) (2.004) (1.399) Women -0.109 -0.860 -0.0269 0.491 0.144 (0.261) (0.650) (0.335) (0.484) (0.522) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). Each model uses a fixed bandwidth of 60 days each side of the age cutoff. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the run- ning variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 80 Table I.3: Decomposition by type of diagnosis, (bw=90) A. FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.132*** -0.0272*** -0.0383*** -0.0274*** -0.0480*** (0.0125) (0.00648) (0.00690) (0.00629) (0.00844) Men -0.0601*** -0.00566 -0.0237*** -0.00466** -0.0277*** (0.0114) (0.00675) (0.00687) (0.00224) (0.00720) Women -0.210*** -0.0504*** -0.0541*** -0.0521*** -0.0700*** (0.0227) (0.0114) (0.0114) (0.0129) (0.0142) B. IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.378* -1.207** -0.203 -0.00501 -0.268 (0.220) (0.583) (0.324) (0.345) (0.437) Men -0.630 -3.224 -0.384 -2.322 -0.136 (0.593) (5.282) (0.650) (1.916) (0.892) Women -0.295 -0.957** -0.116 0.223 -0.320 (0.211) (0.468) (0.331) (0.328) (0.462) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). Each model uses a fixed bandwidth of 90 days each side of the age cutoff. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the run- ning variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 81 Table I.4: Decomposition by type of diagnosis, (bw=120) A. FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.131*** -0.0260*** -0.0424*** -0.0244*** -0.0457*** (0.0106) (0.00553) (0.00571) (0.00555) (0.00696) Men -0.0603*** -0.00689 -0.0214*** -0.00347* -0.0308*** (0.00988) (0.00605) (0.00578) (0.00197) (0.00639) Women -0.208*** -0.0467*** -0.0650*** -0.0470*** -0.0619*** (0.0198) (0.00980) (0.00949) (0.0113) (0.0119) B. IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.436** -1.589*** -0.0426 0.0846 -0.504 (0.193) (0.553) (0.259) (0.339) (0.407) Men -0.693 -2.336 -0.223 -1.418 -0.601 (0.526) (3.216) (0.656) (2.036) (0.716) Women -0.351* -1.465*** 0.0223 0.208 -0.446 (0.186) (0.490) (0.241) (0.319) (0.462) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). Each model uses a fixed bandwidth of 120 days each side of the age cutoff. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the run- ning variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 82 Table I.5: Decomposition by type of diagnosis, (bw=180) A. FIRST STAGE (SHARP DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.129*** -0.0297*** -0.0339*** -0.0257*** -0.0459*** (0.00834) (0.00432) (0.00483) (0.00454) (0.00542) Men -0.0568*** -0.0109** -0.0168*** -0.000292 -0.0306*** (0.00808) (0.00494) (0.00490) (0.00169) (0.00499) Women -0.208*** -0.0500*** -0.0526*** -0.0534*** -0.0626*** (0.0157) (0.00761) (0.00798) (0.00931) (0.00952) B. IV (FUZZY DIFF-IN-DISC) Common Resp Skin Gen/Rep Other All -0.308* -0.882** -0.0191 0.0997 -0.397 (0.157) (0.349) (0.263) (0.263) (0.322) Men -0.739 -1.193 -0.348 -24.77 -0.741 (0.454) (1.322) (0.685) (143.7) (0.607) Women -0.174 -0.800** 0.0951 0.252 -0.205 (0.153) (0.322) (0.250) (0.238) (0.362) Note: Table shows results from the first stage equation (sharp diff-in-disc) and the IV-model (fuzzy-diff-in-disc) by diagnosis groups. The first column shows all Common diagnoses set by DCT providers. In the second to fourth column, these common diagnoses are decomposed into subgroups: upper respiratory infections (Resp), skin related diseases (Skin), genital and repro- ductive organs Gen/Rep, and Other common diagnoses. Each row presents the diff-in-disc estimates for the diagnosis groups for the given estimation sample (All, Men, Women). Each model uses a fixed bandwidth of 180 days each side of the age cutoff. Estimates using data collapsed by region, gender, year and day relative to 20th birthday. Standard errors clustered by the run- ning variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 83 Appendix J Other primary care consultations Tables J.1 and J.2 display fuzzy diff-in-disc results for primary care consulta- tions with other health care professionals than physicians (e.g., nurses) using other bandwidths than Table 8 in the main text. The patterns are consistent across tables and bandwidths. The results suggest that the degree of substitu- tion is slightly larger when including consultations with other health care pro- fessionals than physicians. The estimate suggests that there is partial substitu- tion, but the estimates are noisier than when we include physician consulta- tions only and the confidence intervals no longer exclude full substitution. For women, there is a consistent negative (but insignificant) coefficient on consultations at midwife/youth/STD clinics. To further explore this pattern, we retain the same outcome variable but restrict the first stage to only include online consultations with diagnoses related to the genital and reproductive or- gans. Table J.3 presents the fuzzy diff-in-disc estimates. For women, the results hover around -1, which suggests that women substitute online consultations (with physicians) for in-person midwife visits related to contraceptive manage- ment. In other words, the observed the lack of substitution between in-person physician visits and online consultations for these diagnoses (Table 7) is ex- plained by another type of substitution. Note that the large positive coefficients among men is primarily due to a very weak first stage (and likely relate to visits at a STD or youth clinic). 84 Table J.1: Visits to other health care professionals FUZZY DIFF-IN-DISC All Men Women Nurse visits at PCC -0.0882 -0.592 0.143 (0.160) (0.403) (0.140) Visits at midwife/youth/STD clinic -0.137 0.393* -0.146 (0.206) (0.227) (0.240) Nurse+midwife/youth/STD -0.372 -0.317 0.000166 (0.264) (0.460) (0.275) All consultations -0.762* -0.809 -0.329 (0.389) (0.882) (0.374) Note:Table shows fuzzy diff-in-discs estimates of the effect of on- line consultations on in-person consultations with other health care professionals than physicians at primary care centers: consultations with a nurse at a primary care center; consultations with a mid- wife/nurse/physician at a midwife/youth/STD clinic, the sum of these two types of consultations; and the sum of all consultations, including physician consultations at a primary care center. Each model uses the MSE-optimal bandwidth for the relevant outcome variable and sam- ple. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 85 Table J.2: Visits to other health care professionals, fixed bandwidth FUZZY DIFF-IN-DISC All Men Women BANDWIDTH=60 Nurse visits at a PCC -0.217 -0.474 -0.130 (0.166) (0.416) (0.176) Visits at midwife/youth/STD clinic 0.00387 0.457 -0.136 (0.260) (0.279) (0.328) Nurse+midwife/youth/STD -0.213 -0.0166 -0.266 (0.311) (0.458) (0.391) All consultations -0.441 -0.769 -0.316 (0.497) (0.929) (0.552) BANDWIDTH=120 Nurse visits at a PCC -0.0545 -0.605* 0.124 (0.118) (0.312) (0.120) Visits at a midwife/youth/STD clinic -0.129 0.352* -0.274 (0.176) (0.208) (0.219) Nurse+midwife/youth/STD -0.183 -0.253 -0.150 (0.211) (0.350) (0.249) All consultations -0.633** -1.279* -0.410 (0.320) (0.694) (0.339) BANDWIDTH=180 Nurse visits at a PCC -0.0594 -0.608** 0.113 (0.104) (0.284) (0.104) Visits at a midwife/youth/STD clinic -0.151 0.459** -0.323* (0.150) (0.199) (0.184) Nurse+midwife/youth/STD -0.211 -0.149 -0.209 (0.184) (0.321) (0.210) All consultations -0.582** -0.991 -0.426 (0.273) (0.613) (0.282) Note: Table shows fuzzy diff-in-discs estimates of the effect of on- line consultations on in-person consultations with other health care professionals than physicians at primary care centers: consultations with a nurse at a primary care center; consultations with a mid- wife/nurse/physician at a midwife/youth/STD clinic, the sum of these two types of consultations; and the sum of all consultations, , includ- ing physician consultations at a primary care center. Each model uses a fixed bandwidth of 60/120/180 days each side of the threshold. Stan- dard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 86 Table J.3: Consultations at a midwife/youth/STD clinic FUZZY DIFF-IN-DISC INSTRUMENT: GEN/REP DCT All Men Women OPTIMAL BW, MAIN OUTCOME Visits at a midwife/youth/STD clinic -0.350 5.522 -0.939 (1.048) (3.881) (1.114) OPTIMAL BW, RELEVANT OUTCOME Visits at a midwife/youth/STD clinic -0.787 6.846 -0.693 (1.196) (5.141) (1.155) BANDWIDTH=60 Visits at a midwife/youth/STD clinic 0.0231 5.730 -0.691 (1.549) (4.424) (1.673) BANDWIDTH=120 Visits at a midwife/youth/STD clinic -0.785 7.115 -1.366 (1.085) (5.721) (1.135) BANDWIDTH=180 Visits at a midwife/youth/STD clinic -0.829 101.6 -1.356* (0.829) (589.4) (0.798) Note: Table shows fuzzy diff-in-discs estimates of the effect of online con- sultations (with a registered diagnosis related to genital and reproductive health) on in-person consultations with a midwife/nurse/physician at a mid- wife/youth/STD clinic. That is, the excluded instrument is DCT consulta- tions with a registered diagnosis related to genital and reproductive health. See section 5.6.2. Each model uses a fixed bandwidth of 60/12/180 days each side of the threshold. Standard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 87 Appendix K Prescriptions of antibiotics Table K.1 shows sharp diff-in-disc estimates of the effects of the onset of the DCT user fee on various outcomes related to antibiotic prescriptions.35 The positive and significant coefficients in the first column indicate that the user fee has a positive effect on the total number of antibiotic prescriptions. That is, the larger use of online consultations among 19–year-olds relative to 20–year- olds does not increase their antibiotics consumption, but rather decreases it. The results by antibiotic types (columns 2-4) further suggest that the in- crease is primarily driven by prescriptions of antibiotics related to respiratory infections (rather than antibiotics related to skin conditions or cystitis). To- gether with the results in section 5.6.2, which suggest that close to all DCT con- sultations for respiratory infection replace in-person visits, we interpret these results as indicating that physicians are more (or at least not less) restrictive during online consultations in terms of prescribing antibiotics for respiratory infections. As for the main analysis, the diff-in-disc estimates are sensitive to general trends that would affect the size of the drop at the 20th birthday even without the onset of the DCT user fee. Indeed, antibiotic use has declined over the last decade, and a proportional decrease in the number of prescriptions for indi- viduals each side of the cut-off would generate a positive diff-in-disc estimates as observed in Table K.1. We therefore study the components of the diff-in-disc estimates - the pre and post RD estimates. We also specifically estimate a stan- dard RD in Stockholm, where there is no confounding user fee for in-person visits. Table K.2 shows results from sharp RD before (panel A includes the com- plete pre-period) and after (panel B includes 2018) the introduction of the DCT- services. The 20th birthday is associated with a decrease in the number of prescriptions and daily doses before the DCT, likely driven by the onset of the user fee in the region of Västra Götaland. In 2018, the same discontinuity is associated with an insignificant increase in the number of prescriptions and daily doses. Thus, these results implies that the estimates in table K.1 is not only driven by a general decrease in the number of prescriptions but an actual change in the sign of the effect at the discontinuity. 35Antibiotics is defined as all at-codes within J01 except Metenamin J01XX05. We follow the Public Health Agency of Sweden defining three categories of antibiotics relating to respi- ratory infections (J01AA02, J01CE02, J01CA04, J01CR02, J01DB, J01DC, J01DE, J01FA) cystitis (J01CA08, J01EA01, J01MA02, J01MA06, J01XE01) Skin and soft tissues (J01FF01, J01CF05). 88 Table K.3 presents results for the same outcomes as in the previous tables from a sharp diff-in-disc (panel A) and a RD in 2018 (panel B) for the region of Stockholm. Although the RD estimates are all insignificant and tend to be smaller than the diff-in-disc estimates, the conclusion is still that physicians are not less restrictive in terms of prescribing antibiotics during online consul- tations – if anything they are more restrictive. 89 Table K.1: Antibiotic prescription, sharp diff-in-disc A. SHARP DIFF-IN-DISC All Resp Skin Cystit Daily Doses All 0.0504*** 0.0236** 0.0119* 0.0121 0.608 (0.0163) (0.0116) (0.00688) (0.00861) (0.455) Men 0.0537*** 0.0228 0.00937 0.00309 0.460 (0.0203) (0.0152) (0.0107) (0.00440) (0.647) Women 0.0563** 0.0240 0.00848 0.0198 -0.0683 (0.0258) (0.0196) (0.0101) (0.0171) (0.645) B. BANDWIDTH OPTIMAL FOR IN-PERSON VISITS All Resp Skin Cystit Daily Doses All 0.0403*** 0.0202* 0.00452 0.0102 0.589 (0.0146) (0.0111) (0.00639) (0.00788) (0.460) Men 0.0268 0.0120 0.00207 0.00146 0.668 (0.0183) (0.0138) (0.00958) (0.00393) (0.608) Women 0.0544** 0.0288 0.00716 0.0193 0.500 (0.0240) (0.0178) (0.00888) (0.0157) (0.569) Note: Table shows results from a sharp diff-in-disc (reduced form) on an- tibiotic prescriptions. The first column presents diff-in-disc estimates for the total number of antibiotic prescriptions, columns 2 to 4 present esti- mates for prescriptions of various types: respiratory infections, skin con- ditions, and cystit. Column 5 presents estimates for the total number of daily doses. In panel A, each model uses the MSE-optimal bandwidth for relevant outcome variable and estimation sample. In panel B, each model uses the MSE-optimal bandwidth for in-person visits for both regions and both genders (to prevent bandwidth variation from driving results). Stan- dard errors clustered by the running variable, with separate clusters for pre-post periods. * P<0.1, ** P<0.05, *** P<0.01. 90 Table K.2: Sharp RD of antibiotic prescriptions, pre/post A: SHARP RD OF ANTIBIOTIC PRESCRIPTIONS, PRE All Resp Skin Cystit Daily Doses All -0.0317*** -0.0193*** -0.00506 -0.00733 -0.397* (0.00952) (0.00516) (0.00352) (0.00474) (0.214) Men -0.0335*** -0.0231*** -0.00790 -0.00239 -0.219 (0.0118) (0.00817) (0.00482) (0.00219) (0.295) Women -0.0340** -0.0137* -0.00299 -0.0118 -0.478 (0.0133) (0.00831) (0.00549) (0.00944) (0.320) B: SHARP RD OF ANTIBIOTIC PRESCRIPTIONS, 2018 All Resp Skin Cystit Daily Doses All 0.0187 0.00426 0.00683 0.00473 0.211 (0.0133) (0.0104) (0.00592) (0.00721) (0.403) Men 0.0202 -0.000309 0.00147 0.000699 0.241 (0.0166) (0.0128) (0.00957) (0.00383) (0.578) Women 0.0223 0.0102 0.00548 0.00802 -0.546 (0.0222) (0.0178) (0.00856) (0.0143) (0.563) Note: Table shows results from a sharp regression discontinuity before (panel A) and after (paned B) the introduction of DCT-services. In panel B, we show results for 2018. The first row presents the RD-estimates for any type of prescriptions, columns 2 to 4 present estimates for antibiotic types related to respiratory infections, skin conditions, and cystit. Column 5 presents the same estimate for the total number of daily doses. Each model uses the MSE-optimal bandwidth for relevant outcome variable and estima- tion sample. Data is collapsed by gender, year and day relative to 20th birth- day. Standard errors clustered by the running variable. * P<0.1, ** P<0.05, *** P<0.01. 91 Table K.3: Sharp RD of antibiotic prescriptions (Stockholm 2018) A: DIFF-IN-DISC All Resp Skin Cystit Daily Doses All 0.0446** 0.0289* 0.00622 0.0142 0.883 (0.0210) (0.0160) (0.00918) (0.0116) (0.610) Men 0.0246 0.0213 0.000804 0.00470 0.349 (0.0253) (0.0200) (0.0146) (0.00607) (0.776) Women 0.0651* 0.0215 0.00300 0.0324 0.982 (0.0354) (0.0260) (0.0129) (0.0226) (0.916) B: REGRESSION DISCONTINUITY All Resp Skin Cystit Daily Doses All 0.0263 0.0107 -0.00113 0.00605 0.775 (0.0181) (0.0142) (0.00859) (0.00930) (0.507) Men 0.0166 0.0112 -0.00530 0.00166 0.717 (0.0228) (0.0173) (0.0129) (0.00512) (0.612) Women 0.0357 0.00978 0.00331 0.0102 0.828 (0.0298) (0.0234) (0.0101) (0.0184) (0.723) Note: The table shows estimates from sharp diff-ind-disc in Stockholm in panel A and a sharp regression discontinuity for the Stockholm region using data from 2018 in panel B. The first column presents estimates for any type of prescriptions, columns 2 to 4 present estimates for antibiotic types related to respiratory infections, skin conditions, and cystit. Col- umn 5 presents the same estimate for the total number of daily doses. Each model uses the MSE-optimal bandwidth for relevant outcome vari- able and estimation sample. Data is collapsed by gender, year and day relative to 20th birthday. Standard errors clustered by the running vari- able. * P<0.1, ** P<0.05, *** P<0.01. 92 Appendix L External validity To obtain an idea of the generalisability of our results, we use care register data for 2015 to compare one of the pre-DCT cohorts in our study population with other age groups with respect to a set of health measures. Specifically, we com- pare the study cohort in period -1 – who turned 20 in the second half of 2015 or the first half of 2016 – to individuals who resided in the study regions through- out 2015 (and did not move between the two regions). The reason why we use pre-DCT data for this exercise is that DCT, to the extent that it affected care util- isation, may have had different impact on the care utilisation of our post-DCT study cohort and on other age groups. Our first health measure is the individual’s predicted health care costs ac- cording to the Johns Hopkins ACG(R) System (v 11). This software, which is used for risk-adjustment in many settings (including Swedish primary care), uses diagnoses recorded in care registers to group individuals into Adjusted Clinical Group (ACG) with similar expected costs (similar to DRGs). The soft- ware produces an value for each individual showing showing the expected costs relative to the average costs in the region. Due to the presence of outliers, we recode the ACG values of individuals above the 95th percentile or below so that they get the ACG of the 95th percentile (using the winsor2 package in Stata). Figure L.1 compares the ACG values of the 2015 cohort (empty bars) to the other residents in 2015 in various age groups. As seen from the figure, the study cohort is very similar to the 21-34 age groups in terms of expected health care costs, and quite similar to the 15-19 age group. The study cohort is less similar to children <15 (in particular 0-4 year-olds) and to individuals above 35. Secondly, we compare the groups with respect to the share of individuals who had been diagnosed with at least one of the most common DCT-related diagnoses (the "All" category in Table 7) in primary care in 2015. Notably, this group only includes individuals who visited a (non-DCT) primary care provider in 2015. Figure L.2 shows that roughly 40% of the study cohort received such a diagnosis in 2015, which is similar to most age groups except the very youngest and oldest age groups. We then look at each of the four categories of common DCT diagnoses (resp, gen/repr, skin, other). The proportions with a diagnosis are generally similar in the study cohort and the 15-29 age group, and often also for other age groups. 93 Figure L.1: ACG distribution0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG_below5 2015 cohort 0.2.4.6Fraction 0 1 2 3 4 5ACG5_9 2015 cohort 0.2.4.6Fraction 0 1 2 3 4 5ACG10_14 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG15_19 2015 cohort0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG21_24 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG25_29 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG30_34 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG35_39 2015 cohort0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG40_44 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG45_49 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG50_59 2015 cohort 0.1.2.3.4.5Fraction 0 1 2 3 4 5ACG60_plus 2015 cohort Note: The figure shows the distribution of ACG values in a pre-DCT cohort ("2015 cohort") and the general population in 2015, by age group. ACG captures expected health care costs and is based on diagnoses set during the year. Due to extreme values, the ACG variable is winsorised at the 95th percentile. 94 Figure L.2: Share with a DCT-relevant diagnosis Note: The figure shows the proportion of individuals who were diagnosed with one of the most common diagnoses in DCT in traditional primary care in 2015. The leftmost bar ("2015 cohort") shows the proportion for the youngest pre-DCT cohort. 95 Figure L.3: Share with a DCT-relevant diagnosis; by category Note: The figure shows the proportion of individuals who were diagnosed with a diagnoses in our resp, gen/rep, skin and other diagnosis categories in traditional primary care in 2015. The leftmost bar ("2015 cohort") shows the proportion for the youngest pre-DCT cohort. 96