Summary
Background
There is growing evidence suggesting that beyond the acute phase of SARS-CoV-2 infection, people with COVID-19 could experience a wide range of post-acute sequelae, including diabetes. However, the risks and burdens of diabetes in the post-acute phase of the disease have not yet been comprehensively characterised. To address this knowledge gap, we aimed to examine the post-acute risk and burden of incident diabetes in people who survived the first 30 days of SARS-CoV-2 infection.
Methods
In this cohort study, we used the national databases of the US Department of Veterans Affairs to build a cohort of 181 280 participants who had a positive COVID-19 test between March 1, 2020, and Sept 30, 2021, and survived the first 30 days of COVID-19; a contemporary control (n=4 118 441) that enrolled participants between March 1, 2020, and Sept 30, 2021; and a historical control (n=4 286 911) that enrolled participants between March 1, 2018, and Sept 30, 2019. Both control groups had no evidence of SARS-CoV-2 infection. Participants in all three comparison groups were free of diabetes before cohort entry and were followed up for a median of 352 days (IQR 245–406). We used inverse probability weighted survival analyses, including predefined and algorithmically selected high dimensional variables, to estimate post-acute COVID-19 risks of incident diabetes, antihyperglycaemic use, and a composite of the two outcomes. We reported two measures of risk: hazard ratio (HR) and burden per 1000 people at 12 months.
Findings
In the post-acute phase of the disease, compared with the contemporary control group, people with COVID-19 exhibited an increased risk (HR 1·40, 95% CI 1·36–1·44) and excess burden (13·46, 95% CI 12·11–14·84, per 1000 people at 12 months) of incident diabetes; and an increased risk (1·85, 1·78–1·92) and excess burden (12·35, 11·36–13·38) of incident antihyperglycaemic use. Additionally, analyses to estimate the risk of a composite endpoint of incident diabetes or antihyperglycaemic use yielded a HR of 1·46 (95% CI 1·43–1·50) and an excess burden of 18·03 (95% CI 16·59–19·51) per 1000 people at 12 months. Risks and burdens of post-acute outcomes increased in a graded fashion according to the severity of the acute phase of COVID-19 (whether patients were non-hospitalised, hospitalised, or admitted to intensive care). All the results were consistent in analyses using the historical control as the reference category.
Interpretation
In the post-acute phase, we report increased risks and 12-month burdens of incident diabetes and antihyperglycaemic use in people with COVID-19 compared with a contemporary control group of people who were enrolled during the same period and had not contracted SARS-CoV-2, and a historical control group from a pre-pandemic era. Post-acute COVID-19 care should involve identification and management of diabetes.
Funding
US Department of Veterans Affairs and the American Society of Nephrology.
Introduction
Although diabetes and other glycometabolic abnormalities have been widely reported during the acute phase of COVID-19, less is known about the risk and burden of diabetes and related outcomes in the post-acute phase of COVID-19.
,
,
,
,
,
,
,
A detailed assessment of the risk and burden of diabetes in the post-acute phase of COVID-19 is needed to inform post-acute COVID-19 care strategies.
In this study, we used the US Department of Veterans Affairs (VA) national health-care databases, the Department of Veterans Health Administration (VHA), to build a cohort of US Veterans who survived the first 30 days of COVID-19 between March 1, 2020, and Sept 30, 2021, and two control groups—a contemporary cohort consisting of non-COVID-19 infected participants who used the VHA services during 2019 and a historical cohort consisting of non-COVID-19 infected participants who used the VHA services during 2017. These cohorts were followed-up longitudinally to estimate the risks and burdens of incident diabetes, antihyperglycaemic use, and a composite outcome of these endpoints in the overall cohort and according to the care setting in the acute phase of the disease (non-hospitalised, hospitalised, or admitted to intensive care).
Evidence before this study
We searched PubMed for human studies published between Dec 1, 2019, and Sept 6, 2021, using terms “COVID-19”, “SARS CoV-2” or “long COVID”, and “diabetes”, with no language restrictions. Small studies (<1000 people) limited to short follow-up periods (up to 3 months) showed that people with COVID-19 might be at increased risk of incident diabetes. A large-scale in-depth assessment of the risks and burdens of incident diabetes over a longer time horizon has not been done. In this study, we aimed to examine the post-acute risk and burden of diabetes in people who survived the first 30 days of SARS-CoV-2 infection.
Added value of this study
In this study involving 181 280 people with COVID-19, 4 118 441 contemporary controls, and 4 286 911 historical controls, we provide estimates of risks and 12-month burdens of incident diabetes outcomes. Our results suggest that beyond the first 30 days of infection, COVID-19 survivors exhibited increased risks and burdens of incident diabetes and antihyperglycaemic use. The risks and burdens were significant among those who were non-hospitalised and increased in a graded fashion according to the care setting of the acute phase of the disease (that is whether people were non-hospitalised, hospitalised, or admitted to intensive care during the acute phase of COVID-19). The risks and associated burdens were evident in comparisons versus both the contemporary control group and the historical control group.
Implications of all the available evidence
Altogether, there is evidence to suggest that beyond the acute phase of COVID-19, survivors might be at an increased risk of developing incident diabetes, and increased risk of incident antihyperglycemic use in the post-acute phase of the disease. Diabetes should be considered as a facet of the multifaceted long COVID syndrome. Post-acute care strategies of people with COVID-19 should integrate screening and management of diabetes.
Methods
Study design and participants
We did a cohort study using data from the US Department of VA, which operates the largest nationally integrated health-care system in the US and provides health care to veterans discharged from the US armed forces. We identified 6 242 360 users of the VHA in the year 2019. Within them, 285 656 had a record of COVID-19 positive tests between March 1, 2020, and Sept 30, 2021. We then selected 271 689 participants who were alive 30 days after their COVID-19 positive test. The date of testing positive was set as T0.
From 6 242 360 users of the VHA in 2019, 5 961 637 participants were alive as of March 1, 2020, 5 689 948 of whom were not in the COVID-19 group. To ensure that the contemporary control group had a similar follow-up distribution as the COVID-19 group, we assigned the T0 to the contemporary control group following the same T0 distribution as the COVID-19 group. 5 479 834 contemporary control participants were alive at T0, of whom 5 460 230 were alive 30 days after T0.
Separately, we constructed a historical comparison group by identifying 6 462 011 participants who used the VHA in 2017, of whom 6 151 063 were alive as of March 1, 2018. Within 5 900 962 of those not in the COVID-19 group, T0 was assigned as 2 years before T0 distribution of the COVID-19 group. 5 712 311 participants were alive at T0, of whom 5 695 490 were alive 30 days after T0.
The study was approved by the VA St. Louis Health Care System Institutional Review Board, which granted a waiver of informed consent.
Data sources
,
,
,
,
,
,
,
,
,
Diagnoses were obtained from VA CDW inpatient and outpatient encounters domains. Laboratory measurements were collected from the CDW laboratory results domain and medication data were collected from the CDW outpatient pharmacy domain and the CDW bar code medication administration domain.
,
,
,
,
,
,
,
,
,
Information on COVID-19 was obtained from the VA COVID-19 shared data resource.
The Area Deprivation Index was used as a summary measure of contextual disadvantage at participants’ residential locations.
Outcomes
Post-acute COVID-19 diabetes outcomes were examined in the period of follow-up from 30 days after T0 up to the end of follow-up. Diabetes status was defined based on the ICD-10 codes (E08.X to E13.X) or a HbA1c measurement of more than 6·4% (46 mmol/mol), identified based on the Logical Observation Identifiers Names and Codes (LOINC). Antihyperglycaemic use was defined based on prescription record of diabetes medications for more than 30 days. A composite endpoint was also defined as the first occurrence of diabetes or antihyperglycaemic use.
Covariates
,
,
,
,
Covariates were assessed within 1 year before T0. Predefined baseline variables included age, race (White, Black, or other race), sex, area deprivation index, BMI, smoking status (current smoker, former smoker, or never smoke), use of long-term care (including nursing homes and assisted-living centres), number of outpatient and inpatient encounters, and number of HbA1c measurements. Comorbidities such as cancer, cardiovascular disease, cerebrovascular disease, chronic lung disease, dementia, HIV, hyperlipidaemia, and peripheral artery disease were also included as predefined covariates. Additionally, we also adjusted for laboratory test results including estimated glomerular filtration rate (eGFR) and HbA1c; vital signs including systolic and diastolic blood pressure; and medications including the use of steroids. Missingness of BMI, blood pressure, eGFR, and HbA1c were 1·02%, 1·28%, 6·20%, and 15·43%, respectively. Mean imputations conditional on age, race, sex, and group assignment were applied to missing values and continuous variables were transformed into restricted cubic spline functions to account for the potential non-linear relationships.
We obtained all patient encounter data, prescription data, and laboratory data for the cohort of participants within 1 year before T0. We classified more than 70 000 ICD-10 diagnosis codes into 540 diagnostic categories based on the Clinical Classifications Software Refined (version 2021.1), which is developed as part of the Healthcare Cost and Utilization Project sponsored by the Agency for Healthcare Research and Quality.
,
,
We classified 3425 medications, on the basis of the VA drug classification system, into 543 medication classes.
,
In total, 62 laboratory test abnormalities from 38 laboratory measurements were identified on the basis of LOINC. Because rare conditions occurring in less than 100 people in a group might not be sufficiently substantial to describe the characteristics of the group, only diagnoses, medications, or laboratory test abnormalities with an event of more than 100 within each group, which were not included as predefined variables, were used to further estimate the univariate relative risk for COVID-19 group assignment.
The top 100 variables with the strongest univariate relative risk were selected.
The selection process was done independently for COVID-19 versus contemporary control groups, and COVID-19 versus historical control groups.
Statistical analyses
Inverse probability weighting was then applied to a Cox survival model to estimate the association between COVID-19 and diabetes outcomes. Two measures of risks were estimated, including the adjusted hazard ratios (HRs) and excess burdens. To generate the excess burdens, burdens of diabetes outcomes at 12 months in each group were estimated based on the survival probability at 12 months of follow-up. Excess burdens per 1000 people at 12 months from COVID-19 compared with controls was estimated based on the difference on survival probability between groups and transformed as event rate difference. Comparisons were done between COVID-19 and contemporary control groups, and independently between COVID-19 and historical control groups. The analyses were then repeated in subgroups based on age (≤65 years and >65 years), race (White and Black; subgroup analyses for other race category were not done because of the heterogeneity within this category), sex (male and female), BMI categories (>18·5 to ≤25 kg/m2; >25 to ≤30 kg/m2; and >30 kg/m2), area deprivation index quartiles, and diabetes risk score quartiles. A diabetes risk score was built using logistic regression to predict the probability of having a composite diabetes outcome within 1 year. The risk score was built within control groups based on diabetes risk factors including age, race, sex, BMI, HbA1c, cardiovascular disease, hypertension, and hyperlipidaemia status. The risk score was then applied to the COVID-19 group to evaluate the risk of diabetes outcomes before exposure to COVID-19.
To gain a better understanding of which subgroups with COVID-19 are more likely to have post-acute COVID-19 diabetes events, we estimated the effect of risk factors including diabetes risk scores, age, race, cardiovascular diseases, hypertension, hyperlipidaemia, prediabetes status (HbA1c >5·6% and <6·4%), and BMI categories on diabetes outcomes within 30-day survivors of COVID-19. We constructed logistic regressions within each COVID-19 subgroup to estimate the probability of assignment to the target population, conditional on covariates other than the subgrouping definition. Inverse probability weightings were then computed, and survival models were used to examine the HRs and burdens of these risk factors on diabetes outcomes.
We then separated the COVID-19 group into three mutually exclusive groups based on the care setting of the acute phase of the disease; that is whether people were non-hospitalised, hospitalised, or admitted to intensive care during the first 30 days after a COVID-19 positive test. Logistic regressions were applied to each care setting group to estimate the inverse probability weights. Cox survival models with inverse probability weighting were then applied and HRs, burdens, and excess burdens were reported.
To test the robustness of our findings, we applied an alternative analytic plan. Only cohort participants with complete data and at least 12 months of follow-up were selected and censored at 12 months (COVID-19 group n=62 110 and contemporary control group n=1 277 659). Multinomial logistic regression adjusting for predefined covariates was used to estimate the propensity scores for cohort participants. Average treatment effect weights were then constructed from the propensity score with stabilisation based on proportions of each group in the overall cohort. Weighted logistic regressions were then applied to estimate the odds ratios and predicted probabilities of having the outcome. Variance was estimated through generalised estimating equation, which considers the within-participant correlation after weightings.
,
Sixth, we applied the doubly robust adjustment method to further adjust for covariates after applying inverse probability weighting. Seventh, to further account for missing data, we applied multiple imputation to generate ten imputed datasets based on fully conditional specification regression method and estimated results.
Eight, to remove the influence of steroid use during the acute phase of the infection, we additionally adjusted for steroid use during the acute phase of the infection. Finally, to reduce the bias associated with increased surveillance for COVID-19 patients during follow-up, we additionally adjusted for the number of outpatient visits, number of hospitalisations, and number of HbA1c measurements during the follow-up as time varying variables.
,
Robust sandwich estimators were used to estimate variances when weightings were applied. For all analyses, a 95% CI that excluded unity or a p value of less than 0·05 was considered evidence of statistical significance. Analyses were done using SAS Enterprise Guide (version 8.2) and results were visualised using SAS Enterprise Guide (version 8.2) and R (version 4.0.4).
Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.
Results
There were 4 299 721 US veterans in the cohort overall recruited from March 1, 2020, to Sept 30, 2021; 181 280 were in the COVID-19 group an 4 118 441 were in the contemporary control group. The median follow-up time was 352 (IQR 244–406) days in the COVID-19 group and 352 (245–406) days in the contemporary control group, corresponding to 163 881 person-years and 3 763 155 person-years of follow-up, respectively.
To test the consistency of the results, we also built a historical cohort of 4 286 911 participants followed up for a median of 352 (IQR 245–406) days, corresponding to 3 916 979 person-years of follow-up.
TableDemographic and health characteristics of the COVID-19, contemporary control, and historical control groups after adjustment
Data are mean (SD) or n (%).

Figure 1Risks and burdens of post-acute COVID-19 diabetes outcomes compared with the contemporary control group
The outcomes were ascertained from day 30 after COVID-19 infection until the end of follow-up. Adjusted hazard ratios and 95% CIs are presented in a base 10 logarithmic scale. Adjusted event rates per 1000 people at 12 months for the COVID-19 group and the contemporary control group, and the excess burden per 1000 people at 12 months and related 95% CIs are also presented.

Figure 2Risks and burdens of post-acute COVID-19 diabetes outcomes by severity of the acute infection compared with the contemporary control group
Severity of the acute infection was defined as non-hospitalised (blue), hospitalised (purple), and admitted to intensive care (orange). The outcomes were ascertained from day 30 after COVID-19 infection until the end of follow-up. Adjusted hazard ratios and 95% CIs are presented in a base 10 logarithmic scale. Adjusted event rates per 1000 people at 12 months for each care setting during the acute infection, contemporary control group, and excess burden per 1000 people at 12 months and related 95% CIs are also presented.

Figure 3Risks of post-acute diabetes outcomes among people with COVID-19
(A) Diabetes risk score quartile. (B) Individual risk factors including age, race, cardiovascular disease, hypertension, hyperlipidaemia, prediabetes, and BMI. The outcomes were ascertained from day 30 after COVID-19 infection until the end of follow-up. Adjusted hazard ratios and 95% CIs are presented in a base 10 logarithmic scale. Excess burden per 1000 people at 12 months and 95% CIs are also presented.
Discussion
In this study involving participants with COVID-19, contemporary controls, and historical controls, we provide evidence that suggests that beyond the first 30 days of infection, COVID-19 survivors exhibited increased risks and burdens of incident diabetes, and antihyperglycaemic use. The risks and burdens of all outcomes were significant among those non-hospitalised and increased in a graded fashion according to the care setting of the acute phase of the infection. The risks and burdens were also consistent in comparisons versus a historical control group. Altogether, our results indicate that beyond the acute phase of COVID-19, survivors are at an increased risk of developing incident diabetes and antihyperglycaemic use; therefore diabetes should be considered as a component of the multifaceted long COVID. Post-acute care strategies of people with COVID-19 should also integrate screening and management of diabetes.
these absolute numbers might translate into substantial overall population level burdens and could further strain already overwhelmed health systems. Governments and health systems around the world should be prepared to screen and manage the glycometabolic sequelae of COVID-19. Although the optimal composition of post-acute COVID clinics is still not clear, evidence from this report indicated that those should include attention and care for diabetes.
Our approach examines the risks and burdens of diabetes in comparisons versus a contemporary control group exposed to the same contextual forces of the pandemic (eg, economic, social, and environmental stressors) and a historical control group from a pre-pandemic era that represents a baseline unaffected by the pandemic. COVID-19 consistently exhibited an increased risk of diabetes in comparisons versus both the contemporary and historical control groups, suggesting enhanced vulnerability to diabetes among people with COVID-19.
Our subgroup analyses suggest that even people with a low risk of diabetes before exposure to COVID-19 exhibited increased risk compared to both contemporary and historical controls. In addition, our analyses of who is at risk of diabetes among people with COVID-19 suggest that the relationship between COVID-19 and diabetes exhibited a graded association according to baseline risk of diabetes suggesting that diabetes could manifest in people at low risk (compared with controls), and COVID-19 could likely amplify baseline risks and further accelerate manifestation of disease among individuals already at high risk.
Additionally, the risk of new diabetes was higher in COVID-19 than in those with pre-pandemic acute respiratory infections.
This study did not report the proportion of type 1 or type 2 diabetes.
An analysis, which has not yet been peer reviewed, of 1·8 million people aged younger than 35 years suggested increased risk of type 1 diabetes within, but not beyond, the first 30 days after SARS-CoV-2 infection.
Studies in adults are generally more concordant and show evidence of increased risk of diabetes in people with COVID-19.
,
,
Our study sheds light on this and provides evidence of increased risk in adults among both non-hospitalised and hospitalised individuals at 1 year after COVID-19 diagnosis; and that most (>99%) of diagnoses of diabetes in our cohort relate to type 2 diabetes.
Evidence suggests that SARS-CoV-2 can infect and replicate in insulin-producing pancreatic beta cells subsequently resulting in impaired production and secretion of insulin.
,
,
,
However, in-vitro SARS-CoV-2-infected human pancreatic islets exhibit largely non-cytopathic modest cellular perturbations and inflammatory responses – suggesting that direct infection of pancreatic cells is – on its own – unlikely to fully explain new onset diabetes in people with COVID-19.
Other potential explanations include autonomic dysfunction, hyperactivated immune response or autoimmunity, and persistent low-grade inflammation leading to insulin resistance.
,
,
,
It is also possible that people with COVID-19 might have differentially experienced some of the broader contextual changes (social, economic, environmental, and other) that characterised the pandemic and that might have indirectly contributed to shaping the outcomes evaluated in this study.
,
There are several strengths of this study. We leveraged the breadth and depth of the US Department of Veterans Affairs electronic health-care databases to build a large national cohort of veterans, without a history of diabetes, to investigate the association between COVID-19 and risks of diabetes outcomes. We tested the association using two large controls (contemporary and historical controls), an approach that allowed us to deduce that the associations between COVID-19 and risks of diabetes are not related to the broader temporal changes between the pre-pandemic and the pandemic eras, but rather related (possibly through both a direct and indirect pathway) to exposure to COVID-19 itself. Our covariates specification approach included 22 predefined variables selected based on previous evidence and 100 algorithmically selected variables from high dimensional data domains including diagnostic codes, prescription records, and laboratory test results. We evaluated several incident diabetes outcomes across the continuum of the severity scale, including diabetes diagnoses and initiation of antihyperglycaemic therapy. We tested robustness of our approach in multiple sensitivity analyses, and successfully applied positive and negative outcome controls. We provided estimates of risks on both the ratio scale (HRs) and the absolute scale (burden per 1000 people at 12 months). The absolute scale also reflects the contribution of baseline risk and provides an estimate of potential harm that is more easily explainable to the general public than risk reported on the ratio scale (eg, HR).
In conclusion, we suggest that in the post-acute phase of the disease, people with COVID-19 exhibit increased risk and burden of diabetes, and antihyperglycaemic use. The risks and burdens were evident among those who were non-hospitalised during the acute phase of the infection and increased according to the severity of the acute infection as proxied by the care setting (non-hospitalised, hospitalised, and admitted to intensive care). Taken together, current evidence suggests that diabetes is a facet of the multifaceted long COVID syndrome and that post-acute care strategies of people with COVID-19 should include identification and management of diabetes.
ZA-A was responsible for the research and study design; administrative, technical, or material support; and supervision and mentorship. YX acquired the data and did the statistical analysis. YX and ZA-A were responsible for the data analysis and interpretation, drafting the manuscript, and critical revision of the manuscript. Each author contributed important intellectual content during manuscript drafting or revision, and accept accountability for the overall work by ensuring that questions pertaining to the accuracy or integrity of any portion of the work are appropriately investigated and resolved. ZA-A takes responsibility that this study has been reported honestly, accurately, and transparently; that no important aspects of the study have been omitted, and that any discrepancies from the study as planned have been explained. Both authors had full access to all the data, and both have verified the accuracy of all underlying data. Both authors had final responsibility for the decision to submit for publication.
link