Federal IT operations time and again have proved to be the resilient lifeline to vital citizen services during the COVID-19 pandemic. As government and society map the path to recovery, MeriTalk is chronicling the untold stories and lessons of how government IT has innovated on the fly to meet the demands of the crisis and anchor more resilient services going forward. In the latest installment of CIO Crossroads, we turn to the Department of Health and Human Services (HHS) five months into the fight.
HHS Pivots From Cyber Defense to Virus Relief – CIO Q&A
With the primary missions of protecting the health of all Americans and providing essential human services, HHS bore the brunt of the coronavirus pandemic. Its component agencies – Centers for Disease Control and Prevention (CDC), Food and Drug Administration (FDA), National Institutes of Health (NIH), and Centers for Medicare & Medicaid Services – live on the front lines of public health.
Jose Arrieta is CIO and Chief Data Officer (CDO) for HHS, which employs 83,000 staff and nearly as many contractors. Arrieta, who said this month he plans to leave the agency soon, tells the story of HHS’ IT missions in two chapters. The first is a race for resiliency in the face of massive cyberattacks early in the pandemic. The second is the birth of a COVID-19 data gathering and analytics operation that promises to pave the nation’s way out of the crisis.
The CIO-level effort involved additions of data circuits, big jumps in VPN capacity, firewall upgrades, and a transition to 95 percent telework on the day after one of the largest cyber assaults the agency has ever seen.
On the CDO level, Arrieta explains in an interview with MeriTalk the intricacies of taking in billions of crucial healthcare data elements from thousands of sources, creating access management systems to ensure privacy and security, and leveraging machine-learning technologies to drive higher-value insights as science races for the cure.
MeriTalk: As CIO and CDO at HHS, what have been some of your largest priorities and successes during the last five months, and what accomplishments are you proudest of?
Arrieta: I’m most proud of our decision to add circuit capability across the HHS internet at the beginning of the pandemic. We have one of the largest internet service areas in the world because we’re a data sharing organization. Then we upgraded our VPN capabilities, and we upgraded our firewalls. We made the investments to make our network resilient.
Just as we were finalizing those upgrades, we experienced what was probably the largest cyber event in the public history of the U.S. government – not including anything that’s classified. [Ed.: reported to be a large-scale, foreign-directed distributed denial-of-service attack on HHS networks]. The cyber event was on a Sunday, and teleworking started on Monday. Through that event – and a number of others after that – the network proved resilient.
I’m very proud of the fact that we made those investments in advance, and we were able to push maximum telework at HHS without a blip in service.
MeriTalk: What can you put on the record for us about the cyberattack?
Arrieta: The cyber event we experienced over 18 hours was enormous in size, scale, and scope. In less than 24 hours, we defeated an attack that was 20 million times greater than anything we had seen previously. The team did an amazing job. The government community really came to our aid. Federal CIO Suzette Kent was a key player in that, and I’ll always be appreciative of her leadership. The Cybersecurity and Infrastructure Security Agency (CISA) was an amazing partner. They immediately mobilized and provided tools and support. We also partnered with entities within the Defense Department (DoD) for support. In the face of follow-on events over the next two months, we were extremely resilient.
MeriTalk: How many HHS employees are teleworking, and how much did the agency have to boost VPN capacity to handle that?
Arrieta: We have 83,000 employees and 80,000 contractors, and we got to well over 95 percent teleworking.
To put the VPN question in perspective, HHS is the sixth largest hospital system in the world and the largest funder of research on the face of the Earth, based upon the volume of data that we collect. Each of our agencies has different functions and needs. We ensured that the entire NIH workforce would be able to leverage the VPN multiple times over, because it’s a data-sharing entity. Same thing with the FDA. In organizations that are not moving as much data, we increased it one and a half times over.
MeriTalk: The HHS Protect project for sharing pandemic data has become a very important resource in responding to COVID-19. How did that come about?
Arrieta: We started working on HHS Protect on April 5 after receiving a proof of concept from the CDC, and we launched it on April 10. All 50 states and all six territories are sharing data with it and taking data from it.
In addition to providing data, HHS Protect is driving clinical trials in the sense that the data is identifying areas where outbreaks may occur, and that is correlated with successful clinical trials and speeding them up. We named it HHS Protect because it’s about protecting people.
MeriTalk: Can you walk us through some of the data science behind it and some of the project goals?
Arrieta: We want to turn the United States inside out from a public health perspective, and create visibility so that first responders, policymakers, and community leaders understand what is actually happening across the country, so they can respond better.
There’s no [protected] health information and no personally identifiable information on HHS Protect, but here’s what we do have: We have data on hospital bed availability and ventilator usage. We have all of the commercial and lab test data in the United States, and we have 80 to 85 percent of the private hospital lab and tribal lab data in the United States. We regularly receive data from between 5,000 and 5,800 hospitals. About 50 percent of data we’re pulling from open-source capabilities that exist in the marketplace. The other 50 percent are datasets within Federal, state, and local partners or commercial partners.
MeriTalk: Can you share some metrics from the project?
Arrieta: By April 28 we reached 2.5 billion data elements, and since then have gotten to 4 billion data elements. That’s pretty awesome. I’m very proud of my team.
Not only did we amass the data, but we onboarded Federal users first, then state and local users. Now, we’re even offering it for insight to congressional leaders. And it’s been built, managed, and administered by Federal employees [and Federal contractors], all non-political.
MeriTalk: What are some of the technical considerations?
Arrieta: We’ve used modern technology for identity access management. Every agency has their own identity access management system – and some agencies have multiple systems – so we layered on top of that. Once we authenticate somebody and they have access to HHS Protect, we have an additional authentication step to access discrete data sets. Protecting privacy is a cornerstone of what we are doing.
We want to share data, so we have a modern, secure file transfer capability leveraging another commercial technology that we’re molding together. With that file transfer capability, a user has a folder and controls the data in it, and then has visibility into who wants access to that data and the ability to approve it. It’s another layer of privacy and security controls.
We also want to create a level of transparency that’s greater than any other Federal system. To do that, when the data hits our platform, we create a record of lineage that is timestamped on a hash for the entire time that data exists on our platform. When data is curated, parsed, or shared, a record of those behaviors is timestamped down to the half-second. It’s completely transparent. It runs in a commercial cloud environment, which is moderate level.
Then, we took it a step further. When you share data, you want to be able to protect the integrity of that data. So we took that hash and we put a QR code on top of it. Now, when we share a data element or a dataset, if anyone tries to change the underlying data, we have a record. That is a very powerful set of capabilities.
We bolted commercial technologies together to do all of this, and we did it in-house. I am super proud of that.
MeriTalk: How about technology closer to the contributor or user level?
Arrieta: We have a mapping technology that enables us to look at those 4 billion data elements from 225 different datasets all the way down to the building level. In addition, we have multiple collection mechanisms. Instead of making the health sector adjust to us, we adjust to the health sector, so that we can receive the data. We’re very flexible in terms of how we integrate with a hospital, a state that’s submitting information to a lab, or an Internet of Things-enabled device.
MeriTalk: What are some of the payoffs?
Arrieta: We are running a supervised machine learning platform using the 4 billion data elements. We can model 100 outcomes and 25 simulations on each outcome every minute. Every user that’s doing any type of modeling on HHS Protect is not only sharing their model, but also any insights.
Public health surveillance is all about being able to see the outbreak as it happens or before it happens. That’s what allows you to make decisions that protect people.
If you move lab supplies in advance to a place where there’s going to be an outbreak, it has a significant impact on behavior because if I get tested and find out that I have COVID-19, I’m less likely to go out and infect other people. If clinical trials take a long time, more people are impacted by disease. But if you have insight into where an outbreak is going to occur so that you can drive the clinical trials on the basis of certain metrics, you can finish the clinical trials faster.
MeriTalk: What’s the technology behind how the data is collected, and how has that evolved?
Arrieta: We had four different ways that we collected data from hospitals. One was a small form on HHS Protect for hospitals that fax, one was direct integration with states, one was a system at CDC called the National Healthcare Safety Network, or NHSN, and the other was HHS TeleTracking. Imagine a series of tentacles pushing data into HHS Protect. We decided to use TeleTracking to collect data from hospitals. We are getting information from between 4,700 and 5,800 hospitals per day. We want to get to the total universe of 6,200 to 6,400 hospitals.
Another thing I’m super proud of is the steps we have taken to dashboard the data we are receiving. We’ve taken the raw data and we published it in a location where it can be shared with the scientific community and aspiring scientists. They can use the data and the modeling work that we did to test our model or run their own model. That is a big step in the direction of transparency and data sharing. I’m very proud that we are generating a conversation across the United States on the importance of data sharing.
MeriTalk: What is your greatest lesson learned since the pandemic? If you could time travel back to four and a half months ago, what would have been your advice to get ready?
Arrieta: Sleep is very important. Your wife is extremely important. Seeing your kids is important. We’ve been working seven days a week since March 15 in the technology function. It has just been a grind.
In all seriousness, though, the biggest lesson learned for me is agencies need to modernize their identity and access management (IAM) capabilities. As I learned more about the cyber event that we experienced and I reached out to CIOs and Chief Information Security Officers (CISOs) in large companies, I found that everybody is experiencing these types of events, and they’re focused on IAM. It’s definitely a weak point, and the lack of flexibility with identity access management and authentication makes it very hard to share data and to do supervised machine learning and unsupervised machine learning.
Supervised machine learning and unsupervised machine learning require flexible identity access management authentication capabilities. They also require flexible data sharing and integration tool sets. Instead of re-platforming everything, if you focus on connecting an ecosystem so you can bring data together to run predictive analytics, you can dramatically and dynamically change your view of what’s going on within your organization. It’s pretty amazing when you think about it.
A second lesson learned is that building in pieces by integrating commercial technologies allows you to go so much faster. You really have to commit to objective-based contracting. When you do that, it puts a lot of pressure on you and your staff because it requires a lot of communication. It can be absolutely exhausting, but if you find a company that you’re comfortable interacting with in that way – going through the competitive process – you can make a difference very quickly by integrating technologies and not relying so much on an integrator. There are roles for integrators, but forcing yourself to walk in those shoes is extremely important.
MeriTalk: Big picture, what do you think will change in our government and in our society moving forward because of the COVID-19 experience?
Arrieta: I think you’ll see increased prevalence of distributed ecosystems. The pandemic has forced an investment in technology. That investment for us has been focused on a response, but it forced us to get comfortable with some modern technologies that otherwise would have taken years to adopt.
In the healthcare space, I think it’s going to fundamentally change the way healthcare is delivered. I think the pandemic will force the healthcare system to become more patient-centric. I think there will be modernized infrastructure that allows for digital identity amongst machines, testing devices, government, and individuals. When you empower individuals with the ability to maintain their health information and connect with the doctor or pharmacy electronically, that fundamentally shifts the industry. The pandemic has driven that change.
When we think about the cyber events that we survived at HHS, one of the reasons we were successful is because we have multiple Trusted Internet Connection Access Providers (TICAPs), and they gave us a lot of resiliency and redundancy. It may look fragmented, but it became the ideal way to deliver a solution, and it became the ideal platform to modernize on. We wouldn’t have known that if we had not experienced this pandemic.
MeriTalk: Would you like to give any shout outs to your team members at HHS or other departments that have been particularly valuable as you navigate the pandemic response?
Arrieta: There are so many people to thank. The CISA team from the Department of Homeland Security has been amazingly supportive. Suzette Kent and Greg Schneider at the Office of Management and Budget (OMB). A number of folks at OMB were amazingly supportive. Our industry partners have been really amazing – they’ve been working just as hard. The direct members of my team – Perryn Ashmore, Robin Collins, John Shoe, Janet Vogel, Kim Kalik, Kevin Duvall – and my entire team. Our service team made sure everybody had a laptop and Internet connectivity and made sure we got phones and computers out rapidly for all the folks that are helping us in the response. You can’t do any of this without contracting; we’ve had wonderful support from the contracting function. Michael McFarland, James Simpson, and Jen Browning have been absolutely fantastic. The HHS secretary and deputy secretary have been very supportive. Staffers Nick Uehlecke and Will Brady have really helped us navigate. When you’re working at this speed – and we are fundamentally changing our entire business model at HHS – there’s not a lot of time to communicate. It’s a huge asset to be able to quickly convey a message and know that message will be communicated to the right people so you can focus on executing. Our partnerships with DoD and the Army have been wonderful. And of course, my family members, who have been extremely supportive.
Read other Federal success stories from the COVID-19 pandemic.