2022 Federal Index
Data
Did the agency collect, analyze, share, and use high-quality administrative and survey data – consistent with strong privacy protections to improve (or help other entities improve) outcomes, cost effectiveness, and/or the performance of federal, state, local, and other service providers programs in FY22 (examples: model data-sharing agreements or data-licensing agreements, data tagging and documentation, data standardization, open data policies, and data use policies)?
Score
8
8
Millennium Challenge Corporation
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- In FY22, MCC is continuing to develop a strategic data plan. As detailed on the Digital Strategy and Open Government pages of the MCC website, MCC promotes transparency to provide people with access to information that facilitates their understanding of MCC’s model, MCC’s decision-making processes, and the results of MCC’s investments. Transparency, and therefore open data, is a core principle for MCC because it is the basis for accountability, provides strong checks against corruption, builds public confidence, and supports informed participation of citizens.
- As a testament to MCC’s commitment to and implementation of transparency and open data, the agency was again the highest ranked bilateral donor and U.S. government agency for the sixth consecutive Index. In addition, the U.S. government is part of the Open Government Partnership, a signatory to the International Aid Transparency Initiative, and must adhere to the Foreign Aid Transparency and Accountability Act. All of these initiatives require foreign assistance agencies to make it easier to access, use, and understand data. All of these actions have created further impetus for MCC’s work in this area, as they establish specific goals and timelines for adoption of transparent business processes.
- Additionally, MCC convenes an internal Data Governance Board, an independent group consisting of representatives from departments throughout the agency, to streamline its approach to data management and advance data-driven decision-making across its investment portfolio.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- Through its Open Data Catalog, which includes an enterprise data inventory of all data resources across the agency for release of data in open, machine-readable formats, MCC makes extensive program data, including financials and results data, publicly available. The Department of Policy and Evaluation leads the MCC Disclosure Review Board process for publicly releasing the de-identified microdata that underlie the independent evaluations on the MCC Evidence Platform, following MCC’s Microdata Management Guidelines to ensure appropriate balance in transparency efforts with the protection of human subjects’ confidentiality.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement [examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The new Evidence Platform, which links and provides access to all of MCC’s microdata from evaluation packages, offers a first of its kind data enclave for users to access and use public and restricted use data. The virtual data enclave connects datasets to qualitative reports and results for integrated research and learning. The platform encourages research, learning, and reproducibility and connects datasets to analytical products across the portfolio. In addition to the Evidence Platform, MCC’s Data Analytics Program enables enterprise data-driven decision-making through the capture, storage, analysis, publishing, and governance of MCC’s core programmatic data. It streamlines the agency’s data life cycle, facilitating increased efficiency. Additionally, it promotes agency-wide coordination, learning, and transparency. For example, MCC has developed custom software applications to capture program data, established the infrastructure for consolidated storage and analysis, and connected robust data sources to end user tools that power up-to-date dynamic reporting and also streamline content maintenance on MCC’s public website. As a part of this effort, the M&E team has developed an Evaluation Pipeline application that provides up-to-date information on the status, risk, cost, and milestones of the full evaluation portfolio for better performance management.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The corporation’s Disclosure Review Board ensures that data collected from surveys and other research activities are made public according to relevant laws and ethical standards that protect research participants while recognizing the potential value of the data to the public. The board is responsible for reviewing and approving procedures for the release of data products to the public; reviewing and approving data files for disclosure; ensuring that de-identification procedures adhere to legal and ethical standards for the protection of research participants; and initiating and coordinating any necessary research related to disclosure risk potential in individual, household, and enterprise-level survey microdata on MCC’s beneficiaries.
- The Microdata Evaluation Guidelines inform MCC staff and contractors, as well as other partners, about how to store, manage, and disseminate evaluation-related microdata. These microdata are distinct from other data MCC disseminates because they typically include personally identifiable information and sensitive data as required for independent evaluations. With this in mind, MCC’s Guidelines govern how to manage three competing objectives: share data for verification and replication of the independent evaluations, share data to maximize usability and learning, and protect the privacy and confidentiality of evaluation participants. These guidelines were established in 2013 and updated in January 2017. Following these guidelines, MCC has publicly released 117 de-identified, public use, microdata files for its evaluations and evidence studies. It also has 25 restricted data packages cleared by the Disclosure Review Board that it can make accessible on the new MCC Evidence Platform. The corporation’s experience with developing and implementing this rigorous process for data management and dissemination while protecting human subjects throughout the evaluation life cycle is detailed in Opening Up Evaluation Microdata: Balancing Risks and Benefits of Research Transparency. MCC is committed to ensuring transparent, reproducible, and ethical data and documentation and seeks to further encourage data use through its new Evidence Platform.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- Both MCC and its partner in-country teams produce and provide data that are continuously updated and accessed. The MCC website is routinely updated with the most recent information, and in-country teams are required to do the same on their websites. As such, all MCC program data are publicly available on MCC’s website and individual MCA websites for use by MCC country partners and other stakeholder groups. As a part of each country’s program, MCC provides resources to ensure that data and evidence are continually collected, captured, and accessed. In addition, each project’s evaluation has an Evaluation Brief that distills key learning from MCC-commissioned independent evaluations. Select Evaluation Briefs have been posted in local languages, including Mongolian, Georgian, French, and Romanian, to better facilitate use by country partners.
- Millennium Challenge Corporation also has a partnership with the President’s Emergency Plan for AIDS Relief (PEPFAR), referred to as the Data Collaboratives for Local Impact (DCLI). This partnership is improving the use of data analysis for decision-making within PEPFAR and MCC partner countries by working toward evidence-based programs to address challenges in HIV/AIDS and health, empowerment of women and youth, and sustainable economic growth. Data-driven priority setting and insights gathered by citizen-generated data and community mapping initiatives contribute to improved allocation of resources in target communities to address local priorities, such as job creation, access to services, and reduced gender-based violence. The impact of DCLI is being extended through a new partnership in Côte d’Ivoire, where MCC, Microsoft, and others are partnering to develop a women’s data lab and network program. The program will empower women-owned or women-led small and medium enterprises and female innovators and entrepreneurs with digital and data skills to effectively participate in the digital economy and grow their businesses.
Score
8
8
U.S. Department of Education
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The ED Data Strategy–the first of its kind for the U.S. Department of Education–was released in December 2020. It recognized that we can, should, and will do more to improve student outcomes through the more strategic use of data. The ED Data Strategy goals are highly interdependent with cross-cutting objectives requiring a highly collaborative effort across ED’s principal offices. The strategy calls for strengthening data governance to administer the data the department uses for operations, answer important questions, and meet legal requirements. To accelerate evidence building and enhance operational performance, ED must make its data more interoperable and accessible for tasks ranging from routine reporting to advanced analytics. The high volume and evolving nature of ED’s data tasks necessitate a focus on developing a workforce with skills commensurate with a modern data culture in a digital age. At the same time, safely and securely providing access for researchers and policymakers helps foster innovation and evidence-based decision-making at the federal, state, and local levels.
- Goal 4 of the ED Data Strategy calls for ED to “improve data access, transparency, and privacy.” Objective 1.4 under this goal is to “develop and implement an open data plan that describes the department’s efforts to make its data open to the public.” Improving access to ED data, while maintaining quality and confidentiality, is key to expanding the agency’s ability to generate evidence to inform policy and program decisions. Increasing access to data for ED staff, federal, state, and local lawmakers, and researchers can help ED make new connections and foster evidence-based decision-making. Increasing access can also spur innovations that support ED’s stakeholders, provide transparency about ED’s activities, and serve the public good. The Department of Education seeks to improve user access by ensuring that open data assets are in a machine-readable open format and accessible via its comprehensive data inventory. The department will better leverage expertise in the field to expand its base of evidence by establishing a process for researchers to access non-public data. Further, ED will develop a cohesive and consistent approach to privacy and enhance information collection processes to ensure that department data are findable, accessible, interoperable, and reusable.
- The department continues to wait for Phase 2 guidance from OMB to understand the required parameters for the open data plan. In the meantime, ED continues to draft its open data plan. When finalized, the plan will conform to the new requirements associated with the release of OMB Phase 2 guidance. In the meantime, ED continues to release open data; the department soft-launched the Open Data Platform (ODP) in September 2020 and publicly released it in December 2020.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- The ED Data Inventory (EDI) was developed in response to the requirements of M-13-13 and initially served ED’s external asset inventory. It describes data reported to ED as part of grant activities, along with administrative and statistical data assembled and maintained by ED. It includes descriptive information about each data collection along with information on the specific data elements in individual data collections.
- The ODP is the ED’s solution for publishing, finding, and accessing public data profiles. This open data catalog brings together the department’s data assets in a single location, making them available with their metadata, documentation, and APIs for use by the public. The ODP makes existing public data from all ED principal offices accessible to the public, researchers, and ED staff in one location. It improves the department’s ability to grow and operationalize its comprehensive data inventory while progressing on open data requirements. The Evidence Act requires government agencies to make data assets open and machine-readable by default. The Open Data Platform is ED’s comprehensive data inventory satisfying these requirements while also providing privacy and security. It features standard metadata contained in data profiles for each data asset. Before new assets are added, data stewards conduct quality review checks on the metadata to ensure accuracy and consistency. As the platform matures and expands, ED staff and the public will find it a powerful tool for accessing and analyzing ED data, either through the platform directly or through other tools powered by its API.
- Information about Department data collected by the National Center for Education Statistics (NCES) has historically been made publicly available online. Prioritized data is further documented or featured on ED’s data page. NCES is also leading a government-wide effort to automatically populate metadata from Information Collection Request packages to data inventories. This may facilitate the process of populating EDI and comprehensive data inventory.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement?
- As ED collaboratively took stock of organizational data strengths and weaknesses, key themes arose and provided context for the development of the ED Data Strategy. The Strategy addresses new and emerging mandates such as open data by default, interagency data sharing, data standardization, and other principles found in the Evidence Act and Federal Data Strategy. However, improving strategic data management has benefits far beyond compliance; solving persistent data challenges and making progress against a baseline data maturity assessment offers ED the opportunity to close capability gaps and enable staff to make evidence-based decisions.
- One of the first priorities for the ED Data Governance Board (DGB) in FY21 was to assess the current state of data maturity at ED. In early 2020, OCDO held “discovery” meetings with stakeholders from each ED office to capture information about successes and challenges in the current data landscape. This activity yielded over 300 data challenges and 200 data successes that provided a wealth of information to inform future data governance priorities. The DGB used the understanding gained of the ED data landscape during the discovery phase to develop a Data Maturity Assessment (DMA) for each office and the overall enterprise focusing on data and related data infrastructure in line with requirements in the Federal Data Strategy 2020 Action Plan. Data maturity is a metric that will be measured and reported as part of ED’s Annual Performance Plan. Several of these activities have been supported by ED’s investment in a Data Governance Board and Data Governance Infrastructure (DGBDGI) contract.
- ED has also made concerted efforts to improve the availability and use of its data with the release of the revised College Scorecard that links data from NCES, the Office of Federal Student Aid, and the Internal Revenue Service. Through a series of recent updates, the College Scorecard team has improved the functionality of the tool to allow users to find, compare, and contrast different fields of study more easily, access expanded data on the typical earnings of graduates two years post-graduation, view median parent PLUS loan debt at specific institutions, and learn about the typical amount of federal loan debt for students who transfer. OCDO facilitated reconsideration of IRS risk assumptions to enhance data coverage and utility while still protecting privacy. The Scorecard enhancement discloses for prospective students how well borrowers from institutions are meeting their federal student loan repayment obligations, as well as how borrower cohorts are faring at certain intervals in the repayment process.
- IES continues to make available all data collected as part of its administrative data collections, sample surveys, and evaluation work. Its support of the Common Education Data Standards (CEDS) Initiative has helped to develop a common vocabulary, data model, and tool set for P-20 education data. The CEDS Open Source Community is active, providing a way for users to contribute to the standards development process.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The Student Privacy Policy Office (SPPO) leads the U.S. Department of Education’s (Department) efforts to protect privacy. SPPO manages and maintains the Department’s privacy program to include the enforcement of student privacy laws, and in addition, serves at the epicenter of the Department’s privacy program, ensuring compliance with applicable privacy requirements, developing and evaluating privacy policy, and managing privacy risks across the Department. Through its role as the Department’s leader in privacy policy, implementation, education, and training, SPPO raises awareness of privacy issues, demonstrates how Departmental personnel can safeguard personally identifiable information (PII), and fosters a culture of accountability for protecting PII within the Department. These efforts are implemented through the work of SPPO’s Privacy Safeguards Team, including the Disclosure Review Board (DRB), and is supported by its Privacy Technical Assistance Center.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The Department’s Data Review Board (DRB), assists Principal Offices in managing privacy risks before releasing data assets to the public. The DRB reviews data assets prior to public release to evaluate and manage the risk of unauthorized disclosure of PII, as well as to minimize unnecessary privacy risks with any authorized disclosure of PII. In addition, the DRB establishes best practices and provides technical assistance for applying privacy and confidentiality protections in the context of the public release of data assets.
- SPPO’s Privacy Technical Assistance Center (PTAC) established by the Department as a technical assistance resource center for education stakeholders to learn about privacy requirements and related best practices for student data systems. In this capacity, PTAC responds to technical assistance inquiries on student privacy issues and provides online FERPA training to state and school district officials. FSA conducted a postsecondary institution breach response assessment to determine the extent of a potential breach and provide the institutions with remediation actions around their protection of FSA data and best practices associated with cybersecurity.
- The Institute of Education Sciences (IES) administers a restricted-use data licensing program to make detailed data available to researchers when needed for in-depth analysis and modeling. NCES loans restricted-use data only to qualified organizations in the United States. Individual researchers must apply through an organization (e.g., a university, a research institution, or company). To qualify, an organization must provide a justification for access to the restricted-use data, submit the required legal documents, agree to keep the data safe from unauthorized disclosures at all times, and to participate fully in unannounced, unscheduled inspections of the researcher’s office to ensure compliance with the terms of the License and the Security Plan form.
- The National Center for Education Statistics (NCES) provides free online training on using its data tools to analyze data while protecting privacy. Distance Learning Dataset Training includes modules on NCES’s data-protective analysis tools, including QuickStats, PowerStats, and TrendStats. A full list of NCES data tools is available on their website.
Score
8
8
U.S. Agency for International Development
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The agency’s data-related investments and efforts are guided by its Information Technology Strategic Plan. This includes support for the Agency’s Development Data Policy, which provides a framework for systematically collecting agency-funded data, structuring the data to ensure usability, and making the data public while ensuring rigorous protections for privacy and security. In addition, this policy sets requirements for how USAID data are documented, submitted, and updated. Guidance for USAID’s Open Data Policy may be seen in the user guide, FAQs, and help videos.
- In 2020 USAID revised the Development Data Policy to require development activities to create and submit data management plans before collecting or acquiring data. The Development Data Library (DDL) is the agency’s repository of USAID-funded machine readable data, created or collected by the agency and its implementing partners. The DDL, as a repository of structured and quantitative data, complements the DEC, which publishes qualitative reports and information. The agency’s data governance body, the DATA Board, is guided by annual data roadmaps that include concrete milestones, metrics, and objectives for agency data programs. A variety of stakeholder engagement tools are available on USAID’s DDL, including open data community questions and video tutorials on using DDL.
- People-level indicators for development data have traditionally been disaggregated by sex (male or female), sometimes by age, and occasionally by other demographic markers. In 2022, the DATA Board organized a Data Disaggregation Working Group to address data disaggregation issues including but not limited to how to better measure disability status, sex vs. gender identity definitions, and collection requirements and disaggregation standards for development programming and research purposes.
- In many countries it may be politically complicated or potentially unsafe to collect these data or data that ask about racial or ethnic identity. However, data can often be disaggregated by geographic location, region, or state and mapped with other demographic data to build a picture of geographic disparities. Country expertise can then be applied to analyze racial and ethnic equity dimensions, as described in ADS 205. Also in 2022, USAID’s GeoCenter developed a Geospatial Strategy, currently under review, that will guide the agency in collecting, analyzing, interpreting, and using geospatial data.
- Conducted in August 2021, USAID’s equity assessment acknowledges the urgency of addressing diversity, equity, inclusion, and accessibility through an agency-wide approach. It recommended that USAID should use a consistent approach to incorporate racial and ethnic equity and diversity into policy, planning, and learning. To address this issue, the agency is working toward increased participation of local stakeholders in the evaluation/learning process by recommending, where possible and appropriate, that USAID evaluation contractors use local experts, especially those from marginalized or underrepresented communities, as external evaluation team leaders for designing and conducting evaluations.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- Launched in November 2018 as part of the Development Information Solution, USAID’s public-facing DDL provides a comprehensive inventory of data assets available to the agency. It has posted the data inventory as a JavaScript object notationfile since 2015. Following the passage of the Foundations for Evidence-Based Policymaking Act, and in preparation for specific guidance expected in the upcoming release of phase 2 guidance for the act, USAID will make any necessary changes to its Comprehensive Data Inventory and continue reporting with quarterly updates as required. The DDL’s data catalog is also harvested via JavaScript on an ongoing basis for further distribution on the federal Data.gov website. Currently 566 USAID data assets are in the Comprehensive Data Inventory, available via USAID’s DDL, a 24% increase over the last Results for America report.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement [examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The U.S. Agency for International Development provides both internal agency and public access to data through its enterprise digital repository solutions. Agency staff and the public can access USAID-funded reports and publications in the DEC and access USAID-funded data assets in the DDL. Both repositories store and publish information and data with standard metadata (tags), documentation such as data dictionaries, and clearly labeled standard licenses.
- To strengthen staff and public access to usable data, the agency has established data management planning requirements that promote delivery of high-quality data with rich documentation, standards, and clear licensing and terms of use. These requirements direct USAID staff to work with USAID-funded partners on the creation of activity-level data management plans , which outline the data assets collected during an activity and plan documentation as well as use of standards. These data management plans can help ensure that USAID-funded data are submitted to the DDL as high-quality assets that are ready for publication and easy reuse.
- The agency is also advancing modernization of data access and data linkage solutions. It is exploring an advanced analytics environment called the Development Data Commons (DDC) that will enable staff to access heterogeneous data, merge these data, and analyze and visualize them in a central place for evidence building and program improvement. In addition, USAID is prototyping an Informatica data quality solution that can automate the delivery of standardized data elements and support the ability of staff to link data more efficiently.
- The USAID Data Services team, located in USAID’s Management Bureau’s Office of the Chief Information Officer, manages a comprehensive portfolio of data services in support of the agency’s mission. This includes enhancing the internal and external availability and ease of use of USAID data and information via technology platforms such as the AidScape platform broadening global awareness of USAID’s data and information services and bolstering the agency’s capacity to use data and information via training and the provision of demand-driven analytical services.
- The Data Services Team also manages and develops the agency’s digital repositories, including the DDL, the agency’s central data repository. Both USAID and external users can search for and access datasets from completed evaluations and program monitoring by country and sector.
- Staff of USAID also have access to an internal database of more than 100 standard foreign assistance program performance indicators and associated baseline, target, and actual data reported globally each year. This database and reporting process, known as the Performance Plan and Report , promotes evidence building and informs internal learning and decisions related to policy, strategy, budgets, and programs.
- The United States is a signatory to the International Aid Transparency Initiative (IATI) and reports some data to the IATI registry as frequently as monthly. The standard links an activity’s financial data to its evaluations. Partner country governments, civil society organizations, other initiatives, and websites can pull these data into their respective systems or view visualizations of IATI data. This supports the coordination and management of foreign aid and serves as an effective tool in standardizing and centralizing information about foreign aid flows within a country or to a specific topic, such as COVID-19. The agency continues to improve and add to its published IATI data and is looking into ways to utilize these data as best practice, including using these data to populate partner country systems, fulfill transparency reporting as part of the U.S. commitment to the Grand Bargain, and make decisions internally, including based on what other development actors are doing by using the Development Cooperation Landscape tool. Throughout FY21 USAID continued to publish financial and descriptive information about its COVID-19 activities.
- The agency continues to pursue better communicating data insights. Its Geocenter uses programmatic and demographic data linked with geospatial data to inform decision-making, emphasizing mapping to identify gaps in service provision and inform resource provision and decision-making (for example, to compare gender-based violence hot spots and access to relevant support services and to identify geographies and communities disparately impacted by natural disasters).
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The agency’s Privacy Program and privacy policy (ADS 508) direct policies and practices for protecting personally identifiable information and data, while several policy references provide guidance for protecting information to ensure the health and safety of implementing partners. Its Development Data Policy (ADS Chapter 579) details a data publication process that provides governance for data access and data release in ways that ensure protections for personal and confidential information. As a reference to the Development Data Policy, ADS579maa explains USAID’s foreign assistance data publications and the protection of any sensitive information prior to release. The agency applies extensive statistical disclosure control on all public data before publication or inclusion in the DDL.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- While specific data on this topic is limited, USAID does invest in contracts or grants that provide support to build local organizational or governmental capacity in data collection, analysis, and use. In addition, to date, 566 USAID data assets are held in the agency’s Comprehensive Data Inventory via USAID’s DDL, a 24% increase over last year. These assets include microdata related to USAID’s initiatives that provide partner countries and development partners with insight into emerging trends and opportunities for expanding peace and democracy, reducing food insecurity, and strengthening the capacity to deliver quality educational opportunities for children and youth around the globe. Grantees are encouraged to use the data in the DDL, which provides an extensive user guide to aid in accessing, using, securing and protecting data. The Data Services team conducts communication and outreach to expand awareness of the DDL, how to access it, and how to contact the team for support. In addition, the Data Services team has developed a series of videos to show users how to access the data available. The [email protected] mail account responds to requests for assistance and guidance on a range of data services from both within the agency and from implementing partners and the public.
- Data Services’ Data Literacy Learning Series will make available learning opportunities designed for both internal and external audiences on public-facing web pages in late 2022 that provide assistance to these public audiences about accessing the agency’s datasets while protecting privacy.
Score
6
6
AmeriCorps
5.1 Did the agency have a strategic data plan, including an open data policy (example: Evidence Act 202(c), Strategic Information Resources Plan)?
- In FY22 the agency’s chief data officer developed and implemented an organizational structure to support the maturation of enterprise data management at AmeriCorps. The data stewardship framework is a formal organizational structure that assigns documented roles and responsibilities for enterprise data to the appropriate individuals within the enterprise. Through these roles and responsibilities, individuals and the organizations to which they belong are empowered as stewards, not owners, of the data assets. In this role, they manage various aspects of the data assets in the best interest of the enterprise. By promoting accountability for data as an enterprise asset and providing for effective collaboration among the necessary stakeholders, data stewardship fosters an environment ensuring optimal performance. AmeriCorps’ data stewardship framework includes three governance tiers: a Strategic Advisory Board, a Data Governance Council, and data practitioners. Each governance tier has a charter that defines its purpose, authority, scope, functions, and membership. These governance tiers execute the agency’s data strategic goals and support the agency’s data management policy (Policy 383, currently under revision).
- The Data Governance Council is supported by collaborative working groups organized around a major category of enterprise data or a capability of data management. Three collaborative working groups were created in FY22 to support management of the agency’s diversity data, the agency’s strategic plan performance measures, and the agency’s grant management technology modernization efforts. (Note that AmeriCorps’ Technology Modernization Fund proposal for $14,000,000 was approved by the Federal Technology Modernization Board in FY22.)
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- The agency’s Information Technology Data Management Policy addresses the need to have a current and comprehensive data inventory. The agency has an open data platform. The chief data officer procured updated modules and technical support for this platform in FY22 to better support and advance the agency’s data management capacities.
5.3 Did the agency promote data access or data linkage for evaluation, evidence building, or program improvement [examples: model data-sharing or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- AmeriCorps has a data request form and an MOU template so that anyone interested in accessing agency data may use the protocol to request data. In addition, public data sets are accessible through the agency’s open data platform. The agency’s member exit survey data were made publicly available for the first time in FY19. In addition, nationally representative civic engagement and volunteering statistics are available on an interactive platform through a data sharing agreement with the U.S. Census Bureau. The goal of these platforms is to make these data more accessible to all interested end users.
- The portfolio navigator pulls data from the AmeriCorps data warehouse for use by the agency’s portfolio managers and senior portfolio managers. The goal is to use this information for grants management and continuous improvement throughout the grant lifecycle.
5.4 Did the agency have policies and procedures to secure data and protect personal confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)
- The agency has a privacy policy (Policy 153) that was signed in FY20 and posted internally. The Information Technology Data Governance Policy addresses data security. The agency conducts privacy impact assessments, consisting of a privacy review of each of AmeriCorps’ largest electronic systems, which are then published online.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees in accessing the agency’s datasets while protecting privacy?
- AmeriCorps provides assistance to grantees, including governments, to help them access agency data. For example, AmeriCorps provides assistance on using the AmeriCorps Member Exit Survey data to state service commissions (many of which are part of state government) and other grantees as requested. Through briefings integrated into standing calls with these entities. ORE worked with a few state commissions in FY22 to develop a prototype report tailored to each state that contains data from the Current Population Survey Civic Engagement and Volunteering (CPS CEV) Supplement and Member Exit Survey. The goal is to create these reports for state commissions to use for catalyzing data-driven national service and volunteering efforts.
Score
8
8
U.S. Department of Labor
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The department’s Office of Data Governance led the launch of its data strategy, published in 2022, which includes the following strategic areas: ensuring data quality, building and maintaining data talent, integrating data into existing agency management and planning systems to create a practical and realizable path forward, and expanding the data capabilities for producing sophisticated analytics. This data strategy also includes five data principles and details about public and partner engagement in the development of the plan (p. 4). In addition, DOL has open data assets aimed at developers and researchers who desire data-as-a-service through application programming interfaces hosted by both the Office of Public Affairs and the Bureau of Labor Statistics (BLS). Each of these has clear documentation; is consistent with the open data policy; and offers transparent, repeatable, machine-readable access to data on an as-needed basis.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- The department has conducted extensive inventories over the last ten years, in part to support common activities such as information technology modernization, White House Office of Management and Budget (OMB) data calls, and the general goal of transparency through data sharing. These form the current basis of DOL’s planning and administration. Some sections of the Evidence Act have led to a different federal posture with respect to data, such as the requirement for data to be open by default and considered shareable unless there is a legal requirement not to do so or a risk that the release of such data might help constitute disclosure risk. Led by the chief data officer and DOL Data Board, the department is currently reevaluating its inventories and its public data offerings in light of this very specific requirement and revisiting this issue among all its programs. Because this is a critical prerequisite to developing open data plans, as well as data governance and data strategy frameworks, the agency launched a website housing an updated inventory in FY22.
5.3 Did the agency promote data access or data linkage for evaluation, evidence building, or program improvement [examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The department also has multiple restricted use access systems that exceed what would be possible with simple open data efforts. The Bureau of Labor Statistics has a confidential researcher access program, offering access under appropriate conditions to sensitive data. Similarly, the Chief Evaluation Office is launching a restricted use access program for evaluation study partners to leverage sensitive data in a consistent manner to help make evidence generation more efficient.
- The department’s Chief Evaluation Office, Employment and Training Administration, and Veterans Employment and Training Service have worked with the U.S. Department of Health and Human Services (HHS) to develop a secure mechanism for obtaining and analyzing earnings data from the National Directory of New Hires. Since FY20, DOL has entered into interagency data sharing agreements with HHS and obtained data to support ten job training and employment program evaluations.
- Since FY20, the department has continued to expand efforts to improve the quality of and access to data for evaluation and performance analysis through the Data Analytics Unit in the Chief Evaluation Office and through new pilots beginning in the BLS to access and exchange state labor market and earnings data for statistical and evaluation purposes.
5.4 Did the agency have policies and procedures to secure data and protect personal confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The Department of Labor has a shared services approach to data security. In addition, the privacy provisions for BLS and ETA are explicit and publicly available online.
- The department has consistently sought to make as much data as possible available to the public regarding its activities. Examples of this include its Public Enforcement Database, which makes available records of activity from the worker protection agencies and the Office of Labor Management Standards’ online public disclosure room.
- The Bureau of Labor Statistics has a confidential researcher access program, offering access to sensitive data under appropriate conditions. Similarly, the Chief Evaluation Office is launching a restricted use access program for evaluation study partners to leverage sensitive data in a consistent manner to help make evidence generation more efficient.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The State Wage Interchange System is a mechanism through which states can exchange wage data with other states in order to satisfy performance related reporting requirements under the Workforce Innovation and Opportunity Act (WIOA), as well as for other permitted purposes specified in the agreement. The State Wage Interchange System agreement includes the DOL’s Adult, Dislocated Worker, and Youth programs (Title I) and Employment Service program (Title III); the Department of Education’s Adult and Family Literacy Act program (Title II) and programs authorized under the Carl D. Perkins Career and Technical Education Act of 2006 (as amended); and, the Vocational Rehabilitation program (Title IV). These departments have established agreements with all fifty states, the District of Columbia and Puerto Rico.
- The Employment and Training Administration continues to fund and provide technical assistance to states under the Workforce Data Quality Initiative to link earnings and workforce data with education data in support of state program administration and evaluation. These grants support the development and expansion of longitudinal databases and enhance their ability to share performance data with stakeholders. The databases include information on programs that provide training and employment services and obtain similar information in the service delivery process.
- The Employment and Training Administration is also working to assess the completeness of self-reported demographic data to inform both agency level equity priorities and future technical assistance efforts for states and grantees to improve the completeness and quality of this information. It incorporated into funding opportunity announcements the requirement to make any data on credentials transparent and accessible through use of open linked data formats.
- In addition, ETA is working with the department’s Office of the Chief Information Officer to complete new case management, known as the Grants Performance Management System, for its national and discretionary grantees. In addition to supporting case management by grantees, this new system supports these grantees in meeting WIOA-mandated performance collection and reporting requirements and enabling automation to ensure that programs can continue to meet updated WIOA requirements. As programs onboard interact with the Grants Performance Management System, the administration continues to integrate this system into the Workforce Investment Performance System to seamlessly calculate and report WIOA primary indicators of performance and other calculations in programs’ quarterly performance reports.
- The department is currently developing a new application programming interface (version 3) that will expand the open data offerings, extend the capabilities and offer a suite of user friendly tools.
Score
7
7
Administration for Children and Families (HHS)
5.1 Did the agency have a strategic data plan, including an open data policy? [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The Administration for Children and Families’ Interoperability Action Plan was established in 2017 to formalize its vision for effective and efficient data sharing. Under this plan ACF and its program offices will develop and implement a Data Sharing First strategy that starts with the assumption that data sharing is in the public interest. The plan states that ACF will encourage and promote data sharing broadly, constrained only when required by law or when there are strong countervailing considerations.
5.2 Did the agency have an updated comprehensive data inventory? (Example: Evidence Act 3511)
- In 2020, ACF released a Compendium of ACF Administrative and Survey Data Resources. The compendium documents administrative and survey data collected by ACF that could be used for evidence building purposes. It includes summaries of twelve major ACF administrative data sources and seven surveys. Each summary includes an overview, basic content, available documentation, available data sets, restrictions on use, capacity to link to other data sources, and examples of prior research. It is a joint product of ACF’s Office of Planning, Research, and Evaluation and HHS’s Office of the Assistant Secretary for Planning and Evaluation.
- In addition, in 2019 OPRE compiled the descriptions and locations of hundreds of its archived datasets that are currently available for secondary analysis and made this information available on a single web page. The office continues to regularly update this website with current archiving information. It regularly archives research and evaluation data for secondary analysis, consistent with the ACF Evaluation Policy, which promotes rigor, relevance, transparency, independence, and ethics in the conduct of evaluation and research. This new consolidated web page serves as a one-stop resource that will help to make it easier for potential users to find and use the data that OPRE archives for secondary analysis.
- In 2020 ACF launched the Data Governance Consulting and Support project, which is providing information gathering, analysis, consultation, and technical support to ACF and its partners to strengthen data governance practices within ACF offices, and between ACF and its partners at the federal, state, local, and tribal levels. Initial work is focusing on data asset tracking and metadata management, among other topics.
5.3 Did the agency promote data access or data linkage for evaluation, evidence-building, or program improvement [(examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The Administration for Children and Families has multiple efforts underway to promote and support the use of documented data for research and improvement, including making numerous administrative and survey datasets publicly available for secondary use and actively promoting the archiving of research and evaluation data for secondary use. These data are machine readable, downloadable, and de-identified as appropriate for each data set. For example, individual-level data for research are held in secure restricted use formats, while public use data sets are made available online. To make it easier to find these resources, ACF released a Compendium of ACF Administrative and Survey Data and consolidated information on archived research and evaluation data on the OPRE website.
- Many data sources that may be useful for data linkage for building evidence on human services programs reside outside of ACF.
- In 2020, OPRE released the Compendium of Administrative Data Sources for Self-Sufficiency Research, describing promising administrative data sources that may be linked to evaluation data in order to assess long-term outcomes of economic and social interventions. It includes national, federal, and state sources covering a range of topical areas. In addition, in October 2021 OPRE released A Guide for Using Administrative Data to Examine Long-Term Outcomes in Program Evaluation, a resource to assist program evaluation project teams—including funders, sponsors, and evaluation research partners—in assessing the feasibility and potential value of examining long-term outcomes using administrative data. It describes common steps that are involved in linking evaluation data and administrative data and how to assess the quality of linked study and administrative data, as well as how to assess the quality of linked study and administrative data. While it is primarily targeted to research audiences seeking to access administrative data to assess long-term outcomes once an evaluation has been completed, it is also useful for designing research and evaluations up front in order to enable such analysis at a later date. Both publications were produced under contract by MDRC as a part of OPRE’s Assessing Options Evaluate Long-Term Outcomes Using Administrative Data project.
- The Office of Planning, Research, and Evaluation has also released multiple publications to assist states, localities, and research teams in negotiating the privacy and confidentiality requirements of linking and accessing data for research, evaluation, and/or operational and program improvement purposes. For example, in October 2021 OPRE released an updated Confidentiality Toolkit, which contains information on how to responsibly share personally identifiable information collected by human services and related programs to improve program outcomes. It discusses key federal privacy requirements, strategies to resolve challenges, and information technology security. It also includes documents used to facilitate record sharing and links to helpful resources. Building on the toolkit, OPRE released a Case Study Report on Iowa’s Integrated Data System for Decision-Making (I2D2) in May 2022. This report is the first in a planned series of publications that highlight innovative and unique state and local data sharing initiatives that are functional while protecting data privacy and confidentiality, consistent with the federal- and state-level legal requirements. These reports focus on the privacy and confidentiality challenges that states and localities face—and how they can be overcome—and provide model “tools” and resources (e.g., data sharing agreements) in downloadable and editable formats. Similarly, in June 2022 OPRE released the Sharing and Accessing Administrative Data: Promising Practices and Lessons Learned from the Child Maltreatment Incidence Data Linkages Project, which highlights promising practices for sharing and accessing data and discusses lessons learned related to four key activities essential to sharing and accessing data: (1) developing agreements for data sharing and use; (2) protecting the data’s security, confidentiality, and privacy; (3) securing institutional review board (IRB) and other approvals; and (4) accessing the data.
- Additionally, ACF is actively exploring how enhancing and scaling innovative data linkage practices can improve our understanding of the populations served by ACF and build evidence on human services programs more broadly. For instance, the Child Maltreatment Incidence Data Linkages (CMI Data Linkages) project is examining the feasibility of leveraging administrative data linkages to better understand child maltreatment incidence and related risk and protective factors. Similarly, the Child and Caregiver Outcomes Using Linked Data project, a partnership between OPRE and ASPE, is working with states to enhance capacity to examine outcomes for children and parents who are involved in state child welfare systems and who may have behavioral health issues. The Office of Planning, Research, and Evaluation will shortly release a publication documenting Florida and Kentucky’s projects to link the Medicaid records of parents with the records of their children from the child welfare system and produce de-identified linked files for research use. This publication examines the practical aspects of creating such data linkages, including the language and interpretations of relevant state laws, and can be used as a guide for other states seeking to conduct the same linkages. In 2023 the project will be making available to researchers de-identified state-level datasets through a restricted use data archive. Also, in August 2021, OPRE published a brief presenting findings from the 2019 TANF Data Innovation Needs Assessment. This survey of state TANF agencies was designed to understand state strengths and challenges in linking and analyzing administrative data for program improvement. Findings from the needs assessment informed technical assistance provided to states through ACF’s TANF Data Collaborative. Information from the brief may be helpful to states, policymakers, and other funders in helping to support states in linking data for the purpose of evidence building.
- The Administration for Children and Families actively promotes archiving of research and evaluation data for secondary use. Research contracts initiated by OPRE include a standard clause requiring contractors to make data and analyses supported through federal funds available to other researchers and to establish procedures and parameters for all aspects of data and information collection necessary to support archiving information and data collected under the contract. Many datasets from past ACF projects are stored in archives including the ACF-funded National Data Archive on Child Abuse and Neglect, the ICPSR Child and Family Data Archive, and the ICPSR data archive more broadly. Grants for secondary analysis of ACF/OPRE data have been funded by OPRE; examples in recent years include secondary analysis of strengthening families datasets and early care and education datasets. In 2019 ACF awarded Career Pathways Secondary Data Analysis Grants to stimulate and fund secondary analysis of data collected through the Pathways for Advancing Careers and Education (PACE) Study, HPOG Impact Study, and HPOG National Implementation Evaluation on questions relevant to career pathways programs’ goals and objectives. Information on all archived datasets that are currently available for secondary analysis is available on OPRE’s website. In 2022, OPRE developed a learning agenda to support ACF’s data archiving activities. Activities will document key lessons learned and identify a conceptual model to help federal project officers plan for data archiving. The resulting brief and toolkit will disseminate best practices for making data from federally funded research studies available for secondary use, as well as identify areas for future learning and growth.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The Administration for Children and Families receives privacy and security guidance from both the ACF and the HHS Office of the Chief Information Officer. Between these two offices, there are several policies and practices in place to ensure that all ACF data are protected. The HHS policies govern the departmental policies and procedures broadly, and ACF issues more specific policies and procedures as needed to govern ACF-specific data. This includes a process by which systems are evaluated and receive an authorization to operate. There are also teams in both offices that collectively respond to all incidents and assure they are handled in an appropriate manner. The requirements are supported by auditing mechanisms and a privacy and security training program.
- In October 2021 OPRE released an updated Confidentiality Toolkit, which contains information on how to responsibly share personally identifiable information collected by human services and related programs to improve program outcomes. It discusses key federal privacy requirements, strategies to resolve challenges, and information technology security. It also includes documents used to facilitate record sharing and links to helpful resources. Building on the toolkit, OPRE released a Case Study Report in May 2022. This report is the first in a planned series of publications that highlight innovative and unique state and local data sharing initiatives that are functional while protecting data privacy and confidentiality, consistent with the federal- and state-level legal requirements. These publications focus on the privacy and confidentiality challenges that states and localities face—and how they can be overcome—and provide model “tools” and resources (e.g., data sharing agreements) in downloadable and editable formats. These publications were issued under the ACF Responsibly Sharing Confidential Data: Tools and Recommendations project, launched in 2020. The project is also exploring creating and maintaining a compendium of existing privacy and confidentiality laws for use by ACF staff.
- The Administration for Children and Families also takes appropriate measures to safeguard the privacy and confidentiality of individuals contributing data for research throughout the archiving process, consistent with its core principle of ethics. Research data may be made available as public use files when the data would not likely lead to harm or to the re-identification of an individual, or through restricted access. Restricted access files are de-identified and made available to approved researchers through secure transmission and download, virtual data enclaves, physical data enclaves, or restricted online analysis.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The Administration for Children and Families undertakes many program-specific efforts to support state, local, and tribal efforts to use human services data while protecting privacy and confidentiality. For example, ACF’s TANF Data Innovation Project supports innovation and improved effectiveness of state TANF programs by enhancing the use of data from TANF and related human services programs. This work includes transforming and documenting state-reported data to facilitate research use and establishing a data governance process to enable research requests and grants secure remote data access to approved researchers, including those from states and other ACF grantees.
- Similarly, in 2020 OPRE awarded Human Services Interoperability Demonstration Grants to Georgia State University and Kentucky’s Department of Medicaid Services. These grants are intended to expand data sharing efforts by state, local, and tribal governments to improve human services program delivery and to identify novel data sharing approaches that can be replicated in other jurisdictions. For example, Georgia State University achieved interoperability between Georgia’s Division of Family and Children Services’ units by using an approach already implemented in their school system by using an open source, cloud-based hashing solution to match and link records. Georgia’s solution performs identity-matching functions without requiring a social security number for matching within units of the Division of Family and Child Services as well as sister agencies. The tool is open-source and available for reuse. In December 2021 OPRE awarded a second round of Interoperability Demonstration grants. The focus of the second round of grants is on using an HL7 FHIR-based approach to achieving interoperability with human services programs. Reusable tools developed through these grants will be made available via the HL7 Human and Social Services Workgroup, which OPRE established in December 2021. Also in 2019, in partnership with ASPE, OPRE began a project to support states in linking Medicaid and child welfare data at the parent-child level to support outcomes research.
Score
6
6
Substance Abuse and Mental Health Services Administration
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The current SAMHSA Strategic Plan includes priority 4: improving data collection, analysis, dissemination, and program and policy evaluation. The next strategic plan is currently under revision and is expected to include the prioritization of equity, trauma-informed approaches, and a commitment to data and evidence across all policies and programs.
- In addition to its strategic plan, SAMHSA is developing a SAMHSA Data Plan. Development of the plan includes collecting input from fourteen listening sessions with internal and external stakeholders and users of SAMHSA data. Through the work of the newly created position of chief diversity officer, equity, diversity, and inclusion principles will infuse the full scope of the data cycle (collection, analysis, and use in evidence-informed practices).
- SAMHSA partners with the National Center for Health Statistics to offer individuals access to restricted use data for research and evaluation purposes. This is a carefully controlled process designed to ensure that data and the individuals who provide the data are protected.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- SAMHSA’s Report and Dissemination site identifies eight data collections: the National Survey on Drug Use and Health (NSDUH): Treatment Episode Data Set (TEDS); National Survey of Substance Abuse Treatment Services (N-SSATS); the National Mental Health Services Survey (N-MHSS); Drug Abuse Warning Network (DAWN); Mental Health Client-Level Data (MH-CLD); and the Uniform Reporting System.
- Five of eight data collection and survey datasets (i.e.,TEDS, DAWN, NSDUH, N-MHSS, N-SSATS) and archived studies are publicly available at the Substance Abuse and Mental Health Data Archive (SAMHDA). SAMHDA is a one-stop shop for SAMHSA public use data including online analytic capabilities and downloadable datasets. SAMHSA provides two data analysis systems (PDAS for public use and RDAS for restricted use) for researchers. As part of PDAS, there is also an NSDUH small area estimates tool to quickly see data by state and substate areas. Datasets are available for public use in multiple formats: ASCII, SAS, SPSS, Stata, R, and TSV. An additional publicly available resource on the SAMHSA website is the Behavioral Health Barometer. It draws on the National Survey on Drug Use and Health and National Survey of Substance Abuse Treatment Services data to provide detailed reports on behavioral health indicators by state or HHS region. Also publicly available on the SAMHSA website is the SAMHSA Evaluation of Programs and Policies document, which was revised on May 31, 2022.
- In FY22, through the leadership and coordination of the Evidence and Evaluation Board, SAMHSA began the development of an evaluation repository or inventory of evaluations and evidence building activities over the past five years. As evaluations are submitted by centers/offices, CBHSQ staff are creating an inventory of the data, which are posted on the internal Evidence and Evaluation Board SharePoint site.
5.3 Did the agency promote data access or data linkage for evaluation, evidence building, or program improvement [examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The Center for Behavioral Health Statistics and Quality oversees data collection initiatives and provides publicly available datasets so that as much data as possible can be shared with researchers and other stakeholders while preserving client confidentiality and privacy.
- The center recently developed a data transfer agreement for a uniform and protected sharing of data that began implementation in FY22. It uses the data transfer agreement for a uniform and protected sharing of data. Updated in FY22, a data transfer agreement for uniform and protected sharing of data began implementation by CBHSQ in FY22. Additionally, as the main center within SAMHSA that collects, stewards, and disseminates data, CBHSQ is central to the process of developing a short-term and long-term Strategic Data Plan. SAMHSA is also working with HHS on a department-wide data strategy including a data maturity model and policies for data governance and sharing.
- In FY21, CBHSQ built internal technical capacity for data collections and began the process of modernizing them. For example, the N-SSATS and N-MHSS have been combined into the National Substance Use and Mental Health Services Survey (NSUMHSS) to decrease burden and duplication of responses. The Substance Abuse and Mental Health Data Archive contains substance use disorder and mental illness research data available from CBHSQ’s seven data collections for restricted and public use. To promote the access and use of SAMHSA’s substance abuse and mental health data, SAMHDA provides public use data files and documentation for download, as well as online analysis tools to support a better understanding of this critical area of public health.
- In addition, SAMHSA partners with the National Center for Health Statistics to make restricted use data available through the Research Data Center (RDC). The National Center for Health Statistics (NCHS) operates the RDC to allow researchers access to restricted use data. For access to these data, researchers must submit a research proposal outlining the need for their use. In FY21, many of the procedures for the application process moved in-house from the NCHS and a CBHSQ RDC website was created.
- In FY22, SAMHSA engaged in rigorous activities to update the DIS, a secretarial priority from the HHS Action Plan to Reduce Racial and Ethnic Health Disparities (2011). The current objective is to “assess and heighten the impact of all HHS policies, programs, processes, and resource decisions to reduce health disparities. In support of this objective, HHS leadership will assure that . . . program grantees, as applicable, will be required to submit health disparity impact statements as part of their grant applications.” The secretarial priority focuses on underserved racial and ethnic minority populations (e.g., Black/African American, Hispanic/Latino, Asian American, Native Hawaiian and Pacific Islander, and American Indian/Alaska Native). The Office of Behavioral Health Equity also includes LGBTQI+ populations as underserved disparity-vulnerable groups.
- Through SPARS, grantees and SAMHSA program staff monitor the performance of grantees and, when performance is below targets, provide technical assistance and support. This allows SAMHSA to support communities during the grant process. Staff at SAMHSA meet with grantees regularly to discuss progress and to examine data entered in SPARS, thus ensuring timely submission of data. To quickly resolve issues as they arise, SPARS contractors and SAMHSA staff meet weekly and work closely together.
5.4 Did the agency have policies and procedures to secure data and protect personal confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- The Substance Abuse and Mental Health Services Administration shares data in three ways: (1) on its website (SAMHDA), (2) through the restricted use data program; and (3) through a data use agreement. Policies and procedures to secure SAMHSA’s data and protect personal confidential information mirror those of sister operating divisions.
- Data Use Agreement: In FY22, CBHSQ updated its data use agreement to enable data to be used internally by SAMHSA staff interested in gaining access to data sets as well as by external stakeholders, such as contractors and partnering federal agencies. Signed documents and data use agreements are held by SAMHSA’s confidentiality officer to ensure that all procedures are followed and adhered to including confidentiality training.
- SAMHSA is piloting the revised data use agreement with two of its data systems; the current data use agreement is not available on the SAMHSA website while in pilot testing. This updated agreement manages personal confidential information by limiting its release. Direct identifying information of respondents is rarely released outside of the agency, except to CBHSQ data collection contractors that use these data to conduct survey operations. The restrictions on the sharing of this information must be conducted under a written agreement or contract that must be reviewed by the confidentiality officer and then approved by the director.
- In all instances, CBHSQ must satisfy the requirements of federal law prior to any release. When combined, data on unique characteristics can identify a respondent. These detailed data are released only to agents. Agents can be other federal agencies, state or local governments, university researchers, private businesses, or CBHSQ contractors, as part of CBHSQ’s restricted use data program. There is an application process, with approved agents required to implement and adhere to security procedures to protect the data from unauthorized disclosure or access.
- Once the application has been approved, confidentiality training must be completed and signed off, documenting that the researcher has read and will follow the RDC disclosure review policies and procedures. The confidentiality training, confidentiality forms, and disclosure manual outline the policies and procedures required to protect the data and prevent the disclosure of confidential information. Both the principal investigator and the analyst must complete the confidentiality training and sign the confidentiality forms. The completed certificate and data user agreement forms must be uploaded with the application to be considered a complete package.
- Agents are subject to unannounced or announced inspections of their facilities to assess compliance with CBHSQ data security requirements. More importantly, measures specific to the source and type of data are implemented to protect confidentiality of the data.
- Micro-agglomeration, Substitution, Subsampling, and Calibration: The NSDUH survey has developed a statistical disclosure control technique called micro-agglomeration, substitution, subsampling, and calibration (multifactor identification) to protect confidentiality of the data. This is a disclosure limitation methodology specifically developed for NSDUH to meet the requirements of the Confidential Information Protection and Statistical Efficiency Act. The goal of this technique is to control the disclosure risks while minimizing the impact of the disclosure control measures on the quality of the data in a comprehensive and integrated manner. It has been successfully used to create NSDUH public use files since 1999.
- Confidentiality Officer and Training: In addition to having a confidentiality officer within CBHSQ who ensures that staff complete training and sign a confidentiality statement, SAMHSA offers a certificate of confidentiality that protects grantees from legal requests for names or other information that would personally identify participants in the evaluation of a grant, project, or contract. The CBHSQ trains all staff in good data stewardship, whether the data are covered by CIPSEA or the Privacy Act (5 U.S.C. 552a) and the Public Health Service Act [42 U.S.C.290aa(n)].
- The Center for Behavioral Health Statistics and Quality National Data Sets: Multiple means are used to protect data and ensure the protection of personally identifiable information including encryption and limiting access to data.
- Discretionary Grant Data: The data entry, technical assistance request, and training system for grantees to report performance data to SAMHSA is hosted by SPARS. This system serves as the data repository for the agency’s three centers: the Center for Substance Abuse and Prevention, Center for Mental Health Services (CMHS), and Center for Substance Abuse Treatment. To safeguard confidentiality and privacy, the current data transfer agreement limits the use of grantee data to internal reports so that data collected by SAMHSA grantees will not be available to share with researchers or stakeholders beyond SAMHSA and publications based on grantee data will not be permitted.
5.5 Did the agency provide assistance to city, county, and/or state governments and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The Substance Abuse and Mental Health Services Administration provides both public access and restricted use access to its datasets in a variety of ways, for example:
- Data from CBHSQ’s various data collections’ data are available (1) as prepublished estimates, (2) via online systems, and (3) as microdata files. A description of CBHSQ’s products can be found in the SAMHDA.
- SAMHSA partners with NCHS to make restricted use data available through the RDC to allow researchers access to restricted use data. For access to the restricted use data, researchers must submit a research proposal outlining the need for these data. The proposal provides a framework for CBHSQ to identify potential disclosure risks and determine how the data will be used.
Score
6
6
U.S. Dept. of Housing & Urban Development
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- The department’s FY22-26 Strategic Plan states that a more accessible data system as an objective. In its FY22-26 Learning Agenda, HUD set forth the roles of the chief data officer, which include developing a HUD Enterprise Data Strategy, updating the Data Asset Catalog, and finalizing the Open Data Plan in compliance with the Evidence Act. The document also presented the statistical official’s role to encompass supporting roles for developing and implementing the Data Asset Catalog and Open Data policy. Currently, HUD’s open data program includes assets such as administrative datasets on data.hud.gov; spatially enabled data on the eGIS portal; PD&R datasets for researchers and practitioners; and robust partnerships with the U.S. Census Bureau, U.S. Postal Service vacancy data, and health data linkages with the National Center for Health Statistics. The department’s public datasets are designed to allow analysis by race/ethnicity, gender, and other equity-related characteristics to the extent possible given the nature of the data and privacy constraints.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- The Department of Housing and Urban Development has extensive data sharing processes including public sharing, interagency sharing, and internal sharing, with each mode requiring specific controls and documentation. The department’s chief data officer assumes responsibility for creating a master inventory of HUD data assets as publicly noted in the department’s FY22-FY26 Learning Agenda. Currently HUD is reviewing its existing data inventory and updating it accordingly to produce a comprehensive data inventory. The datasets at Huduser.gov could be considered the most accessible and user friendly version of the data inventory that is managed by PD&R. The office shares the data update schedule and datasets by research categories.
5.3 Did the agency promote data access or data linkage for evaluation, evidence building, or program improvement[examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The Department of Housing and Urban Development has extensively promoted data access and data linkage, including the following approaches:
- An updated list of open data assets; numerous PD&R-produced datasets for researchers and practitioners, including tenant public use microdata samples, A Picture of Subsidized Households, fair market rents and income limits, Comprehensive Housing Affordability Strategy special tabulations of the American Community Survey; and an eGIS portal providing geo-identified open data to support public analysis of housing and community development issues using GIS tools. The eGIS portal is a comprehensive geospatial data source with web-mapping tools and application program interfaces (APIs). Uploaded data sets are categorized and tagged by ten major topics of HUD programs, and HUD provides an exploratory image where users can filter and download data with a few clicks.
- Data linkage agreements with the National Center for Health Statistics (NCHS) and the Census Bureau. Policy Development and Researchhas formed a partnership with NCHS, which links HUD’s administrative rental assistance participants with NCHS health surveys and Medicare, Medicaid, and mortality data. The Department of Housing and Urban Development has partnered with the Census Bureau across multiple projects to link data products, including American Housing Survey (AHS) data and American Community Survey data.
- HUD has created a repository of properties, units and tenants that merge data across the various HUD rental assistance programs for use in research, evaluation and reporting. This allows for standardization and greater access to sociodemographic characteristics of HUD’s clients.
- Engagement in cooperative agreements with research organizations, including both funded research partnerships and unfunded data license agreements, to support innovative research that leverages HUD’s data assets and informs HUD’s policies and programs. Data licensing protocols ensure that confidential information is protected.
5.4 Did the agency have policies and procedures to secure data and protect personal, confidential information (example: differential privacy, secure multiparty computation, or homomorphic encryption; or developing audit trails)?
- The Department of Housing and Urban Development’s statistical official supports the evidence officer on issues related to protection of confidential data and statistical efficiency. Its Evaluation Policy specifies that HUD protects client privacy by adhering to the Rule of Eleven to prevent disclosure from tabulations with small cell sizes. Data licensing protocols ensure that researchers protect confidential information when using HUD’s administrative data or program demonstration datasets.
- The statistical official collaborates with statistical agencies to create data linkages and develop data products that are machine-readable and include robust privacy protections. The Department of Housing and Urban Development has an interagency agreement with the Census Bureau to conduct the AHS and collaborates with Census staff to examine disclosure issues for AHS public use files and the potential for “synthetic” public datasets to support researchers in estimating summary statistics with no possibility of reidentifying survey respondents. Another interagency agreement allows the Census Bureau to link data from HUD’s RCTs with other administrative data collected under the privacy protections of its Title 13 authority. These RCT datasets are the first intervention data added to Federal Statistical Research Data Centers (RDCs) by any federal agency. Strict RDC protocols and review of all output ensure that confidential information is protected, and the open data and joint support for researchers are currently facilitating seven innovative research projects at minimal cost to HUD.
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- The Department of Housing and Urban Development has an updated list of open data assets, an open data program, numerous PD&R datasets for researchers and practitioners, and an eGIS portal providing geo-identified data to support public analysis of housing and community development issues related to multiple programs and policy domains using GIS tools. For example, HUD supports local governments in assessing and planning for housing needs by providing summary data files about HUD-supported public and assisted housing and about local housing needs. These accessible data assets have privacy protections. Researchers needing detailed microdata can obtain access through data licensing agreements.
- Numerous resources and training opportunities to help program partners use data assets more effectively are available through HUDExchange. Additional technical assistance is offered through the program; a $91,000,000 technical assistance program equips HUD’s customers with the knowledge, skills, tools, capacity, and systems to implement HUD programs and policies successfully and provide effective oversight of federal funding. The department supports in-depth, one-on-one technical assistance for HUD funding recipients. The technical support is delivered by HUD headquarters and field office staff and resources such as online courses, webinars, guidebooks, and virtual help desk responses.
Score
8
8
Administration for Community Living (HHS)
5.1 Did the agency have a strategic data plan, including an open data policy [example: Evidence Act 202(c), Strategic Information Resources Plan]?
- As an operating division of a Chief Financial Officers Act agency, HHS, ACL is not required to have its own strategic data plan and utilizes HHS’s data strategy. In 2016, ACL implemented a Public Access Plan as a mechanism for compliance with the White House Office of Science and Technology Policy’s public access policy. The plan focused on making published results of ACL/NIDILRR-funded research more readily accessible to the public, making scientific data collected through ACL/NIDILRR-funded research more readily accessible to the public, and increasing the use of research results and scientific data to further advance scientific endeavors and other tangible applications. In 2019, ACL created a council to improve its data governance, including the development of improved processes and standards for defining, collecting, reviewing, certifying, analyzing, and presenting data collected through its evaluation, grant reporting, and administrative performance measures. In 2020, its first year, the ACL Data Council produced an annotated bibliography to provide essential background information about the topic, a primer to detail best practices in data governance specifically as they apply to ACL, and a Data Quality 101 infographic to guide decision-making processes related to data quality. The next phase of the ACL Data Council is currently under review.
5.2 Did the agency have an updated comprehensive data inventory (example: Evidence Act 3511)?
- The Administration for Community Living provides comprehensive public access to its programmatic data through its Aging, Independence, and Disability Program Data Portal (AGID), which is currently being rebuilt. It also has two data inventories available to the public on the NARIC website: REHABDATA, a database of rehabilitation and disability literature, and the Online Program Directory, which contains NIDILRR’s previously funded, currently funded, and newly funded grants. An ACL/NIDILRR public access plan, first published in February 2016, makes available to the public peer-reviewed publications and scientific data arising from research funded in whole or part by ACL through the NIDILRR to the extent feasible and permitted by law and available resources. The requirements outlined in this plan are being applied prospectively and not retrospectively. In addition, ACL is creating an internal evidence inventory that staff will be able to use to search for relevant program performance and evaluation data by agency priority question.
5.3 Did the agency promote data access or data linkage for evaluation, evidence building, or program improvement [examples: model data-sharing agreements or data-licensing agreements; data tagging and documentation; data standardization; and downloadable machine-readable, de-identified tagged data; Evidence Act 3520(c)]?
- The OPE has access to all of ACL’s performance and evaluation data and is able to link those data and advise programs about their availability and usability. In March 2019, ACL completed the ACL Data Restructuring Project to assess the data hosted on AGID and to develop and test a potential restructuring of the data in order to make these data useful and usable for stakeholders. In 2021, ACL published a final report on the Data Restructuring II Project in which ACL awarded a follow-on contract to further integrate its datasets along the lines of conceptual linkages and to better align the measures within its data collections. It funded several grants to promote data linkage. Grants to Enhance State Adult Protective Services (APS) were awarded in FY19 to increase intra- and inter-state sharing of information on APS cases. The 2020 Empowering Communities to Reduce Falls and Falls Risk grant, which was awarded to develop robust partnerships and a result-based comprehensive strategy for reducing falls and fall risks among older adults and adults with disabilities living in the community, directs grantees to consider Centers for Disease Control and Prevention opportunities to broaden and improve the linkage between primary care providers and evidence-based community falls prevention programs supported by ACL.
5.4 Did the agency have policies and procedures to secure data and protect personal confidential information (example: differential privacy; secure, multiparty computation; homomorphic encryption; or developing audit trails)?
- As an operating division of HHS, ACL follows all departmental guidance regarding data privacy and security. This includes project-specific reviews by ACL’s Office of Information Resource Management, which monitors all of ACL’s data collection activities to ensure the safety and security of ACL’s data assets. In FY19, ACL awarded a contract to establish a data council to enhance the quality, security, and statistical usability of the data collected by ACL through its evaluation, grant reporting, and administrative data collections and to develop effective data governance standards.
- The NIDILRR model systems’ data centers have extensive standard operating procedures that are designed to secure data and protect personal and confidential information. Below are a few illustrative examples from the these data centers:
- The Burn Model System National Data and Statistical Center has a procedures page that lists all of the standard operating procedures that grantees contributing to this database must follow.
- The Traumatic Brain Injury Model Systems National Data and Statistical Center has a standard operating procedures page that describes the procedures that all grantees contributing to this database must follow.
- The Spinal Cord Injury Model Systems National Data and Statistical Center has a page on Using the National Spinal Cord Injury Model Systems Database. Descriptions of what constitutes “de-identified data” can be found on this page.
- In addition to the model systems data centers referenced above, NIDILRR developed Part 2: Preparing Data and Documentation. This page and video are part of the larger training course, NIDILRR Data Archiving and Sharing Training, that NIDILRR grantees must complete. Additional guidance is available on the ICPSR web page Resources for National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR) Grantees. Each funding opportunity announcement states that “a data and safety monitoring board (DSMB) is required for all multi-site clinical trials involving interventions”
5.5 Did the agency provide assistance to city, county, and/or state governments, and/or other grantees on accessing the agency’s datasets while protecting privacy?
- Through the AGID system, which is currently being redeveloped to meet current privacy, security, and technology needs, ACL data sets are made publicly available. While AGID is under construction, ACL is releasing as much data publicly as is feasible given resources and capacity. Technical assistance is provided by ACL staff through presentations and by ACL’s technical assistance resource centers to grantees, including state, tribal, and local governments. The resource centers providing technical assistance include the National Resource Center on Nutrition and Aging, the Alzheimer’s Disease Supportive Services Program, and the University Centers for Excellence in Developmental Disabilities Education, Research, and Service. This technical assistance includes annual workshops and presentations at the Title VI National Training and Technical Assistance Conference and training available through the ACL funded National Ombudsman Resource Center and the Disability and Rehabilitation Research Program, which funds capacity building for minority research entities. In addition, NIDILRR has a number of resources to help the public access its data responsibly: The National Spinal Cord Injury Statistical Center, for example, has a pdf document entitled Using the National Spinal Cord Injury Model Systems Database. This same center also has an online Data Request Form that requestors need to complete before gaining access to data. The National Data and Statistical Center for the Traumatic Brain Injury Model Systems has a web page entitled How to obtain a dataset from the TBIMS. Similarly, the Burn Model Systems’ National Data and Statistical Center has a page with instructions on how to access data.