Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

February 2, 2025

Backing Up US Federal Databases

Posted by John Baez

I hope you’ve read the news:

Many of the pages taken down mention DEI, but they also include research papers on optics, chemistry, medicine and much more. They may reappear, but at this time nobody knows.

If you want to help save US federal web pages and databases, here are some things to do.

  • First check to see if they’re already backed up. You can go to the Wayback Machine and type a website’s URL into the search bar. Also check out the Safeguarding Research Discourse Group, which has a list of what’s been backed up.

  • If they’re not already on the Wayback Machine, you can save web pages there. The easiest way to do this is by installing the Wayback Machine extension for your browser. The add-ons and extensions are listed on the left-hand panel of the website’s homepage.

  • If you’re concerned that certain websites or web pages may be removed, you can suggest federal websites and content that end in .gov, .mil and .com to the End of Term Web Archive.

  • You can suggest federal climate and environmental databases to Environmental Data and Governance Initiative.

  • You can suggest databases to The Data Liberation Project.

  • You can suggest databases and also report databases you’ve backed up to the Safeguarding Research Discourse Group. This seems to be a community devoted to such issues.

  • For Centers for Disease Control data: tell science journalist Maggie Koerth which CDC data you’ve downloaded and whether you’ve made them publicly available.

I’ve taken these suggestions from Naseem Miller and added a bit. As you can see, there are overlapping efforts that are not yet coordinated with each other. This has some advantages (for example the Safeguarding Research Discourse Group is based outside the US) and some disadvantages (it’s hard to tell definitively what hasn’t been backed up yet).

For more on the situation, go here:

Posted at February 2, 2025 11:40 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3587

4 Comments & 0 Trackbacks

Re: Backing Up US Federal Databases

A little bit off topic, but do we know how well protected the arXiv is against the whims of the US government?

Donkeys’ years ago, there were arXiv mirrors all over the world, and one was encouraged to use the nearest one. Is the full arXiv repository still stored in multiple countries?

It would obviously be a disaster if something happened to the arXiv. I’d imagine that those who run it have been wise enough to protect themselves against possible legal as well as technological threats. But given how worried people seem to be about PubMed, it can’t be taken as a given.

Posted by: Tom Leinster on March 12, 2025 11:08 AM | Permalink | Reply to this

Re: Backing Up US Federal Databases

Tom wrote:

A little bit off topic, but do we know how well protected the arXiv is against the whims of the US government?

We don’t—that is, the people I know are worrying about this, but I haven’t been talking to people who are helping run the arXiv, so I’m not the best one to answer this.

Donkeys’ years ago, there were arXiv mirrors all over the world, and one was encouraged to use the nearest one.

Yes, that system of “mirror sites” was replaced by “the cloud”, and now some people are noticing that “the cloud” may possibly be under US control.

Is the full arXiv repository still stored in multiple countries?

“That’s an excellent question,” as people say immediately before not answering.

Posted by: John Baez on March 12, 2025 7:11 PM | Permalink | Reply to this

Re: Backing Up US Federal Databases

It turns out the network of arXiv mirrors, located in different countries, was shut down on September 15, 2024.

Posted by: John Baez on March 13, 2025 9:27 PM | Permalink | Reply to this

Re: Backing Up US Federal Databases

If anyone wants to back up the arXiv, they should go here:

The complete arXiv was 2.7 terabytes in March 2023, and growing at about 100 gigabytes a month.

Also read this, if you want to keep your backup up to date:

Posted by: John Baez on March 14, 2025 1:02 AM | Permalink | Reply to this

Post a New Comment